 Your new post is loading...
|
Scooped by
mhryu@live.com
April 30, 11:47 PM
|
Proteins interact with several RNA types to facilitate a broad spectrum of cellular functions. However, the underlying interaction data are sparse, and existing methods for predicting RNA-binding residues (RBRs) in protein sequences are almost exclusively RNA type-agnostic, limiting their utility. To this end, we introduced RNAdetector, a sequence-based method that accurately predicts messenger RNA-, ribosomal RNA-, small nuclear RNA (snRNA)-, and transfer RNA-binding residues and type-agnostic RBRs. RNAdetector employs an innovative deep transformer network architecture and transfer learning, which together produce a large boost to predictive performance and minimize cross-predictions, defined as incorrectly predicting wrong types of RBRs. Moreover, our design has a low computational footprint and produces accurate predictions in about 9 s per protein, facilitating analysis of large collections of proteins. A comparative evaluation on a low-similarity test dataset showed that RNAdetector provided substantially more accurate RNA-type-specific predictions, a much shorter runtime, additionally covered snRNA, and significantly reduced cross-predictions compared to the only other RNA-type-specific predictor that is cross-prediction prone. Moreover, we empirically showed that RNAdetector’s predictions of type-agnostic RBRs are modestly more accurate than those generated by several representative predictors of RNA-type-agnostic RBRs and RNA-binding proteins. http://biomine.cs.vcu.edu/servers/RNAdetector/.
|
Scooped by
mhryu@live.com
April 30, 3:31 PM
|
Polysaccharide degradation by microorganisms is essential in driving global nutrient cycling, developing renewable fuels or chemicals, and promoting human gut health. However, complex polysaccharides are often insoluble, making it challenging to study degradation with soluble enzymes or to measure microbial growth. Several protocols exist to address this challenge, including 3D-printed biomass containment devices or agar capture systems to facilitate the study of insoluble carbohydrate degradation. While these methods are functional, they are constrained by variable substrate loading and time-consuming preparation. To address these shortcomings, a 3D-printed pipette was designed and tested as part of a newly developed agar immobilization method to quantitatively monitor the degradation of insoluble polysaccharides for a 96-well microtiter assay. The pipette and method were validated via two mechanisms. First, growth analyses of Cellvibrio japonicus wild-type or bgl2A, cbp2D, and cbp2E single deletion mutant strains were conducted using glucose, barley β-glucan, starch, cellulose, chitin, pectin, galactan, or intact yeast biomass as a sole carbon source to benchmark the new method against previously described methods. Second, the agar pipette was used to screen the growth of bacteria or yeasts capable of utilizing insoluble polysaccharides as the sole carbon source. The 3D-printed pipette and agar immobilization method enabled a faster, more consistent, and cost-effective way for high-throughput screening of bacterial or yeast growth using insoluble carbohydrates, suggesting that it can be a useful tool for environmental and applied microbial research.
|
Scooped by
mhryu@live.com
April 30, 3:19 PM
|
The issues of how microorganisms survive very long periods of desiccation and how they react during both drying and rehydration phases have long been topics of interest in a range of relevant fields, including desert ecosystem microbiomics, food storage, ancient microbe studies, and even astrobiology. The recently published study by Carini et al., who used a combination of transcriptomics and metabolomics to investigate steady-state gene expression and cellular metabolite profiles at different states of bacterial cellular desiccation, during both drying and rewetting phases, adds some valuable insights into how members of bacterial communities can survive in the driest habitats on earth (P. Carini, A. Gomez-Buckley, C. R. Guerrero, M. R. Kridler, et al., mSystems 11:e00493-25, 2026, https://doi.org/10.1128/msystems.00493-25).
|
Scooped by
mhryu@live.com
April 30, 2:16 PM
|
Since their discovery, CRISPR-Cas systems have been widely applied in areas ranging from genome editing to biosensing, owing to their specific, RNA-guided target recognition. Their performance in complex biological environments has been extensively studied, particularly to optimize guide RNA (gRNA) design and minimize off-target cleavage. Here, we focus on the kinetic inhibition of the interaction between Cas12a—a Class 2, Type V effector—and its target, caused by interference from non-cognate background nucleic acids. This effect is particularly relevant for sensing applications in complex mixtures or cellular contexts, where genome- and transcriptome-derived sequences may impede CRISPR-Cas activity. Using in vitro assays under defined conditions, we systematically examine the influence of background single-stranded RNA and double-stranded DNA (dsDNA) on reaction kinetics. We find that both the purine-to-pyrimidine ratio and the GC content of the gRNA seed region significantly affect kinetic inhibition by background polynucleotides. gRNAs with low GC content and a high purine fraction in the seed region were least affected by background sequences. A gRNA with high uracil content in the seed region exhibited particularly strong inhibition in the presence of a dsDNA background. Experiments with dCas12a-based gene activation in living cells indicate that our in vitro findings may also be relevant for in vivo applications.
|
Scooped by
mhryu@live.com
April 30, 2:05 PM
|
RNA naturally regulates many cellular processes, yet the engineering of RNA for use in synthetic cellular control schemes lags behind protein-based systems. Recent advancements in synthetic biology, investment in RNA therapeutics, and a better understanding of RNA structural dynamics have driven the development of novel RNA sensors and actuators. The genetic information encoded within RNA enables facile sensing and interactions with other nucleic acids, while its dynamic structure facilitates binding to a broad array of small-molecule and protein ligands. RNA can be engineered to sense these diverse inputs and transduce signals to regulate cellular activity on the transcriptional, translational, and post-translational levels to enhance microbial biosynthesis and create targeted gene therapies.
|
Scooped by
mhryu@live.com
April 30, 1:45 PM
|
Non-viral targeted integration of large DNA cargoes into human primary T cells typically requires the induction of genomic double-strand breaks (DSBs), a process associated with cytotoxicity and potential tumorigenic chromosomal abnormalities. Here we report PRIME-In, a novel genome-editing platform that uses a prime editing-engineered donor template coupled with either single (PRIME-In 1.0) or paired (PRIME-In 2.0) genomic nicks to enable precise integration of substantial DNA payloads into human cells without reliance on DSB repair pathways. Compared with traditional DSB-dependent methods, PRIME-In demonstrates markedly enhanced editing efficiency and specificity while eliminating detectable on-target and off-target chromosomal aberrations. Subsequent refinement of reagent composition and delivery protocols enabled PRIME-In-mediated engineering of primary human T cells with minimal toxicity, achieving up to 50% integration efficiency for a 3-kb CAR construct. These advances establish PRIME-In as a transformative platform for streamlining the non-viral production of genome-edited T cells, offering substantial potential for T cell-based immunotherapies. PRIME-In couples a prime edited donor template with either single or paired genomic nicks to enable precise integration of even large constructs into human cells, as exemplified by high-efficiency integration of a 3-kb CAR construct in T cells.
|
Scooped by
mhryu@live.com
April 30, 1:27 PM
|
One of the most relevant objectives in microbiome studies is the identification of microbial species that are differentially abundant across conditions. However, the compositional nature of microbiome data complicates this task. Interdependence among components leads to spurious associations when the abundances of each component are analyzed separately. Due to the growing awareness of the challenges of compositional data analysis (CoDA), log-ratio transformations, such as the additive log-ratio (alr) or the centered log-ratio (clr) transformations, have become increasingly popular in microbiome studies. Several studies have compared the performance of compositional and non-compositional methods through simulations. However, the debate between these two frameworks remains unresolved, creating confusion among researchers. Rather than relying on simulation-based results, this work provides theoretical results that enable a more rigorous and conclusive analysis of the problem, contributing to a better understanding of differential abundance estimation. We provide theoretical expressions of the bias of differential abundance estimation related to the use of proportions (total sum scaling) and log-ratio transformations (alr and clr) when estimates are interpreted as absolute rather than relative to a reference. The factors that most strongly influence the bias are the magnitude and direction of the effects, the dimension of the composition, the proportion of differentially abundant variables, and the distribution of relative abundances. The findings of this work strongly support the use of CoDA transformations; however, they also highlight that even when log-ratio transformations are applied, interpreting the results outside of a CoDA framework can still lead to biased conclusions. Among CoDA transformations, alr has several advantages over clr: its reference is more explicit, which reduces the risk of interpreting estimates as absolute rather than relative, and it facilitates the replication of results in independent studies, as it only requires assessing changes relative to the same reference rather than reconstructing the full composition. In this work, we propose a heuristic method for selecting a suitable alr reference component, which will enable a more widespread use of this transformation.
|
Scooped by
mhryu@live.com
April 30, 12:52 PM
|
Engineering pyrenoid-based CO2-concentrating mechanisms (pCCMs) into crop plants is a promising strategy to boost photosynthetic performance, enhance yield potential, and improve resilience to future climates. Achieving this goal requires a deeper understanding of the molecular principles that enable pyrenoids to elevate CO2 around Rubisco. Although pyrenoids span a remarkable diversity of forms across algae and hornworts, they consistently exhibit three core features: a condensed Rubisco matrix, specialized membranes that deliver inorganic carbon, and a diffusion barrier that restricts CO2 leakage. Here, we review recent mechanistic advances from Chlamydomonas reinhardtii alongside emerging insights from other algae and hornworts that highlight both conserved strategies and lineage-specific innovations in Rubisco condensation, membrane-associated inorganic carbon channelling, and matrix encapsulation. Together, these findings refine our understanding of pCCM diversity and provide an increasingly robust blueprint for reconstructing pCCMs in vascular plant chloroplasts.
|
Scooped by
mhryu@live.com
April 30, 12:25 PM
|
RNA structure critically governs biological function in both physiological and pathological contexts, making high-resolution structural maps essential for RNA-targeted therapeutics. Yet, despite recent advances, well-validated structural targets for drug design remain limited. To help bridge this gap, we generated the first genome-scale map of the human RNA structurome by applying ScanFold to >230 000 annotated human pre-mRNA transcripts, identifying sequences likely evolved to form highly stable and functional secondary structures. We also performed a global analysis of regions with z-scores ≤ –2 and statistically characterized their two-dimensional folding patterns. In addition, we developed the RNA-Annotator Pipeline to integrate 20 diverse biological annotations, such as tissue-specific expression and protein interactions, with the structural data. Our results reveal local folding propensities and unusually stable structures with high-confidence architectures, providing insights for prioritizing RNA targets and guiding therapeutic design, including antisense oligonucleotides and small molecules. All ScanFold results are publicly available through RNAStructuromeDB. Using the RNA-Annotator Pipeline, analysis of SMN1 and SMN2 pre-mRNAs showed that a single C-to-T transition in SMN2 induces structural rearrangements that disrupt a critical splicing enhancer. This toolkit establishes an integrated workflow that enables researchers to explore RNA structure–function relationships and accelerate advances in RNA-targeted drug discovery and RNA biology.
|
Scooped by
mhryu@live.com
April 30, 1:56 AM
|
Bacteria drive crucial processes across ecosystems and profoundly impact human health, yet tools to rewrite microbiomes remain limited. Here, we show that bridge recombinase enables versatile and programmable genome editing across the bacterial tree of life. In E. coli, we achieved 142 kb insertions at >90% efficiency, megabase-scale inversions (2.3 Mb), and pathway-scale 50 kb excisions. With a single ortholog and bridge RNA (bRNA), we edited bacterial isolates spanning five phyla and performed metagenomic editing in human gut microbiomes. We overcame cross-reactivity between co-expressed bRNAs to establish single-step search-and-replace editing and demonstrated capture and interphylum transfer of functional chromosomal pathways, enabling programmable horizontal gene transfer. These advances establish bridge recombinase as a foundation for orchestrating controlled gene flow in complex microbial ecosystems.
|
Scooped by
mhryu@live.com
April 30, 1:39 AM
|
Band-pass filters, which selectively transmit signals within a defined range of input magnitudes, are fundamental components of signal-processing systems. In cellular gene circuits, band-pass behavior has likewise been pursued as a mean to implement complex signal-processing functions. However, previously reported genetic band-pass circuits have typically relied not only on a large number of regulatory components but also on transcriptional cascades involving multiple transcription factors, resulting in long DNA sequences and increased circuit complexity. Here, we first propose a band-pass gene circuit that operates without transcriptional cascades. By co-expressing two variants of the transcription factor BetI that exhibit opposite input–response behaviors—one acting as an inducer-dependent activator and the other as an inducer-dependent repressor—band-pass filtering is achieved solely through differential tuning of their inducer sensitivities. This minimal architecture enables gene expression only within a specific range of intracellular choline concentrations. Furthermore, we demonstrate that this cascade-free band-pass circuit can be exploited to generate spatial expression patterns in E. coli populations in response to a choline diffusion gradient, illustrating its utility for pattern formation in multicellular contexts.
|
Scooped by
mhryu@live.com
April 30, 1:32 AM
|
The global escalation of antibiotic resistance has renewed interest in antimicrobial peptides (AMP) as promising alternatives to conventional antibiotics. Although extensive experimental evidence supports their effectiveness against drug-resistant pathogens, identifying new candidates remains time-consuming and labor-intensive. Recent advances in artificial intelligence (AI) are helping to alleviate this bottleneck by accelerating the discovery of AMPs. In particular, generative approaches, including variational autoencoders (VAEs), generative adversarial networks (GANs), and diffusion models, have enabled the de novo design of AMPs with potent activity and reduced toxicity. Despite this progress, key challenges remain, including dataset quality and bias, the calibration and reliability of activity/toxicity prediction, controllable generation under realistic constraints, and rigorous, reproducible validation. This review summarizes recent advances in generative models for AMP discovery, compares different modeling strategies, and outlines mainstream validation methods used to assess peptide efficacy and safety. We provide a systematic comparison and new insights for peptide-centered generative modeling, discuss current limitations and unresolved questions, and propose practical considerations and future directions to guide the development of AMP generation models that are effective, safe, and clinically translatable.
|
Scooped by
mhryu@live.com
April 30, 1:24 AM
|
Predicting the thermodynamic stability of proteins upon single-point mutations is a pivotal step in both protein engineering and medicine. In the study of predicting protein thermodynamic stability, various computational methods, whether they extract features at the local-level or global-level, exhibit their respective advantages and limitations. To leverage the advantages of both features, we developed MuFaDDG, a novel sequence-based method that integrated multiscale feature fusion for improved prediction of protein stability changes (ΔΔG). MuFaDDG achieves comparable performance on the S669 benchmark, demonstrating strong capabilities in stabilizing mutations. Notably, it shows a significant advantage in the ACC metric, with values of 0.75, 0.88, and 0.81 on the direct, reverse, and overall datasets of the CAGI5 Challenge’s Frataxin, respectively. Furthermore, our method outperforms leading sequence-based approaches including THPLM, DDGemb, DDGun, and INPS-Seq on protein Myoglobin stability prediction. Additionally, MuFaDDG demonstrates exceptional predictive performance with higher PCC and ACC on the protein ThreeFoil, which is uncurated by FireProtDB and ProThermDB databases.
|
|
Scooped by
mhryu@live.com
April 30, 11:27 PM
|
Transformer models enable functionally meaningful representation of complex biological data, such as nucleotide or protein sequences. Existing foundation transformer models are trained on large multi-domain corpuses of unlabelled DNA or protein data, showing unmatched task generalization. However, these foundation models are often outperformed on domain-specific tasks by models trained on taxonomically-constrained data, such as gene classification in prokaryotes. By extension, species-specific transformer models hold promise for targeted analyses, given sufficient training data are available. Epidemiological analysis of bacterial pathogens exemplifies the use-case of species-specific transformers, due to the wealth of genome data available, coupled with pathogen-specific analyses carried out during routine and outbreak surveillance. Here, we trained a transformer model, PanBART, on the gene content and gene order of two important and biologically distinct bacterial pathogens, Escherichia coli and Streptococcus pneumoniae, benchmarking against state-of-the-art non-transformer approaches for genomic epidemiology. We show PanBART learns representations of population structure in an unsupervised manner, and can be used to accurately assign genomes to biologically-meaningful sequence clusters. PanBART is also able to identify emergent lineages, differentiating them from pre-existing lineages, and can accurately predict genomes likely to uptake genes involved in antibiotic resistance before a transfer event has occurred. Finally, PanBART can be used to conduct co-selection analysis to identify pairs of genes likely to be found together. Our work demonstrates that species-specific transformer models can be employed in many critical public health scenarios. We lay the groundwork for wider application of such models in epidemiological analysis, and provide scenarios where such models excel.
|
Scooped by
mhryu@live.com
April 30, 3:27 PM
|
The adaptation of CRISPR technologies for molecular detection marks a significant advancement in the field of biosurveillance and infectious disease response. CRISPR-based detection systems offer superior specificity and sensitivity compared to traditional PCR methods by directly binding and cleaving target DNA or RNA sequences, thus signaling the presence of specific pathogens. These advantages include the elimination of non-specific amplification and the reduction of required genetic material, leading to faster time to results without the need for extensive amplification cycling. However, the efficacy of CRISPR technologies heavily depends on the design of specific guide RNA (gRNA) sequences tailored for each genomic target, a process that can be intricate and time-consuming. We present Cas-CRISPR Automated Design and Evaluation (CasCADE), a state-of-the-art gRNA design software platform with a high degree of flexibility and modularity. CasCADE incorporates k-mer set operations to reduce time to answer for large data inputs when compared to computationally costly multiple sequence alignment methodologies and uses an agnostic whole genome approach to maximize gRNA discovery. CasCADE can be scaled efficiently to problems of any input sequence size and can be used for design, candidate evaluation, or both, depending on user need.
|
Scooped by
mhryu@live.com
April 30, 3:18 PM
|
Molecular profiling enabled by meta-omics technologies has significantly expanded our knowledge of microbial catalog across diverse environments. Increasing attention has now been focused on identifying ecologically significant taxa, particularly keystone that stabilize communities, rare taxa that underpin functional redundancy, and indicators that reflect environmental gradients. However, current pipeline methods remain limited in deciphering complex ecological relationships and modeling the evolution of community dynamics. As a transformative computational tool, deep learning (DL) offers novel strategies to address these challenges through autonomous feature extraction, nonlinear interaction modeling, and integration of multi-modal data sets. Nevertheless, there are still obstacles to the widespread adoption of DL for collaborative identification of specific microbial taxa, primarily including the intrinsic heterogeneity and imbalance of data sets, the difficulty of model generalization across diverse ecosystems, and the limited ecological interpretability of model outputs. This review summarizes existing research advances and proposes to build a unified DL framework for multi-modal data, exploring its implementation pathways, challenges, and potential coping strategies. The envisioned framework establishes a multi-task learning architecture for unified identification of keystone, rare, and indicator taxa, incorporating domain knowledge through ecological constraint layers and explainable AI modules, while providing flexible implementation pathways for heterogeneous data integration and model customization across microbial ecosystems. This framework has the potential to form a closed-loop verification in combination with synthetic microbial community experiments, reshape the paradigm of microbial community research, and promote the transition from empirical classification to mechanistic ecological cognition.
|
Scooped by
mhryu@live.com
April 30, 2:11 PM
|
Selective identification of translationally active cells remains a key challenge in linking microbial function to population dynamics. Bioorthogonal non-canonical amino acid tagging (BONCAT) enables pulse-labelling of newly synthesized proteins in live cells, providing a time-resolved of translational activity. While BONCAT is typically combined with fluorescent tagging and fluorescence-activated cell sorting (FACS), the reliance on high-end instrumentation and transparent matrices limits its use. Here, we present a fully benchtop workflow by coupling BONCAT with magnetic affinity-based cell separation (BONCAT–ABCS) to enable enrichment of translationally active E. coli subpopulations from a mixed culture without specialized instrumentation. Using a diauxic growth model with glucose and lactose as carbon source and two near-isogenic strains (MV1300 (Lac-) and MV1717 (Lac + )), we quantified BONCAT-labelled and unlabelled fractions across five timepoints using AHA pulse labelling, click chemistry, and absolute qPCR targeting strain-specific chromosomal markers. BONCAT-labelled fractions exhibited significantly higher gene copy numbers and enrichment factors (up to 6.9-fold; p < 0.001, ANOVA with Tukey’s HSD) compared to non-labelled controls, with recovery efficiencies ranging from 12 to 21% and background capture < 4%. BONCAT-ABCS resolved physiological divergence during the diauxic shift, selectively enriching Lac + cells during lactose metabolism while Lac- cells remained translationally inactive. Importantly, enrichment reflected activity-weighted population shifts rather than rare-cell amplification, highlighting the suitability of BONCAT–ABCS for bulk metabolic profiling. These results support BONCAT–ABCS as an accessible affinity-based enrichment strategy for quantifying translationally active subpopulations and highlight its potential application in bulk metabolic profiling, engineered microbial systems, microbiome analyses, and bioprocess monitoring.
|
Scooped by
mhryu@live.com
April 30, 1:49 PM
|
Targeting DNA payloads into human induced pluripotent stem cells (hiPSCs) typically requires multiple inefficient steps, slowing the testing of gene circuits and cell-fate programmes. Here we show that STRAIGHT-IN Dual enables simultaneous, allele-specific, single-copy integration of two DNA constructs efficiently within 1 week. STRAIGHT-IN Dual leverages the STRAIGHT-IN platform for near-scarless payload integration, facilitating the recycling of components for further modifications. Using STRAIGHT-IN Dual, we investigate how promoter choice and gene syntax influence transgene silencing and how these design features affect reporter expression and forward programming of hiPSCs into neurons, motor neurons and endothelial cells. We also incorporate a grazoprevir-inducible synthetic gene switch that complements tetracycline-inducible control, providing tunable and temporally controlled expression of different transcription factors within the same cell. STRAIGHT-IN Dual generates homogeneous engineered hiPSC populations, accelerating synthetic biology design–build–test cycles in stem cells and enabling controlled comparisons of circuit performances. STRAIGHT-IN Dual enables allele-specific integration of two DNA constructs into hiPSCs within 1 week. The system was used to programme hiPSCs into distinct cell types and for tunable expression of different transcription factors within the same cell.
|
Scooped by
mhryu@live.com
April 30, 1:35 PM
|
An analogous biological pump has been postulated in soils, with microbial processing of plant organic matter locking up photosynthetically fixed carbon in the form of microbial residues, particularly into the deeper mineral layers. We now believe that most plant organic matter is decomposed by soil microorganisms on a relatively short timescale (months to years), whereas it is microbially processed organic matter (microbial dead residues, or necromass) that is more chemically recalcitrant and sticks to reactive minerals, allowing it to persist for much longer (decades). microbial carbon use efficiency, the amount of available carbon that is invested in biomass production, has become a key ecophysiological trait that determines the transfer of carbon from photosynthetically fixed to persistent soil carbon pools. Stabilization of microbial metabolites and residues on reactive mineral surfaces has also received quite a lot of attention since the publication of this paper. One of the key areas of future research remains the question of how microbial death rates and different modes of death differentially affect chemical recalcitrance, mineral affinity and therefore residence time of microbial necromass in soil.
|
Scooped by
mhryu@live.com
April 30, 12:56 PM
|
Liver diseases, including hepatocellular carcinoma (HCC), non-alcoholic fatty liver disease (NAFLD), and alcoholic liver disease (ALD), impose a significant global health burden, with over 2 million deaths annually and substantial economic losses. Current treatments, primarily pharmacological, face challenges such as insufficient efficacy, poor absorption, and drug resistance. The liver-gut axis, a critical pathway linking the liver and intestines, offers a therapeutic target for these diseases. Engineered live bacteria, modified through genetic engineering and synthetic biology, have emerged as a promising alternative. These bacteria can be designed to deliver therapeutic agents directly to the liver or gut, enhancing efficacy and reducing systemic side effects. This review explores the application of engineered live bacteria in treating liver diseases, focusing on strains such as Bifidobacterium, Escherichia coli Nissle 1917, Bacillus subtilis, Saccharomyces boulardii, and Lactobacillus reuteri. It also pays attention to the internal genetic modification and external covalent connection of the original live bacteria. We discuss the design of dosage forms, including capsule formulations, microencapsulation, and nanopreparations, and administration methods like oral and in situ injection. Additionally, we address the challenges and future prospects of using engineered live bacteria to target liver diseases and related conditions, aiming to advance their clinical application and reduce the global burden of liver diseases.
|
Scooped by
mhryu@live.com
April 30, 12:43 PM
|
RNAanalyzer3 (“RNA analyzer cubic” https://rnaanalyzer.bioapps.biozentrum.uni-wuerzburg.de) substitutes the frequently consulted current RNAanalyzer webserver (https://rnaanalyzer-old.bioapps.biozentrum.uni-wuerzburg.de). RNAanalyzer3 is free/open via the secure HTTPS protocol, with example data, help and tutorial, web-link to results, and rich data output. We combine a general detailed structure analysis with motif analyses. It accepts either a single plain-text nucleotide sequence or batch submission in FASTA format, which can be pasted or uploaded as a FASTA file. Our tool (i) has up-to-date software and operating systems, (ii) combines diverse RNA motif analyses with RNA structure prediction, (iii) puts found motifs into structural context, and (iv) offers dedicated tools for probing RNA–protein binding interactions. RNAanalyzer3 links motif searches to Rfam and miRNA search to miRbase. It focuses on structural features first, looks for stem-loops, hairpins, and specific enrichment regions such as stem-GG pairs, plus AU-rich regions with their locations for easier identification, while providing structural context and interactive RNA structure visualization. A tabulated overview shows all RNA features including structure details, coding potential, untranslated regions (UTRs, including Shine–Dalgarno sequences, Kozak sequences, and polyadenylation signals), transfer RNA (tRNA), microRNA (miRNA), long noncoding RNA (lncRNA), trans-splicing motifs, iron response elements (IRE), riboswitches, small nuclear ribonucleoprotein (snRNP) motifs, and spliceosomal Sm-sites.
|
Scooped by
mhryu@live.com
April 30, 2:00 AM
|
High-quality protein structure models have become widely available, offering insights into protein function, yet they remain underutilized. Here, we introduce metagenomic-deepFRI, a framework incorporating structural templates into functional annotation pipelines at speeds comparable to sequence-alignment methods. Notably, structural features improved GO term prediction confidence and Information Content by up to 50%. Applied to metagenomic datasets, the framework achieves nearly 90% annotation coverage, enabling protein function inference without explicit orthology-based transfer.
|
Scooped by
mhryu@live.com
April 30, 1:47 AM
|
Many common bacteria use quorum sensing to regulate cell density-dependent phenotypes, including luminescence, biofilm formation, virulence, and symbiosis. The LuxI/R system is the best-characterized quorum sensing pathway in Gram-negative bacteria and consists of a LuxI-type synthase that produces an N-acyl L-homoserine lactone (AHL) autoinducer and a LuxR-type transcription factor that is regulated by AHL binding. Binding of native AHL signal promotes DNA binding and transcriptional regulation in some LuxR homologs (associative-type), while other homologs regulate transcription in the absence of ligand and are inactivated by native signal binding (dissociative-type). To better characterize what features determine ligand-response type, we generated structural mutants of two associative receptors (LasR of Pseudomonas aeruginosa and MrtR of Mesorhizobium tianshanense) and two dissociative receptors (EsaR of Pantoea stewartii and ExpR2 of Pectobacterium versatile). Swapping domains between these receptors revealed that the ligand-binding domain primarily determines associative vs. dissociative activity in response to native AHL agonists. Further, non-native AHL-derived antagonists maintained their activity profiles in receptors with interchanged DNA-binding domains. We also found that the extended linker between domains observed in the dissociative receptors does not determine mechanism of ligand response, and that inter-domain interactions may play an important role in activation for some receptors but not others. Notably, deletion of just one residue from the dissociative receptor EsaR produced a mutant with associative activity, the first time such mechanism switching has been reported for a LuxR-type receptor. These findings illuminate features essential for ligand response and highlight the mechanistic diversity of the LuxR family.
|
Scooped by
mhryu@live.com
April 30, 1:36 AM
|
The rapid expansion of eukaryotic genome sequencing has created an urgent demand for scalable and accurate gene annotation, particularly for large-scale genomic initiatives such as the Earth BioGenome Project (EBP). Existing ab initio methods often struggle with complex gene architectures and exhibit limited cross-lineage generalizability. Moreover, these frameworks typically treat repetitive DNA sequences (repeats) as genomic noise to be pre-masked, leaving the joint modeling of genes and repeats largely unexplored. Here we present OrionGeno, a multispecies phylogeny-aware deep learning framework for end-to-end eukaryotic genome annotation. By integrating phylogenetic context into model learning, OrionGeno resolves complex gene structure variations across divergent lineages, jointly predicting exon-intron architectures, UTRs, and repeats directly from genomic sequences. Across Vertebrates, Invertebrates, Viridiplantae and Fungi, OrionGeno consistently outperforms state-of-the-art methods, achieving a 37.2% relative improvement in protein-level F1 score over the existing best-performing method. Beyond benchmarking, OrionGeno identifies novel loci within well-curated model genomes and generates high-confidence annotations for ~1,200 previously uncharacterized species, expanding NCBI's family-level coverage by 40.5%. As an evidence-independent approach, OrionGeno bridges the gap between genome sequencing and functional discovery, holding promise for large-scale biodiversity initiatives like the EBP.
|
Scooped by
mhryu@live.com
April 30, 1:27 AM
|
Most proteins act through interactions with other molecules, yet predicting how single mutations perturb these interactions—defined as ‘protein codes’—remains a central challenge in computational biology. Here we introduce eSIG-Net, the edgetic mutation sequence-based interaction grammar network, a language model that integrates protein sequence embeddings with syntax-aware and evolution-aware mutation encoding and contrastive learning to predict mutation-driven interaction changes. eSIG-Net outperforms state-of-the-art sequence-based and structure-based methods, nominates causal variants and provides mechanistic insights. Together, eSIG-Net is a mutation-centric interaction language model that accurately predicts interaction-specific network rewiring from sequence information alone and generalizes across biological contexts. eSIG-Net is an interaction language model that predicts the effects of mutations on protein interaction.
|