WorldWideScience

Sample records for sequence analysis reveals

  1. Sequence analysis of serum albumins reveals the molecular evolution of ligand recognition properties.

    Science.gov (United States)

    Fanali, Gabriella; Ascenzi, Paolo; Bernardi, Giorgio; Fasano, Mauro

    2012-01-01

    Serum albumin (SA) is a circulating protein providing a depot and carrier for many endogenous and exogenous compounds. At least seven major binding sites have been identified by structural and functional investigations mainly in human SA. SA is conserved in vertebrates, with at least 49 entries in protein sequence databases. The multiple sequence analysis of this set of entries leads to the definition of a cladistic tree for the molecular evolution of SA orthologs in vertebrates, thus showing the clustering of the considered species, with lamprey SAs (Lethenteron japonicum and Petromyzon marinus) in a separate outgroup. Sequence analysis aimed at searching conserved domains revealed that most SA sequences are made up by three repeated domains (about 600 residues), as extensively characterized for human SA. On the contrary, lamprey SAs are giant proteins (about 1400 residues) comprising seven repeated domains. The phylogenetic analysis of the SA family reveals a stringent correlation with the taxonomic classification of the species available in sequence databases. A focused inspection of the sequences of ligand binding sites in SA revealed that in all sites most residues involved in ligand binding are conserved, although the versatility towards different ligands could be peculiar of higher organisms. Moreover, the analysis of molecular links between the different sites suggests that allosteric modulation mechanisms could be restricted to higher vertebrates.

  2. Multilocus sequence analysis of nectar pseudomonads reveals high genetic diversity and contrasting recombination patterns.

    Science.gov (United States)

    Alvarez-Pérez, Sergio; de Vega, Clara; Herrera, Carlos M

    2013-01-01

    The genetic and evolutionary relationships among floral nectar-dwelling Pseudomonas 'sensu stricto' isolates associated to South African and Mediterranean plants were investigated by multilocus sequence analysis (MLSA) of four core housekeeping genes (rrs, gyrB, rpoB and rpoD). A total of 35 different sequence types were found for the 38 nectar bacterial isolates characterised. Phylogenetic analyses resulted in the identification of three main clades [nectar groups (NGs) 1, 2 and 3] of nectar pseudomonads, which were closely related to five intrageneric groups: Pseudomonas oryzihabitans (NG 1); P. fluorescens, P. lutea and P. syringae (NG 2); and P. rhizosphaerae (NG 3). Linkage disequilibrium analysis pointed to a mostly clonal population structure, even when the analysis was restricted to isolates from the same floristic region or belonging to the same NG. Nevertheless, signatures of recombination were observed for NG 3, which exclusively included isolates retrieved from the floral nectar of insect-pollinated Mediterranean plants. In contrast, the other two NGs comprised both South African and Mediterranean isolates. Analyses relating diversification to floristic region and pollinator type revealed that there has been more unique evolution of the nectar pseudomonads within the Mediterranean region than would be expected by chance. This is the first work analysing the sequence of multiple loci to reveal geno- and ecotypes of nectar bacteria.

  3. Multilocus Sequence Analysis of Nectar Pseudomonads Reveals High Genetic Diversity and Contrasting Recombination Patterns

    Science.gov (United States)

    Álvarez-Pérez, Sergio; de Vega, Clara; Herrera, Carlos M.

    2013-01-01

    The genetic and evolutionary relationships among floral nectar-dwelling Pseudomonas ‘sensu stricto’ isolates associated to South African and Mediterranean plants were investigated by multilocus sequence analysis (MLSA) of four core housekeeping genes (rrs, gyrB, rpoB and rpoD). A total of 35 different sequence types were found for the 38 nectar bacterial isolates characterised. Phylogenetic analyses resulted in the identification of three main clades [nectar groups (NGs) 1, 2 and 3] of nectar pseudomonads, which were closely related to five intrageneric groups: Pseudomonas oryzihabitans (NG 1); P. fluorescens, P. lutea and P. syringae (NG 2); and P. rhizosphaerae (NG 3). Linkage disequilibrium analysis pointed to a mostly clonal population structure, even when the analysis was restricted to isolates from the same floristic region or belonging to the same NG. Nevertheless, signatures of recombination were observed for NG 3, which exclusively included isolates retrieved from the floral nectar of insect-pollinated Mediterranean plants. In contrast, the other two NGs comprised both South African and Mediterranean isolates. Analyses relating diversification to floristic region and pollinator type revealed that there has been more unique evolution of the nectar pseudomonads within the Mediterranean region than would be expected by chance. This is the first work analysing the sequence of multiple loci to reveal geno- and ecotypes of nectar bacteria. PMID:24116076

  4. Re-Analysis of Metagenomic Sequences from Acute Flaccidmyelitis Patients Reveals Alternatives to Enterovirus D68 Infection

    Science.gov (United States)

    2015-07-13

    caused in some cases by infection with enterovirus D68. We found that among the patients whose symptoms were previously attributed to enterovirus D68...distribution is unlimited. Re-analysis of metagenomic sequences from acute flaccidmyelitis patients reveals alternatives to enterovirus D68...Street Baltimore, MD 21218 -2685 ABSTRACT Re-analysis of metagenomic sequences from acute flaccidmyelitis patients reveals alternatives to enterovirus

  5. Molecular evolution and diversification of snake toxin genes, revealed by analysis of intron sequences.

    Science.gov (United States)

    Fujimi, T J; Nakajyo, T; Nishimura, E; Ogura, E; Tsuchiya, T; Tamiya, T

    2003-08-14

    The genes encoding erabutoxin (short chain neurotoxin) isoforms (Ea, Eb, and Ec), LsIII (long chain neurotoxin) and a novel long chain neurotoxin pseudogene were cloned from a Laticauda semifasciata genomic library. Short and long chain neurotoxin genes were also cloned from the genome of Laticauda laticaudata, a closely related species of L. semifasciata, by PCR. A putative matrix attached region (MAR) sequence was found in the intron I of the LsIII gene. Comparative analysis of 11 structurally relevant snake toxin genes (three-finger-structure toxins) revealed the molecular evolution of these toxins. Three-finger-structure toxin genes diverged from a common ancestor through two types of evolutionary pathways (long and short types), early in the course of evolution. At a later stage of evolution in each gene, the accumulation of mutations in the exons, especially exon II, by accelerated evolution may have caused the increased diversification in their functions. It was also revealed that the putative MAR sequence found in the LsIII gene was integrated into the gene after the species-level divergence.

  6. Deep sequencing analysis of the developing mouse brain reveals a novel microRNA

    Directory of Open Access Journals (Sweden)

    Piltz Sandra

    2011-04-01

    Full Text Available Abstract Background MicroRNAs (miRNAs are small non-coding RNAs that can exert multilevel inhibition/repression at a post-transcriptional or protein synthesis level during disease or development. Characterisation of miRNAs in adult mammalian brains by deep sequencing has been reported previously. However, to date, no small RNA profiling of the developing brain has been undertaken using this method. We have performed deep sequencing and small RNA analysis of a developing (E15.5 mouse brain. Results We identified the expression of 294 known miRNAs in the E15.5 developing mouse brain, which were mostly represented by let-7 family and other brain-specific miRNAs such as miR-9 and miR-124. We also discovered 4 putative 22-23 nt miRNAs: mm_br_e15_1181, mm_br_e15_279920, mm_br_e15_96719 and mm_br_e15_294354 each with a 70-76 nt predicted pre-miRNA. We validated the 4 putative miRNAs and further characterised one of them, mm_br_e15_1181, throughout embryogenesis. Mm_br_e15_1181 biogenesis was Dicer1-dependent and was expressed in E3.5 blastocysts and E7 whole embryos. Embryo-wide expression patterns were observed at E9.5 and E11.5 followed by a near complete loss of expression by E13.5, with expression restricted to a specialised layer of cells within the developing and early postnatal brain. Mm_br_e15_1181 was upregulated during neurodifferentiation of P19 teratocarcinoma cells. This novel miRNA has been identified as miR-3099. Conclusions We have generated and analysed the first deep sequencing dataset of small RNA sequences of the developing mouse brain. The analysis revealed a novel miRNA, miR-3099, with potential regulatory effects on early embryogenesis, and involvement in neuronal cell differentiation/function in the brain during late embryonic and early neonatal development.

  7. Deep Sequence Analysis of AgoshRNA Processing Reveals 3' A Addition and Trimming.

    Science.gov (United States)

    Harwig, Alex; Herrera-Carrillo, Elena; Jongejan, Aldo; van Kampen, Antonius Hubertus; Berkhout, Ben

    2015-07-14

    The RNA interference (RNAi) pathway, in which microprocessor and Dicer collaborate to process microRNAs (miRNA), was recently expanded by the description of alternative processing routes. In one of these noncanonical pathways, Dicer action is replaced by the Argonaute2 (Ago2) slicer function. It was recently shown that the stem-length of precursor-miRNA or short hairpin RNA (shRNA) molecules is a major determinant for Dicer versus Ago2 processing. Here we present the results of a deep sequence study on the processing of shRNAs with different stem length and a top G·U wobble base pair (bp). This analysis revealed some unexpected properties of these so-called AgoshRNA molecules that are processed by Ago2 instead of Dicer. First, we confirmed the gradual shift from Dicer to Ago2 processing upon shortening of the hairpin length. Second, hairpins with a stem larger than 19 base pair are inefficiently cleaved by Ago2 and we noticed a shift in the cleavage site. Third, the introduction of a top G·U bp in a regular shRNA can promote Ago2-cleavage, which coincides with a loss of Ago2-loading of the Dicer-cleaved 3' strand. Fourth, the Ago2-processed AgoshRNAs acquire a short 3' tail of 1-3 A-nucleotides (nt) and we present evidence that this product is subsequently trimmed by the poly(A)-specific ribonuclease (PARN).

  8. Genome sequencing and analysis reveals possible determinants of Staphylococcus aureus nasal carriage

    Directory of Open Access Journals (Sweden)

    Cole Alexander M

    2008-09-01

    Full Text Available Abstract Background Nasal carriage of Staphylococcus aureus is a major risk factor in clinical and community settings due to the range of etiologies caused by the organism. We have identified unique immunological and ultrastructural properties associated with nasal carriage isolates denoting a role for bacterial factors in nasal carriage. However, despite extensive molecular level characterizations by several groups suggesting factors necessary for colonization on nasal epithelium, genetic determinants of nasal carriage are unknown. Herein, we have set a genomic foundation for unraveling the bacterial determinants of nasal carriage in S. aureus. Results MLST analysis revealed no lineage specific differences between carrier and non-carrier strains suggesting a role for mobile genetic elements. We completely sequenced a model carrier isolate (D30 and a model non-carrier strain (930918-3 to identify differential gene content. Comparison revealed the presence of 84 genes unique to the carrier strain and strongly suggests a role for Type VII secretion systems in nasal carriage. These genes, along with a putative pathogenicity island (SaPIBov present uniquely in the carrier strains are likely important in affecting carriage. Further, PCR-based genotyping of other clinical isolates for a specific subset of these 84 genes raise the possibility of nasal carriage being caused by multiple gene sets. Conclusion Our data suggest that carriage is likely a heterogeneic phenotypic trait and implies a role for nucleotide level polymorphism in carriage. Complete genome level analyses of multiple carriage strains of S. aureus will be important in clarifying molecular determinants of S. aureus nasal carriage.

  9. Complete genome sequence analysis of novel human bocavirus reveals genetic recombination between human bocavirus 2 and human bocavirus 4.

    Science.gov (United States)

    Khamrin, Pattara; Okitsu, Shoko; Ushijima, Hiroshi; Maneekarn, Niwat

    2013-07-01

    Epidemiological surveillance of human bocavirus (HBoV) was conducted on fecal specimens collected from hospitalized children with diarrhea in Chiang Mai, Thailand in 2011. By partial sequence analysis of VP1 gene, an unusual strain of HBoV (CMH-S011-11), was initially identified as HBoV4. The complete genome sequence of CMH-S011-11 was performed and analyzed further to clarify whether it was a recombinant strain or a new HBoV variant. Analysis of complete genome sequence revealed that the coding sequence starting from NS1, NP1 to VP1/VP2 was 4795 nucleotides long. Interestingly, the nucleotide sequence of NS1 gene of CMH-S011-11 was most closely related to the HBoV2 reference strains detected in Pakistan, which contradicted to the initial genotyping result of the partial VP1 region in the previous study. In addition, comparison of NP1 nucleotide sequence of CMH-S011-11 with those of other HBoV1-4 reference strains also revealed a high level of sequence identity with HBoV2. On the other hand, nucleotide sequence of VP1/VP2 gene of CMH-S011-11 was most closely related to those of HBoV4 reference strains detected in Nigeria. The overall full-length sequence analysis revealed that this CMH-S011-11 was grouped within HBoV4 species, but located in a separate branch from other HBoV4 prototype strains. Recombination analysis revealed that CMH-S011-11 was the result of recombination between HBoV2 and HBoV4 strains with the break point located near the start codon of VP2. Copyright © 2013 Elsevier B.V. All rights reserved.

  10. Deep Sequence Analysis of AgoshRNA Processing Reveals 3’ A Addition and Trimming

    Directory of Open Access Journals (Sweden)

    Alex Harwig

    2015-01-01

    Full Text Available The RNA interference (RNAi pathway, in which microprocessor and Dicer collaborate to process microRNAs (miRNA, was recently expanded by the description of alternative processing routes. In one of these noncanonical pathways, Dicer action is replaced by the Argonaute2 (Ago2 slicer function. It was recently shown that the stem-length of precursor-miRNA or short hairpin RNA (shRNA molecules is a major determinant for Dicer versus Ago2 processing. Here we present the results of a deep sequence study on the processing of shRNAs with different stem length and a top G·U wobble base pair (bp. This analysis revealed some unexpected properties of these so-called AgoshRNA molecules that are processed by Ago2 instead of Dicer. First, we confirmed the gradual shift from Dicer to Ago2 processing upon shortening of the hairpin length. Second, hairpins with a stem larger than 19 base pair are inefficiently cleaved by Ago2 and we noticed a shift in the cleavage site. Third, the introduction of a top G·U bp in a regular shRNA can promote Ago2-cleavage, which coincides with a loss of Ago2-loading of the Dicer-cleaved 3’ strand. Fourth, the Ago2-processed AgoshRNAs acquire a short 3’ tail of 1–3 A-nucleotides (nt and we present evidence that this product is subsequently trimmed by the poly(A-specific ribonuclease (PARN.

  11. Time Correlations of Lightning Flash Sequences in Thunderstorms Revealed by Fractal Analysis

    Science.gov (United States)

    Gou, Xueqiang; Chen, Mingli; Zhang, Guangshu

    2018-01-01

    By using the data of lightning detection and ranging system at the Kennedy Space Center, the temporal fractal and correlation of interevent time series of lightning flash sequences in thunderstorms have been investigated with Allan factor (AF), Fano factor (FF), and detrended fluctuation analysis (DFA) methods. AF, FF, and DFA methods are powerful tools to detect the time-scaling structures and correlations in point processes. Totally 40 thunderstorms with distinguishing features of a single-cell storm and apparent increase and decrease in the total flash rate were selected for the analysis. It is found that the time-scaling exponents for AF (αAF) and FF (αFF) analyses are 1.62 and 0.95 in average, respectively, indicating a strong time correlation of the lightning flash sequences. DFA analysis shows that there is a crossover phenomenon—a crossover timescale (τc) ranging from 54 to 195 s with an average of 114 s. The occurrence of a lightning flash in a thunderstorm behaves randomly at timescales τc but shows strong time correlation at scales >τc. Physically, these may imply that the establishment of an extensive strong electric field necessary for the occurrence of a lightning flash needs a timescale >τc, which behaves strongly time correlated. But the initiation of a lightning flash within a well-established extensive strong electric field may involve the heterogeneities of the electric field at a timescale τc, which behave randomly.

  12. Comparative genomic sequence analysis of strawberry and other rosids reveals significant microsynteny

    Directory of Open Access Journals (Sweden)

    Abbott Albert

    2010-06-01

    Full Text Available Abstract Background Fragaria belongs to the Rosaceae, an economically important family that includes a number of important fruit producing genera such as Malus and Prunus. Using genomic sequences from 50 Fragaria fosmids, we have examined the microsynteny between Fragaria and other plant models. Results In more than half of the strawberry fosmids, we found syntenic regions that are conserved in Populus, Vitis, Medicago and/or Arabidopsis with Populus containing the greatest number of syntenic regions with Fragaria. The longest syntenic region was between LG VIII of the poplar genome and the strawberry fosmid 72E18, where seven out of twelve predicted genes were collinear. We also observed an unexpectedly high level of conserved synteny between Fragaria (rosid I and Vitis (basal rosid. One of the strawberry fosmids, 34E24, contained a cluster of R gene analogs (RGAs with NBS and LRR domains. We detected clusters of RGAs with high sequence similarity to those in 34E24 in all the genomes compared. In the phylogenetic tree we have generated, all the NBS-LRR genes grouped together with Arabidopsis CNL-A type NBS-LRR genes. The Fragaria RGA grouped together with those of Vitis and Populus in the phylogenetic tree. Conclusions Our analysis shows considerable microsynteny between Fragaria and other plant genomes such as Populus, Medicago, Vitis, and Arabidopsis to a lesser degree. We also detected a cluster of NBS-LRR type genes that are conserved in all the genomes compared.

  13. Computational sequence analysis of predicted long dsRNA transcriptomes of major crops reveals sequence complementarity with human genes.

    Science.gov (United States)

    Jensen, Peter D; Zhang, Yuanji; Wiggins, B Elizabeth; Petrick, Jay S; Zhu, Jin; Kerstetter, Randall A; Heck, Gregory R; Ivashuta, Sergey I

    2013-01-01

    Long double-stranded RNAs (long dsRNAs) are precursors for the effector molecules of sequence-specific RNA-based gene silencing in eukaryotes. Plant cells can contain numerous endogenous long dsRNAs. This study demonstrates that such endogenous long dsRNAs in plants have sequence complementarity to human genes. Many of these complementary long dsRNAs have perfect sequence complementarity of at least 21 nucleotides to human genes; enough complementarity to potentially trigger gene silencing in targeted human cells if delivered in functional form. However, the number and diversity of long dsRNA molecules in plant tissue from crops such as lettuce, tomato, corn, soy and rice with complementarity to human genes that have a long history of safe consumption supports a conclusion that long dsRNAs do not present a significant dietary risk.

  14. Multilocus sequence analysis (MLSA) of Bradyrhizobium strains: revealing high diversity of tropical diazotrophic symbiotic bacteria.

    Science.gov (United States)

    Delamuta, Jakeline Renata Marçon; Ribeiro, Renan Augusto; Menna, Pâmela; Bangel, Eliane Villamil; Hungria, Mariangela

    2012-04-01

    Symbiotic association of several genera of bacteria collectively called as rhizobia and plants belonging to the family Leguminosae (=Fabaceae) results in the process of biological nitrogen fixation, playing a key role in global N cycling, and also bringing relevant contributions to the agriculture. Bradyrhizobium is considered as the ancestral of all nitrogen-fixing rhizobial species, probably originated in the tropics. The genus encompasses a variety of diverse bacteria, but the diversity captured in the analysis of the 16S rRNA is often low. In this study, we analyzed twelve Bradyrhizobium strains selected from previous studies performed by our group for showing high genetic diversity in relation to the described species. In addition to the 16S rRNA, five housekeeping genes (recA, atpD, glnII, gyrB and rpoB) were analyzed in the MLSA (multilocus sequence analysis) approach. Analysis of each gene and of the concatenated housekeeping genes captured a considerably higher level of genetic diversity, with indication of putative new species. The results highlight the high genetic variability associated with Bradyrhizobium microsymbionts of a variety of legumes. In addition, the MLSA approach has proved to represent a rapid and reliable method to be employed in phylogenetic and taxonomic studies, speeding the identification of the still poorly known diversity of nitrogen-fixing rhizobia in the tropics.

  15. Salmon louse (Lepeophtheirus salmonis transcriptomes during post molting maturation and egg production, revealed using EST-sequencing and microarray analysis

    Directory of Open Access Journals (Sweden)

    Jonassen Inge

    2008-03-01

    Full Text Available Abstract Background Lepeophtheirus salmonis is an ectoparasitic copepod feeding on skin, mucus and blood from salmonid hosts. Initial analysis of EST sequences from pre adult and adult stages of L. salmonis revealed a large proportion of novel transcripts. In order to link unknown transcripts to biological functions we have combined EST sequencing and microarray analysis to characterize female salmon louse transcriptomes during post molting maturation and egg production. Results EST sequence analysis shows that 43% of the ESTs have no significant hits in GenBank. Sequenced ESTs assembled into 556 contigs and 1614 singletons and whenever homologous genes were identified no clear correlation with homologous genes from any specific animal group was evident. Sequence comparison of 27 L. salmonis proteins with homologous proteins in humans, zebrafish, insects and crustaceans revealed an almost identical sequence identity with all species. Microarray analysis of maturing female adult salmon lice revealed two major transcription patterns; up-regulation during the final molting followed by down regulation and female specific up regulation during post molting growth and egg production. For a third minor group of ESTs transcription decreased during molting from pre-adult II to immature adults. Genes regulated during molting typically gave hits with cuticula proteins whilst transcripts up regulated during post molting growth were female specific, including two vitellogenins. Conclusion The copepod L.salmonis contains high a level of novel genes. Among analyzed L.salmonis proteins, sequence identities with homologous proteins in crustaceans are no higher than to homologous proteins in humans. Three distinct processes, molting, post molting growth and egg production correlate with transcriptional regulation of three groups of transcripts; two including genes related to growth, one including genes related to egg production. The function of the regulated

  16. Sequence analysis reveals how G protein-coupled receptors transduce the signal to the G protein.

    NARCIS (Netherlands)

    Oliveira, L.; Paiva, P.B.; Paiva, A.C.; Vriend, G.

    2003-01-01

    Sequence entropy-variability plots based on alignments of very large numbers of sequences-can indicate the location in proteins of the main active site and modulator sites. In the previous article in this issue, we applied this observation to a series of well-studied proteins and concluded that it

  17. Complete sequence analysis reveals two distinct poleroviruses infecting cucurbits in China.

    Science.gov (United States)

    Xiang, Hai-ying; Shang, Qiao-xia; Han, Cheng-gui; Li, Da-wei; Yu, Jia-lin

    2008-01-01

    The complete RNA genomes of a Chinese isolate of cucurbit aphid-borne yellows virus (CABYV-CHN) and a new polerovirus tentatively referred to as melon aphid-borne yellows virus (MABYV) were determined. The entire genome of CABYV-CHN shared 89.0% nucleotide sequence identity with the French CABYV isolate. In contrast, nucleotide sequence identities between MABYV and CABYV and other poleroviruses were in the range of 50.7-74.2%, with amino acid sequence identities ranging from 24.8 to 82.9% for individual gene products. We propose that CABYV-CHN is a strain of CABYV and that MABYV is a member of a tentative distinct species within the genus Polerovirus.

  18. Sequencing analysis reveals a unique gene organization in the gyrB region of Mycoplasma hominis

    DEFF Research Database (Denmark)

    Ladefoged, Søren; Christiansen, Gunna

    1994-01-01

    of which showed similarity to that which encodes the LicA protein of Haemophilus influenzae. The organization of the genes in the region showed no resemblance to that in the corresponding regions of other bacteria sequenced so far. The gyrA gene was mapped 35 kb downstream from the gyrB gene.......The homolog of the gyrB gene, which has been reported to be present in the vicinity of the initiation site of replication in bacteria, was mapped on the Mycoplasma hominis genome, and the region was subsequently sequenced. Five open reading frames were identified flanking the gyrB gene, one...

  19. Sequence analysis of RNase MRP RNA reveals its origination from eukaryotic RNase P RNA

    Science.gov (United States)

    Zhu, Yanglong; Stribinskis, Vilius; Ramos, Kenneth S.; Li, Yong

    2006-01-01

    RNase MRP is a eukaryote-specific endoribonuclease that generates RNA primers for mitochondrial DNA replication and processes precursor rRNA. RNase P is a ubiquitous endoribonuclease that cleaves precursor tRNA transcripts to produce their mature 5′ termini. We found extensive sequence homology of catalytic domains and specificity domains between their RNA subunits in many organisms. In Candida glabrata, the internal loop of helix P3 is 100% conserved between MRP and P RNAs. The helix P8 of MRP RNA from microsporidia Encephalitozoon cuniculi is identical to that of P RNA. Sequence homology can be widely spread over the whole molecule of MRP RNA and P RNA, such as those from Dictyostelium discoideum. These conserved nucleotides between the MRP and P RNAs strongly support the hypothesis that the MRP RNA is derived from the P RNA molecule in early eukaryote evolution. PMID:16540690

  20. Bacterial Pathogens and Community Composition in Advanced Sewage Treatment Systems Revealed by Metagenomics Analysis Based on High-Throughput Sequencing

    Science.gov (United States)

    Lu, Xin; Zhang, Xu-Xiang; Wang, Zhu; Huang, Kailong; Wang, Yuan; Liang, Weigang; Tan, Yunfei; Liu, Bo; Tang, Junying

    2015-01-01

    This study used 454 pyrosequencing, Illumina high-throughput sequencing and metagenomic analysis to investigate bacterial pathogens and their potential virulence in a sewage treatment plant (STP) applying both conventional and advanced treatment processes. Pyrosequencing and Illumina sequencing consistently demonstrated that Arcobacter genus occupied over 43.42% of total abundance of potential pathogens in the STP. At species level, potential pathogens Arcobacter butzleri, Aeromonas hydrophila and Klebsiella pneumonia dominated in raw sewage, which was also confirmed by quantitative real time PCR. Illumina sequencing also revealed prevalence of various types of pathogenicity islands and virulence proteins in the STP. Most of the potential pathogens and virulence factors were eliminated in the STP, and the removal efficiency mainly depended on oxidation ditch. Compared with sand filtration, magnetic resin seemed to have higher removals in most of the potential pathogens and virulence factors. However, presence of the residual A. butzleri in the final effluent still deserves more concerns. The findings indicate that sewage acts as an important source of environmental pathogens, but STPs can effectively control their spread in the environment. Joint use of the high-throughput sequencing technologies is considered a reliable method for deep and comprehensive overview of environmental bacterial virulence. PMID:25938416

  1. Major soybean maturity gene haplotypes revealed by SNPViz analysis of 72 sequenced soybean genomes.

    Directory of Open Access Journals (Sweden)

    Tiffany Langewisch

    Full Text Available In this Genomics Era, vast amounts of next-generation sequencing data have become publicly available for multiple genomes across hundreds of species. Analyses of these large-scale datasets can become cumbersome, especially when comparing nucleotide polymorphisms across many samples within a dataset and among different datasets or organisms. To facilitate the exploration of allelic variation and diversity, we have developed and deployed an in-house computer software to categorize and visualize these haplotypes. The SNPViz software enables users to analyze region-specific haplotypes from single nucleotide polymorphism (SNP datasets for different sequenced genomes. The examination of allelic variation and diversity of important soybean [Glycine max (L. Merr.] flowering time and maturity genes may provide additional insight into flowering time regulation and enhance researchers' ability to target soybean breeding for particular environments. For this study, we utilized two available soybean genomic datasets for a total of 72 soybean genotypes encompassing cultivars, landraces, and the wild species Glycine soja. The major soybean maturity genes E1, E2, E3, and E4 along with the Dt1 gene for plant growth architecture were analyzed in an effort to determine the number of major haplotypes for each gene, to evaluate the consistency of the haplotypes with characterized variant alleles, and to identify evidence of artificial selection. The results indicated classification of a small number of predominant haplogroups for each gene and important insights into possible allelic diversity for each gene within the context of known causative mutations. The software has both a stand-alone and web-based version and can be used to analyze other genes, examine additional soybean datasets, and view similar genome sequence and SNP datasets from other species.

  2. Comparative transcriptome analysis within the Lolium/Festuca species complex reveals high sequence conservation

    DEFF Research Database (Denmark)

    Czaban, Adrian; Sharma, Sapna; Byrne, Stephen

    2015-01-01

    species from the Lolium-Festuca complex, ranging from 52,166 to 72,133 transcripts per assembly. We have also predicted a set of proteins and validated it with a high-confidence protein database from three closely related species (H. vulgare, B. distachyon and O. sativa). We have obtained gene family...... clusters for the four species using OrthoMCL and analyzed their inferred phylogenetic relationships. Our results indicate that VRN2 is a candidate gene for differentiating vernalization and non-vernalization types in the Lolium-Festuca complex. Grouping of the gene families based on their BLAST identity...... enabled us to divide ortholog groups into those that are very conserved and those that are more evolutionarily relaxed. The ratio of the non-synonumous to synonymous substitutions enabled us to pinpoint protein sequences evolving in response to positive selection. These proteins may explain some...

  3. Sequence analysis of chromosome 1 revealed different selection patterns between Chinese wild mice and laboratory strains.

    Science.gov (United States)

    Xu, Fuyi; Hu, Shixian; Chao, Tianzhu; Wang, Maochun; Li, Kai; Zhou, Yuxun; Xu, Hongyan; Xiao, Junhua

    2017-10-01

    Both natural and artificial selection play a critical role in animals' adaptation to the environment. Detection of the signature of selection in genomic regions can provide insights for understanding the function of specific phenotypes. It is generally assumed that laboratory mice may experience intense artificial selection while wild mice more natural selection. However, the differences of selection signature in the mouse genome and underlying genes between wild and laboratory mice remain unclear. In this study, we used two mouse populations: chromosome 1 (Chr 1) substitution lines (C1SLs) derived from Chinese wild mice and mouse genome project (MGP) sequenced inbred strains and two selection detection statistics: Fst and Tajima's D to identify the signature of selection footprint on Chr 1. For the differentiation between the C1SLs and MGP, 110 candidate selection regions containing 47 protein coding genes were detected. A total of 149 selection regions which encompass 7.215 Mb were identified in the C1SLs by Tajima's D approach. While for the MGP, we identified nearly twice selection regions (243) compared with the C1SLs which accounted for 13.27 Mb Chr 1 sequence. Through functional annotation, we identified several biological processes with significant enrichment including seven genes in the olfactory transduction pathway. In addition, we searched the phenotypes associated with the 47 candidate selection genes identified by Fst. These genes were involved in behavior, growth or body weight, mortality or aging, and immune systems which align well with the phenotypic differences between wild and laboratory mice. Therefore, the findings would be helpful for our understanding of the phenotypic differences between wild and laboratory mice and applications for using this new mouse resource (C1SLs) for further genetics studies.

  4. Revealing stable processing products from ribosome-associated small RNAs by deep-sequencing data analysis.

    Science.gov (United States)

    Zywicki, Marek; Bakowska-Zywicka, Kamilla; Polacek, Norbert

    2012-05-01

    The exploration of the non-protein-coding RNA (ncRNA) transcriptome is currently focused on profiling of microRNA expression and detection of novel ncRNA transcription units. However, recent studies suggest that RNA processing can be a multi-layer process leading to the generation of ncRNAs of diverse functions from a single primary transcript. Up to date no methodology has been presented to distinguish stable functional RNA species from rapidly degraded side products of nucleases. Thus the correct assessment of widespread RNA processing events is one of the major obstacles in transcriptome research. Here, we present a novel automated computational pipeline, named APART, providing a complete workflow for the reliable detection of RNA processing products from next-generation-sequencing data. The major features include efficient handling of non-unique reads, detection of novel stable ncRNA transcripts and processing products and annotation of known transcripts based on multiple sources of information. To disclose the potential of APART, we have analyzed a cDNA library derived from small ribosome-associated RNAs in Saccharomyces cerevisiae. By employing the APART pipeline, we were able to detect and confirm by independent experimental methods multiple novel stable RNA molecules differentially processed from well known ncRNAs, like rRNAs, tRNAs or snoRNAs, in a stress-dependent manner.

  5. Genome Sequencing and Comparative Analysis of Stenotrophomonas acidaminiphila Reveal Evolutionary Insights Into Sulfamethoxazole Resistance

    Directory of Open Access Journals (Sweden)

    Yao-Ting Huang

    2018-05-01

    Full Text Available Stenotrophomonas acidaminiphila is an aerobic, glucose non-fermentative, Gram-negative bacterium that been isolated from various environmental sources, particularly aquatic ecosystems. Although resistance to multiple antimicrobial agents has been reported in S. acidaminiphila, the mechanisms are largely unknown. Here, for the first time, we report the complete genome and antimicrobial resistome analysis of a clinical isolate S. acidaminiphila SUNEO which is resistant to sulfamethoxazole. Comparative analysis among closely related strains identified common and strain-specific genes. In particular, comparison with a sulfamethoxazole-sensitive strain identified a mutation within the sulfonamide-binding site of folP in SUNEO, which may reduce the binding affinity of sulfamethoxazole. Selection pressure analysis indicated folP in SUNEO is under purifying selection, which may be owing to long-term administration of sulfonamide against Stenotrophomonas.

  6. Metagenome Sequence Analysis of Filamentous Microbial Communities Obtained from Geochemically Distinct Geothermal Channels Reveals Specialization of Three Aquificales Lineages

    Directory of Open Access Journals (Sweden)

    Cristina eTakacs-vesbach

    2013-05-01

    Full Text Available The Aquificales are thermophilic microorganisms that inhabit hydrothermal systems worldwide and are considered one of the earliest lineages of the domain Bacteria. We analyzed metagenome sequence obtained from six thermal ‘filamentous streamer’ communities (~40 Mbp per site, which targeted three different groups of Aquificales found in Yellowstone National Park (YNP. Unassembled metagenome sequence and PCR-amplified 16S rRNA gene libraries revealed that acidic, sulfidic sites were dominated by Hydrogenobaculum (Aquificaceae populations, whereas the circumneutral pH (6.5 - 7.8 sites containing dissolved sulfide were dominated by Sulfurihydrogenibium spp. (Hydrogenothermaceae. Thermocrinis (Aquificaceae populations were found primarily in the circumneutral sites with undetectable sulfide, and to a lesser extent in one sulfidic system at pH 8. Phylogenetic analysis of assembled sequence containing 16S rRNA genes as well as conserved protein-encoding genes revealed that the composition and function of these communities varied across geochemical conditions. Each Aquificales lineage contained genes for CO2 fixation by the reverse TCA cycle, but only the Sulfurihydrogenibium populations perform citrate cleavage using ATP citrate lyase (Acl. The Aquificaceae populations use an alternative pathway catalyzed by two separate enzymes, citryl CoA synthetase (Ccs and citryl CoA lyase (Ccl. All three Aquificales lineages contained evidence of aerobic respiration, albeit due to completely different types of heme Cu oxidases (subunit I involved in oxygen reduction. The distribution of Aquificales populations and differences among functional genes involved in energy generation and electron transport is consistent with the hypothesis that geochemical parameters (e.g., pH, sulfide, H2, O2 have resulted in niche specialization among members of the Aquificales.

  7. Multilocus sequence typing analysis reveals that Cryptococcus neoformans var. neoformans is a recombinant population.

    Science.gov (United States)

    Cogliati, Massimo; Zani, Alberto; Rickerts, Volker; McCormick, Ilka; Desnos-Ollivier, Marie; Velegraki, Aristea; Escandon, Patricia; Ichikawa, Tomoe; Ikeda, Reiko; Bienvenu, Anne-Lise; Tintelnot, Kathrin; Tore, Okan; Akcaglar, Sevim; Lockhart, Shawn; Tortorano, Anna Maria; Varma, Ashok

    2016-02-01

    Cryptococcus neoformans var. neoformans (serotype D) represents about 30% of the clinical isolates in Europe and is present less frequently in the other continents. It is the prevalent etiological agent in primary cutaneous cryptococcosis as well as in cryptococcal skin lesions of disseminated cryptococcosis. Very little is known about the genotypic diversity of this Cryptococcus subtype. The aim of this study was to investigate the genotypic diversity among a set of clinical and environmental C. neoformans var. neoformans isolates and to evaluate the relationship between genotypes, geographical origin and clinical manifestations. A total of 83 globally collected C. neoformans var. neoformans isolates from Italy, Germany, France, Belgium, Denmark, Greece, Turkey, Thailand, Japan, Colombia, and the USA, recovered from different sources (primary and secondary cutaneous cryptococcosis, disseminated cryptococcosis, the environment, and animals), were included in the study. All isolates were confirmed to belong to genotype VNIV by molecular typing and they were further investigated by MLST analysis. Maximum likelihood phylogenetic as well as network analysis strongly suggested the existence of a recombinant rather than a clonal population structure. Geographical origin and source of isolation were not correlated with a specific MLST genotype. The comparison with a set of outgroup C. neoformans var. grubii isolates provided clear evidence that the two varieties have different population structures. Copyright © 2016 Elsevier Inc. All rights reserved.

  8. Sequence and structural analysis of the chitinase insertion domain reveals two conserved motifs involved in chitin-binding.

    Directory of Open Access Journals (Sweden)

    Hai Li

    2010-01-01

    Full Text Available Chitinases are prevalent in life and are found in species including archaea, bacteria, fungi, plants, and animals. They break down chitin, which is the second most abundant carbohydrate in nature after cellulose. Hence, they are important for maintaining a balance between carbon and nitrogen trapped as insoluble chitin in biomass. Chitinases are classified into two families, 18 and 19 glycoside hydrolases. In addition to a catalytic domain, which is a triosephosphate isomerase barrel, many family 18 chitinases contain another module, i.e., chitinase insertion domain. While numerous studies focus on the biological role of the catalytic domain in chitinase activity, the function of the chitinase insertion domain is not completely understood. Bioinformatics offers an important avenue in which to facilitate understanding the role of residues within the chitinase insertion domain in chitinase function.Twenty-seven chitinase insertion domain sequences, which include four experimentally determined structures and span five kingdoms, were aligned and analyzed using a modified sequence entropy parameter. Thirty-two positions with conserved residues were identified. The role of these conserved residues was explored by conducting a structural analysis of a number of holo-enzymes. Hydrogen bonding and van der Waals calculations revealed a distinct subset of four conserved residues constituting two sequence motifs that interact with oligosaccharides. The other conserved residues may be key to the structure, folding, and stability of this domain.Sequence and structural studies of the chitinase insertion domains conducted within the framework of evolution identified four conserved residues which clearly interact with the substrates. Furthermore, evolutionary studies propose a link between the appearance of the chitinase insertion domain and the function of family 18 chitinases in the subfamily A.

  9. Integrated analysis of whole genome and transcriptome sequencing reveals diverse transcriptomic aberrations driven by somatic genomic changes in liver cancers.

    Directory of Open Access Journals (Sweden)

    Yuichi Shiraishi

    Full Text Available Recent studies applying high-throughput sequencing technologies have identified several recurrently mutated genes and pathways in multiple cancer genomes. However, transcriptional consequences from these genomic alterations in cancer genome remain unclear. In this study, we performed integrated and comparative analyses of whole genomes and transcriptomes of 22 hepatitis B virus (HBV-related hepatocellular carcinomas (HCCs and their matched controls. Comparison of whole genome sequence (WGS and RNA-Seq revealed much evidence that various types of genomic mutations triggered diverse transcriptional changes. Not only splice-site mutations, but also silent mutations in coding regions, deep intronic mutations and structural changes caused splicing aberrations. HBV integrations generated diverse patterns of virus-human fusion transcripts depending on affected gene, such as TERT, CDK15, FN1 and MLL4. Structural variations could drive over-expression of genes such as WNT ligands, with/without creating gene fusions. Furthermore, by taking account of genomic mutations causing transcriptional aberrations, we could improve the sensitivity of deleterious mutation detection in known cancer driver genes (TP53, AXIN1, ARID2, RPS6KA3, and identified recurrent disruptions in putative cancer driver genes such as HNF4A, CPS1, TSC1 and THRAP3 in HCCs. These findings indicate genomic alterations in cancer genome have diverse transcriptomic effects, and integrated analysis of WGS and RNA-Seq can facilitate the interpretation of a large number of genomic alterations detected in cancer genome.

  10. RNA Sequencing and Coexpression Analysis Reveal Key Genes Involved in α-Linolenic Acid Biosynthesis in Perilla frutescens Seed

    Directory of Open Access Journals (Sweden)

    Tianyuan Zhang

    2017-11-01

    Full Text Available Perilla frutescen is used as traditional food and medicine in East Asia. Its seeds contain high levels of α-linolenic acid (ALA, which is important for health, but is scarce in our daily meals. Previous reports on RNA-seq of perilla seed had identified fatty acid (FA and triacylglycerol (TAG synthesis genes, but the underlying mechanism of ALA biosynthesis and its regulation still need to be further explored. So we conducted Illumina RNA-sequencing in seven temporal developmental stages of perilla seeds. Sequencing generated a total of 127 million clean reads, containing 15.88 Gb of valid data. The de novo assembly of sequence reads yielded 64,156 unigenes with an average length of 777 bp. A total of 39,760 unigenes were annotated and 11,693 unigenes were found to be differentially expressed in all samples. According to Kyoto Encyclopedia of Genes and Genomes (KEGG pathway analysis, 486 unigenes were annotated in the “lipid metabolism” pathway. Of these, 150 unigenes were found to be involved in fatty acid (FA biosynthesis and triacylglycerol (TAG assembly in perilla seeds. A coexpression analysis showed that a total of 104 genes were highly coexpressed (r > 0.95. The coexpression network could be divided into two main subnetworks showing over expression in the medium or earlier and late phases, respectively. In order to identify the putative regulatory genes, a transcription factor (TF analysis was performed. This led to the identification of 45 gene families, mainly including the AP2-EREBP, bHLH, MYB, and NAC families, etc. After coexpression analysis of TFs with highly expression of FAD2 and FAD3 genes, 162 TFs were found to be significantly associated with two FAD genes (r > 0.95. Those TFs were predicted to be the key regulatory factors in ALA biosynthesis in perilla seed. The qRT-PCR analysis also verified the relevance of expression pattern between two FAD genes and partial candidate TFs. Although it has been reported that some TFs

  11. Phylogenetic diversity and genotypical complexity of H9N2 influenza A viruses revealed by genomic sequence analysis.

    Directory of Open Access Journals (Sweden)

    Guoying Dong

    Full Text Available H9N2 influenza A viruses have become established worldwide in terrestrial poultry and wild birds, and are occasionally transmitted to mammals including humans and pigs. To comprehensively elucidate the genetic and evolutionary characteristics of H9N2 influenza viruses, we performed a large-scale sequence analysis of 571 viral genomes from the NCBI Influenza Virus Resource Database, representing the spectrum of H9N2 influenza viruses isolated from 1966 to 2009. Our study provides a panoramic framework for better understanding the genesis and evolution of H9N2 influenza viruses, and for describing the history of H9N2 viruses circulating in diverse hosts. Panorama phylogenetic analysis of the eight viral gene segments revealed the complexity and diversity of H9N2 influenza viruses. The 571 H9N2 viral genomes were classified into 74 separate lineages, which had marked host and geographical differences in phylogeny. Panorama genotypical analysis also revealed that H9N2 viruses include at least 98 genotypes, which were further divided according to their HA lineages into seven series (A-G. Phylogenetic analysis of the internal genes showed that H9N2 viruses are closely related to H3, H4, H5, H7, H10, and H14 subtype influenza viruses. Our results indicate that H9N2 viruses have undergone extensive reassortments to generate multiple reassortants and genotypes, suggesting that the continued circulation of multiple genotypical H9N2 viruses throughout the world in diverse hosts has the potential to cause future influenza outbreaks in poultry and epidemics in humans. We propose a nomenclature system for identifying and unifying all lineages and genotypes of H9N2 influenza viruses in order to facilitate international communication on the evolution, ecology and epidemiology of H9N2 influenza viruses.

  12. Location analysis for the estrogen receptor-? reveals binding to diverse ERE sequences and widespread binding within repetitive DNA elements

    OpenAIRE

    Mason, Christopher E.; Shu, Feng-Jue; Wang, Cheng; Session, Ryan M.; Kallen, Roland G.; Sidell, Neil; Yu, Tianwei; Liu, Mei Hui; Cheung, Edwin; Kallen, Caleb B.

    2010-01-01

    Location analysis for estrogen receptor-? (ER?)-bound cis-regulatory elements was determined in MCF7 cells using chromatin immunoprecipitation (ChIP)-on-chip. Here, we present the estrogen response element (ERE) sequences that were identified at ER?-bound loci and quantify the incidence of ERE sequences under two stringencies of detection:

  13. RNA sequencing analysis reveals new findings of hyperbaric oxygen treatment on rats with acute carbon monoxide poisoning.

    Science.gov (United States)

    Wang, Wenlan; Xue, Li; Li, Ya; Li, Rong; Xie, Xiaoping; Bao, Junxiang; Hai, Chunxu; Li, Jinsheng

    2016-01-01

    To elucidate the altered gene network in the brains of carbon monoxide (CO) poisoned rats after treatment with hyperbaric oxygen (HBO₂). RNA sequencing (RNA-seq) analysis was performed to examine differentially expressed genes (DEGs) in brain tissue samples from nine male rats: a normal control group; a CO poisoning group; and an HBO₂ treatment group (three rats/group). Reverse transcription polymerase chain reaction (RT-PCR) and real-time quantitative PCR were used for validation of the DEGs in another 18 male rats (six rats/group). RNA-seq revealed that two genes were upregulated (4.18 and 8.76 log to the base 2 fold change) (p⟨0.05) in the CO-poisoned rats relative to the control rats; two genes were upregulated (3.88 and 7.69 log to the base 2 fold change); and 23 genes were downregulated (3.49-15.12 log to the base 2 fold change) (p⟨0.05) in the brains of the HBO₂-treated rats relative to the CO-poisoned rats. Target prediction of DEGs by gene network analysis and analysis of pathways affected suggested that regulation of gene expressions of dopamine metabolism and nitric oxide (NO) synthesis were significantly affected by CO poisoning and HBO₂ treatment. Results of RT-PCR and real-time quantitative PCR indicated that four genes (Pomc, GH-1, Pr1 and Fshβ) associated with hormone secretion in the hypothalamic-pituitary system have potential as markers for prognosis of CO. This study is the first RNA-seq analysis profile of HBO₂ treatment on rats with acute CO poisoning. It concludes that changes of hormone secretion in the hypothalamic-pituitary system, dopamine metabolism and NO synthesis involved in brain damage and behavior abnormalities after CO poisoning and HBO₂ therapy may regulate these changes.

  14. Analysis of sequences from field samples reveals the presence of the recently described pepper vein yellows virus (genus Polerovirus) in six additional countries.

    Science.gov (United States)

    Knierim, Dennis; Tsai, Wen-Shi; Kenyon, Lawrence

    2013-06-01

    Polerovirus infection was detected by reverse transcription polymerase chain reaction (RT-PCR) in 29 pepper plants (Capsicum spp.) and one black nightshade plant (Solanum nigrum) sample collected from fields in India, Indonesia, Mali, Philippines, Thailand and Taiwan. At least two representative samples for each country were selected to generate a general polerovirus RT-PCR product of 1.4 kb length for sequencing. Sequence analysis of the partial genome sequences revealed the presence of pepper vein yellows virus (PeVYV) in all 13 samples. A 1990 Australian herbarium sample of pepper described by serological means as infected with capsicum yellows virus (CYV) was identified by sequence analysis of a partial CP sequence as probably infected with a potato leaf roll virus (PLRV) isolate.

  15. Transcriptome sequencing and metabolite analysis reveals the role of delphinidin metabolism in flower colour in grape hyacinth.

    Science.gov (United States)

    Lou, Qian; Liu, Yali; Qi, Yinyan; Jiao, Shuzhen; Tian, Feifei; Jiang, Ling; Wang, Yuejin

    2014-07-01

    Grape hyacinth (Muscari) is an important ornamental bulbous plant with an extraordinary blue colour. Muscari armeniacum, whose flowers can be naturally white, provides an opportunity to unravel the complex metabolic networks underlying certain biochemical traits, especially colour. A blue flower cDNA library of M. armeniacum and a white flower library of M. armeniacum f. album were used for transcriptome sequencing. A total of 89 926 uni-transcripts were isolated, 143 of which could be identified as putative homologues of colour-related genes in other species. Based on a comprehensive analysis relating colour compounds to gene expression profiles, the mechanism of colour biosynthesis was studied in M. armeniacum. Furthermore, a new hypothesis explaining the lack of colour phenotype of the grape hyacinth flower is proposed. Alteration of the substrate competition between flavonol synthase (FLS) and dihydroflavonol 4-reductase (DFR) may lead to elimination of blue pigmentation while the multishunt from the limited flux in the cyanidin (Cy) synthesis pathway seems to be the most likely reason for the colour change in the white flowers of M. armeniacum. Moreover, mass sequence data obtained by the deep sequencing of M. armeniacum and its white variant provided a platform for future function and molecular biological research on M. armeniacum. © The Author 2014. Published by Oxford University Press on behalf of the Society for Experimental Biology.

  16. Complete genome sequence analysis of Nocardia brasiliensis HUJEG-1 reveals a saprobic lifestyle and the genes needed for human pathogenesis.

    Science.gov (United States)

    Vera-Cabrera, Lucio; Ortiz-Lopez, Rocio; Elizondo-Gonzalez, Ramiro; Ocampo-Candiani, Jorge

    2013-01-01

    Nocardia brasiliensis is an important etiologic agent of mycetoma. These bacteria live as a saprobe in soil or organic material and enter the tissue via minor trauma. Mycetoma is characterized by tumefaction and the production of fistula and abscesses, with no spontaneous cure. By using mass sequencing, we determined the complete genomic nucleotide sequence of the bacteria. According to our data, the genome is a circular chromosome 9,436,348-bp long with 68% G+C content that encodes 8,414 proteins. We observed orthologs for virulence factors, a higher number of genes involved in lipid biosynthesis and catabolism, and gene clusters for the synthesis of bioactive compounds, such as antibiotics, terpenes, and polyketides. An in silico analysis of the sequence supports the conclusion that the bacteria acquired diverse genes by horizontal transfer from other soil bacteria, even from eukaryotic organisms. The genome composition reflects the evolution of bacteria via the acquisition of a large amount of DNA, which allows it to survive in new ecological niches, including humans.

  17. Complete genome sequence analysis of Nocardia brasiliensis HUJEG-1 reveals a saprobic lifestyle and the genes needed for human pathogenesis.

    Directory of Open Access Journals (Sweden)

    Lucio Vera-Cabrera

    Full Text Available Nocardia brasiliensis is an important etiologic agent of mycetoma. These bacteria live as a saprobe in soil or organic material and enter the tissue via minor trauma. Mycetoma is characterized by tumefaction and the production of fistula and abscesses, with no spontaneous cure. By using mass sequencing, we determined the complete genomic nucleotide sequence of the bacteria. According to our data, the genome is a circular chromosome 9,436,348-bp long with 68% G+C content that encodes 8,414 proteins. We observed orthologs for virulence factors, a higher number of genes involved in lipid biosynthesis and catabolism, and gene clusters for the synthesis of bioactive compounds, such as antibiotics, terpenes, and polyketides. An in silico analysis of the sequence supports the conclusion that the bacteria acquired diverse genes by horizontal transfer from other soil bacteria, even from eukaryotic organisms. The genome composition reflects the evolution of bacteria via the acquisition of a large amount of DNA, which allows it to survive in new ecological niches, including humans.

  18. Analysis of 90 Mb of the potato genome reveals conservation of gene structures and order with tomato but divergence in repetitive sequence composition

    Directory of Open Access Journals (Sweden)

    O'Brien Kimberly

    2008-06-01

    Full Text Available Abstract Background The Solanaceae family contains a number of important crop species including potato (Solanum tuberosum which is grown for its underground storage organ known as a tuber. Albeit the 4th most important food crop in the world, other than a collection of ~220,000 Expressed Sequence Tags, limited genomic sequence information is currently available for potato and advances in potato yield and nutrition content would be greatly assisted through access to a complete genome sequence. While morphologically diverse, Solanaceae species such as potato, tomato, pepper, and eggplant share not only genes but also gene order thereby permitting highly informative comparative genomic analyses. Results In this study, we report on analysis 89.9 Mb of potato genomic sequence representing 10.2% of the genome generated through end sequencing of a potato bacterial artificial chromosome (BAC clone library (87 Mb and sequencing of 22 potato BAC clones (2.9 Mb. The GC content of potato is very similar to Solanum lycopersicon (tomato and other dicotyledonous species yet distinct from the monocotyledonous grass species, Oryza sativa. Parallel analyses of repetitive sequences in potato and tomato revealed substantial differences in their abundance, 34.2% in potato versus 46.3% in tomato, which is consistent with the increased genome size per haploid genome of these two Solanum species. Specific classes and types of repetitive sequences were also differentially represented between these two species including a telomeric-related repetitive sequence, ribosomal DNA, and a number of unclassified repetitive sequences. Comparative analyses between tomato and potato at the gene level revealed a high level of conservation of gene content, genic feature, and gene order although discordances in synteny were observed. Conclusion Genomic level analyses of potato and tomato confirm that gene sequence and gene order are conserved between these solanaceous species and that

  19. Exome sequencing analysis reveals variants in primary immunodeficiency genes in patients with very early onset inflammatory bowel disease.

    Science.gov (United States)

    Kelsen, Judith R; Dawany, Noor; Moran, Christopher J; Petersen, Britt-Sabina; Sarmady, Mahdi; Sasson, Ariella; Pauly-Hubbard, Helen; Martinez, Alejandro; Maurer, Kelly; Soong, Joanne; Rappaport, Eric; Franke, Andre; Keller, Andreas; Winter, Harland S; Mamula, Petar; Piccoli, David; Artis, David; Sonnenberg, Gregory F; Daly, Mark; Sullivan, Kathleen E; Baldassano, Robert N; Devoto, Marcella

    2015-11-01

    Very early onset inflammatory bowel disease (VEO-IBD), IBD diagnosed at 5 years of age or younger, frequently presents with a different and more severe phenotype than older-onset IBD. We investigated whether patients with VEO-IBD carry rare or novel variants in genes associated with immunodeficiencies that might contribute to disease development. Patients with VEO-IBD and parents (when available) were recruited from the Children's Hospital of Philadelphia from March 2013 through July 2014. We analyzed DNA from 125 patients with VEO-IBD (age, 3 wk to 4 y) and 19 parents, 4 of whom also had IBD. Exome capture was performed by Agilent SureSelect V4, and sequencing was performed using the Illumina HiSeq platform. Alignment to human genome GRCh37 was achieved followed by postprocessing and variant calling. After functional annotation, candidate variants were analyzed for change in protein function, minor allele frequency less than 0.1%, and scaled combined annotation-dependent depletion scores of 10 or less. We focused on genes associated with primary immunodeficiencies and related pathways. An additional 210 exome samples from patients with pediatric IBD (n = 45) or adult-onset Crohn's disease (n = 20) and healthy individuals (controls, n = 145) were obtained from the University of Kiel, Germany, and used as control groups. Four hundred genes and regions associated with primary immunodeficiency, covering approximately 6500 coding exons totaling more than 1 Mbp of coding sequence, were selected from the whole-exome data. Our analysis showed novel and rare variants within these genes that could contribute to the development of VEO-IBD, including rare heterozygous missense variants in IL10RA and previously unidentified variants in MSH5 and CD19. In an exome sequence analysis of patients with VEO-IBD and their parents, we identified variants in genes that regulate B- and T-cell functions and could contribute to pathogenesis. Our analysis could lead to the

  20. Genome sequencing and comparative genomics analysis revealed pathogenic potential in Penicillium capsulatum as a novel fungal pathogen belonging to Eurotiales

    Directory of Open Access Journals (Sweden)

    Ying Yang

    2016-10-01

    Full Text Available Penicillium capsulatum is a rare Penicillium species used in paper manufacturing, but recently it has been reported to cause invasive infection. To research the pathogenicity of the clinical Penicillium strain, we sequenced the genomes and transcriptome of the clinical and environmental strains of P. capsulatum. Comparative analyses of these two P. capsulatum strains and close related strains belonging to Eurotiales were performed. The assembled genome sizes of P. capsulatum are approximately 34.4 Mbp in length and encode 11,080 predicted genes. The different isolates of P. capsulatum are highly similar, with the exception of several unique genes, INDELs or SNP in the genes coding for glycosyl hydrolases, amino acid transporters and circumsporozoite protein. A phylogenomic analysis was performed based on the whole genome data of 38 strains belonging to Eurotiales. By comparing the whole genome sequences and the virulence-related genes from 20 important related species, including fungal pathogens and non-human pathogens belonging to Eurotiales, we found meaningful pathogenicity characteristics between P. capsulatum and its closely related species. Our research indicated that P. capsulatum may be a neglected opportunistic pathogen. This study is beneficial for mycologists, geneticists and epidemiologists to achieve a deeper understanding of the genetic basis of the role of P. capsulatum as a newly reported fungal pathogen.

  1. Diversity of Pea-Associated F. proliferatum and F. verticillioides Populations Revealed by FUM1 Sequence Analysis and Fumonisin Biosynthesis

    Science.gov (United States)

    Waśkiewicz, Agnieszka; Stępień, Łukasz; Wilman, Karolina; Kachlicki, Piotr

    2013-01-01

    Fusarium proliferatum and F. verticillioides are considered as minor pathogens of pea (Pisum sativum L.). Both species can survive in seed material without visible disease symptoms, but still contaminating it with fumonisins. Two populations of pea-derived F. proliferatum and F. verticillioides strains were subjected to FUM1 sequence divergence analysis, forming a distinct group when compared to the collection strains originating from different host species. Furthermore, the mycotoxigenic abilities of those strains were evaluated on the basis of in planta and in vitro fumonisin biosynthesis. No differences were observed in fumonisin B (FB) levels measured in pea seeds (maximum level reached 1.5 μg g−1); however, in rice cultures, the majority of F. proliferatum genotypes produced higher amounts of FB1–FB3 than F. verticillioides strains. PMID:23470545

  2. Comparative analysis of the complete genome sequence of the California MSW strain of myxoma virus reveals potential host adaptations.

    Science.gov (United States)

    Kerr, Peter J; Rogers, Matthew B; Fitch, Adam; Depasse, Jay V; Cattadori, Isabella M; Hudson, Peter J; Tscharke, David C; Holmes, Edward C; Ghedin, Elodie

    2013-11-01

    Myxomatosis is a rapidly lethal disease of European rabbits that is caused by myxoma virus (MYXV). The introduction of a South American strain of MYXV into the European rabbit population of Australia is the classic case of host-pathogen coevolution following cross-species transmission. The most virulent strains of MYXV for European rabbits are the Californian viruses, found in the Pacific states of the United States and the Baja Peninsula, Mexico. The natural host of Californian MYXV is the brush rabbit, Sylvilagus bachmani. We determined the complete sequence of the MSW strain of Californian MYXV and performed a comparative analysis with other MYXV genomes. The MSW genome is larger than that of the South American Lausanne (type) strain of MYXV due to an expansion of the terminal inverted repeats (TIRs) of the genome, with duplication of the M156R, M154L, M153R, M152R, and M151R genes and part of the M150R gene from the right-hand (RH) end of the genome at the left-hand (LH) TIR. Despite the extreme virulence of MSW, no novel genes were identified; five genes were disrupted by multiple indels or mutations to the ATG start codon, including two genes, M008.1L/R and M152R, with major virulence functions in European rabbits, and a sixth gene, M000.5L/R, was absent. The loss of these gene functions suggests that S. bachmani is a relatively recent host for MYXV and that duplication of virulence genes in the TIRs, gene loss, or sequence variation in other genes can compensate for the loss of M008.1L/R and M152R in infections of European rabbits.

  3. Bulk segregant analysis by high-throughput sequencing reveals a novel xylose utilization gene from Saccharomyces cerevisiae.

    Directory of Open Access Journals (Sweden)

    Jared W Wenger

    2010-05-01

    Full Text Available Fermentation of xylose is a fundamental requirement for the efficient production of ethanol from lignocellulosic biomass sources. Although they aggressively ferment hexoses, it has long been thought that native Saccharomyces cerevisiae strains cannot grow fermentatively or non-fermentatively on xylose. Population surveys have uncovered a few naturally occurring strains that are weakly xylose-positive, and some S. cerevisiae have been genetically engineered to ferment xylose, but no strain, either natural or engineered, has yet been reported to ferment xylose as efficiently as glucose. Here, we used a medium-throughput screen to identify Saccharomyces strains that can increase in optical density when xylose is presented as the sole carbon source. We identified 38 strains that have this xylose utilization phenotype, including strains of S. cerevisiae, other sensu stricto members, and hybrids between them. All the S. cerevisiae xylose-utilizing strains we identified are wine yeasts, and for those that could produce meiotic progeny, the xylose phenotype segregates as a single gene trait. We mapped this gene by Bulk Segregant Analysis (BSA using tiling microarrays and high-throughput sequencing. The gene is a putative xylitol dehydrogenase, which we name XDH1, and is located in the subtelomeric region of the right end of chromosome XV in a region not present in the S288c reference genome. We further characterized the xylose phenotype by performing gene expression microarrays and by genetically dissecting the endogenous Saccharomyces xylose pathway. We have demonstrated that natural S. cerevisiae yeasts are capable of utilizing xylose as the sole carbon source, characterized the genetic basis for this trait as well as the endogenous xylose utilization pathway, and demonstrated the feasibility of BSA using high-throughput sequencing.

  4. Comparative sequence analysis of Solanum and Arabidopsis in a hot spot for pathogen resistance on potato chromosome V reveals a patchwork of conserved and rapidly evolving genome segments

    Directory of Open Access Journals (Sweden)

    Bruggmann Rémy

    2007-05-01

    Full Text Available Abstract Background Quantitative phenotypic variation of agronomic characters in crop plants is controlled by environmental and genetic factors (quantitative trait loci = QTL. To understand the molecular basis of such QTL, the identification of the underlying genes is of primary interest and DNA sequence analysis of the genomic regions harboring QTL is a prerequisite for that. QTL mapping in potato (Solanum tuberosum has identified a region on chromosome V tagged by DNA markers GP21 and GP179, which contains a number of important QTL, among others QTL for resistance to late blight caused by the oomycete Phytophthora infestans and to root cyst nematodes. Results To obtain genomic sequence for the targeted region on chromosome V, two local BAC (bacterial artificial chromosome contigs were constructed and sequenced, which corresponded to parts of the homologous chromosomes of the diploid, heterozygous genotype P6/210. Two contiguous sequences of 417,445 and 202,781 base pairs were assembled and annotated. Gene-by-gene co-linearity was disrupted by non-allelic insertions of retrotransposon elements, stretches of diverged intergenic sequences, differences in gene content and gene order. The latter was caused by inversion of a 70 kbp genomic fragment. These features were also found in comparison to orthologous sequence contigs from three homeologous chromosomes of Solanum demissum, a wild tuber bearing species. Functional annotation of the sequence identified 48 putative open reading frames (ORF in one contig and 22 in the other, with an average of one ORF every 9 kbp. Ten ORFs were classified as resistance-gene-like, 11 as F-box-containing genes, 13 as transposable elements and three as transcription factors. Comparing potato to Arabidopsis thaliana annotated proteins revealed five micro-syntenic blocks of three to seven ORFs with A. thaliana chromosomes 1, 3 and 5. Conclusion Comparative sequence analysis revealed highly conserved collinear regions

  5. Massively parallel sequencing and genome-wide copy number analysis revealed a clonal relationship in benign metastasizing leiomyoma.

    Science.gov (United States)

    Wu, Ren-Chin; Chao, An-Shine; Lee, Li-Yu; Lin, Gigin; Chen, Shu-Jen; Lu, Yen-Jung; Huang, Huei-Jean; Yen, Chi-Feng; Han, Chien Min; Lee, Yun-Shien; Wang, Tzu-Hao; Chao, Angel

    2017-07-18

    Benign metastasizing leiomyoma (BML) is a rare disease entity typically presenting as multiple extrauterine leiomyomas associated with a uterine leiomyoma. It has been hypothesized that the extrauterine leiomyomata represent distant metastasis of the uterine leiomyoma. To date, the only molecular evidence supporting this hypothesis was derived from clonality analyses based on X-chromosome inactivation assays. Here, we sought to address this issue by examining paired specimens of synchronous pulmonary and uterine leiomyomata from three patients using targeted massively parallel sequencing and molecular inversion probe array analysis for detecting somatic mutations and copy number aberrations. We detected identical non-hot-spot somatic mutations and similar patterns of copy number aberrations (CNAs) in paired pulmonary and uterine leiomyomata from two patients, indicating the clonal relationship between pulmonary and uterine leiomyomata. In addition to loss of chromosome 22q found in the literature, we identified additional recurrent CNAs including losses of chromosome 3q and 11q. In conclusion, our findings of the clonal relationship between synchronous pulmonary and uterine leiomyomas support the hypothesis that BML represents a condition wherein a uterine leiomyoma disseminates to distant extrauterine locations.

  6. Massively parallel sequencing and genome-wide copy number analysis revealed a clonal relationship in benign metastasizing leiomyoma

    Science.gov (United States)

    Lee, Li-Yu; Lin, Gigin; Chen, Shu-Jen; Lu, Yen-Jung; Huang, Huei-Jean; Yen, Chi-Feng; Han, Chien Min; Lee, Yun-Shien; Wang, Tzu-Hao; Chao, Angel

    2017-01-01

    Benign metastasizing leiomyoma (BML) is a rare disease entity typically presenting as multiple extrauterine leiomyomas associated with a uterine leiomyoma. It has been hypothesized that the extrauterine leiomyomata represent distant metastasis of the uterine leiomyoma. To date, the only molecular evidence supporting this hypothesis was derived from clonality analyses based on X-chromosome inactivation assays. Here, we sought to address this issue by examining paired specimens of synchronous pulmonary and uterine leiomyomata from three patients using targeted massively parallel sequencing and molecular inversion probe array analysis for detecting somatic mutations and copy number aberrations. We detected identical non-hot-spot somatic mutations and similar patterns of copy number aberrations (CNAs) in paired pulmonary and uterine leiomyomata from two patients, indicating the clonal relationship between pulmonary and uterine leiomyomata. In addition to loss of chromosome 22q found in the literature, we identified additional recurrent CNAs including losses of chromosome 3q and 11q. In conclusion, our findings of the clonal relationship between synchronous pulmonary and uterine leiomyomas support the hypothesis that BML represents a condition wherein a uterine leiomyoma disseminates to distant extrauterine locations. PMID:28533481

  7. RNA Sequencing Analysis Reveals Transcriptomic Variations in Tobacco (Nicotiana tabacum Leaves Affected by Climate, Soil, and Tillage Factors

    Directory of Open Access Journals (Sweden)

    Bo Lei

    2014-04-01

    Full Text Available The growth and development of plants are sensitive to their surroundings. Although numerous studies have analyzed plant transcriptomic variation, few have quantified the effect of combinations of factors or identified factor-specific effects. In this study, we performed RNA sequencing (RNA-seq analysis on tobacco leaves derived from 10 treatment combinations of three groups of ecological factors, i.e., climate factors (CFs, soil factors (SFs, and tillage factors (TFs. We detected 4980, 2916, and 1605 differentially expressed genes (DEGs that were affected by CFs, SFs, and TFs, which included 2703, 768, and 507 specific and 703 common DEGs (simultaneously regulated by CFs, SFs, and TFs, respectively. GO and KEGG enrichment analyses showed that genes involved in abiotic stress responses and secondary metabolic pathways were overrepresented in the common and CF-specific DEGs. In addition, we noted enrichment in CF-specific DEGs related to the circadian rhythm, SF-specific DEGs involved in mineral nutrient absorption and transport, and SF- and TF-specific DEGs associated with photosynthesis. Based on these results, we propose a model that explains how plants adapt to various ecological factors at the transcriptomic level. Additionally, the identified DEGs lay the foundation for future investigations of stress resistance, circadian rhythm and photosynthesis in tobacco.

  8. Protein domain analysis of genomic sequence data reveals regulation of LRR related domains in plant transpiration in Ficus.

    Science.gov (United States)

    Lang, Tiange; Yin, Kangquan; Liu, Jinyu; Cao, Kunfang; Cannon, Charles H; Du, Fang K

    2014-01-01

    Predicting protein domains is essential for understanding a protein's function at the molecular level. However, up till now, there has been no direct and straightforward method for predicting protein domains in species without a reference genome sequence. In this study, we developed a functionality with a set of programs that can predict protein domains directly from genomic sequence data without a reference genome. Using whole genome sequence data, the programming functionality mainly comprised DNA assembly in combination with next-generation sequencing (NGS) assembly methods and traditional methods, peptide prediction and protein domain prediction. The proposed new functionality avoids problems associated with de novo assembly due to micro reads and small single repeats. Furthermore, we applied our functionality for the prediction of leucine rich repeat (LRR) domains in four species of Ficus with no reference genome, based on NGS genomic data. We found that the LRRNT_2 and LRR_8 domains are related to plant transpiration efficiency, as indicated by the stomata index, in the four species of Ficus. The programming functionality established in this study provides new insights for protein domain prediction, which is particularly timely in the current age of NGS data expansion.

  9. Transcriptome sequencing and metabolite analysis reveals the role of delphinidin metabolism in flower colour in grape hyacinth

    OpenAIRE

    Lou, Qian; Liu, Yali; Qi, Yinyan; Jiao, Shuzhen; Tian, Feifei; Jiang, Ling; Wang, Yuejin

    2014-01-01

    Grape hyacinth (Muscari) is an important ornamental bulbous plant with an extraordinary blue colour. Muscari armeniacum, whose flowers can be naturally white, provides an opportunity to unravel the complex metabolic networks underlying certain biochemical traits, especially colour. A blue flower cDNA library of M. armeniacum and a white flower library of M. armeniacum f. album were used for transcriptome sequencing. A total of 89 926 uni-transcripts were isolated, 143 of which could be identi...

  10. Sequence analysis of the internal transcribed spacer (ITS) region reveals a novel clade of Ichthyophonus sp. from rainbow trout.

    Science.gov (United States)

    Rasmussen, C; Purcell, M K; Gregg, J L; LaPatra, S E; Winton, J R; Hershberger, P K

    2010-03-09

    The mesomycetozoean parasite Ichthyophonus hoferi is most commonly associated with marine fish hosts but also occurs in some components of the freshwater rainbow trout Oncorhynchus mykiss aquaculture industry in Idaho, USA. It is not certain how the parasite was introduced into rainbow trout culture, but it might have been associated with the historical practice of feeding raw, ground common carp Cyprinus carpio that were caught by commercial fisherman. Here, we report a major genetic division between west coast freshwater and marine isolates of Ichthyophonus hoferi. Sequence differences were not detected in 2 regions of the highly conserved small subunit (18S) rDNA gene; however, nucleotide variation was seen in internal transcribed spacer loci (ITS1 and ITS2), both within and among the isolates. Intra-isolate variation ranged from 2.4 to 7.6 nucleotides over a region consisting of approximately 740 bp. Majority consensus sequences from marine/anadromous hosts differed in only 0 to 3 nucleotides (99.6 to 100% nucleotide identity), while those derived from freshwater rainbow trout had no nucleotide substitutions relative to each other. However, the consensus sequences between isolates from freshwater rainbow trout and those from marine/anadromous hosts differed in 13 to 16 nucleotides (97.8 to 98.2% nucleotide identity).

  11. Analysis of the 9p21.3 sequence associated with coronary artery disease reveals a tendency for duplication in a CAD patient

    Science.gov (United States)

    Kouprina, Natalay; Noskov, Vladimir N.; Waterfall, Joshua J.; Walker, Robert L.; Meltzer, Paul S.; Topol, Eric J.; Larionov, Vladimir

    2018-01-01

    Tandem segmental duplications (SDs) greater than 10 kb are widespread in complex genomes. They provide material for gene divergence and evolutionary adaptation, while formation of specific de novo SDs is a hallmark of cancer and some human diseases. Most SDs map to distinct genomic regions termed ‘duplication blocks’. SDs organization within these blocks is often poorly characterized as they are mosaics of ancestral duplicons juxtaposed with younger duplicons arising from more recent duplication events. Structural and functional analysis of SDs is further hampered as long repetitive DNA structures are underrepresented in existing BAC and YAC libraries. We applied Transformation-Associated Recombination (TAR) cloning, a versatile technique for large DNA manipulation, to selectively isolate the coronary artery disease (CAD) interval sequence within the 9p21.3 chromosome locus from a patient with coronary artery disease and normal individuals. Four tandem head-to-tail duplicons, each ∼50 kb long, were recovered in the patient but not in normal individuals. Sequence analysis revealed that the repeats varied by 10-15 SNPs between each other and by 82 SNPs between the human genome sequence (version hg19). SNPs polymorphism within the junctions between repeats allowed two junction types to be distinguished, Type 1 and Type 2, which were found at a 2:1 ratio. The junction sequences contained an Alu element, a sequence previously shown to play a role in duplication. Knowledge of structural variation in the CAD interval from more patients could help link this locus to cardiovascular diseases susceptibility, and maybe relevant to other cases of regional amplification, including cancer. PMID:29632643

  12. Complete sequencing and pan-genomic analysis of Lactobacillus delbrueckii subsp. bulgaricus reveal its genetic basis for industrial yogurt production.

    Science.gov (United States)

    Hao, Pei; Zheng, Huajun; Yu, Yao; Ding, Guohui; Gu, Wenyi; Chen, Shuting; Yu, Zhonghao; Ren, Shuangxi; Oda, Munehiro; Konno, Tomonobu; Wang, Shengyue; Li, Xuan; Ji, Zai-Si; Zhao, Guoping

    2011-01-17

    Lactobacillus delbrueckii subsp. bulgaricus (Lb. bulgaricus) is an important species of Lactic Acid Bacteria (LAB) used for cheese and yogurt fermentation. The genome of Lb. bulgaricus 2038, an industrial strain mainly used for yogurt production, was completely sequenced and compared against the other two ATCC collection strains of the same subspecies. Specific physiological properties of strain 2038, such as lysine biosynthesis, formate production, aspartate-related carbon-skeleton intermediate metabolism, unique EPS synthesis and efficient DNA restriction/modification systems, are all different from those of the collection strains that might benefit the industrial production of yogurt. Other common features shared by Lb. bulgaricus strains, such as efficient protocooperation with Streptococcus thermophilus and lactate production as well as well-equipped stress tolerance mechanisms may account for it being selected originally for yogurt fermentation industry. Multiple lines of evidence suggested that Lb. bulgaricus 2038 was genetically closer to the common ancestor of the subspecies than the other two sequenced collection strains, probably due to a strict industrial maintenance process for strain 2038 that might have halted its genome decay and sustained a gene network suitable for large scale yogurt production.

  13. Complete sequencing and pan-genomic analysis of Lactobacillus delbrueckii subsp. bulgaricus reveal its genetic basis for industrial yogurt production.

    Directory of Open Access Journals (Sweden)

    Pei Hao

    Full Text Available Lactobacillus delbrueckii subsp. bulgaricus (Lb. bulgaricus is an important species of Lactic Acid Bacteria (LAB used for cheese and yogurt fermentation. The genome of Lb. bulgaricus 2038, an industrial strain mainly used for yogurt production, was completely sequenced and compared against the other two ATCC collection strains of the same subspecies. Specific physiological properties of strain 2038, such as lysine biosynthesis, formate production, aspartate-related carbon-skeleton intermediate metabolism, unique EPS synthesis and efficient DNA restriction/modification systems, are all different from those of the collection strains that might benefit the industrial production of yogurt. Other common features shared by Lb. bulgaricus strains, such as efficient protocooperation with Streptococcus thermophilus and lactate production as well as well-equipped stress tolerance mechanisms may account for it being selected originally for yogurt fermentation industry. Multiple lines of evidence suggested that Lb. bulgaricus 2038 was genetically closer to the common ancestor of the subspecies than the other two sequenced collection strains, probably due to a strict industrial maintenance process for strain 2038 that might have halted its genome decay and sustained a gene network suitable for large scale yogurt production.

  14. Complete Sequencing and Pan-Genomic Analysis of Lactobacillus delbrueckii subsp. bulgaricus Reveal Its Genetic Basis for Industrial Yogurt Production

    Science.gov (United States)

    Ding, Guohui; Gu, Wenyi; Chen, Shuting; Yu, Zhonghao; Ren, Shuangxi; Oda, Munehiro; Konno, Tomonobu; Wang, Shengyue; Li, Xuan; Ji, Zai-Si; Zhao, Guoping

    2011-01-01

    Lactobacillus delbrueckii subsp. bulgaricus (Lb. bulgaricus) is an important species of Lactic Acid Bacteria (LAB) used for cheese and yogurt fermentation. The genome of Lb. bulgaricus 2038, an industrial strain mainly used for yogurt production, was completely sequenced and compared against the other two ATCC collection strains of the same subspecies. Specific physiological properties of strain 2038, such as lysine biosynthesis, formate production, aspartate-related carbon-skeleton intermediate metabolism, unique EPS synthesis and efficient DNA restriction/modification systems, are all different from those of the collection strains that might benefit the industrial production of yogurt. Other common features shared by Lb. bulgaricus strains, such as efficient protocooperation with Streptococcus thermophilus and lactate production as well as well-equipped stress tolerance mechanisms may account for it being selected originally for yogurt fermentation industry. Multiple lines of evidence suggested that Lb. bulgaricus 2038 was genetically closer to the common ancestor of the subspecies than the other two sequenced collection strains, probably due to a strict industrial maintenance process for strain 2038 that might have halted its genome decay and sustained a gene network suitable for large scale yogurt production. PMID:21264216

  15. Bioinformatic Analysis Reveals Conservation of Intrinsic Disorder in the Linker Sequences of Prokaryotic Dual-family Immunophilin Chaperones.

    Science.gov (United States)

    Barik, Sailen

    2018-01-01

    The two classical immunophilin families, found essentially in all living cells, are: cyclophilin (CYN) and FK506-binding protein (FKBP). We previously reported a novel class of immunophilins that are natural chimera of these two, which we named dual-family immunophilin (DFI). The DFIs were found in either of two conformations: CYN-linker-FKBP (CFBP) or FKBP-3TPR-CYN (FCBP). While the 3TPR domain can serve as a flexible linker between the FKBP and CYN modules in the FCBP-type DFI, the linker sequences in the CFBP-type DFIs are relatively short, diverse in sequence, and contain no discernible motif or signature. Here, I present several lines of computational evidence that, regardless of their primary structure, these CFBP linkers are intrinsically disordered. This report provides the first molecular foundation for the model that the CFBP linker acts as an unstructured, flexible loop, allowing the two flanking chaperone modules function independently while linked in cis , likely to assist in the folding of multisubunit client complexes.

  16. Conserved antigenic sites between MERS-CoV and Bat-coronavirus are revealed through sequence analysis.

    Science.gov (United States)

    Sharmin, Refat; Islam, Abul B M M K

    2016-01-01

    MERS-CoV is a newly emerged human coronavirus reported closely related with HKU4 and HKU5 Bat coronaviruses. Bat and MERS corona-viruses are structurally related. Therefore, it is of interest to estimate the degree of conserved antigenic sites among them. It is of importance to elucidate the shared antigenic-sites and extent of conservation between them to understand the evolutionary dynamics of MERS-CoV. Multiple sequence alignment of the spike (S), membrane (M), enveloped (E) and nucleocapsid (N) proteins was employed to identify the sequence conservation among MERS and Bat (HKU4, HKU5) coronaviruses. We used various in silico tools to predict the conserved antigenic sites. We found that MERS-CoV shared 30 % of its S protein antigenic sites with HKU4 and 70 % with HKU5 bat-CoV. Whereas 100 % of its E, M and N protein's antigenic sites are found to be conserved with those in HKU4 and HKU5. This sharing suggests that in case of pathogenicity MERS-CoV is more closely related to HKU5 bat-CoV than HKU4 bat-CoV. The conserved epitopes indicates their evolutionary relationship and ancestry of pathogenicity.

  17. Sequence tagging reveals unexpected modifications in toxicoproteomics

    Science.gov (United States)

    Dasari, Surendra; Chambers, Matthew C.; Codreanu, Simona G.; Liebler, Daniel C.; Collins, Ben C.; Pennington, Stephen R.; Gallagher, William M.; Tabb, David L.

    2010-01-01

    Toxicoproteomic samples are rich in posttranslational modifications (PTMs) of proteins. Identifying these modifications via standard database searching can incur significant performance penalties. Here we describe the latest developments in TagRecon, an algorithm that leverages inferred sequence tags to identify modified peptides in toxicoproteomic data sets. TagRecon identifies known modifications more effectively than the MyriMatch database search engine. TagRecon outperformed state of the art software in recognizing unanticipated modifications from LTQ, Orbitrap, and QTOF data sets. We developed user-friendly software for detecting persistent mass shifts from samples. We follow a three-step strategy for detecting unanticipated PTMs in samples. First, we identify the proteins present in the sample with a standard database search. Next, identified proteins are interrogated for unexpected PTMs with a sequence tag-based search. Finally, additional evidence is gathered for the detected mass shifts with a refinement search. Application of this technology on toxicoproteomic data sets revealed unintended cross-reactions between proteins and sample processing reagents. Twenty five proteins in rat liver showed signs of oxidative stress when exposed to potentially toxic drugs. These results demonstrate the value of mining toxicoproteomic data sets for modifications. PMID:21214251

  18. Computational Analysis of G-Quadruplex Forming Sequences across Chromosomes Reveals High Density Patterns Near the Terminal Ends.

    Directory of Open Access Journals (Sweden)

    Julia H Chariker

    Full Text Available G-quadruplex structures (G4 are found throughout the human genome and are known to play a regulatory role in a variety of molecular processes. Structurally, they have many configurations and can form from one or more DNA strands. At the gene level, they regulate gene expression and protein synthesis. In this paper, chromosomal-level patterns of distribution are analyzed on the human genome to identify high-level distribution patterns potentially related to global functional processes. Here we show unique high density banding patterns on individual chromosomes that are highly correlated, appearing in a mirror pattern, across forward and reverse DNA strands. The highest density of G4 sequences occurs within four megabases of one end of most chromosomes and contains G4 motifs that bind with zinc finger proteins. These findings suggest that G4 may play a role in global chromosomal processes such as those found in meiosis.

  19. Whole genome sequencing and analysis of Campylobacter coli YH502 from retail chicken reveals a plasmid-borne type VI secretion system

    Directory of Open Access Journals (Sweden)

    Sandeep Ghatak

    2017-03-01

    Full Text Available Campylobacter is a major cause of foodborne illnesses worldwide. Campylobacter infections, commonly caused by ingestion of undercooked poultry and meat products, can lead to gastroenteritis and chronic reactive arthritis in humans. Whole genome sequencing (WGS is a powerful technology that provides comprehensive genetic information about bacteria and is increasingly being applied to study foodborne pathogens: e.g., evolution, epidemiology/outbreak investigation, and detection. Herein we report the complete genome sequence of Campylobacter coli strain YH502 isolated from retail chicken in the United States. WGS, de novo assembly, and annotation of the genome revealed a chromosome of 1,718,974 bp and a mega-plasmid (pCOS502 of 125,964 bp. GC content of the genome was 31.2% with 1931 coding sequences and 53 non-coding RNAs. Multiple virulence factors including a plasmid-borne type VI secretion system and antimicrobial resistance genes (beta-lactams, fluoroquinolones, and aminoglycoside were found. The presence of T6SS in a mobile genetic element (plasmid suggests plausible horizontal transfer of these virulence genes to other organisms. The C. coli YH502 genome also harbors CRISPR sequences and associated proteins. Phylogenetic analysis based on average nucleotide identity and single nucleotide polymorphisms identified closely related C. coli genomes available in the NCBI database. Taken together, the analyzed genomic data of this potentially virulent strain of C. coli will facilitate further understanding of this important foodborne pathogen most likely leading to better control strategies. The chromosome and plasmid sequences of C. coli YH502 have been deposited in GenBank under the accession numbers CP018900.1 and CP018901.1, respectively.

  20. Mass Spectrometry Analysis Coupled with de novo Sequencing Reveals Amino Acid Substitutions in Nucleocapsid Protein from Influenza A Virus

    Directory of Open Access Journals (Sweden)

    Zijian Li

    2014-02-01

    Full Text Available Amino acid substitutions in influenza A virus are the main reasons for both antigenic shift and virulence change, which result from non-synonymous mutations in the viral genome. Nucleocapsid protein (NP, one of the major structural proteins of influenza virus, is responsible for regulation of viral RNA synthesis and replication. In this report we used LC-MS/MS to analyze tryptic digestion of nucleocapsid protein of influenza virus (A/Puerto Rico/8/1934 H1N1, which was isolated and purified by SDS poly-acrylamide gel electrophoresis. Thus, LC-MS/MS analyses, coupled with manual de novo sequencing, allowed the determination of three substituted amino acid residues R452K, T423A and N430T in two tryptic peptides. The obtained results provided experimental evidence that amino acid substitutions resulted from non-synonymous gene mutations could be directly characterized by mass spectrometry in proteins of RNA viruses such as influenza A virus.

  1. Deep sequencing-based transcriptome profiling analysis of bacteria-challenged Lateolabrax japonicus reveals insight into the immune-relevant genes in marine fish

    Directory of Open Access Journals (Sweden)

    Xiang Li-xin

    2010-08-01

    Full Text Available Abstract Background Systematic research on fish immunogenetics is indispensable in understanding the origin and evolution of immune systems. This has long been a challenging task because of the limited number of deep sequencing technologies and genome backgrounds of non-model fish available. The newly developed Solexa/Illumina RNA-seq and Digital gene expression (DGE are high-throughput sequencing approaches and are powerful tools for genomic studies at the transcriptome level. This study reports the transcriptome profiling analysis of bacteria-challenged Lateolabrax japonicus using RNA-seq and DGE in an attempt to gain insights into the immunogenetics of marine fish. Results RNA-seq analysis generated 169,950 non-redundant consensus sequences, among which 48,987 functional transcripts with complete or various length encoding regions were identified. More than 52% of these transcripts are possibly involved in approximately 219 known metabolic or signalling pathways, while 2,673 transcripts were associated with immune-relevant genes. In addition, approximately 8% of the transcripts appeared to be fish-specific genes that have never been described before. DGE analysis revealed that the host transcriptome profile of Vibrio harveyi-challenged L. japonicus is considerably altered, as indicated by the significant up- or down-regulation of 1,224 strong infection-responsive transcripts. Results indicated an overall conservation of the components and transcriptome alterations underlying innate and adaptive immunity in fish and other vertebrate models. Analysis suggested the acquisition of numerous fish-specific immune system components during early vertebrate evolution. Conclusion This study provided a global survey of host defence gene activities against bacterial challenge in a non-model marine fish. Results can contribute to the in-depth study of candidate genes in marine fish immunity, and help improve current understanding of host

  2. Sequence and phylogenetic analysis of virulent Newcastle disease virus isolates from Pakistan during 2009–2013 reveals circulation of new sub genotype

    International Nuclear Information System (INIS)

    Siddique, Naila; Naeem, Khalid; Abbas, Muhammad Athar; Ali Malik, Akbar; Rashid, Farooq; Rafique, Saba; Ghafar, Abdul; Rehman, Abdul

    2013-01-01

    Despite observing the standard bio-security measures at commercial poultry farms and extensive use of Newcastle disease vaccines, a new genotype VII-f of Newcastle disease virus (NDV) got introduced in Pakistan during 2011. In this regard 300 ND outbreaks recorded so far have resulted into huge losses of approximately USD 200 million during 2011–2013. A total of 33 NDV isolates recovered during 2009–2013 throughout Pakistan were characterized biologically and phylogenetically. The phylogenetic analysis revealed a new velogenic sub genotype VII-f circulating in commercial and domestic poultry along with the earlier reported sub genotype VII-b. Partial sequencing of Fusion gene revealed two types of cleavage site motifs; lentogenic 112 GRQGRL 117 and velogenic 112 RRQKRF 117 along with some point mutations indicative of genetic diversity. We report here a new sub genotype of virulent NDV circulating in commercial and backyard poultry in Pakistan and provide evidence for the possible genetic diversity which may be causing new NDV out breaks. - Highlights: • The first report of isolation of new genotype VII-f of virulent Newcastle disease virus (NDV) in Pakistan. • We report the partial Fusion gene sequences of new genotype VII-f of virulent NDV from Pakistan. • We report the phylogenetic relationship of new NDV strains with reported NDV strains. • Provide outbreak history of new virulent NDV strain in commercial and backyard poultry in Pakistan. • We provide possible evidence for the role of backyard poultry in NDV outbreaks

  3. Genetic diversity and relatedness of Fasciola spp. isolates from different hosts and geographic regions revealed by analysis of mitochondrial DNA sequences.

    Science.gov (United States)

    Ai, L; Weng, Y B; Elsheikha, H M; Zhao, G H; Alasaad, S; Chen, J X; Li, J; Li, H L; Wang, C R; Chen, M X; Lin, R Q; Zhu, X Q

    2011-09-27

    The present study examined sequence variability in a portion of the mitochondrial cytochrome c oxidase subunit 1 (pcox1) and NADH dehydrogenase subunits 4 and 5 (pnad4 and pnad5) among 39 isolates of Fasciola spp., from different hosts from China, Niger, France, the United States of America, and Spain; and their phylogenetic relationships were re-constructed. Intra-species sequence variations were 0.0-1.1% for pcox1, 0.0-2.7% for pnad4, and 0.0-3.3% for pnad5 for Fasciola hepatica; 0.0-1.8% for pcox1, 0.0-2.5% for pnad4, and 0.0-4.2% for pnad5 for Fasciola gigantica, and 0.0-0.9% for pcox1, 0.0-0.2% for pnad4, and 0.0-1.1% for pnad5 for the intermediate Fasciola form. Whereas, nucleotide differences were 2.1-2.7% for pcox1, 3.1-3.3% for pnad4, and 4.2-4.8% for pnad5 between F. hepatica and F. gigantica; were 1.3-1.5% for pcox1, 2.1-2.9% for pnad4, 3.1-3.4% for pnad5 between F. hepatica and the intermediate form; and were 0.9-1.1% for pcox1, 1.4-1.8% for pnad4, 2.2-2.4% for pnad5 between F. gigantica and the intermediate form. Phylogenetic analysis based on the combined sequences of pcox1, pnad4 and pnad5 revealed distinct groupings of isolates of F. hepatica, F. gigantica, or the intermediate Fasciola form irrespective of their origin, demonstrating the usefulness of the mtDNA sequences for the delineation of Fasciola species, and reinforcing the genetic evidence for the existence of the intermediate Fasciola form. Copyright © 2011 Elsevier B.V. All rights reserved.

  4. Analysis of complete nucleotide sequences of Angolan hepatitis B virus isolates reveals the existence of a separate lineage within genotype E.

    Directory of Open Access Journals (Sweden)

    Barbara V Lago

    Full Text Available Hepatitis B virus genotype E (HBV/E is highly prevalent in Western Africa. In this work, 30 HBV/E isolates from HBsAg positive Angolans (staff and visitors of a private hospital in Luanda were genetically characterized: 16 of them were completely sequenced and the pre-S/S sequences of the remaining 14 were determined. A high proportion (12/30, 40% of subjects tested positive for both HBsAg and anti-HBs markers. Deduced amino acid sequences revealed the existence of specific substitutions and deletions in the B- and T-cell epitopes of the surface antigen (pre-S1- and pre-S2 regions of the virus isolates derived from 8/12 individuals with concurrent HBsAg/anti-HBs. Phylogenetic analysis performed with 231 HBV/E full-length sequences, including 16 from this study, showed that all isolates from Angola, Namibia and the Democratic Republic of Congo (n = 28 clustered in a separate lineage, divergent from the HBV/E isolates from nine other African countries, namely Cameroon, Central African Republic, Côte d'Ivoire, Ghana, Guinea, Madagascar, Niger, Nigeria and Sudan, with a Bayesian posterior probability of 1. Five specific mutations, namely small S protein T57I, polymerase Q177H, G245W and M612L, and X protein V30L, were observed in 79-96% of the isolates of the separate lineage, compared to a frequency of 0-12% among the other HBV/E African isolates.

  5. Whole-Genome Sequencing and Comparative Analysis of Mycobacterium brisbanense Reveals a Possible Soil Origin and Capability in Fertiliser Synthesis.

    Directory of Open Access Journals (Sweden)

    Wei Yee Wee

    Full Text Available Mycobacterium brisbanense is a member of Mycobacterium fortuitum third biovariant complex, which includes rapidly growing Mycobacterium spp. that normally inhabit soil, dust and water, and can sometimes cause respiratory tract infections in humans. We present the first whole-genome analysis of M. brisbanense UM_WWY which was isolated from a 70-year-old Malaysian patient. Molecular phylogenetic analyses confirmed the identification of this strain as M. brisbanense and showed that it has an unusually large genome compared with related mycobacteria. The large genome size of M. brisbanense UM_WWY (~7.7Mbp is consistent with further findings that this strain has a highly variable genome structure that contains many putative horizontally transferred genomic islands and prophage. Comparative analysis showed that M. brisbanense UM_WWY is the only Mycobacterium species that possesses a complete set of genes encoding enzymes involved in the urea cycle, suggesting that this soil bacterium is able to synthesize urea for use as plant fertilizers. It is likely that M. brisbanense UM_WWY is adapted to live in soil as its primary habitat since the genome contains many genes associated with nitrogen metabolism. Nevertheless, a large number of predicted virulence genes were identified in M. brisbanense UM_WWY that are mostly shared with well-studied mycobacterial pathogens such as Mycobacterium tuberculosis and Mycobacterium abscessus. These findings are consistent with the role of M. brisbanense as an opportunistic pathogen of humans. The whole-genome study of UM_WWY has provided the basis for future work of M. brisbanense.

  6. Whole-Genome Sequencing and Comparative Analysis of Mycobacterium brisbanense Reveals a Possible Soil Origin and Capability in Fertiliser Synthesis.

    Science.gov (United States)

    Wee, Wei Yee; Tan, Tze King; Jakubovics, Nicholas S; Choo, Siew Woh

    2016-01-01

    Mycobacterium brisbanense is a member of Mycobacterium fortuitum third biovariant complex, which includes rapidly growing Mycobacterium spp. that normally inhabit soil, dust and water, and can sometimes cause respiratory tract infections in humans. We present the first whole-genome analysis of M. brisbanense UM_WWY which was isolated from a 70-year-old Malaysian patient. Molecular phylogenetic analyses confirmed the identification of this strain as M. brisbanense and showed that it has an unusually large genome compared with related mycobacteria. The large genome size of M. brisbanense UM_WWY (~7.7Mbp) is consistent with further findings that this strain has a highly variable genome structure that contains many putative horizontally transferred genomic islands and prophage. Comparative analysis showed that M. brisbanense UM_WWY is the only Mycobacterium species that possesses a complete set of genes encoding enzymes involved in the urea cycle, suggesting that this soil bacterium is able to synthesize urea for use as plant fertilizers. It is likely that M. brisbanense UM_WWY is adapted to live in soil as its primary habitat since the genome contains many genes associated with nitrogen metabolism. Nevertheless, a large number of predicted virulence genes were identified in M. brisbanense UM_WWY that are mostly shared with well-studied mycobacterial pathogens such as Mycobacterium tuberculosis and Mycobacterium abscessus. These findings are consistent with the role of M. brisbanense as an opportunistic pathogen of humans. The whole-genome study of UM_WWY has provided the basis for future work of M. brisbanense.

  7. Illumina sequencing-based analysis of a microbial community enriched under anaerobic methane oxidation condition coupled to denitrification revealed coexistence of aerobic and anaerobic methanotrophs.

    Science.gov (United States)

    Siniscalchi, Luciene Alves Batista; Leite, Laura Rabelo; Oliveira, Guilherme; Chernicharo, Carlos Augusto Lemos; de Araújo, Juliana Calabria

    2017-07-01

    Methane is produced in anaerobic environments, such as reactors used to treat wastewaters, and can be consumed by methanotrophs. The composition and structure of a microbial community enriched from anaerobic sewage sludge under methane-oxidation condition coupled to denitrification were investigated. Denaturing gradient gel electrophoresis (DGGE) analysis retrieved sequences of Methylocaldum and Chloroflexi. Deep sequencing analysis revealed a complex community that changed over time and was affected by methane concentration. Methylocaldum (8.2%), Methylosinus (2.3%), Methylomonas (0.02%), Methylacidiphilales (0.45%), Nitrospirales (0.18%), and Methanosarcinales (0.3%) were detected. Despite denitrifying conditions provided, Nitrospirales and Methanosarcinales, known to perform anaerobic methane oxidation coupled to denitrification (DAMO) process, were in very low abundance. Results demonstrated that aerobic and anaerobic methanotrophs coexisted in the reactor together with heterotrophic microorganisms, suggesting that a diverse microbial community was important to sustain methanotrophic activity. The methanogenic sludge was a good inoculum to enrich methanotrophs, and cultivation conditions play a selective role in determining community composition.

  8. Biological sequence analysis

    DEFF Research Database (Denmark)

    Durbin, Richard; Eddy, Sean; Krogh, Anders Stærmose

    This book provides an up-to-date and tutorial-level overview of sequence analysis methods, with particular emphasis on probabilistic modelling. Discussed methods include pairwise alignment, hidden Markov models, multiple alignment, profile searches, RNA secondary structure analysis, and phylogene...

  9. Sequence and phylogenetic analysis of virulent Newcastle disease virus isolates from Pakistan during 2009–2013 reveals circulation of new sub genotype

    Energy Technology Data Exchange (ETDEWEB)

    Siddique, Naila, E-mail: naila.nrlpd@gmail.com [National Reference Laboratory for Poultry Diseases, Animal Sciences Institute, National Agricultural Research Center, Islamabad (Pakistan); Naeem, Khalid; Abbas, Muhammad Athar; Ali Malik, Akbar; Rashid, Farooq; Rafique, Saba; Ghafar, Abdul; Rehman, Abdul [National Reference Laboratory for Poultry Diseases, Animal Sciences Institute, National Agricultural Research Center, Islamabad (Pakistan)

    2013-09-15

    Despite observing the standard bio-security measures at commercial poultry farms and extensive use of Newcastle disease vaccines, a new genotype VII-f of Newcastle disease virus (NDV) got introduced in Pakistan during 2011. In this regard 300 ND outbreaks recorded so far have resulted into huge losses of approximately USD 200 million during 2011–2013. A total of 33 NDV isolates recovered during 2009–2013 throughout Pakistan were characterized biologically and phylogenetically. The phylogenetic analysis revealed a new velogenic sub genotype VII-f circulating in commercial and domestic poultry along with the earlier reported sub genotype VII-b. Partial sequencing of Fusion gene revealed two types of cleavage site motifs; lentogenic {sup 112}GRQGRL{sup 117} and velogenic {sup 112}RRQKRF{sup 117} along with some point mutations indicative of genetic diversity. We report here a new sub genotype of virulent NDV circulating in commercial and backyard poultry in Pakistan and provide evidence for the possible genetic diversity which may be causing new NDV out breaks. - Highlights: • The first report of isolation of new genotype VII-f of virulent Newcastle disease virus (NDV) in Pakistan. • We report the partial Fusion gene sequences of new genotype VII-f of virulent NDV from Pakistan. • We report the phylogenetic relationship of new NDV strains with reported NDV strains. • Provide outbreak history of new virulent NDV strain in commercial and backyard poultry in Pakistan. • We provide possible evidence for the role of backyard poultry in NDV outbreaks.

  10. Next generation sequencing reveals the hidden diversity of zooplankton assemblages.

    Directory of Open Access Journals (Sweden)

    Penelope K Lindeque

    Full Text Available BACKGROUND: Zooplankton play an important role in our oceans, in biogeochemical cycling and providing a food source for commercially important fish larvae. However, difficulties in correctly identifying zooplankton hinder our understanding of their roles in marine ecosystem functioning, and can prevent detection of long term changes in their community structure. The advent of massively parallel next generation sequencing technology allows DNA sequence data to be recovered directly from whole community samples. Here we assess the ability of such sequencing to quantify richness and diversity of a mixed zooplankton assemblage from a productive time series site in the Western English Channel. METHODOLOGY/PRINCIPLE FINDINGS: Plankton net hauls (200 µm were taken at the Western Channel Observatory station L4 in September 2010 and January 2011. These samples were analysed by microscopy and metagenetic analysis of the 18S nuclear small subunit ribosomal RNA gene using the 454 pyrosequencing platform. Following quality control a total of 419,041 sequences were obtained for all samples. The sequences clustered into 205 operational taxonomic units using a 97% similarity cut-off. Allocation of taxonomy by comparison with the National Centre for Biotechnology Information database identified 135 OTUs to species level, 11 to genus level and 1 to order, <2.5% of sequences were classified as unknowns. By comparison a skilled microscopic analyst was able to routinely enumerate only 58 taxonomic groups. CONCLUSIONS: Metagenetics reveals a previously hidden taxonomic richness, especially for Copepoda and hard-to-identify meroplankton such as Bivalvia, Gastropoda and Polychaeta. It also reveals rare species and parasites. We conclude that Next Generation Sequencing of 18S amplicons is a powerful tool for elucidating the true diversity and species richness of zooplankton communities. While this approach allows for broad diversity assessments of plankton it may

  11. A Next-Generation Sequencing Data Analysis Pipeline for Detecting Unknown Pathogens from Mixed Clinical Samples and Revealing Their Genetic Diversity.

    Directory of Open Access Journals (Sweden)

    Yu-Nong Gong

    Full Text Available Forty-two cytopathic effect (CPE-positive isolates were collected from 2008 to 2012. All isolates could not be identified for known viral pathogens by routine diagnostic assays. They were pooled into 8 groups of 5-6 isolates to reduce the sequencing cost. Next-generation sequencing (NGS was conducted for each group of mixed samples, and the proposed data analysis pipeline was used to identify viral pathogens in these mixed samples. Polymerase chain reaction (PCR or enzyme-linked immunosorbent assay (ELISA was individually conducted for each of these 42 isolates depending on the predicted viral types in each group. Two isolates remained unknown after these tests. Moreover, iteration mapping was implemented for each of these 2 isolates, and predicted human parechovirus (HPeV in both. In summary, our NGS pipeline detected the following viruses among the 42 isolates: 29 human rhinoviruses (HRVs, 10 HPeVs, 1 human adenovirus (HAdV, 1 echovirus and 1 rotavirus. We then focused on the 10 identified Taiwanese HPeVs because of their reported clinical significance over HRVs. Their genomes were assembled and their genetic diversity was explored. One novel 6-bp deletion was found in one HPeV-1 virus. In terms of nucleotide heterogeneity, 64 genetic variants were detected from these HPeVs using the mapped NGS reads. Most importantly, a recombination event was found between our HPeV-3 and a known HPeV-4 strain in the database. Similar event was detected in the other HPeV-3 strains in the same clade of the phylogenetic tree. These findings demonstrated that the proposed NGS data analysis pipeline identified unknown viruses from the mixed clinical samples, revealed their genetic identity and variants, and characterized their genetic features in terms of viral evolution.

  12. Comparative sequence analysis of the potato cyst nematode resistance locus H1 reveals a major lack of co-linearity between three haplotypes in potato (Solanum tuberosum ssp.).

    Science.gov (United States)

    Finkers-Tomczak, Anna; Bakker, Erin; de Boer, Jan; van der Vossen, Edwin; Achenbach, Ute; Golas, Tomasz; Suryaningrat, Suwardi; Smant, Geert; Bakker, Jaap; Goverse, Aska

    2011-02-01

    The H1 locus confers resistance to the potato cyst nematode Globodera rostochiensis pathotypes 1 and 4. It is positioned at the distal end of chromosome V of the diploid Solanum tuberosum genotype SH83-92-488 (SH) on an introgression segment derived from S. tuberosum ssp. andigena. Markers from a high-resolution genetic map of the H1 locus (Bakker et al. in Theor Appl Genet 109:146-152, 2004) were used to screen a BAC library to construct a physical map covering a 341-kb region of the resistant haplotype coming from SH. For comparison, physical maps were also generated of the two haplotypes from the diploid susceptible genotype RH89-039-16 (S. tuberosum ssp. tuberosum/S. phureja), spanning syntenic regions of 700 and 319 kb. Gene predictions on the genomic segments resulted in the identification of a large cluster consisting of variable numbers of the CC-NB-LRR type of R genes for each haplotype. Furthermore, the regions were interspersed with numerous transposable elements and genes coding for an extensin-like protein and an amino acid transporter. Comparative analysis revealed a major lack of gene order conservation in the sequences of the three closely related haplotypes. Our data provide insight in the evolutionary mechanisms shaping the H1 locus and will facilitate the map-based cloning of the H1 resistance gene.

  13. Location analysis for the estrogen receptor-α reveals binding to diverse ERE sequences and widespread binding within repetitive DNA elements

    Science.gov (United States)

    Mason, Christopher E.; Shu, Feng-Jue; Wang, Cheng; Session, Ryan M.; Kallen, Roland G.; Sidell, Neil; Yu, Tianwei; Liu, Mei Hui; Cheung, Edwin; Kallen, Caleb B.

    2010-01-01

    Location analysis for estrogen receptor-α (ERα)-bound cis-regulatory elements was determined in MCF7 cells using chromatin immunoprecipitation (ChIP)-on-chip. Here, we present the estrogen response element (ERE) sequences that were identified at ERα-bound loci and quantify the incidence of ERE sequences under two stringencies of detection: ERE sequence. We demonstrate that ∼50% of all ERα-bound loci do not have a discernable ERE and show that most ERα-bound EREs are not perfect consensus EREs. Approximately one-third of all ERα-bound ERE sequences reside within repetitive DNA sequences, most commonly of the AluS family. In addition, the 3-bp spacer between the inverted ERE half-sites, rather than being random nucleotides, is C(A/T)G-enriched at bona fide receptor targets. Diverse ERα-bound loci were validated using electrophoretic mobility shift assay and ChIP-polymerase chain reaction (PCR). The functional significance of receptor-bound loci was demonstrated using luciferase reporter assays which proved that repetitive element ERE sequences contribute to enhancer function. ChIP-PCR demonstrated estrogen-dependent recruitment of the coactivator SRC3 to these loci in vivo. Our data demonstrate that ERα binds to widely variant EREs with less sequence specificity than had previously been suspected and that binding at repetitive and nonrepetitive genomic targets is favored by specific trinucleotide spacers. PMID:20047966

  14. Location analysis for the estrogen receptor-alpha reveals binding to diverse ERE sequences and widespread binding within repetitive DNA elements.

    Science.gov (United States)

    Mason, Christopher E; Shu, Feng-Jue; Wang, Cheng; Session, Ryan M; Kallen, Roland G; Sidell, Neil; Yu, Tianwei; Liu, Mei Hui; Cheung, Edwin; Kallen, Caleb B

    2010-04-01

    Location analysis for estrogen receptor-alpha (ERalpha)-bound cis-regulatory elements was determined in MCF7 cells using chromatin immunoprecipitation (ChIP)-on-chip. Here, we present the estrogen response element (ERE) sequences that were identified at ERalpha-bound loci and quantify the incidence of ERE sequences under two stringencies of detection: ERE sequence. We demonstrate that approximately 50% of all ERalpha-bound loci do not have a discernable ERE and show that most ERalpha-bound EREs are not perfect consensus EREs. Approximately one-third of all ERalpha-bound ERE sequences reside within repetitive DNA sequences, most commonly of the AluS family. In addition, the 3-bp spacer between the inverted ERE half-sites, rather than being random nucleotides, is C(A/T)G-enriched at bona fide receptor targets. Diverse ERalpha-bound loci were validated using electrophoretic mobility shift assay and ChIP-polymerase chain reaction (PCR). The functional significance of receptor-bound loci was demonstrated using luciferase reporter assays which proved that repetitive element ERE sequences contribute to enhancer function. ChIP-PCR demonstrated estrogen-dependent recruitment of the coactivator SRC3 to these loci in vivo. Our data demonstrate that ERalpha binds to widely variant EREs with less sequence specificity than had previously been suspected and that binding at repetitive and nonrepetitive genomic targets is favored by specific trinucleotide spacers.

  15. Image sequence analysis

    CERN Document Server

    1981-01-01

    The processing of image sequences has a broad spectrum of important applica­ tions including target tracking, robot navigation, bandwidth compression of TV conferencing video signals, studying the motion of biological cells using microcinematography, cloud tracking, and highway traffic monitoring. Image sequence processing involves a large amount of data. However, because of the progress in computer, LSI, and VLSI technologies, we have now reached a stage when many useful processing tasks can be done in a reasonable amount of time. As a result, research and development activities in image sequence analysis have recently been growing at a rapid pace. An IEEE Computer Society Workshop on Computer Analysis of Time-Varying Imagery was held in Philadelphia, April 5-6, 1979. A related special issue of the IEEE Transactions on Pattern Anal­ ysis and Machine Intelligence was published in November 1980. The IEEE Com­ puter magazine has also published a special issue on the subject in 1981. The purpose of this book ...

  16. Fetal anatomy revealed with fast MR sequences.

    Science.gov (United States)

    Levine, D; Hatabu, H; Gaa, J; Atkinson, M W; Edelman, R R

    1996-10-01

    Although all the imaging studies in this pictorial essay were done for maternal rather than fetal indications, fetal anatomy was well visualized. However, when scans are undertaken for fetal indications, fetal motion in between scout views and imaging sequences may make specific image planes difficult to obtain. Of the different techniques described in this review, we preferred the HASTE technique and use it almost exclusively for scanning pregnant patients. The T2-weighting is ideal for delineating fetal organs. Also, the HASTE technique allows images to be obtained in 430 msec, limiting artifacts arising from maternal and fetal motion. MR imaging should play a more important role in evaluating equivocal sonographic cases as fast scanning techniques are more widely used. Obstetric MR imaging no longer will be limited by fetal motion artifacts. When complex anatomy requires definition in a complicated pregnant patient, MR imaging should be considered as a useful adjunct to sonography.

  17. Cytosolic Glutamine Synthetase is Important for Photosynthetic Efficiency and Water Use Efficiency in Potato as Revealed by High Throughput Sequencing QTL analysis

    DEFF Research Database (Denmark)

    Kaminski, Kacper Piotr; Sørensen, Kirsten Kørup; Andersen, Mathias Neumann

    2015-01-01

    was observed. Two extreme WUE bulks of clones were identified and pools of genomic DNA from them as well as the parents were sequenced and mapped to reference potato genome. Following a novel data analysis approach, two highly resolved QTLs were found on chromosome 1 and 9. Interestingly, three genes encoding...

  18. Meta-analysis of sequence-based association studies across three cattle breeds reveals 25 QTL for fat and protein percentages in milk at nucleotide resolution.

    Science.gov (United States)

    Pausch, Hubert; Emmerling, Reiner; Gredler-Grandl, Birgit; Fries, Ruedi; Daetwyler, Hans D; Goddard, Michael E

    2017-11-09

    Genotyping and whole-genome sequencing data have been generated for hundreds of thousands of cattle. International consortia used these data to compile imputation reference panels that facilitate the imputation of sequence variant genotypes for animals that have been genotyped using dense microarrays. Association studies with imputed sequence variant genotypes allow for the characterization of quantitative trait loci (QTL) at nucleotide resolution particularly when individuals from several breeds are included in the mapping populations. We imputed genotypes for 28 million sequence variants in 17,229 cattle of the Braunvieh, Fleckvieh and Holstein breeds in order to compile large mapping populations that provide high power to identify QTL for milk production traits. Association tests between imputed sequence variant genotypes and fat and protein percentages in milk uncovered between six and thirteen QTL (P < 1e-8) per breed. Eight of the detected QTL were significant in more than one breed. We combined the results across breeds using meta-analysis and identified a total of 25 QTL including six that were not significant in the within-breed association studies. Two missense mutations in the ABCG2 (p.Y581S, rs43702337, P = 4.3e-34) and GHR (p.F279Y, rs385640152, P = 1.6e-74) genes were the top variants at QTL on chromosomes 6 and 20. Another known causal missense mutation in the DGAT1 gene (p.A232K, rs109326954, P = 8.4e-1436) was the second top variant at a QTL on chromosome 14 but its allelic substitution effects were inconsistent across breeds. It turned out that the conflicting allelic substitution effects resulted from flaws in the imputed genotypes due to the use of a multi-breed reference population for genotype imputation. Many QTL for milk production traits segregate across breeds and across-breed meta-analysis has greater power to detect such QTL than within-breed association testing. Association testing between imputed sequence variant genotypes and

  19. Transcriptional analysis of the HeT-A retrotransposon in mutant and wild type stocks reveals high sequence variability at Drosophila telomeres and other unusual features

    Directory of Open Access Journals (Sweden)

    Piñeyro David

    2011-11-01

    Full Text Available Abstract Background Telomere replication in Drosophila depends on the transposition of a domesticated retroelement, the HeT-A retrotransposon. The sequence of the HeT-A retrotransposon changes rapidly resulting in differentiated subfamilies. This pattern of sequence change contrasts with the essential function with which the HeT-A is entrusted and brings about questions concerning the extent of sequence variability, the telomere contribution of different subfamilies, and whether wild type and mutant Drosophila stocks show different HeT-A scenarios. Results A detailed study on the variability of HeT-A reveals that both the level of variability and the number of subfamilies are higher than previously reported. Comparisons between GIII, a strain with longer telomeres, and its parental strain Oregon-R indicate that both strains have the same set of HeT-A subfamilies. Finally, the presence of a highly conserved splicing pattern only in its antisense transcripts indicates a putative regulatory, functional or structural role for the HeT-A RNA. Interestingly, our results also suggest that most HeT-A copies are actively expressed regardless of which telomere and where in the telomere they are located. Conclusions Our study demonstrates how the HeT-A sequence changes much faster than previously reported resulting in at least nine different subfamilies most of which could actively contribute to telomere extension in Drosophila. Interestingly, the only significant difference observed between Oregon-R and GIII resides in the nature and proportion of the antisense transcripts, suggesting a possible mechanism that would in part explain the longer telomeres of the GIII stock.

  20. Automated analysis of high-throughput B-cell sequencing data reveals a high frequency of novel immunoglobulin V gene segment alleles.

    Science.gov (United States)

    Gadala-Maria, Daniel; Yaari, Gur; Uduman, Mohamed; Kleinstein, Steven H

    2015-02-24

    Individual variation in germline and expressed B-cell immunoglobulin (Ig) repertoires has been associated with aging, disease susceptibility, and differential response to infection and vaccination. Repertoire properties can now be studied at large-scale through next-generation sequencing of rearranged Ig genes. Accurate analysis of these repertoire-sequencing (Rep-Seq) data requires identifying the germline variable (V), diversity (D), and joining (J) gene segments used by each Ig sequence. Current V(D)J assignment methods work by aligning sequences to a database of known germline V(D)J segment alleles. However, existing databases are likely to be incomplete and novel polymorphisms are hard to differentiate from the frequent occurrence of somatic hypermutations in Ig sequences. Here we develop a Tool for Ig Genotype Elucidation via Rep-Seq (TIgGER). TIgGER analyzes mutation patterns in Rep-Seq data to identify novel V segment alleles, and also constructs a personalized germline database containing the specific set of alleles carried by a subject. This information is then used to improve the initial V segment assignments from existing tools, like IMGT/HighV-QUEST. The application of TIgGER to Rep-Seq data from seven subjects identified 11 novel V segment alleles, including at least one in every subject examined. These novel alleles constituted 13% of the total number of unique alleles in these subjects, and impacted 3% of V(D)J segment assignments. These results reinforce the highly polymorphic nature of human Ig V genes, and suggest that many novel alleles remain to be discovered. The integration of TIgGER into Rep-Seq processing pipelines will increase the accuracy of V segment assignments, thus improving B-cell repertoire analyses.

  1. Genome sequence analysis of five Canadian isolates of strawberry mottle virus reveals extensive intra-species diversity and a longer RNA2 with increased coding capacity compared to a previously characterized European isolate.

    Science.gov (United States)

    Bhagwat, Basdeo; Dickison, Virginia; Ding, Xinlun; Walker, Melanie; Bernardy, Michael; Bouthillier, Michel; Creelman, Alexa; DeYoung, Robyn; Li, Yinzi; Nie, Xianzhou; Wang, Aiming; Xiang, Yu; Sanfaçon, Hélène

    2016-06-01

    In this study, we report the genome sequence of five isolates of strawberry mottle virus (family Secoviridae, order Picornavirales) from strawberry field samples with decline symptoms collected in Eastern Canada. The Canadian isolates differed from the previously characterized European isolate 1134 in that they had a longer RNA2, resulting in a 239-amino-acid extension of the C-terminal region of the polyprotein. Sequence analysis suggests that reassortment and recombination occurred among the isolates. Phylogenetic analysis revealed that the Canadian isolates are diverse, grouping in two separate branches along with isolates from Europe and the Americas.

  2. Sequence analysis of measles virus strains collected during the pre- and early-vaccination era in Denmark reveals a considerable diversity of ancient strains

    DEFF Research Database (Denmark)

    Christensen, Laurids Siig; Schöller, S.; Schierup, M. H.

    2002-01-01

    A total of 199 serum samples from patients with measles collected in Denmark, Greenland and the Faroe Islands from 1964 to 1983 were analysed by PCR. Measles virus (MV) RNA could be detected in 38 (19%) of the samples and a total of 18 strains were subjected to partial sequence analysis of the he......A total of 199 serum samples from patients with measles collected in Denmark, Greenland and the Faroe Islands from 1964 to 1983 were analysed by PCR. Measles virus (MV) RNA could be detected in 38 (19%) of the samples and a total of 18 strains were subjected to partial sequence analysis...... of the hemagglutinin gene. The strains exhibited a considerable genomic diversity, which is at odds with the assumption that one genome type prevailed among globally circulating MV strains prior to the advent of live-attenuated vaccines. Our data indicate that the similarity of the various vaccine strains...... is attributed to their having originated from the same primary isolate. Consequently, it is implied that a small number of clinical manifestations of MV worldwide from which strains similar to the vaccine strain were identified were vaccine related rather than being caused by members of a persistently...

  3. High-throughput sequencing and pathway analysis reveal alteration of the pituitary transcriptome by 17α-ethynylestradiol (EE2) in female coho salmon, Oncorhynchus kisutch

    Energy Technology Data Exchange (ETDEWEB)

    Harding, Louisa B. [School of Aquatic and Fishery Sciences, University of Washington, Seattle, WA 98195 (United States); Schultz, Irvin R. [Battelle, Marine Sciences Laboratory – Pacific Northwest National Laboratory, 1529 West Sequim Bay Road, Sequim, WA 98382 (United States); Goetz, Giles W. [School of Aquatic and Fishery Sciences, University of Washington, Seattle, WA 98195 (United States); Luckenbach, J. Adam [Northwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, 2725 Montlake Blvd E, Seattle, WA 98112 (United States); Center for Reproductive Biology, Washington State University, Pullman, WA 98164 (United States); Young, Graham [School of Aquatic and Fishery Sciences, University of Washington, Seattle, WA 98195 (United States); Center for Reproductive Biology, Washington State University, Pullman, WA 98164 (United States); Goetz, Frederick W. [Northwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, Manchester Research Station, P.O. Box 130, Manchester, WA 98353 (United States); Swanson, Penny, E-mail: penny.swanson@noaa.gov [Northwest Fisheries Science Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, 2725 Montlake Blvd E, Seattle, WA 98112 (United States); Center for Reproductive Biology, Washington State University, Pullman, WA 98164 (United States)

    2013-10-15

    Highlights: •Studied impacts of ethynylestradiol (EE2) exposure on salmon pituitary transcriptome. •High-throughput sequencing, RNAseq, and pathway analysis were performed. •EE2 altered mRNAs for genes in circadian rhythm, GnRH, and TGFβ signaling pathways. •LH and FSH beta subunit mRNAs were most highly up- and down-regulated by EE2, respectively. •Estrogens may alter processes associated with reproductive timing in salmon. -- Abstract: Considerable research has been done on the effects of endocrine disrupting chemicals (EDCs) on reproduction and gene expression in the brain, liver and gonads of teleost fish, but information on impacts to the pituitary gland are still limited despite its central role in regulating reproduction. The aim of this study was to further our understanding of the potential effects of natural and synthetic estrogens on the brain–pituitary–gonad axis in fish by determining the effects of 17α-ethynylestradiol (EE2) on the pituitary transcriptome. We exposed sub-adult coho salmon (Oncorhynchus kisutch) to 0 or 12 ng EE2/L for up to 6 weeks and effects on the pituitary transcriptome of females were assessed using high-throughput Illumina{sup ®} sequencing, RNA-Seq and pathway analysis. After 1 or 6 weeks, 218 and 670 contiguous sequences (contigs) respectively, were differentially expressed in pituitaries of EE2-exposed fish relative to control. Two of the most highly up- and down-regulated contigs were luteinizing hormone β subunit (241-fold and 395-fold at 1 and 6 weeks, respectively) and follicle-stimulating hormone β subunit (−3.4-fold at 6 weeks). Additional contigs related to gonadotropin synthesis and release were differentially expressed in EE2-exposed fish relative to controls. These included contigs involved in gonadotropin releasing hormone (GNRH) and transforming growth factor-β signaling. There was an over-representation of significantly affected contigs in 33 and 18 canonical pathways at 1 and 6 weeks

  4. High-throughput sequencing and pathway analysis reveal alteration of the pituitary transcriptome by 17α-ethynylestradiol (EE2) in female coho salmon, Oncorhynchus kisutch

    International Nuclear Information System (INIS)

    Harding, Louisa B.; Schultz, Irvin R.; Goetz, Giles W.; Luckenbach, J. Adam; Young, Graham; Goetz, Frederick W.; Swanson, Penny

    2013-01-01

    Highlights: •Studied impacts of ethynylestradiol (EE2) exposure on salmon pituitary transcriptome. •High-throughput sequencing, RNAseq, and pathway analysis were performed. •EE2 altered mRNAs for genes in circadian rhythm, GnRH, and TGFβ signaling pathways. •LH and FSH beta subunit mRNAs were most highly up- and down-regulated by EE2, respectively. •Estrogens may alter processes associated with reproductive timing in salmon. -- Abstract: Considerable research has been done on the effects of endocrine disrupting chemicals (EDCs) on reproduction and gene expression in the brain, liver and gonads of teleost fish, but information on impacts to the pituitary gland are still limited despite its central role in regulating reproduction. The aim of this study was to further our understanding of the potential effects of natural and synthetic estrogens on the brain–pituitary–gonad axis in fish by determining the effects of 17α-ethynylestradiol (EE2) on the pituitary transcriptome. We exposed sub-adult coho salmon (Oncorhynchus kisutch) to 0 or 12 ng EE2/L for up to 6 weeks and effects on the pituitary transcriptome of females were assessed using high-throughput Illumina ® sequencing, RNA-Seq and pathway analysis. After 1 or 6 weeks, 218 and 670 contiguous sequences (contigs) respectively, were differentially expressed in pituitaries of EE2-exposed fish relative to control. Two of the most highly up- and down-regulated contigs were luteinizing hormone β subunit (241-fold and 395-fold at 1 and 6 weeks, respectively) and follicle-stimulating hormone β subunit (−3.4-fold at 6 weeks). Additional contigs related to gonadotropin synthesis and release were differentially expressed in EE2-exposed fish relative to controls. These included contigs involved in gonadotropin releasing hormone (GNRH) and transforming growth factor-β signaling. There was an over-representation of significantly affected contigs in 33 and 18 canonical pathways at 1 and 6 weeks

  5. 16S-23S rDNA intergenic spacer region polymorphism of Lactococcus garvieae, Lactococcus raffinolactis and Lactococcus lactis as revealed by PCR and nucleotide sequence analysis.

    Science.gov (United States)

    Blaiotta, Giuseppe; Pepe, Olimpia; Mauriello, Gianluigi; Villani, Francesco; Andolfi, Rosamaria; Moschetti, Giancarlo

    2002-12-01

    The intergenic spacer region (ISR) between the 16S and 23S rRNA genes was tested as a tool for differentiating lactococci commonly isolated in a dairy environment. 17 reference strains, representing 11 different species belonging to the genera Lactococcus, Streptococcus, Lactobacillus, Enterococcus and Leuconostoc, and 127 wild streptococcal strains isolated during the whole fermentation process of "Fior di Latte" cheese were analyzed. After 16S-23S rDNA ISR amplification by PCR, species or genus-specific patterns were obtained for most of the reference strains tested. Moreover, results obtained after nucleotide analysis show that the 16S-23S rDNA ISR sequences vary greatly, in size and sequence, among Lactococcus garvieae, Lactococcus raffinolactis, Lactococcus lactis as well as other streptococci from dairy environments. Because of the high degree of inter-specific polymorphism observed, 16S-23S rDNA ISR can be considered a good potential target for selecting species-specific molecular assays, such as PCR primer or probes, for a rapid and extremely reliable differentiation of dairy lactococcal isolates.

  6. Re-analysis of RNA-Sequencing Data on Apple Stem Grooving Virus infected Apple reveals more significant differentially expressed genes

    Directory of Open Access Journals (Sweden)

    Bipin Balan

    2017-12-01

    Full Text Available RNA sequencing (RNA-Seq technology has enabled the researchers to investigate the host global gene expression changes in plant-virus interactions which helped to understand the molecular basis of virus diseases. The re-analysis of RNA-Seq studies using most updated genome version and the available best analysis pipeline will produce most accurate results. In this study, we re-analysed the Apple stem grooving virus (ASGV infected apple shoots in comparison with that of virus-free in vitro shoots [1] using the most updated Malus x domestica genome downloaded from Phytozome database. The re-analysis was done by using HISAT2 software and Cufflinks program was used to mine the differentially expressed genes. We found that ~20% more reads was mapped to the latest genome using the updated pipeline, which proved the significance of such re-analysis. The comparison of the updated results with that of previous was done. In addition, we performed protein-protein interaction (PPI to investigate the proteins affected by ASGV infection.

  7. RNA-sequencing and pathway analysis reveal alteration of hepatic steroid biosynthesis and retinol metabolism by tributyltin exposure in male rare minnow (Gobiocypris rarus).

    Science.gov (United States)

    Zhang, Jiliang; Zhang, Chunnuan; Sun, Ping; Huang, Maoxian; Fan, Mingzhen; Liu, Min

    2017-07-01

    Tributyltin (TBT) is widely spread in aquatic ecosystems. Although adverse effects of TBT on reproduction and lipogenesis are observed in fishes, the underlying mechanisms, especially in livers, are still scarce and inconclusive. Thus, RNA-sequencing runs were performed on the hepatic libraries of adult male rare minnow (Gobiocypris rarus) after TBT exposure for 60d. After differentially expressed genes were identified, enrichment analysis and validation by quantitative real-time PCR were conducted. The results showed that TBT up-regulated the profile of hepatic genes in the steroid biosynthesis pathway and down-regulated the profile of hepatic genes in the retinol metabolism pathway. In the hepatic steroid biosynthesis pathway, TBT might induce biosynthesis of cholesterol, which could affect the bioavailability of steroid hormones. More important, 3beta-hydroxysteroid 3-dehydrogenase, a key enzyme in the biosynthesis of all active steroid hormones, was up-regulated by TBT exposure. In the hepatic retinol metabolism pathway, TBT impaired retinoic acid homeostasis which plays essential roles in both reproduction and lipogenesis. The results of two pathways offered new mechanisms underlying the toxicology of TBT and represented a starting point from which detailed mechanistic links should be explored. Copyright © 2017 Elsevier B.V. All rights reserved.

  8. Analysis of The Cancer Genome Atlas sequencing data reveals novel properties of the human papillomavirus 16 genome in head and neck squamous cell carcinoma.

    Science.gov (United States)

    Nulton, Tara J; Olex, Amy L; Dozmorov, Mikhail; Morgan, Iain M; Windle, Brad

    2017-03-14

    Human papillomavirus (HPV) DNA is detected in up to 80% of oropharyngeal carcinomas (OPC) and this HPV positive disease has reached epidemic proportions. To increase our understanding of the disease, we investigated the status of the HPV16 genome in HPV-positive head and neck cancers (HNC). Raw RNA-Seq and Whole Genome Sequence data from The Cancer Genome Atlas HNC samples were analyzed to gain a full understanding of the HPV genome status for these tumors. Several remarkable and novel observations were made following this analysis. Firstly, there are three main HPV genome states in these tumors that are split relatively evenly: An episomal only state, an integrated state, and a state in which the viral genome exists as a hybrid episome with human DNA. Secondly, none of the tumors expressed high levels of E6; E6*I is the dominant variant expressed in all tumors. The most striking conclusion from this study is that around three quarters of HPV16 positive HNC contain episomal versions of the viral genome that are likely replicating in an E1-E2 dependent manner. The clinical and therapeutic implications of these observations are discussed.

  9. A comparative sequence analysis reveals a common GBD/FH3-FH1-FH2-DAD architecture in formins from Dictyostelium, fungi and metazoa

    Directory of Open Access Journals (Sweden)

    Uyeda Taro QP

    2005-03-01

    Full Text Available Abstract Background Formins are multidomain proteins defined by a conserved FH2 (formin homology 2 domain with actin nucleation activity preceded by a proline-rich FH1 (formin homology 1 domain. Formins act as profilin-modulated processive actin nucleators conserved throughout a wide range of eukaryotes. Results We present a detailed sequence analysis of the 10 formins (ForA to J identified in the genome of the social amoeba Dictyostelium discoideum. With the exception of ForI and ForC all other formins conform to the domain structure GBD/FH3-FH1-FH2-DAD, where DAD is the Diaphanous autoinhibition domain and GBD/FH3 is the Rho GTPase-binding domain/formin homology 3 domain that we propose to represent a single domain. ForC lacks a FH1 domain, ForI lacks recognizable GBD/FH3 and DAD domains and ForA, E and J have additional unique domains. To establish the relationship between formins of Dictyostelium and other organisms we constructed a phylogenetic tree based on the alignment of FH2 domains. Real-time PCR was used to study the expression pattern of formin genes. Expression of forC, D, I and J increased during transition to multi-cellular stages, while the rest of genes displayed less marked developmental variations. During sexual development, expression of forH and forI displayed a significant increase in fusion competent cells. Conclusion Our analysis allows some preliminary insight into the functionality of Dictyostelium formins: all isoforms might display actin nucleation activity and, with the exception of ForI, might also be susceptible to autoinhibition and to regulation by Rho GTPases. The architecture GBD/FH3-FH1-FH2-DAD appears common to almost all Dictyostelium, fungal and metazoan formins, for which we propose the denomination of conventional formins, and implies a common regulatory mechanism.

  10. A comparative sequence analysis reveals a common GBD/FH3-FH1-FH2-DAD architecture in formins from Dictyostelium, fungi and metazoa.

    Science.gov (United States)

    Rivero, Francisco; Muramoto, Tetsuya; Meyer, Ann-Kathrin; Urushihara, Hideko; Uyeda, Taro Q P; Kitayama, Chikako

    2005-03-01

    Formins are multidomain proteins defined by a conserved FH2 (formin homology 2) domain with actin nucleation activity preceded by a proline-rich FH1 (formin homology 1) domain. Formins act as profilin-modulated processive actin nucleators conserved throughout a wide range of eukaryotes. We present a detailed sequence analysis of the 10 formins (ForA to J) identified in the genome of the social amoeba Dictyostelium discoideum. With the exception of ForI and ForC all other formins conform to the domain structure GBD/FH3-FH1-FH2-DAD, where DAD is the Diaphanous autoinhibition domain and GBD/FH3 is the Rho GTPase-binding domain/formin homology 3 domain that we propose to represent a single domain. ForC lacks a FH1 domain, ForI lacks recognizable GBD/FH3 and DAD domains and ForA, E and J have additional unique domains. To establish the relationship between formins of Dictyostelium and other organisms we constructed a phylogenetic tree based on the alignment of FH2 domains. Real-time PCR was used to study the expression pattern of formin genes. Expression of forC, D, I and J increased during transition to multi-cellular stages, while the rest of genes displayed less marked developmental variations. During sexual development, expression of forH and forI displayed a significant increase in fusion competent cells. Our analysis allows some preliminary insight into the functionality of Dictyostelium formins: all isoforms might display actin nucleation activity and, with the exception of ForI, might also be susceptible to autoinhibition and to regulation by Rho GTPases. The architecture GBD/FH3-FH1-FH2-DAD appears common to almost all Dictyostelium, fungal and metazoan formins, for which we propose the denomination of conventional formins, and implies a common regulatory mechanism.

  11. Small RNA sequence analysis of adenovirus VA RNA-derived miRNAs reveals an unexpected serotype-specific difference in structure and abundance.

    Directory of Open Access Journals (Sweden)

    Wael Kamel

    Full Text Available Human adenoviruses (HAds encode for one or two highly abundant virus-associated RNAs, designated VA RNAI and VA RNAII, which fold into stable hairpin structures resembling miRNA precursors. Here we show that the terminal stem of the VA RNAs originating from Ad4, Ad5, Ad11 and Ad37, all undergo Dicer dependent processing into virus-specific miRNAs (so-called mivaRNAs. We further show that the mivaRNA duplex is subjected to a highly asymmetric RISC loading with the 3'-strand from all VA RNAs being the favored strand, except for the Ad37 VA RNAII, where the 5'-mivaRNAII strand was preferentially assembled into RISC. Although the mivaRNA seed sequences are not fully conserved between the HAds a bioinformatics prediction approach suggests that a large fraction of the VA RNAII-, but not the VA RNAI-derived mivaRNAs still are able to target the same cellular genes. Using small RNA deep sequencing we demonstrate that the Dicer processing event in the terminal stem of the VA RNAs is not unique and generates 3'-mivaRNAs with a slight variation of the position of the 5' terminal nucleotide in the RISC loaded guide strand. Also, we show that all analyzed VA RNAs, except Ad37 VA RNAI and Ad5 VA RNAII, utilize an alternative upstream A start site in addition to the classical +1 G start site. Further, the 5'-mivaRNAs with an A start appears to be preferentially incorporated into RISC. Although the majority of mivaRNA research has been done using Ad5 as the model system our analysis demonstrates that the mivaRNAs expressed in Ad11- and Ad37-infected cells are the most abundant mivaRNAs associated with Ago2-containing RISC. Collectively, our results show an unexpected variability in Dicer processing of the VA RNAs and a serotype-specific loading of mivaRNAs into Ago2-based RISC.

  12. Next generation sequencing analysis reveals that the ribonucleases RNase II, RNase R and PNPase affect bacterial motility and biofilm formation in E. coli.

    Science.gov (United States)

    Pobre, Vânia; Arraiano, Cecília M

    2015-02-14

    The RNA steady-state levels in the cell are a balance between synthesis and degradation rates. Although transcription is important, RNA processing and turnover are also key factors in the regulation of gene expression. In Escherichia coli there are three main exoribonucleases (RNase II, RNase R and PNPase) involved in RNA degradation. Although there are many studies about these exoribonucleases not much is known about their global effect in the transcriptome. In order to study the effects of the exoribonucleases on the transcriptome, we sequenced the total RNA (RNA-Seq) from wild-type cells and from mutants for each of the exoribonucleases (∆rnb, ∆rnr and ∆pnp). We compared each of the mutant transcriptome with the wild-type to determine the global effects of the deletion of each exoribonucleases in exponential phase. We determined that the deletion of RNase II significantly affected 187 transcripts, while deletion of RNase R affects 202 transcripts and deletion of PNPase affected 226 transcripts. Surprisingly, many of the transcripts are actually down-regulated in the exoribonuclease mutants when compared to the wild-type control. The results obtained from the transcriptomic analysis pointed to the fact that these enzymes were changing the expression of genes related with flagellum assembly, motility and biofilm formation. The three exoribonucleases affected some stable RNAs, but PNPase was the main exoribonuclease affecting this class of RNAs. We confirmed by qPCR some fold-change values obtained from the RNA-Seq data, we also observed that all the exoribonuclease mutants were significantly less motile than the wild-type cells. Additionally, RNase II and RNase R mutants were shown to produce more biofilm than the wild-type control while the PNPase mutant did not form biofilms. In this work we demonstrate how deep sequencing can be used to discover new and relevant functions of the exoribonucleases. We were able to obtain valuable information about the

  13. Transcriptome sequence analysis of an ornamental plant, Ananas comosus var. bracteatus, revealed the potential unigenes involved in terpenoid and phenylpropanoid biosynthesis.

    Science.gov (United States)

    Ma, Jun; Kanakala, S; He, Yehua; Zhang, Junli; Zhong, Xiaolan

    2015-01-01

    Ananas comosus var. bracteatus (Red Pineapple) is an important ornamental plant for its colorful leaves and decorative red fruits. Because of its complex genome, it is difficult to understand the molecular mechanisms involved in the growth and development. Thus high-throughput transcriptome sequencing of Ananas comosus var. bracteatus is necessary to generate large quantities of transcript sequences for the purpose of gene discovery and functional genomic studies. The Ananas comosus var. bracteatus transcriptome was sequenced by the Illumina paired-end sequencing technology. We obtained a total of 23.5 million high quality sequencing reads, 1,555,808 contigs and 41,052 unigenes. In total 41,052 unigenes of Ananas comosus var. bracteatus, 23,275 unigenes were annotated in the NCBI non-redundant protein database and 23,134 unigenes were annotated in the Swiss-Port database. Out of these, 17,748 and 8,505 unigenes were assigned to gene ontology categories and clusters of orthologous groups, respectively. Functional annotation against Kyoto Encyclopedia of Genes and Genomes Pathway database identified 5,825 unigenes which were mapped to 117 pathways. The assembly predicted many unigenes that were previously unknown. The annotated unigenes were compared against pineapple, rice, maize, Arabidopsis, and sorghum. Unigenes that did not match any of those five sequence datasets are considered to be Ananas comosus var. bracteatus unique. We predicted unigenes encoding enzymes involved in terpenoid and phenylpropanoid biosynthesis. The sequence data provide the most comprehensive transcriptomic resource currently available for Ananas comosus var. bracteatus. To our knowledge; this is the first report on the de novo transcriptome sequencing of the Ananas comosus var. bracteatus. Unigenes obtained in this study, may help improve future gene expression, genetic and genomics studies in Ananas comosus var. bracteatus.

  14. Transcriptome Sequence Analysis of an Ornamental Plant, Ananas comosus var. bracteatus, Revealed the Potential Unigenes Involved in Terpenoid and Phenylpropanoid Biosynthesis

    OpenAIRE

    Ma, Jun; Kanakala, S.; He, Yehua; Zhang, Junli; Zhong, Xiaolan

    2015-01-01

    Background Ananas comosus var. bracteatus (Red Pineapple) is an important ornamental plant for its colorful leaves and decorative red fruits. Because of its complex genome, it is difficult to understand the molecular mechanisms involved in the growth and development. Thus high-throughput transcriptome sequencing of Ananas comosus var. bracteatus is necessary to generate large quantities of transcript sequences for the purpose of gene discovery and functional genomic studies. Results The Anana...

  15. Transcriptome sequence analysis of an ornamental plant, Ananas comosus var. bracteatus, revealed the potential unigenes involved in terpenoid and phenylpropanoid biosynthesis.

    Directory of Open Access Journals (Sweden)

    Jun Ma

    Full Text Available Ananas comosus var. bracteatus (Red Pineapple is an important ornamental plant for its colorful leaves and decorative red fruits. Because of its complex genome, it is difficult to understand the molecular mechanisms involved in the growth and development. Thus high-throughput transcriptome sequencing of Ananas comosus var. bracteatus is necessary to generate large quantities of transcript sequences for the purpose of gene discovery and functional genomic studies.The Ananas comosus var. bracteatus transcriptome was sequenced by the Illumina paired-end sequencing technology. We obtained a total of 23.5 million high quality sequencing reads, 1,555,808 contigs and 41,052 unigenes. In total 41,052 unigenes of Ananas comosus var. bracteatus, 23,275 unigenes were annotated in the NCBI non-redundant protein database and 23,134 unigenes were annotated in the Swiss-Port database. Out of these, 17,748 and 8,505 unigenes were assigned to gene ontology categories and clusters of orthologous groups, respectively. Functional annotation against Kyoto Encyclopedia of Genes and Genomes Pathway database identified 5,825 unigenes which were mapped to 117 pathways. The assembly predicted many unigenes that were previously unknown. The annotated unigenes were compared against pineapple, rice, maize, Arabidopsis, and sorghum. Unigenes that did not match any of those five sequence datasets are considered to be Ananas comosus var. bracteatus unique. We predicted unigenes encoding enzymes involved in terpenoid and phenylpropanoid biosynthesis.The sequence data provide the most comprehensive transcriptomic resource currently available for Ananas comosus var. bracteatus. To our knowledge; this is the first report on the de novo transcriptome sequencing of the Ananas comosus var. bracteatus. Unigenes obtained in this study, may help improve future gene expression, genetic and genomics studies in Ananas comosus var. bracteatus.

  16. Transcriptome Sequencing Analysis Reveals a Difference in Monoterpene Biosynthesis between Scented Lilium ‘Siberia’ and Unscented Lilium ‘Novano’

    Directory of Open Access Journals (Sweden)

    Zenghui Hu

    2017-08-01

    Full Text Available Lilium is a world famous fragrant bulb flower with high ornamental and economic values, and significant differences in fragrance are found among different Lilium genotypes. In order to explore the mechanism underlying the different fragrances, the floral scents of Lilium ‘Sibeia’, with a strong fragrance, and Lilium ‘Novano’, with a very faint fragrance, were collected in vivo using a dynamic headspace technique. These scents were identified using automated thermal desorption—gas chromatography/mass spectrometry (ATD-GC/MS at different flowering stages. We used RNA-Seq technique to determine the petal transcriptome at the full-bloom stage and analyzed differentially expressed genes (DEGs to investigate the molecular mechanism of floral scent biosynthesis. The results showed that a significantly higher amount of Lilium ‘Siberia’ floral scent was released compared with Lilium ‘Novano’. Moreover, monoterpenes played a dominant role in the floral scent of Lilium ‘Siberia’; therefore, it is believed that the different emissions of monoterpenes mainly contributed to the difference in the floral scent between the two Lilium genotypes. Transcriptome sequencing analysis indicated that ~29.24 Gb of raw data were generated and assembled into 124,233 unigenes, of which 35,749 unigenes were annotated. Through a comparison of gene expression between these two Lilium genotypes, 6,496 DEGs were identified. The genes in the terpenoid backbone biosynthesis pathway showed significantly different expression levels. The gene expressions of 1-deoxy-D-xylulose 5-phosphate synthase (DXS, 1-deoxy-D-xylulose-5-phosphate reductoisomerase (DXR, 4-hydroxy-3-methylbut-2-enyl diphosphate synthase (HDS, 4-hydroxy-3-methylbut-2-enyl diphosphate reductase (HDR, isopentenyl diphosphate isomerase (IDI, and geranyl diphosphate synthase (GPS/GGPS, were upregulated in Lilium ‘Siberia’ compared to Lilium ‘Novano’, and two monoterpene synthase genes

  17. Direct, rapid RNA sequence analysis

    International Nuclear Information System (INIS)

    Peattie, D.A.

    1987-01-01

    The original methods of RNA sequence analysis were based on enzymatic production and chromatographic separation of overlapping oligonucleotide fragments from within an RNA molecule followed by identification of the mononucleotides comprising the oligomer. Over the past decade the field of nucleic acid sequencing has changed dramatically, however, and RNA molecules now can be sequenced in a variety of more streamlined fashions. Most of the more recent advances in RNA sequencing have involved one-dimensional electrophoretic separation of 32 P-end-labeled oligoribonucleotides on polyacrylamide gels. In this chapter the author discusses two of these methods for determining the nucleotide sequences of RNA molecules rapidly: the chemical method and the enzymatic method. Both methods are direct and degradative, i.e., they rely on fragmatic and chemical approaches should be utilized. The single-strand-specific ribonucleases (A, T 1 , T 2 , and S 1 ) provide an efficient means to locate double-helical regions rapidly, and the chemical reactions provide a means to determine the RNA sequence within these regions. In addition, the chemical reactions allow one to assign interactions to specific atoms and to distinguish secondary interactions from tertiary ones. If the RNA molecule is small enough to be sequenced directly by the enzymatic or chemical method, the probing reactions can be done easily at the same time as sequencing reactions

  18. Experimental evolution, genetic analysis and genome re-sequencing reveal the mutation conferring artemisinin resistance in an isogenic lineage of malaria parasites

    KAUST Repository

    Hunt, Paul; Martinelli, Axel; Modrzynska, Katarzyna; Borges, Sofia; Creasey, Alison; Rodrigues, Louise; Beraldi, Dario; Loewe, Laurence; Fawcett, Richard; Kumar, Sujai; Thomson, Marian; Trivedi, Urmi; Otto, Thomas D; Pain, Arnab; Blaxter, Mark; Cravo, Pedro

    2010-01-01

    was mapped to a region of chromosome 2 by Linkage Group Selection in two different genetic crosses. Whole-genome deep coverage short-read re-sequencing (IlluminaSolexa) defined the point mutations, insertions, deletions and copy-number variations arising

  19. Sequence analysis of two alleles reveals that intra-and intergenic recombination played a role in the evolution of the radish fertility restorer (Rfo

    Directory of Open Access Journals (Sweden)

    Budar Françoise

    2010-02-01

    Full Text Available Abstract Background Land plant genomes contain multiple members of a eukaryote-specific gene family encoding proteins with pentatricopeptide repeat (PPR motifs. Some PPR proteins were shown to participate in post-transcriptional events involved in organellar gene expression, and this type of function is now thought to be their main biological role. Among PPR genes, restorers of fertility (Rf of cytoplasmic male sterility systems constitute a peculiar subgroup that is thought to evolve in response to the presence of mitochondrial sterility-inducing genes. Rf genes encoding PPR proteins are associated with very close relatives on complex loci. Results We sequenced a non-restoring allele (L7rfo of the Rfo radish locus whose restoring allele (D81Rfo was previously described, and compared the two alleles and their PPR genes. We identified a ca 13 kb long fragment, likely originating from another part of the radish genome, inserted into the L7rfo sequence. The L7rfo allele carries two genes (PPR-1 and PPR-2 closely related to the three previously described PPR genes of the restorer D81Rfo allele (PPR-A, PPR-B, and PPR-C. Our results indicate that alleles of the Rfo locus have experienced complex evolutionary events, including recombination and insertion of extra-locus sequences, since they diverged. Our analyses strongly suggest that present coding sequences of Rfo PPR genes result from intragenic recombination. We found that the 10 C-terminal PPR repeats in Rfo PPR gene encoded proteins result from the tandem duplication of a 5 PPR repeat block. Conclusions The Rfo locus appears to experience more complex evolution than its flanking sequences. The Rfo locus and PPR genes therein are likely to evolve as a result of intergenic and intragenic recombination. It is therefore not possible to determine which genes on the two alleles are direct orthologs. Our observations recall some previously reported data on pathogen resistance complex loci.

  20. Expressed sequence tag (EST) analysis of two subspecies of Metarhizium anisopliae reveals a plethora of secreted proteins with potential activity in insect hosts.

    Science.gov (United States)

    Freimoser, Florian M; Screen, Steven; Bagga, Savita; Hu, Gang; St Leger, Raymond J

    2003-01-01

    Expressed sequence tag (EST) libraries for Metarhizium anisopliae, the causative agent of green muscardine disease, were developed from the broad host-range pathogen Metarhizium anisopliae sf. anisopliae and the specific grasshopper pathogen, M. anisopliae sf. acridum. Approximately 1,700 5' end sequences from each subspecies were generated from cDNA libraries representing fungi grown under conditions that maximize secretion of cuticle-degrading enzymes. Both subspecies had ESTs for virtually all pathogenicity-related genes cloned to date from M. anisopliae, but many novel genes encoding potential virulence factors were also tagged. Enzymes with potential targets in the insect host included proteases, chitinases, phospholipases, lipases, esterases, phosphatases and enzymes producing toxic secondary metabolites. A diverse array of proteases composed 36 % of all M. anisopliae sf. anisopliae ESTs. Eighty percent of the ESTs that could be clustered into functional groups had significant matches (Ehistory of this clade.

  1. Analysis of the genome sequence of the pathogenic Muscovy duck parvovirus strain YY reveals a 14-nucleotide-pair deletion in the inverted terminal repeats.

    Science.gov (United States)

    Wang, Jianye; Huang, Yu; Zhou, Mingxu; Zhu, Guoqiang

    2016-09-01

    Genomic information about Muscovy duck parvovirus is still limited. In this study, the genome of the pathogenic MDPV strain YY was sequenced. The full-length genome of YY is 5075 nucleotides (nt) long, 57 nt shorter than that of strain FM. Sequence alignment indicates that the 5' and 3' inverted terminal repeats (ITR) of strain YY contain a 14-nucleotide-pair deletion in the stem of the palindromic hairpin structure in comparison to strain FM and FZ91-30. The deleted region contains one "E-box" site and one repeated motif with the sequence "TTCCGGT" or "ACCGGAA". Phylogenetic trees constructed based the protein coding genes concordantly showed that YY, together with nine other MDPV isolates from various places, clustered in a separate branch, distinct from the branch formed by goose parvovirus (GPV) strains. These results demonstrate that, despite the distinctive deletion, the YY strain still belongs to the classical MDPV group. Moreover, the deletion of ITR may contribute to the genome evolution of MDPV under immunization pressure.

  2. Analysis of alkaptonuria (AKU) mutations and polymorphisms reveals that the CCC sequence motif is a mutational hot spot in the homogentisate 1,2 dioxygenase gene (HGO).

    Science.gov (United States)

    Beltrán-Valero de Bernabé, D; Jimenez, F J; Aquaron, R; Rodríguez de Córdoba, S

    1999-01-01

    We recently showed that alkaptonuria (AKU) is caused by loss-of-function mutations in the homogentisate 1,2 dioxygenase gene (HGO). Herein we describe haplotype and mutational analyses of HGO in seven new AKU pedigrees. These analyses identified two novel single-nucleotide polymorphisms (INV4+31A-->G and INV11+18A-->G) and six novel AKU mutations (INV1-1G-->A, W60G, Y62C, A122D, P230T, and D291E), which further illustrates the remarkable allelic heterogeneity found in AKU. Reexamination of all 29 mutations and polymorphisms thus far described in HGO shows that these nucleotide changes are not randomly distributed; the CCC sequence motif and its inverted complement, GGG, are preferentially mutated. These analyses also demonstrated that the nucleotide substitutions in HGO do not involve CpG dinucleotides, which illustrates important differences between HGO and other genes for the occurrence of mutation at specific short-sequence motifs. Because the CCC sequence motifs comprise a significant proportion (34.5%) of all mutated bases that have been observed in HGO, we conclude that the CCC triplet is a mutational hot spot in HGO. PMID:10205262

  3. Comparative sequence analysis revealed altered chromosomal organization and a novel insertion sequence encoding DNA modification and potentially stress-related functions in an Escherichia coli O157:H7 foodborne isolate

    Science.gov (United States)

    We recently described the complete genome of enterohemorrhagic Escherichia coli (EHEC) O157:H7 strain NADC 6564, an isolate of strain 86-24 linked to the 1986 disease outbreak. In the current study, we compared the chromosomal sequence of NADC 6564 to the well-characterized chromosomal sequences of ...

  4. Integrated sequence analysis. Final report

    International Nuclear Information System (INIS)

    Andersson, K.; Pyy, P.

    1998-02-01

    The NKS/RAK subprojet 3 'integrated sequence analysis' (ISA) was formulated with the overall objective to develop and to test integrated methodologies in order to evaluate event sequences with significant human action contribution. The term 'methodology' denotes not only technical tools but also methods for integration of different scientific disciplines. In this report, we first discuss the background of ISA and the surveys made to map methods in different application fields, such as man machine system simulation software, human reliability analysis (HRA) and expert judgement. Specific event sequences were, after the surveys, selected for application and testing of a number of ISA methods. The event sequences discussed in the report were cold overpressure of BWR, shutdown LOCA of BWR, steam generator tube rupture of a PWR and BWR disturbed signal view in the control room after an external event. Different teams analysed these sequences by using different ISA and HRA methods. Two kinds of results were obtained from the ISA project: sequence specific and more general findings. The sequence specific results are discussed together with each sequence description. The general lessons are discussed under a separate chapter by using comparisons of different case studies. These lessons include areas ranging from plant safety management (design, procedures, instrumentation, operations, maintenance and safety practices) to methodological findings (ISA methodology, PSA,HRA, physical analyses, behavioural analyses and uncertainty assessment). Finally follows a discussion about the project and conclusions are presented. An interdisciplinary study of complex phenomena is a natural way to produce valuable and innovative results. This project came up with structured ways to perform ISA and managed to apply the in practice. The project also highlighted some areas where more work is needed. In the HRA work, development is required for the use of simulators and expert judgement as

  5. Experimental evolution, genetic analysis and genome re-sequencing reveal the mutation conferring artemisinin resistance in an isogenic lineage of malaria parasites

    KAUST Repository

    Hunt, Paul

    2010-09-16

    Background: Classical and quantitative linkage analyses of genetic crosses have traditionally been used to map genes of interest, such as those conferring chloroquine or quinine resistance in malaria parasites. Next-generation sequencing technologies now present the possibility of determining genome-wide genetic variation at single base-pair resolution. Here, we combine in vivo experimental evolution, a rapid genetic strategy and whole genome re-sequencing to identify the precise genetic basis of artemisinin resistance in a lineage of the rodent malaria parasite, Plasmodium chabaudi. Such genetic markers will further the investigation of resistance and its control in natural infections of the human malaria, P. falciparum.Results: A lineage of isogenic in vivo drug-selected mutant P. chabaudi parasites was investigated. By measuring the artemisinin responses of these clones, the appearance of an in vivo artemisinin resistance phenotype within the lineage was defined. The underlying genetic locus was mapped to a region of chromosome 2 by Linkage Group Selection in two different genetic crosses. Whole-genome deep coverage short-read re-sequencing (IlluminaSolexa) defined the point mutations, insertions, deletions and copy-number variations arising in the lineage. Eight point mutations arise within the mutant lineage, only one of which appears on chromosome 2. This missense mutation arises contemporaneously with artemisinin resistance and maps to a gene encoding a de-ubiquitinating enzyme.Conclusions: This integrated approach facilitates the rapid identification of mutations conferring selectable phenotypes, without prior knowledge of biological and molecular mechanisms. For malaria, this model can identify candidate genes before resistant parasites are commonly observed in natural human malaria populations. 2010 Hunt et al; licensee BioMed Central Ltd.

  6. A whole genome analysis reveals the presence of a plant PR1 sequence in the potato pathogen Streptomyces scabies and other Streptomyces species.

    Science.gov (United States)

    Armijos-Jaramillo, Vinicio; Santander-Gordón, Daniela; Soria, Rosa; Pazmiño-Betancourth, Mauro; Echeverría, María Cristina

    2017-09-01

    Streptomyces scabies is a common soil bacterium that causes scab symptoms in potatoes. Strong evidence indicates horizontal gene transfer (HGT) among bacteria has influenced the evolution of this plant pathogen and other Streptomyces spp. To extend the study of the HGT to the Streptomyces genus, we explored the effects of the inter-domain HGT in the S. scabies genome. We employed a semi-automatic pipeline based on BLASTp searches and phylogenetic reconstruction. The data show low impact of inter-domain HGT in the S. scabies genome; however, we found a putative plant pathogenesis related 1 (PR1) sequence in the genome of S. scabies and other species of the genus. It is possible that this gene could be used by S. scabies to out-compete other soil organisms. Copyright © 2016 Elsevier Inc. All rights reserved.

  7. A novel pathogenic variant in an Iranian Ataxia telangiectasia family revealed by next-generation sequencing followed by in silico analysis.

    Science.gov (United States)

    Tabatabaiefar, Mohammad Amin; Alipour, Paria; Pourahmadiyan, Azam; Fattahi, Najmeh; Shariati, Laleh; Golchin, Neda; Mohammadi-Asl, Javad

    2017-08-15

    Ataxia telangiectasia (A-T) is a neurodegenerative autosomal recessive disorder with the main characteristics of progressive cerebellar degeneration, sensitivity to ionizing radiation, immunodeficiency, telangiectasia, premature aging, recurrent sinopulmonary infections, and increased risk of malignancy, especially of lymphoid origin. Ataxia Telangiectasia Mutated gene, ATM, as a causative gene for the A-T disorder, encodes the ATM protein, which plays an important role in the activation of cell-cycle checkpoints and initiation of DNA repair in response to DNA damage. Targeted next-generation sequencing (NGS) was performed on an Iranian 5-year-old boy presented with truncal and limb ataxia, telangiectasia of the eye, Hodgkin lymphoma, hyper pigmentation, total alopecia, hepatomegaly, and dysarthria. Sanger sequencing was used to confirm the candidate pathogenic variants. Computational docking was done using the HEX software to examine how this change affects the interactions of ATM with the upstream and downstream proteins. Three different variants were identified comprising two homozygous SNPs and one novel homozygous frameshift variant (c.80468047delTA, p.Thr2682ThrfsX5), which creates a stop codon in exon 57 leaving the protein truncated at its C-terminal portion. Therefore, the activation and phosphorylation of target proteins are lost. Moreover, the HEX software confirmed that the mutated protein lost its interaction with upstream and downstream proteins. The variant was classified as pathogenic based on the American College of Medical Genetics and Genomics guideline. This study expands the spectrum of ATM pathogenic variants in Iran and demonstrates the utility of targeted NGS in genetic diagnostics. Copyright © 2017. Published by Elsevier B.V.

  8. Whole transcriptome analysis of Acinetobacter baumannii assessed by RNA-sequencing reveals different mRNA expression profiles in biofilm compared to planktonic cells.

    Directory of Open Access Journals (Sweden)

    Soraya Rumbo-Feal

    Full Text Available Acinetobacterbaumannii has emerged as a dangerous opportunistic pathogen, with many strains able to form biofilms and thus cause persistent infections. The aim of the present study was to use high-throughput sequencing techniques to establish complete transcriptome profiles of planktonic (free-living and sessile (biofilm forms of A. baumannii ATCC 17978 and thereby identify differences in their gene expression patterns. Collections of mRNA from planktonic (both exponential and stationary phase cultures and sessile (biofilm cells were sequenced. Six mRNA libraries were prepared following the mRNA-Seq protocols from Illumina. Reads were obtained in a HiScanSQ platform and mapped against the complete genome to describe the complete mRNA transcriptomes of planktonic and sessile cells. The results showed that the gene expression pattern of A. baumannii biofilm cells was distinct from that of planktonic cells, including 1621 genes over-expressed in biofilms relative to stationary phase cells and 55 genes expressed only in biofilms. These differences suggested important changes in amino acid and fatty acid metabolism, motility, active transport, DNA-methylation, iron acquisition, transcriptional regulation, and quorum sensing, among other processes. Disruption or deletion of five of these genes caused a significant decrease in biofilm formation ability in the corresponding mutant strains. Among the genes over-expressed in biofilm cells were those in an operon involved in quorum sensing. One of them, encoding an acyl carrier protein, was shown to be involved in biofilm formation as demonstrated by the significant decrease in biofilm formation by the corresponding knockout strain. The present work serves as a basis for future studies examining the complex network systems that regulate bacterial biofilm formation and maintenance.

  9. Multilocus sequence typing reveals a novel subspeciation of Lactobacillus delbrueckii.

    Science.gov (United States)

    Tanigawa, Kana; Watanabe, Koichi

    2011-03-01

    Currently, the species Lactobacillus delbrueckii is divided into four subspecies, L. delbrueckii subsp. delbrueckii, L. delbrueckii subsp. bulgaricus, L. delbrueckii subsp. indicus and L. delbrueckii subsp. lactis. These classifications were based mainly on phenotypic identification methods and few studies have used genotypic identification methods. As a result, these subspecies have not yet been reliably delineated. In this study, the four subspecies of L. delbrueckii were discriminated by phenotype and by genotypic identification [amplified-fragment length polymorphism (AFLP) and multilocus sequence typing (MLST)] methods. The MLST method developed here was based on the analysis of seven housekeeping genes (fusA, gyrB, hsp60, ileS, pyrG, recA and recG). The MLST method had good discriminatory ability: the 41 strains of L. delbrueckii examined were divided into 34 sequence types, with 29 sequence types represented by only a single strain. The sequence types were divided into eight groups. These groups could be discriminated as representing different subspecies. The results of the AFLP and MLST analyses were consistent. The type strain of L. delbrueckii subsp. delbrueckii, YIT 0080(T), was clearly discriminated from the other strains currently classified as members of this subspecies, which were located close to strains of L. delbrueckii subsp. lactis. The MLST scheme developed in this study should be a useful tool for the identification of strains of L. delbrueckii to the subspecies level.

  10. Taxonomy of anaerobic digestion microbiome reveals biases associated with the applied high throughput sequencing strategies

    DEFF Research Database (Denmark)

    Campanaro, Stefano; Treu, Laura; Kougias, Panagiotis

    2018-01-01

    In the past few years, many studies investigated the anaerobic digestion microbiome by means of 16S rRNA amplicon sequencing. Results obtained from these studies were compared to each other without taking into consideration the followed procedure for amplicons preparation and data analysis...... specifically, the microbial compositions of three laboratory scale biogas reactors were analyzed before and after addition of sodium oleate by sequencing the microbiome with three different approaches: 16S rRNA amplicon sequencing, shotgun DNA and shotgun RNA. This comparative analysis revealed that......, in amplicon sequencing, abundance of some taxa (Euryarchaeota and Spirochaetes) was biased by the inefficiency of universal primers to hybridize all the templates. Reliability of the results obtained was also influenced by the number of hypervariable regions under investigation. Finally, amplicon sequencing...

  11. Diversity of the microbiota involved in wine and organic apple cider submerged vinegar production as revealed by DHPLC analysis and next-generation sequencing.

    Science.gov (United States)

    Trček, Janja; Mahnič, Aleksander; Rupnik, Maja

    2016-04-16

    Unfiltered vinegar samples collected from three oxidation cycles of the submerged industrial production of each, red wine and organic apple cider vinegars, were sampled in a Slovene vinegar producing company. The samples were systematically collected from the beginning to the end of an oxidation cycle and used for culture-independent microbial analyses carried out by denaturing high pressure liquid chromatography (DHPLC) and Illumina MiSeq sequencing of 16S rRNA gene variable regions. Both approaches showed a very homogeneous bacterial structure during wine vinegar production but more heterogeneous during organic apple cider vinegar production. In all wine vinegar samples Komagataeibacter oboediens (formerly Gluconacetobacter oboediens) was a predominating species. In apple cider vinegar the acetic acid and lactic acid bacteria were two major groups of bacteria. The acetic acid bacterial consortium was composed of Acetobacter and Komagataeibacter with the Komagataeibacter genus outcompeting the Acetobacter in all apple cider vinegar samples at the end of oxidation cycle. Among the lactic acid bacterial consortium two dominating genera were identified, Lactobacillus and Oenococcus, with Oenococcus prevailing with increasing concentration of acetic acid in vinegars. Unexpectedly, a minor genus of the acetic acid bacterial consortium in organic apple cider vinegar was Gluconobacter, suggesting a possible development of the Gluconobacter population with a tolerance against ethanol and acetic acid. Among the accompanying bacteria of the wine vinegar, the genus Rhodococcus was detected, but it decreased substantially by the end of oxidation cycles. Copyright © 2016 Elsevier B.V. All rights reserved.

  12. T-cell receptor repertoire of human peripheral CD161hiTRAV1-2+ MAIT cells revealed by next generation sequencing and single cell analysis.

    Science.gov (United States)

    Held, Kathrin; Beltrán, Eduardo; Moser, Markus; Hohlfeld, Reinhard; Dornmair, Klaus

    2015-09-01

    Mucosal-associated invariant T (MAIT) cells are a T-cell subset that expresses a conserved TRAV1-2 (Vα7.2) T-cell receptor (TCR) chain and the surface marker CD161. They are involved in the defence against microbes as they recognise small organic molecules of microbial origin that are presented by the non-classical MHC molecule 1 (MR1). MAIT cells express a semi-restricted TCR α chain with TRAV1-2 preferentially linked to TRAJ33, TRAJ12, or TRAJ20 which pairs with a limited set of β chains. To investigate the TCR repertoire of human CD161(hi)TRAV1-2(+) T cells in depth we analysed the α and β chains of this T-cell subset by next generation sequencing. Concomitantly we analysed 132 paired α and β chains from single cells to assess the αβ pairing preferences. We found that the CD161(hi)TRAV1-2(+) TCR repertoire in addition to the typical MAIT TCRs further contains polyclonal elements reminiscent of classical αβ T cells. Copyright © 2015 American Society for Histocompatibility and Immunogenetics. Published by Elsevier Inc. All rights reserved.

  13. Fractals in DNA sequence analysis

    Institute of Scientific and Technical Information of China (English)

    Yu Zu-Guo(喻祖国); Vo Anh; Gong Zhi-Min(龚志民); Long Shun-Chao(龙顺潮)

    2002-01-01

    Fractal methods have been successfully used to study many problems in physics, mathematics, engineering, finance,and even in biology. There has been an increasing interest in unravelling the mysteries of DNA; for example, how can we distinguish coding and noncoding sequences, and the problems of classification and evolution relationship of organisms are key problems in bioinformatics. Although much research has been carried out by taking into consideration the long-range correlations in DNA sequences, and the global fractal dimension has been used in these works by other people, the models and methods are somewhat rough and the results are not satisfactory. In recent years, our group has introduced a time series model (statistical point of view) and a visual representation (geometrical point of view)to DNA sequence analysis. We have also used fractal dimension, correlation dimension, the Hurst exponent and the dimension spectrum (multifractal analysis) to discuss problems in this field. In this paper, we introduce these fractal models and methods and the results of DNA sequence analysis.

  14. Comparative analysis of seven viral nuclear export signals (NESs reveals the crucial role of nuclear export mediated by the third NES consensus sequence of nucleoprotein (NP in influenza A virus replication.

    Directory of Open Access Journals (Sweden)

    Nopporn Chutiwitoonchai

    Full Text Available The assembly of influenza virus progeny virions requires machinery that exports viral genomic ribonucleoproteins from the cell nucleus. Currently, seven nuclear export signal (NES consensus sequences have been identified in different viral proteins, including NS1, NS2, M1, and NP. The present study examined the roles of viral NES consensus sequences and their significance in terms of viral replication and nuclear export. Mutation of the NP-NES3 consensus sequence resulted in a failure to rescue viruses using a reverse genetics approach, whereas mutation of the NS2-NES1 and NS2-NES2 sequences led to a strong reduction in viral replication kinetics compared with the wild-type sequence. While the viral replication kinetics for other NES mutant viruses were also lower than those of the wild-type, the difference was not so marked. Immunofluorescence analysis after transient expression of NP-NES3, NS2-NES1, or NS2-NES2 proteins in host cells showed that they accumulated in the cell nucleus. These results suggest that the NP-NES3 consensus sequence is mostly required for viral replication. Therefore, each of the hydrophobic (Φ residues within this NES consensus sequence (Φ1, Φ2, Φ3, or Φ4 was mutated, and its viral replication and nuclear export function were analyzed. No viruses harboring NP-NES3 Φ2 or Φ3 mutants could be rescued. Consistent with this, the NP-NES3 Φ2 and Φ3 mutants showed reduced binding affinity with CRM1 in a pull-down assay, and both accumulated in the cell nucleus. Indeed, a nuclear export assay revealed that these mutant proteins showed lower nuclear export activity than the wild-type protein. Moreover, the Φ2 and Φ3 residues (along with other Φ residues within the NP-NES3 consensus were highly conserved among different influenza A viruses, including human, avian, and swine. Taken together, these results suggest that the Φ2 and Φ3 residues within the NP-NES3 protein are important for its nuclear export function

  15. Sequence and Expression Analysis of Interferon Regulatory Factor 10 (IRF10 in Three Diverse Teleost Fish Reveals Its Role in Antiviral Defense.

    Directory of Open Access Journals (Sweden)

    Qiaoqing Xu

    Full Text Available Interferon regulatory factor (IRF 10 was first found in birds and is present in the genome of other tetrapods (but not humans and mice, as well as in teleost fish. The functional role of IRF10 in vertebrate immunity is relatively unknown compared to IRF1-9. The target of this research was to clone and characterize the IRF10 genes in three economically important fish species that will facilitate future evaluation of this molecule in fish innate and adaptive immunity.In the present study, a single IRF10 gene was cloned in grass carp Ctenopharyngodon idella and Asian swamp eel Monopterus albus, and two, named IRF10a and IRF10b, in rainbow trout Oncorhynchus mykiss. The fish IRF10 molecules share highest identities to other vertebrate IRF10s, and have a well conserved DNA binding domain, IRF-associated domain, and an 8 exon/7 intron structure with conserved intron phase. The presence of an upstream ATG or open reading frame (ORF in the 5'-untranslated region of different fish IRF10 cDNA sequences suggests potential regulation at the translational level, and this has been verified by in vitro transcription/translation experiments of the trout IRF10a cDNA, but would still need to be validated in fish cells.Both trout IRF10 paralogues are highly expressed in thymus, blood and spleen but are relatively low in head kidney and caudal kidney. Trout IRF10b expression is significantly higher than IRF10a in integumentary tissues i.e. gills, scales, skin, intestine, adipose fin and tail fins, suggesting that IRF10b may be more important in mucosal immunity. The expression of both trout IRF10 paralogues is up-regulated by recombinant IFN-γ. The expression of the IRF10 genes is highly induced by Poly I:C in vitro and in vivo, and by viral infection, but is less responsive to peptidoglycan and bacterial infection, suggesting an important role of fish IRF10 in antiviral defense.

  16. Integrated sequence analysis. Final report

    Energy Technology Data Exchange (ETDEWEB)

    Andersson, K.; Pyy, P

    1998-02-01

    The NKS/RAK subprojet 3 `integrated sequence analysis` (ISA) was formulated with the overall objective to develop and to test integrated methodologies in order to evaluate event sequences with significant human action contribution. The term `methodology` denotes not only technical tools but also methods for integration of different scientific disciplines. In this report, we first discuss the background of ISA and the surveys made to map methods in different application fields, such as man machine system simulation software, human reliability analysis (HRA) and expert judgement. Specific event sequences were, after the surveys, selected for application and testing of a number of ISA methods. The event sequences discussed in the report were cold overpressure of BWR, shutdown LOCA of BWR, steam generator tube rupture of a PWR and BWR disturbed signal view in the control room after an external event. Different teams analysed these sequences by using different ISA and HRA methods. Two kinds of results were obtained from the ISA project: sequence specific and more general findings. The sequence specific results are discussed together with each sequence description. The general lessons are discussed under a separate chapter by using comparisons of different case studies. These lessons include areas ranging from plant safety management (design, procedures, instrumentation, operations, maintenance and safety practices) to methodological findings (ISA methodology, PSA,HRA, physical analyses, behavioural analyses and uncertainty assessment). Finally follows a discussion about the project and conclusions are presented. An interdisciplinary study of complex phenomena is a natural way to produce valuable and innovative results. This project came up with structured ways to perform ISA and managed to apply the in practice. The project also highlighted some areas where more work is needed. In the HRA work, development is required for the use of simulators and expert judgement as

  17. Widespread alternative and aberrant splicing revealed by lariat sequencing

    Science.gov (United States)

    Stepankiw, Nicholas; Raghavan, Madhura; Fogarty, Elizabeth A.; Grimson, Andrew; Pleiss, Jeffrey A.

    2015-01-01

    Alternative splicing is an important and ancient feature of eukaryotic gene structure, the existence of which has likely facilitated eukaryotic proteome expansions. Here, we have used intron lariat sequencing to generate a comprehensive profile of splicing events in Schizosaccharomyces pombe, amongst the simplest organisms that possess mammalian-like splice site degeneracy. We reveal an unprecedented level of alternative splicing, including alternative splice site selection for over half of all annotated introns, hundreds of novel exon-skipping events, and thousands of novel introns. Moreover, the frequency of these events is far higher than previous estimates, with alternative splice sites on average activated at ∼3% the rate of canonical sites. Although a subset of alternative sites are conserved in related species, implying functional potential, the majority are not detectably conserved. Interestingly, the rate of aberrant splicing is inversely related to expression level, with lowly expressed genes more prone to erroneous splicing. Although we validate many events with RNAseq, the proportion of alternative splicing discovered with lariat sequencing is far greater, a difference we attribute to preferential decay of aberrantly spliced transcripts. Together, these data suggest the spliceosome possesses far lower fidelity than previously appreciated, highlighting the potential contributions of alternative splicing in generating novel gene structures. PMID:26261211

  18. Whole Exome Sequencing Reveals Genetic Predisposition in a Large Family with Retinitis Pigmentosa

    Directory of Open Access Journals (Sweden)

    Juan Wu

    2014-01-01

    Full Text Available Next-generation sequencing has become more widely used to reveal genetic defect in monogenic disorders. Retinitis pigmentosa (RP, the leading cause of hereditary blindness worldwide, has been attributed to more than 67 disease-causing genes. Due to the extreme genetic heterogeneity, using general molecular screening alone is inadequate for identifying genetic predispositions in susceptible individuals. In order to identify underlying mutation rapidly, we utilized next-generation sequencing in a four-generation Chinese family with RP. Two affected patients and an unaffected sibling were subjected to whole exome sequencing. Through bioinformatics analysis and direct sequencing confirmation, we identified p.R135W transition in the rhodopsin gene. The mutation was subsequently confirmed to cosegregate with the disease in the family. In this study, our results suggest that whole exome sequencing is a robust method in diagnosing familial hereditary disease.

  19. OTU analysis using metagenomic shotgun sequencing data.

    Directory of Open Access Journals (Sweden)

    Xiaolin Hao

    Full Text Available Because of technological limitations, the primer and amplification biases in targeted sequencing of 16S rRNA genes have veiled the true microbial diversity underlying environmental samples. However, the protocol of metagenomic shotgun sequencing provides 16S rRNA gene fragment data with natural immunity against the biases raised during priming and thus the potential of uncovering the true structure of microbial community by giving more accurate predictions of operational taxonomic units (OTUs. Nonetheless, the lack of statistically rigorous comparison between 16S rRNA gene fragments and other data types makes it difficult to interpret previously reported results using 16S rRNA gene fragments. Therefore, in the present work, we established a standard analysis pipeline that would help confirm if the differences in the data are true or are just due to potential technical bias. This pipeline is built by using simulated data to find optimal mapping and OTU prediction methods. The comparison between simulated datasets revealed a relationship between 16S rRNA gene fragments and full-length 16S rRNA sequences that a 16S rRNA gene fragment having a length >150 bp provides the same accuracy as a full-length 16S rRNA sequence using our proposed pipeline, which could serve as a good starting point for experimental design and making the comparison between 16S rRNA gene fragment-based and targeted 16S rRNA sequencing-based surveys possible.

  20. Sequencing of 50 human exomes reveals adaptation to high altitude

    DEFF Research Database (Denmark)

    Yi, Xin; Liang, Yu; Huerta-Sanchez, Emilia

    2010-01-01

    Residents of the Tibetan Plateau show heritable adaptations to extreme altitude. We sequenced 50 exomes of ethnic Tibetans, encompassing coding sequences of 92% of human genes, with an average coverage of 18x per individual. Genes showing population-specific allele frequency changes, which repres...... in genetic adaptation to high altitude.......Residents of the Tibetan Plateau show heritable adaptations to extreme altitude. We sequenced 50 exomes of ethnic Tibetans, encompassing coding sequences of 92% of human genes, with an average coverage of 18x per individual. Genes showing population-specific allele frequency changes, which...... represent strong candidates for altitude adaptation, were identified. The strongest signal of natural selection came from endothelial Per-Arnt-Sim (PAS) domain protein 1 (EPAS1), a transcription factor involved in response to hypoxia. One single-nucleotide polymorphism (SNP) at EPAS1 shows a 78% frequency...

  1. Functional and RNA-sequencing analysis revealed expression of a novel stay-green gene from Zoysia japonica (ZjSGR caused chlorophyll degradation and accelerated senescence in Arabidopsis

    Directory of Open Access Journals (Sweden)

    Ke Teng

    2016-12-01

    Full Text Available Senescence is not only an important developmental process, but also a responsive regulation to abiotic and biotic stress for plants. Stay-green protein plays crucial roles in plant senescence and chlorophyll degradation. However, the underlying mechanisms were not well studied, particularly in non-model plants. In this study, a novel stay-green gene, ZjSGR, was isolated from Zoysia japonica. Subcellular localization result demonstrated that ZjSGR was localized in the chloroplasts. Quantitative real-time PCR results together with promoter activity determination using transgenic Arabidopsis confirmed that ZjSGR could be induced by darkness, ABA and MeJA. Its expression levels could also be up-regulated by natural senescence, but suppressed by SA treatments. Overexpression of ZjSGR in Arabidopsis resulted in a rapid yellowing phenotype; complementary experiments proved that ZjSGR was a functional homologue of AtNYE1 from Arabidopsis thaliana. Overexpression of ZjSGR accelerated chlorophyll degradation and impaired photosynthesis in Arabidopsis. Transmission electron microscopy observation revealed that overexpression of ZjSGR decomposed the chloroplasts structure. RNA sequencing analysis showed that ZjSGR could play multiple roles in senescence and chlorophyll degradation by regulating hormone signal transduction and the expression of a large number of senescence and environmental stress related genes. Our study provides a better understanding of the roles of SGRs, and new insight into the senescence and chlorophyll degradation mechanisms in plants.

  2. Whole-genome sequencing reveals a potential causal mutation for dwarfism in the Miniature Shetland pony.

    Science.gov (United States)

    Metzger, Julia; Gast, Alana Christina; Schrimpf, Rahel; Rau, Janina; Eikelberg, Deborah; Beineke, Andreas; Hellige, Maren; Distl, Ottmar

    2017-04-01

    The Miniature Shetland pony represents a horse breed with an extremely small body size. Clinical examination of a dwarf Miniature Shetland pony revealed a lowered size at the withers, malformed skull and brachygnathia superior. Computed tomography (CT) showed a shortened maxilla and a cleft of the hard and soft palate which protruded into the nasal passage leading to breathing difficulties. Pathological examination confirmed these findings but did not reveal histopathological signs of premature ossification in limbs or cranial sutures. Whole-genome sequencing of this dwarf Miniature Shetland pony and comparative sequence analysis using 26 reference equids from NCBI Sequence Read Archive revealed three probably damaging missense variants which could be exclusively found in the affected foal. Validation of these three missense mutations in 159 control horses from different horse breeds and five donkeys revealed only the aggrecan (ACAN)-associated g.94370258G>C variant as homozygous wild-type in all control samples. The dwarf Miniature Shetland pony had the homozygous mutant genotype C/C of the ACAN:g.94370258G>C variant and the normal parents were heterozygous G/C. An unaffected full sib and 3/5 unaffected half-sibs were heterozygous G/C for the ACAN:g.94370258G>C variant. In summary, we could demonstrate a dwarf phenotype in a miniature pony breed perfectly associated with a missense mutation within the ACAN gene.

  3. Transcriptome Sequencing Revealed Significant Alteration of Cortical Promoter Usage and Splicing in Schizophrenia

    Science.gov (United States)

    Wu, Jing Qin; Wang, Xi; Beveridge, Natalie J.; Tooney, Paul A.; Scott, Rodney J.; Carr, Vaughan J.; Cairns, Murray J.

    2012-01-01

    Background While hybridization based analysis of the cortical transcriptome has provided important insight into the neuropathology of schizophrenia, it represents a restricted view of disease-associated gene activity based on predetermined probes. By contrast, sequencing technology can provide un-biased analysis of transcription at nucleotide resolution. Here we use this approach to investigate schizophrenia-associated cortical gene expression. Methodology/Principal Findings The data was generated from 76 bp reads of RNA-Seq, aligned to the reference genome and assembled into transcripts for quantification of exons, splice variants and alternative promoters in postmortem superior temporal gyrus (STG/BA22) from 9 male subjects with schizophrenia and 9 matched non-psychiatric controls. Differentially expressed genes were then subjected to further sequence and functional group analysis. The output, amounting to more than 38 Gb of sequence, revealed significant alteration of gene expression including many previously shown to be associated with schizophrenia. Gene ontology enrichment analysis followed by functional map construction identified three functional clusters highly relevant to schizophrenia including neurotransmission related functions, synaptic vesicle trafficking, and neural development. Significantly, more than 2000 genes displayed schizophrenia-associated alternative promoter usage and more than 1000 genes showed differential splicing (FDRschizophrenia-associated transcriptional diversity within the STG, and revealed variants with important implications for the complex pathophysiology of schizophrenia. PMID:22558445

  4. Targeted exome sequencing reveals novel USH2A mutations in Chinese patients with simplex Usher syndrome.

    Science.gov (United States)

    Shu, Hai-Rong; Bi, Huai; Pan, Yang-Chun; Xu, Hang-Yu; Song, Jian-Xin; Hu, Jie

    2015-09-16

    Usher syndrome (USH) is an autosomal recessive disorder characterized by hearing impairment and vision dysfunction due to retinitis pigmentosa. Phenotypic and genetic heterogeneities of this disease make it impractical to obtain a genetic diagnosis by conventional Sanger sequencing. In this study, we applied a next-generation sequencing approach to detect genetic abnormalities in patients with USH. Two unrelated Chinese families were recruited, consisting of two USH afflicted patients and four unaffected relatives. We selected 199 genes related to inherited retinal diseases as targets for deep exome sequencing. Through systematic data analysis using an established bioinformatics pipeline, all variants that passed filter criteria were validated by Sanger sequencing and co-segregation analysis. A homozygous frameshift mutation (c.4382delA, p.T1462Lfs*2) was revealed in exon20 of gene USH2A in the F1 family. Two compound heterozygous mutations, IVS47 + 1G > A and c.13156A > T (p.I4386F), located in intron 48 and exon 63 respectively, of USH2A, were identified as causative mutations for the F2 family. Of note, the missense mutation c.13156A > T has not been reported so far. In conclusion, targeted exome sequencing precisely and rapidly identified the genetic defects in two Chinese USH families and this technique can be applied as a routine examination for these disorders with significant clinical and genetic heterogeneity.

  5. Sequence analysis of Leukemia DNA

    Science.gov (United States)

    Nacong, Nasria; Lusiyanti, Desy; Irawan, Muhammad. Isa

    2018-03-01

    Cancer is a very deadly disease, one of which is leukemia disease or better known as blood cancer. The cancer cell can be detected by taking DNA in laboratory test. This study focused on local alignment of leukemia and non leukemia data resulting from NCBI in the form of DNA sequences by using Smith-Waterman algorithm. SmithWaterman algorithm was invented by TF Smith and MS Waterman in 1981. These algorithms try to find as much as possible similarity of a pair of sequences, by giving a negative value to the unequal base pair (mismatch), and positive values on the same base pair (match). So that will obtain the maximum positive value as the end of the alignment, and the minimum value as the initial alignment. This study will use sequences of leukemia and 3 sequences of non leukemia.

  6. Sequencing the extrachromosomal circular mobilome reveals retrotransposon activity in plants

    OpenAIRE

    Lanciano, Sophie; Carpentier, M. C.; Llauro, C.; Jobet, E.; Robakowska-Hyzorek, D.; Lasserre, E.; Ghesquière, Alain; Panaud, O.; Mirouze, Marie

    2017-01-01

    Retrotransposons are mobile genetic elements abundant in plant and animal genomes. While efficiently silenced by the epigenetic machinery, they can be reactivated upon stress or during development. Their level of transcription not reflecting their transposition ability, it is thus difficult to evaluate their contribution to the active mobilome. Here we applied a simple methodology based on the high throughput sequencing of extrachromosomal circular DNA (eccDNA) forms of active retrotransposon...

  7. Deep sequencing of foot-and-mouth disease virus reveals RNA sequences involved in genome packaging.

    Science.gov (United States)

    Logan, Grace; Newman, Joseph; Wright, Caroline F; Lasecka-Dykes, Lidia; Haydon, Daniel T; Cottam, Eleanor M; Tuthill, Tobias J

    2017-10-18

    Non-enveloped viruses protect their genomes by packaging them into an outer shell or capsid of virus-encoded proteins. Packaging and capsid assembly in RNA viruses can involve interactions between capsid proteins and secondary structures in the viral genome as exemplified by the RNA bacteriophage MS2 and as proposed for other RNA viruses of plants, animals and human. In the picornavirus family of non-enveloped RNA viruses, the requirements for genome packaging remain poorly understood. Here we show a novel and simple approach to identify predicted RNA secondary structures involved in genome packaging in the picornavirus foot-and-mouth disease virus (FMDV). By interrogating deep sequencing data generated from both packaged and unpackaged populations of RNA we have determined multiple regions of the genome with constrained variation in the packaged population. Predicted secondary structures of these regions revealed stem loops with conservation of structure and a common motif at the loop. Disruption of these features resulted in attenuation of virus growth in cell culture due to a reduction in assembly of mature virions. This study provides evidence for the involvement of predicted RNA structures in picornavirus packaging and offers a readily transferable methodology for identifying packaging requirements in many other viruses. Importance In order to transmit their genetic material to a new host, non-enveloped viruses must protect their genomes by packaging them into an outer shell or capsid of virus-encoded proteins. For many non-enveloped RNA viruses the requirements for this critical part of the viral life cycle remain poorly understood. We have identified RNA sequences involved in genome packaging of the picornavirus foot-and-mouth disease virus. This virus causes an economically devastating disease of livestock affecting both the developed and developing world. The experimental methods developed to carry out this work are novel, simple and transferable to the

  8. Targeted Genome Sequencing Reveals Varicella-Zoster Virus Open Reading Frame 12 Deletion.

    Science.gov (United States)

    Cohrs, Randall J; Lee, Katherine S; Beach, Addilynn; Sanford, Bridget; Baird, Nicholas L; Como, Christina; Graybill, Chiharu; Jones, Dallas; Tekeste, Eden; Ballard, Mitchell; Chen, Xiaomi; Yalacki, David; Frietze, Seth; Jones, Kenneth; Lenac Rovis, Tihana; Jonjić, Stipan; Haas, Jürgen; Gilden, Don

    2017-10-15

    The neurotropic herpesvirus varicella-zoster virus (VZV) establishes a lifelong latent infection in humans following primary infection. The low abundance of VZV nucleic acids in human neurons has hindered an understanding of the mechanisms that regulate viral gene transcription during latency. To overcome this critical barrier, we optimized a targeted capture protocol to enrich VZV DNA and cDNA prior to whole-genome/transcriptome sequence analysis. Since the VZV genome is remarkably stable, it was surprising to detect that VZV32, a VZV laboratory strain with no discernible growth defect in tissue culture, contained a 2,158-bp deletion in open reading frame (ORF) 12. Consequently, ORF 12 and 13 protein expression was abolished and Akt phosphorylation was inhibited. The discovery of the ORF 12 deletion, revealed through targeted genome sequencing analysis, points to the need to authenticate the VZV genome when the virus is propagated in tissue culture. IMPORTANCE Viruses isolated from clinical samples often undergo genetic modifications when cultured in the laboratory. Historically, VZV is among the most genetically stable herpesviruses, a notion supported by more than 60 complete genome sequences from multiple isolates and following multiple in vitro passages. However, application of enrichment protocols to targeted genome sequencing revealed the unexpected deletion of a significant portion of VZV ORF 12 following propagation in cultured human fibroblast cells. While the enrichment protocol did not introduce bias in either the virus genome or transcriptome, the findings indicate the need for authentication of VZV by sequencing when the virus is propagated in tissue culture. Copyright © 2017 American Society for Microbiology.

  9. Sequence alignment reveals possible MAPK docking motifs on HIV proteins.

    Directory of Open Access Journals (Sweden)

    Perry Evans

    Full Text Available Over the course of HIV infection, virus replication is facilitated by the phosphorylation of HIV proteins by human ERK1 and ERK2 mitogen-activated protein kinases (MAPKs. MAPKs are known to phosphorylate their substrates by first binding with them at a docking site. Docking site interactions could be viable drug targets because the sequences guiding them are more specific than phosphorylation consensus sites. In this study we use multiple bioinformatics tools to discover candidate MAPK docking site motifs on HIV proteins known to be phosphorylated by MAPKs, and we discuss the possibility of targeting docking sites with drugs. Using sequence alignments of HIV proteins of different subtypes, we show that MAPK docking patterns previously described for human proteins appear on the HIV matrix, Tat, and Vif proteins in a strain dependent manner, but are absent from HIV Rev and appear on all HIV Nef strains. We revise the regular expressions of previously annotated MAPK docking patterns in order to provide a subtype independent motif that annotates all HIV proteins. One revision is based on a documented human variant of one of the substrate docking motifs, and the other reduces the number of required basic amino acids in the standard docking motifs from two to one. The proposed patterns are shown to be consistent with in silico docking between ERK1 and the HIV matrix protein. The motif usage on HIV proteins is sufficiently different from human proteins in amino acid sequence similarity to allow for HIV specific targeting using small-molecule drugs.

  10. Genome Sequencing and Analysis Conference IV

    Energy Technology Data Exchange (ETDEWEB)

    1993-12-31

    J. Craig Venter and C. Thomas Caskey co-chaired Genome Sequencing and Analysis Conference IV held at Hilton Head, South Carolina from September 26--30, 1992. Venter opened the conference by noting that approximately 400 researchers from 16 nations were present four times as many participants as at Genome Sequencing Conference I in 1989. Venter also introduced the Data Fair, a new component of the conference allowing exchange and on-site computer analysis of unpublished sequence data.

  11. Targeted cancer exome sequencing reveals recurrent mutations in myeloproliferative neoplasms

    Science.gov (United States)

    Tenedini, E; Bernardis, I; Artusi, V; Artuso, L; Roncaglia, E; Guglielmelli, P; Pieri, L; Bogani, C; Biamonte, F; Rotunno, G; Mannarelli, C; Bianchi, E; Pancrazzi, A; Fanelli, T; Malagoli Tagliazucchi, G; Ferrari, S; Manfredini, R; Vannucchi, A M; Tagliafico, E

    2014-01-01

    With the intent of dissecting the molecular complexity of Philadelphia-negative myeloproliferative neoplasms (MPN), we designed a target enrichment panel to explore, using next-generation sequencing (NGS), the mutational status of an extensive list of 2000 cancer-associated genes and microRNAs. The genomic DNA of granulocytes and in vitro-expanded CD3+T-lymphocytes, as a germline control, was target-enriched and sequenced in a learning cohort of 20 MPN patients using Roche 454 technology. We identified 141 genuine somatic mutations, most of which were not previously described. To test the frequency of the identified variants, a larger validation cohort of 189 MPN patients was additionally screened for these mutations using Ion Torrent AmpliSeq NGS. Excluding the genes already described in MPN, for 8 genes (SCRIB, MIR662, BARD1, TCF12, FAT4, DAP3, POLG and NRAS), we demonstrated a mutation frequency between 3 and 8%. We also found that mutations at codon 12 of NRAS (NRASG12V and NRASG12D) were significantly associated, for primary myelofibrosis (PMF), with highest dynamic international prognostic scoring system (DIPSS)-plus score categories. This association was then confirmed in 66 additional PMF patients composing a final dataset of 168 PMF showing a NRAS mutation frequency of 4.7%, which was associated with a worse outcome, as defined by the DIPSS plus score. PMID:24150215

  12. Intratumor Heterogeneity and Branched Evolution Revealed by Multiregion Sequencing

    DEFF Research Database (Denmark)

    Gerlinger, Marco; Rowan, Andrew J.; Horswell, Stuart

    2012-01-01

    .RESULTS: Phylogenetic reconstruction revealed branched evolutionary tumor growth, with 63 to 69% of all somatic mutations not detectable across every tumor region. Intratumor heterogeneity was observed for a mutation within an autoinhibitory domain of the mammalian target of rapamycin (mTOR) kinase, correlating with S6...

  13. Sequencing the extrachromosomal circular mobilome reveals retrotransposon activity in plants.

    Directory of Open Access Journals (Sweden)

    Sophie Lanciano

    2017-02-01

    Full Text Available Retrotransposons are mobile genetic elements abundant in plant and animal genomes. While efficiently silenced by the epigenetic machinery, they can be reactivated upon stress or during development. Their level of transcription not reflecting their transposition ability, it is thus difficult to evaluate their contribution to the active mobilome. Here we applied a simple methodology based on the high throughput sequencing of extrachromosomal circular DNA (eccDNA forms of active retrotransposons to characterize the repertoire of mobile retrotransposons in plants. This method successfully identified known active retrotransposons in both Arabidopsis and rice material where the epigenome is destabilized. When applying mobilome-seq to developmental stages in wild type rice, we identified PopRice as a highly active retrotransposon producing eccDNA forms in the wild type endosperm. The mobilome-seq strategy opens new routes for the characterization of a yet unexplored fraction of plant genomes.

  14. Sequencing the extrachromosomal circular mobilome reveals retrotransposon activity in plants.

    Science.gov (United States)

    Lanciano, Sophie; Carpentier, Marie-Christine; Llauro, Christel; Jobet, Edouard; Robakowska-Hyzorek, Dagmara; Lasserre, Eric; Ghesquière, Alain; Panaud, Olivier; Mirouze, Marie

    2017-02-01

    Retrotransposons are mobile genetic elements abundant in plant and animal genomes. While efficiently silenced by the epigenetic machinery, they can be reactivated upon stress or during development. Their level of transcription not reflecting their transposition ability, it is thus difficult to evaluate their contribution to the active mobilome. Here we applied a simple methodology based on the high throughput sequencing of extrachromosomal circular DNA (eccDNA) forms of active retrotransposons to characterize the repertoire of mobile retrotransposons in plants. This method successfully identified known active retrotransposons in both Arabidopsis and rice material where the epigenome is destabilized. When applying mobilome-seq to developmental stages in wild type rice, we identified PopRice as a highly active retrotransposon producing eccDNA forms in the wild type endosperm. The mobilome-seq strategy opens new routes for the characterization of a yet unexplored fraction of plant genomes.

  15. Transcriptome sequencing from diverse human populations reveals differentiated regulatory architecture.

    Directory of Open Access Journals (Sweden)

    Alicia R Martin

    2014-08-01

    Full Text Available Large-scale sequencing efforts have documented extensive genetic variation within the human genome. However, our understanding of the origins, global distribution, and functional consequences of this variation is far from complete. While regulatory variation influencing gene expression has been studied within a handful of populations, the breadth of transcriptome differences across diverse human populations has not been systematically analyzed. To better understand the spectrum of gene expression variation, alternative splicing, and the population genetics of regulatory variation in humans, we have sequenced the genomes, exomes, and transcriptomes of EBV transformed lymphoblastoid cell lines derived from 45 individuals in the Human Genome Diversity Panel (HGDP. The populations sampled span the geographic breadth of human migration history and include Namibian San, Mbuti Pygmies of the Democratic Republic of Congo, Algerian Mozabites, Pathan of Pakistan, Cambodians of East Asia, Yakut of Siberia, and Mayans of Mexico. We discover that approximately 25.0% of the variation in gene expression found amongst individuals can be attributed to population differences. However, we find few genes that are systematically differentially expressed among populations. Of this population-specific variation, 75.5% is due to expression rather than splicing variability, and we find few genes with strong evidence for differential splicing across populations. Allelic expression analyses indicate that previously mapped common regulatory variants identified in eight populations from the International Haplotype Map Phase 3 project have similar effects in our seven sampled HGDP populations, suggesting that the cellular effects of common variants are shared across diverse populations. Together, these results provide a resource for studies analyzing functional differences across populations by estimating the degree of shared gene expression, alternative splicing, and

  16. Transcriptome sequencing revealed significant alteration of cortical promoter usage and splicing in schizophrenia.

    Directory of Open Access Journals (Sweden)

    Jing Qin Wu

    Full Text Available While hybridization based analysis of the cortical transcriptome has provided important insight into the neuropathology of schizophrenia, it represents a restricted view of disease-associated gene activity based on predetermined probes. By contrast, sequencing technology can provide un-biased analysis of transcription at nucleotide resolution. Here we use this approach to investigate schizophrenia-associated cortical gene expression.The data was generated from 76 bp reads of RNA-Seq, aligned to the reference genome and assembled into transcripts for quantification of exons, splice variants and alternative promoters in postmortem superior temporal gyrus (STG/BA22 from 9 male subjects with schizophrenia and 9 matched non-psychiatric controls. Differentially expressed genes were then subjected to further sequence and functional group analysis. The output, amounting to more than 38 Gb of sequence, revealed significant alteration of gene expression including many previously shown to be associated with schizophrenia. Gene ontology enrichment analysis followed by functional map construction identified three functional clusters highly relevant to schizophrenia including neurotransmission related functions, synaptic vesicle trafficking, and neural development. Significantly, more than 2000 genes displayed schizophrenia-associated alternative promoter usage and more than 1000 genes showed differential splicing (FDR<0.05. Both types of transcriptional isoforms were exemplified by reads aligned to the neurodevelopmentally significant doublecortin-like kinase 1 (DCLK1 gene.This study provided the first deep and un-biased analysis of schizophrenia-associated transcriptional diversity within the STG, and revealed variants with important implications for the complex pathophysiology of schizophrenia.

  17. [Exome sequencing revealed Allan-Herndon-Dudley syndrome underlying multiple disabilities].

    Science.gov (United States)

    Arvio, Maria; Philips, Anju K; Ahvenainen, Minna; Somer, Mirja; Kalscheuer, Vera; Järvelä, Irma

    2014-01-01

    Normal function of the thyroid gland is the cornerstone of a child's mental development and physical growth. We describe a Finnish family, in which the diagnosis of three brothers became clear after investigations that lasted for more than 30 years. Two of the sons have already died. DNA analysis of the third one, a 16-year-old boy, revealed in exome sequencing of the complete X chromosome a mutation in the SLC16A2 gene, i.e. MCT8, coding for a thyroid hormone transport protein. Allan-Herndon-Dudley syndrome was thus shown to be the cause of multiple disabilities.

  18. Key roles for freshwater Actinobacteria revealed by deep metagenomic sequencing.

    Science.gov (United States)

    Ghai, Rohit; Mizuno, Carolina Megumi; Picazo, Antonio; Camacho, Antonio; Rodriguez-Valera, Francisco

    2014-12-01

    Freshwater ecosystems are critical but fragile environments directly affecting society and its welfare. However, our understanding of genuinely freshwater microbial communities, constrained by our capacity to manipulate its prokaryotic participants in axenic cultures, remains very rudimentary. Even the most abundant components, freshwater Actinobacteria, remain largely unknown. Here, applying deep metagenomic sequencing to the microbial community of a freshwater reservoir, we were able to circumvent this traditional bottleneck and reconstruct de novo seven distinct streamlined actinobacterial genomes. These genomes represent three new groups of photoheterotrophic, planktonic Actinobacteria. We describe for the first time genomes of two novel clades, acMicro (Micrococcineae, related to Luna2,) and acAMD (Actinomycetales, related to acTH1). Besides, an aggregate of contigs belonged to a new branch of the Acidimicrobiales. All are estimated to have small genomes (approximately 1.2 Mb), and their GC content varied from 40 to 61%. One of the Micrococcineae genomes encodes a proteorhodopsin, a rhodopsin type reported for the first time in Actinobacteria. The remarkable potential capacity of some of these genomes to transform recalcitrant plant detrital material, particularly lignin-derived compounds, suggests close linkages between the terrestrial and aquatic realms. Moreover, abundances of Actinobacteria correlate inversely to those of Cyanobacteria that are responsible for prolonged and frequently irretrievable damage to freshwater ecosystems. This suggests that they might serve as sentinels of impending ecological catastrophes. © 2014 John Wiley & Sons Ltd.

  19. The analysis of novel microRNA mimic sequences in cancer cells reveals lack of specificity in stem-loop RT-qPCR-based microRNA detection.

    Science.gov (United States)

    Winata, Patrick; Williams, Marissa; McGowan, Eileen; Nassif, Najah; van Zandwijk, Nico; Reid, Glen

    2017-11-17

    MicroRNAs are frequently downregulated in cancer, and restoring expression has tumour suppressive activity in tumour cells. Our recent phase I clinical trial investigated microRNA-based therapy in patients with malignant pleural mesothelioma. Treatment with TargomiRs, microRNA mimics with novel sequence packaged in EGFR antibody-targeted bacterial minicells, revealed clear signs of clinical activity. In order to detect delivery of microRNA mimics to tumour cells in future clinical trials, we tested hydrolysis probe-based assays specific for the sequence of the novel mimics in transfected mesothelioma cell lines using RT-qPCR. The custom assays efficiently and specifically amplified the consensus mimics. However, we found that these assays gave a signal when total RNA from untransfected and control mimic-transfected cells were used as templates. Further investigation revealed that the reverse transcription step using stem-loop primers appeared to introduce substantial non-specific amplification with either total RNA or synthetic RNA templates. This suggests that reverse transcription using stem-loop primers suffers from an intrinsic lack of specificity for the detection of highly similar microRNAs in the same family, especially when analysing total RNA. These results suggest that RT-qPCR is unlikely to be an effective means to detect delivery of microRNA mimic-based drugs to tumour cells in patients.

  20. Whole-Exome Sequencing Reveals Clinically Relevant Variants in Family Affected with Autism Spectrum Disorder

    Directory of Open Access Journals (Sweden)

    Jiaxiu Zhou

    2016-10-01

    Full Text Available Chromosomal microarray (CMA has been suggested as a first tier clinical diagnostic test for ASD. High-throughput sequencing (HTS has associated hundreds of genes associated with ASD. Whole Exome Sequencing (WES was used in combination with CMA to identify clinically-relevant ASD variants. In prior work, a trio-based (father, mother, and proband WGS (Whole Genome Sequencing was used to reveal clinically-relevant de novo, or inherited, rare variants in half (16 / 32 of the ASD families in which all probands had normal, or VOUS (Variant of Uncertain Clinical Significance, CMA results. In this study, after CMA screening chromosome structural abnormalities of a proband affected with ASD, a WES was performed on the patient and parents. Some rare de novo, and inherited, variants were detected using trio-based bioinformatics analysis. ASD variants were ranked by SFARI Gene score, HPO (human phenotype ontology, protein function damage, and manual searching PubMed. Sanger sequencing was used to validated some candidate variants in family members. A de novo homozygous mutation in SPG11 (p.C209F, two inherited, compound-heterozygote mutations in SCN9A (p.Q10R and p.R1893H and BEST1 (p.A135V and p.A297V were confirmed. Heterozygous mutations in TSC1 (p.S487C and SHANK2 (p.Arg569His inherited from mother were also confirmed.

  1. Classic selective sweeps revealed by massive sequencing in cattle.

    Directory of Open Access Journals (Sweden)

    Saber Qanbari

    2014-02-01

    Full Text Available Human driven selection during domestication and subsequent breed formation has likely left detectable signatures within the genome of modern cattle. The elucidation of these signatures of selection is of interest from the perspective of evolutionary biology, and for identifying domestication-related genes that ultimately may help to further genetically improve this economically important animal. To this end, we employed a panel of more than 15 million autosomal SNPs identified from re-sequencing of 43 Fleckvieh animals. We mainly applied two somewhat complementary statistics, the integrated Haplotype Homozygosity Score (iHS reflecting primarily ongoing selection, and the Composite of Likelihood Ratio (CLR having the most power to detect completed selection after fixation of the advantageous allele. We find 106 candidate selection regions, many of which are harboring genes related to phenotypes relevant in domestication, such as coat coloring pattern, neurobehavioral functioning and sensory perception including KIT, MITF, MC1R, NRG4, Erbb4, TMEM132D and TAS2R16, among others. To further investigate the relationship between genes with signatures of selection and genes identified in QTL mapping studies, we use a sample of 3062 animals to perform four genome-wide association analyses using appearance traits, body size and somatic cell count. We show that regions associated with coat coloring significantly (P<0.0001 overlap with the candidate selection regions, suggesting that the selection signals we identify are associated with traits known to be affected by selection during domestication. Results also provide further evidence regarding the complexity of the genetics underlying coat coloring in cattle. This study illustrates the potential of population genetic approaches for identifying genomic regions affecting domestication-related phenotypes and further helps to identify specific regions targeted by selection during speciation, domestication and

  2. Genetic variation in the Staphylococcus aureus 8325 strain lineage revealed by whole-genome sequencing.

    Directory of Open Access Journals (Sweden)

    Kristoffer T Bæk

    Full Text Available Staphylococcus aureus strains of the 8325 lineage, especially 8325-4 and derivatives lacking prophage, have been used extensively for decades of research. We report herein the results of our deep sequence analysis of strain 8325-4. Assignment of sequence variants compared with the reference strain 8325 (NRS77/PS47 required correction of errors in the 8325 reference genome, and reassessment of variation previously attributed to chemical mutagenesis of the restriction-defective RN4220. Using an extensive strain pedigree analysis, we discovered that 8325-4 contains 16 single nucleotide polymorphisms (SNP arising prior to the construction of RN4220. We identified 5 indels in 8325-4 compared with 8325. Three indels correspond to expected Φ11, 12, 13 excisions, one indel is explained by a sequence assembly artifact, and the final indel (Δ63bp in the spa-sarS intergenic region is common to only a sub-lineage of 8325-4 strains including SH1000. This deletion was found to significantly decrease (75% steady state sarS but not spa transcript levels in post-exponential phase. The sub-lineage 8325-4 was also found to harbor 4 additional SNPs. We also found large sequence variation between 8325, 8325-4 and RN4220 in a cluster of repetitive hypothetical proteins (SA0282 homologs near the Ess secretion cluster. The overall 8325-4 SNP set results in 17 alterations within coding sequences. Remarkably, we discovered that all tested strains of the 8325-4 lineage lack phenol soluble modulin α3 (PSMα3, a virulence determinant implicated in neutrophil chemotaxis, biofilm architecture and surface spreading. Collectively, our results clarify and define the 8325-4 pedigree and reveal clear evidence that mutations existing throughout all branches of this lineage, including the widely used RN6390 and SH1000 strains, could conceivably impact virulence regulation.

  3. Differences in acid tolerance between Bifidobacterium breve BB8 and its acid-resistant derivative B. breve BB8dpH, revealed by RNA-sequencing and physiological analysis.

    Science.gov (United States)

    Yang, Xu; Hang, Xiaomin; Tan, Jing; Yang, Hong

    2015-06-01

    Bifidobacteria are common inhabitants of the human gastrointestinal tract, and their application has increased dramatically in recent years due to their health-promoting effects. The ability of bifidobacteria to tolerate acidic environments is particularly important for their function as probiotics because they encounter such environments in food products and during passage through the gastrointestinal tract. In this study, we generated a derivative, Bifidobacterium breve BB8dpH, which displayed a stable, acid-resistant phenotype. To investigate the possible reasons for the higher acid tolerance of B. breve BB8dpH, as compared with its parental strain B. breve BB8, a combined transcriptome and physiological approach was used to characterize differences between the two strains. An analysis of the transcriptome by RNA-sequencing indicated that the expression of 121 genes was increased by more than 2-fold, while the expression of 146 genes was reduced more than 2-fold, in B. breve BB8dpH. Validation of the RNA-sequencing data using real-time quantitative PCR analysis demonstrated that the RNA-sequencing results were highly reliable. The comparison analysis, based on differentially expressed genes, suggested that the acid tolerance of B. breve BB8dpH was enhanced by regulating the expression of genes involved in carbohydrate transport and metabolism, energy production, synthesis of cell envelope components (peptidoglycan and exopolysaccharide), synthesis and transport of glutamate and glutamine, and histidine synthesis. Furthermore, an analysis of physiological data showed that B. breve BB8dpH displayed higher production of exopolysaccharide and lower H(+)-ATPase activity than B. breve BB8. The results presented here will improve our understanding of acid tolerance in bifidobacteria, and they will lead to the development of new strategies to enhance the acid tolerance of bifidobacterial strains. Copyright © 2015 Elsevier Ltd. All rights reserved.

  4. MicroRNA sequence motifs reveal asymmetry between the stem arms

    DEFF Research Database (Denmark)

    Gorodkin, Jan; Havgaard, Jakob Hull; Ensterö, M.

    2006-01-01

    The processing of micro RNAs (miRNAs) from their stemloop precursor have revealed asymmetry in the processing of the mature and its star sequence. Furthermore, the miRNA processing system between organism differ. To assess this at the sequence level we have investigated mature miRNAs in their gen......The processing of micro RNAs (miRNAs) from their stemloop precursor have revealed asymmetry in the processing of the mature and its star sequence. Furthermore, the miRNA processing system between organism differ. To assess this at the sequence level we have investigated mature mi...

  5. Robustness analysis of chiller sequencing control

    International Nuclear Information System (INIS)

    Liao, Yundan; Sun, Yongjun; Huang, Gongsheng

    2015-01-01

    Highlights: • Uncertainties with chiller sequencing control were systematically quantified. • Robustness of chiller sequencing control was systematically analyzed. • Different sequencing control strategies were sensitive to different uncertainties. • A numerical method was developed for easy selection of chiller sequencing control. - Abstract: Multiple-chiller plant is commonly employed in the heating, ventilating and air-conditioning system to increase operational feasibility and energy-efficiency under part load condition. In a multiple-chiller plant, chiller sequencing control plays a key role in achieving overall energy efficiency while not sacrifices the cooling sufficiency for indoor thermal comfort. Various sequencing control strategies have been developed and implemented in practice. Based on the observation that (i) uncertainty, which cannot be avoided in chiller sequencing control, has a significant impact on the control performance and may cause the control fail to achieve the expected control and/or energy performance; and (ii) in current literature few studies have systematically addressed this issue, this paper therefore presents a study on robustness analysis of chiller sequencing control in order to understand the robustness of various chiller sequencing control strategies under different types of uncertainty. Based on the robustness analysis, a simple and applicable method is developed to select the most robust control strategy for a given chiller plant in the presence of uncertainties, which will be verified using case studies

  6. Protein sequences from mastodon and Tyrannosaurus rex revealed by mass spectrometry.

    Science.gov (United States)

    Asara, John M; Schweitzer, Mary H; Freimark, Lisa M; Phillips, Matthew; Cantley, Lewis C

    2007-04-13

    Fossilized bones from extinct taxa harbor the potential for obtaining protein or DNA sequences that could reveal evolutionary links to extant species. We used mass spectrometry to obtain protein sequences from bones of a 160,000- to 600,000-year-old extinct mastodon (Mammut americanum) and a 68-million-year-old dinosaur (Tyrannosaurus rex). The presence of T. rex sequences indicates that their peptide bonds were remarkably stable. Mass spectrometry can thus be used to determine unique sequences from ancient organisms from peptide fragmentation patterns, a valuable tool to study the evolution and adaptation of ancient taxa from which genomic sequences are unlikely to be obtained.

  7. Probabilistic accident sequence recovery analysis

    International Nuclear Information System (INIS)

    Stutzke, Martin A.; Cooper, Susan E.

    2004-01-01

    Recovery analysis is a method that considers alternative strategies for preventing accidents in nuclear power plants during probabilistic risk assessment (PRA). Consideration of possible recovery actions in PRAs has been controversial, and there seems to be a widely held belief among PRA practitioners, utility staff, plant operators, and regulators that the results of recovery analysis should be skeptically viewed. This paper provides a framework for discussing recovery strategies, thus lending credibility to the process and enhancing regulatory acceptance of PRA results and conclusions. (author)

  8. Genome sequencing and transcriptome analysis of Trichoderma reesei QM9978 strain reveals a distal chromosome translocation to be responsible for loss of vib1 expression and loss of cellulase induction.

    Science.gov (United States)

    Ivanova, Christa; Ramoni, Jonas; Aouam, Thiziri; Frischmann, Alexa; Seiboth, Bernhard; Baker, Scott E; Le Crom, Stéphane; Lemoine, Sophie; Margeot, Antoine; Bidard, Frédérique

    2017-01-01

    expression is absent in QM9978. We propose that in T. reesei , as in Neurospora crassa , vib1 is involved in cellulase induction, although the exact mechanism remains to be elucidated. The data presented here show an example of a combined genome sequencing and transcriptomic approach to explain a specific trait, in this case the QM9978 cellulase-negative phenotype, and how it helps to better understand the mechanisms during cellulase gene regulation. When focusing on mutations on the single base-pair level, changes on the chromosome level can be easily overlooked and through this work we provide an example that stresses the importance of the big picture of the genomic landscape during analysis of sequencing data.

  9. Genomic view of bipolar disorder revealed by whole genome sequencing in a genetic isolate.

    Directory of Open Access Journals (Sweden)

    Benjamin Georgi

    2014-03-01

    Full Text Available Bipolar disorder is a common, heritable mental illness characterized by recurrent episodes of mania and depression. Despite considerable effort to elucidate the genetic underpinnings of bipolar disorder, causative genetic risk factors remain elusive. We conducted a comprehensive genomic analysis of bipolar disorder in a large Old Order Amish pedigree. Microsatellite genotypes and high-density SNP-array genotypes of 388 family members were combined with whole genome sequence data for 50 of these subjects, comprising 18 parent-child trios. This study design permitted evaluation of candidate variants within the context of haplotype structure by resolving the phase in sequenced parent-child trios and by imputation of variants into multiple unsequenced siblings. Non-parametric and parametric linkage analysis of the entire pedigree as well as on smaller clusters of families identified several nominally significant linkage peaks, each of which included dozens of predicted deleterious variants. Close inspection of exonic and regulatory variants in genes under the linkage peaks using family-based association tests revealed additional credible candidate genes for functional studies and further replication in population-based cohorts. However, despite the in-depth genomic characterization of this unique, large and multigenerational pedigree from a genetic isolate, there was no convergence of evidence implicating a particular set of risk loci or common pathways. The striking haplotype and locus heterogeneity we observed has profound implications for the design of studies of bipolar and other related disorders.

  10. Genomic View of Bipolar Disorder Revealed by Whole Genome Sequencing in a Genetic Isolate

    Science.gov (United States)

    Georgi, Benjamin; Craig, David; Kember, Rachel L.; Liu, Wencheng; Lindquist, Ingrid; Nasser, Sara; Brown, Christopher; Egeland, Janice A.; Paul, Steven M.; Bućan, Maja

    2014-01-01

    Bipolar disorder is a common, heritable mental illness characterized by recurrent episodes of mania and depression. Despite considerable effort to elucidate the genetic underpinnings of bipolar disorder, causative genetic risk factors remain elusive. We conducted a comprehensive genomic analysis of bipolar disorder in a large Old Order Amish pedigree. Microsatellite genotypes and high-density SNP-array genotypes of 388 family members were combined with whole genome sequence data for 50 of these subjects, comprising 18 parent-child trios. This study design permitted evaluation of candidate variants within the context of haplotype structure by resolving the phase in sequenced parent-child trios and by imputation of variants into multiple unsequenced siblings. Non-parametric and parametric linkage analysis of the entire pedigree as well as on smaller clusters of families identified several nominally significant linkage peaks, each of which included dozens of predicted deleterious variants. Close inspection of exonic and regulatory variants in genes under the linkage peaks using family-based association tests revealed additional credible candidate genes for functional studies and further replication in population-based cohorts. However, despite the in-depth genomic characterization of this unique, large and multigenerational pedigree from a genetic isolate, there was no convergence of evidence implicating a particular set of risk loci or common pathways. The striking haplotype and locus heterogeneity we observed has profound implications for the design of studies of bipolar and other related disorders. PMID:24625924

  11. Next-Generation Sequence Analysis Reveals Transfer of Methicillin Resistance to a Methicillin-Susceptible Staphylococcus aureus Strain That Subsequently Caused a Methicillin-Resistant Staphylococcus aureus Outbreak: a Descriptive Study.

    Science.gov (United States)

    Weterings, Veronica; Bosch, Thijs; Witteveen, Sandra; Landman, Fabian; Schouls, Leo; Kluytmans, Jan

    2017-09-01

    Resistance to methicillin in Staphylococcus aureus is caused primarily by the mecA gene, which is carried on a mobile genetic element, the staphylococcal cassette chromosome mec (SCC mec ). Horizontal transfer of this element is supposed to be an important factor in the emergence of new clones of methicillin-resistant Staphylococcus aureus (MRSA) but has been rarely observed in real time. In 2012, an outbreak occurred involving a health care worker (HCW) and three patients, all carrying a fusidic acid-resistant MRSA strain. The husband of the HCW was screened for MRSA carriage, but only a methicillin-susceptible S. aureus (MSSA) strain, which was also resistant to fusidic acid, was detected. Multiple-locus variable-number tandem-repeat analysis (MLVA) typing showed that both the MSSA and MRSA isolates were MT4053-MC0005. This finding led to the hypothesis that the MSSA strain acquired the SCC mec and subsequently caused an outbreak. To support this hypothesis, next-generation sequencing of the MSSA and MRSA isolates was performed. This study showed that the MSSA isolate clustered closely with the outbreak isolates based on whole-genome multilocus sequence typing and single-nucleotide polymorphism (SNP) analysis, with a genetic distance of 17 genes and 44 SNPs, respectively. Remarkably, there were relatively large differences in the mobile genetic elements in strains within and between individuals. The limited genetic distance between the MSSA and MRSA isolates in combination with a clear epidemiologic link supports the hypothesis that the MSSA isolate acquired a SCC mec and that the resulting MRSA strain caused an outbreak. Copyright © 2017 American Society for Microbiology.

  12. Deep Sequencing Reveals the Complete Genome and Evidence for Transcriptional Activity of the First Virus-Like Sequences Identified in Aristotelia chilensis (Maqui Berry

    Directory of Open Access Journals (Sweden)

    Javier Villacreses

    2015-04-01

    Full Text Available Here, we report the genome sequence and evidence for transcriptional activity of a virus-like element in the native Chilean berry tree Aristotelia chilensis. We propose to name the endogenous sequence as Aristotelia chilensis Virus 1 (AcV1. High-throughput sequencing of the genome of this tree uncovered an endogenous viral element, with a size of 7122 bp, corresponding to the complete genome of AcV1. Its sequence contains three open reading frames (ORFs: ORFs 1 and 2 shares 66%–73% amino acid similarity with members of the Caulimoviridae virus family, especially the Petunia vein clearing virus (PVCV, Petuvirus genus. ORF1 encodes a movement protein (MP; ORF2 a Reverse Transcriptase (RT and a Ribonuclease H (RNase H domain; and ORF3 showed no amino acid sequence similarity with any other known virus proteins. Analogous to other known endogenous pararetrovirus sequences (EPRVs, AcV1 is integrated in the genome of Maqui Berry and showed low viral transcriptional activity, which was detected by deep sequencing technology (DNA and RNA-seq. Phylogenetic analysis of AcV1 and other pararetroviruses revealed a closer resemblance with Petuvirus. Overall, our data suggests that AcV1 could be a new member of Caulimoviridae family, genus Petuvirus, and the first evidence of this kind of virus in a fruit plant.

  13. Whale phylogeny and rapid radiation events revealed using novel retroposed elements and their flanking sequences

    Directory of Open Access Journals (Sweden)

    Zhou Kaiya

    2011-10-01

    Full Text Available Abstract Background A diversity of hypotheses have been proposed based on both morphological and molecular data to reveal phylogenetic relationships within the order Cetacea (dolphins, porpoises, and whales, and great progress has been made in the past two decades. However, there is still some controversy concerning relationships among certain cetacean taxa such as river dolphins and delphinoid species, which needs to be further addressed with more markers in an effort to address unresolved portions of the phylogeny. Results An analysis of additional SINE insertions and SINE-flanking sequences supported the monophyly of the order Cetacea as well as Odontocete, Delphinoidea (Delphinidae + Phocoenidae + Mondontidae, and Delphinidae. A sister relationship between Delphinidae and Phocoenidae + Mondontidae was supported, and members of classical river dolphins and the genera Tursiops and Stenella were found to be paraphyletic. Estimates of divergence times revealed rapid divergences of basal Odontocete lineages in the Oligocene and Early Miocene, and a recent rapid diversification of Delphinidae in the Middle-Late Miocene and Pliocene within a narrow time frame. Conclusions Several novel SINEs were found to differentiate Delphinidae from the other two families (Monodontidae and Phocoenidae, whereas the sister grouping of the latter two families with exclusion of Delphinidae was further revealed using the SINE-flanking sequences. Interestingly, some anomalous PCR amplification patterns of SINE insertions were detected, which can be explained as the result of potential ancestral SINE polymorphisms and incomplete lineage sorting. Although a few loci were potentially anomalous, this study demonstrated that the SINE-based approach is a powerful tool in phylogenetic studies. Identifying additional SINE elements that resolve the relationships in the superfamily Delphinoidea and family Delphinidae will be important steps forward in completely resolving

  14. Whale phylogeny and rapid radiation events revealed using novel retroposed elements and their flanking sequences.

    Science.gov (United States)

    Chen, Zhuo; Xu, Shixia; Zhou, Kaiya; Yang, Guang

    2011-10-27

    A diversity of hypotheses have been proposed based on both morphological and molecular data to reveal phylogenetic relationships within the order Cetacea (dolphins, porpoises, and whales), and great progress has been made in the past two decades. However, there is still some controversy concerning relationships among certain cetacean taxa such as river dolphins and delphinoid species, which needs to be further addressed with more markers in an effort to address unresolved portions of the phylogeny. An analysis of additional SINE insertions and SINE-flanking sequences supported the monophyly of the order Cetacea as well as Odontocete, Delphinoidea (Delphinidae + Phocoenidae + Mondontidae), and Delphinidae. A sister relationship between Delphinidae and Phocoenidae + Mondontidae was supported, and members of classical river dolphins and the genera Tursiops and Stenella were found to be paraphyletic. Estimates of divergence times revealed rapid divergences of basal Odontocete lineages in the Oligocene and Early Miocene, and a recent rapid diversification of Delphinidae in the Middle-Late Miocene and Pliocene within a narrow time frame. Several novel SINEs were found to differentiate Delphinidae from the other two families (Monodontidae and Phocoenidae), whereas the sister grouping of the latter two families with exclusion of Delphinidae was further revealed using the SINE-flanking sequences. Interestingly, some anomalous PCR amplification patterns of SINE insertions were detected, which can be explained as the result of potential ancestral SINE polymorphisms and incomplete lineage sorting. Although a few loci were potentially anomalous, this study demonstrated that the SINE-based approach is a powerful tool in phylogenetic studies. Identifying additional SINE elements that resolve the relationships in the superfamily Delphinoidea and family Delphinidae will be important steps forward in completely resolving cetacean phylogenetic relationships in the future.

  15. Sequence analysis by iterated maps, a review.

    Science.gov (United States)

    Almeida, Jonas S

    2014-05-01

    Among alignment-free methods, Iterated Maps (IMs) are on a particular extreme: they are also scale free (order free). The use of IMs for sequence analysis is also distinct from other alignment-free methodologies in being rooted in statistical mechanics instead of computational linguistics. Both of these roots go back over two decades to the use of fractal geometry in the characterization of phase-space representations. The time series analysis origin of the field is betrayed by the title of the manuscript that started this alignment-free subdomain in 1990, 'Chaos Game Representation'. The clash between the analysis of sequences as continuous series and the better established use of Markovian approaches to discrete series was almost immediate, with a defining critique published in same journal 2 years later. The rest of that decade would go by before the scale-free nature of the IM space was uncovered. The ensuing decade saw this scalability generalized for non-genomic alphabets as well as an interest in its use for graphic representation of biological sequences. Finally, in the past couple of years, in step with the emergence of BigData and MapReduce as a new computational paradigm, there is a surprising third act in the IM story. Multiple reports have described gains in computational efficiency of multiple orders of magnitude over more conventional sequence analysis methodologies. The stage appears to be now set for a recasting of IMs with a central role in processing nextgen sequencing results.

  16. Single-Cell RNA-Sequencing Reveals a Continuous Spectrum of Differentiation in Hematopoietic Cells

    Directory of Open Access Journals (Sweden)

    Iain C. Macaulay

    2016-02-01

    Full Text Available The transcriptional programs that govern hematopoiesis have been investigated primarily by population-level analysis of hematopoietic stem and progenitor cells, which cannot reveal the continuous nature of the differentiation process. Here we applied single-cell RNA-sequencing to a population of hematopoietic cells in zebrafish as they undergo thrombocyte lineage commitment. By reconstructing their developmental chronology computationally, we were able to place each cell along a continuum from stem cell to mature cell, refining the traditional lineage tree. The progression of cells along this continuum is characterized by a highly coordinated transcriptional program, displaying simultaneous suppression of genes involved in cell proliferation and ribosomal biogenesis as the expression of lineage specific genes increases. Within this program, there is substantial heterogeneity in the expression of the key lineage regulators. Overall, the total number of genes expressed, as well as the total mRNA content of the cell, decreases as the cells undergo lineage commitment.

  17. Genetic signatures of adaptation revealed from transcriptome sequencing of Arctic and red foxes.

    Science.gov (United States)

    Kumar, Vikas; Kutschera, Verena E; Nilsson, Maria A; Janke, Axel

    2015-08-07

    The genus Vulpes (true foxes) comprises numerous species that inhabit a wide range of habitats and climatic conditions, including one species, the Arctic fox (Vulpes lagopus) which is adapted to the arctic region. A close relative to the Arctic fox, the red fox (Vulpes vulpes), occurs in subarctic to subtropical habitats. To study the genetic basis of their adaptations to different environments, transcriptome sequences from two Arctic foxes and one red fox individual were generated and analyzed for signatures of positive selection. In addition, the data allowed for a phylogenetic analysis and divergence time estimate between the two fox species. The de novo assembly of reads resulted in more than 160,000 contigs/transcripts per individual. Approximately 17,000 homologous genes were identified using human and the non-redundant databases. Positive selection analyses revealed several genes involved in various metabolic and molecular processes such as energy metabolism, cardiac gene regulation, apoptosis and blood coagulation to be under positive selection in foxes. Branch site tests identified four genes to be under positive selection in the Arctic fox transcriptome, two of which are fat metabolism genes. In the red fox transcriptome eight genes are under positive selection, including molecular process genes, notably genes involved in ATP metabolism. Analysis of the three transcriptomes and five Sanger re-sequenced genes in additional individuals identified a lower genetic variability within Arctic foxes compared to red foxes, which is consistent with distribution range differences and demographic responses to past climatic fluctuations. A phylogenomic analysis estimated that the Arctic and red fox lineages diverged about three million years ago. Transcriptome data are an economic way to generate genomic resources for evolutionary studies. Despite not representing an entire genome, this transcriptome analysis identified numerous genes that are relevant to arctic

  18. Preliminary hazard analysis using sequence tree method

    International Nuclear Information System (INIS)

    Huang Huiwen; Shih Chunkuan; Hung Hungchih; Chen Minghuei; Yih Swu; Lin Jiinming

    2007-01-01

    A system level PHA using sequence tree method was developed to perform Safety Related digital I and C system SSA. The conventional PHA is a brainstorming session among experts on various portions of the system to identify hazards through discussions. However, this conventional PHA is not a systematic technique, the analysis results strongly depend on the experts' subjective opinions. The analysis quality cannot be appropriately controlled. Thereby, this research developed a system level sequence tree based PHA, which can clarify the relationship among the major digital I and C systems. Two major phases are included in this sequence tree based technique. The first phase uses a table to analyze each event in SAR Chapter 15 for a specific safety related I and C system, such as RPS. The second phase uses sequence tree to recognize what I and C systems are involved in the event, how the safety related systems work, and how the backup systems can be activated to mitigate the consequence if the primary safety systems fail. In the sequence tree, the defense-in-depth echelons, including Control echelon, Reactor trip echelon, ESFAS echelon, and Indication and display echelon, are arranged to construct the sequence tree structure. All the related I and C systems, include digital system and the analog back-up systems are allocated in their specific echelon. By this system centric sequence tree based analysis, not only preliminary hazard can be identified systematically, the vulnerability of the nuclear power plant can also be recognized. Therefore, an effective simplified D3 evaluation can be performed as well. (author)

  19. Electrophoretic mobility shift assay reveals a novel recognition sequence for Setaria italica NAC protein.

    Science.gov (United States)

    Puranik, Swati; Kumar, Karunesh; Srivastava, Prem S; Prasad, Manoj

    2011-10-01

    The NAC (NAM/ATAF1,2/CUC2) proteins are among the largest family of plant transcription factors. Its members have been associated with diverse plant processes and intricately regulate the expression of several genes. Inspite of this immense progress, knowledge of their DNA-binding properties are still limited. In our recent publication,1 we reported isolation of a membrane-associated NAC domain protein from Setaria italica (SiNAC). Transactivation analysis revealed that it was a functionally active transcription factor as it could stimulate expression of reporter genes in vivo. Truncations of the transmembrane region of the protein lead to its nuclear localization. Here we describe expression and purification of SiNAC DNA-binding domain. We further report identification of a novel DNA-binding site, [C/G][A/T][T/A][G/C]TC[C/G][A/T][C/G][G/C] for SiNAC by electrophoretic mobility shift assay. The SiNAC-GST protein could bind to the NAC recognition sequence in vitro as well as to sequences where some bases had been reshuffled. The results presented here contribute to our understanding of the DNA-binding specificity of SiNAC protein.

  20. Whole-exome sequencing revealed two novel mutations in Usher syndrome.

    Science.gov (United States)

    Koparir, Asuman; Karatas, Omer Faruk; Atayoglu, Ali Timucin; Yuksel, Bayram; Sagiroglu, Mahmut Samil; Seven, Mehmet; Ulucan, Hakan; Yuksel, Adnan; Ozen, Mustafa

    2015-06-01

    Usher syndrome is a clinically and genetically heterogeneous autosomal recessive inherited disorder accompanied by hearing loss and retinitis pigmentosa (RP). Since the associated genes are various and quite large, we utilized whole-exome sequencing (WES) as a diagnostic tool to identify the molecular basis of Usher syndrome. DNA from a 12-year-old male diagnosed with Usher syndrome was analyzed by WES. Mutations detected were confirmed by Sanger sequencing. The pathogenicity of these mutations was determined by in silico analysis. A maternally inherited deleterious frameshift mutation, c.14439_14454del in exon 66 and a paternally inherited non-sense c.10830G>A stop-gain SNV in exon 55 of USH2A were found as two novel compound heterozygous mutations. Both of these mutations disrupt the C terminal of USH2A protein. As a result, WES revealed two novel compound heterozygous mutations in a Turkish USH2A patient. This approach gave us an opportunity to have an appropriate diagnosis and provide genetic counseling to the family within a reasonable time. Copyright © 2015 Elsevier B.V. All rights reserved.

  1. Diversity and Structure of Diazotrophic Communities in Mangrove Rhizosphere, Revealed by High-Throughput Sequencing.

    Science.gov (United States)

    Zhang, Yanying; Yang, Qingsong; Ling, Juan; Van Nostrand, Joy D; Shi, Zhou; Zhou, Jizhong; Dong, Junde

    2017-01-01

    Diazotrophic communities make an essential contribution to the productivity through providing new nitrogen. However, knowledge of the roles that both mangrove tree species and geochemical parameters play in shaping mangove rhizosphere diazotrophic communities is still elusive. Here, a comprehensive examination of the diversity and structure of microbial communities in the rhizospheres of three mangrove species, Rhizophora apiculata , Avicennia marina , and Ceriops tagal , was undertaken using high - throughput sequencing of the 16S rRNA and nifH genes. Our results revealed a great diversity of both the total microbial composition and the diazotrophic composition specifically in the mangrove rhizosphere. Deltaproteobacteria and Gammaproteobacteria were both ubiquitous and dominant, comprising an average of 45.87 and 86.66% of total microbial and diazotrophic communities, respectively. Sulfate-reducing bacteria belonging to the Desulfobacteraceae and Desulfovibrionaceae were the dominant diazotrophs. Community statistical analyses suggested that both mangrove tree species and additional environmental variables played important roles in shaping total microbial and potential diazotroph communities in mangrove rhizospheres. In contrast to the total microbial community investigated by analysis of 16S rRNA gene sequences, most of the dominant diazotrophic groups identified by nifH gene sequences were significantly different among mangrove species. The dominant diazotrophs of the family Desulfobacteraceae were positively correlated with total phosphorus, but negatively correlated with the nitrogen to phosphorus ratio. The Pseudomonadaceae were positively correlated with the concentration of available potassium, suggesting that diazotrophs potentially play an important role in biogeochemical cycles, such as those of nitrogen, phosphorus, sulfur, and potassium, in the mangrove ecosystem.

  2. Diversity and Structure of Diazotrophic Communities in Mangrove Rhizosphere, Revealed by High-Throughput Sequencing

    Directory of Open Access Journals (Sweden)

    Yanying Zhang

    2017-10-01

    Full Text Available Diazotrophic communities make an essential contribution to the productivity through providing new nitrogen. However, knowledge of the roles that both mangrove tree species and geochemical parameters play in shaping mangove rhizosphere diazotrophic communities is still elusive. Here, a comprehensive examination of the diversity and structure of microbial communities in the rhizospheres of three mangrove species, Rhizophora apiculata, Avicennia marina, and Ceriops tagal, was undertaken using high-throughput sequencing of the 16S rRNA and nifH genes. Our results revealed a great diversity of both the total microbial composition and the diazotrophic composition specifically in the mangrove rhizosphere. Deltaproteobacteria and Gammaproteobacteria were both ubiquitous and dominant, comprising an average of 45.87 and 86.66% of total microbial and diazotrophic communities, respectively. Sulfate-reducing bacteria belonging to the Desulfobacteraceae and Desulfovibrionaceae were the dominant diazotrophs. Community statistical analyses suggested that both mangrove tree species and additional environmental variables played important roles in shaping total microbial and potential diazotroph communities in mangrove rhizospheres. In contrast to the total microbial community investigated by analysis of 16S rRNA gene sequences, most of the dominant diazotrophic groups identified by nifH gene sequences were significantly different among mangrove species. The dominant diazotrophs of the family Desulfobacteraceae were positively correlated with total phosphorus, but negatively correlated with the nitrogen to phosphorus ratio. The Pseudomonadaceae were positively correlated with the concentration of available potassium, suggesting that diazotrophs potentially play an important role in biogeochemical cycles, such as those of nitrogen, phosphorus, sulfur, and potassium, in the mangrove ecosystem.

  3. High quality maize centromere 10 sequence reveals evidence of frequent recombination events

    Directory of Open Access Journals (Sweden)

    Thomas Kai Wolfgruber

    2016-03-01

    Full Text Available The ancestral centromeres of maize contain long stretches of the tandemly arranged CentC repeat. The abundance of tandem DNA repeats and centromeric retrotransposons (CR have presented a significant challenge to completely assembling centromeres using traditional sequencing methods. Here we report a nearly complete assembly of the 1.85 Mb maize centromere 10 from inbred B73 using PacBio technology and BACs from the reference genome project. The error rates estimated from overlapping BAC sequences are 7 x 10-6 and 5 x 10-5 for mismatches and indels, respectively. The number of gaps in the region covered by the reassembly was reduced from 140 in the reference genome to three. Three expressed genes are located between 92 and 477 kb of the inferred ancestral CentC cluster, which lies within the region of highest centromeric repeat density. The improved assembly increased the count of full-length centromeric retrotransposons from 5 to 55 and revealed a 22.7 kb segmental duplication that occurred approximately 121,000 years ago. Our analysis provides evidence of frequent recombination events in the form of partial retrotransposons, deletions within retrotransposons, chimeric retrotransposons, segmental duplications including higher order CentC repeats, a deleted CentC monomer, centromere-proximal inversions, and insertion of mitochondrial sequences. Double-strand DNA break (DSB repair is the most plausible mechanism for these events and may be the major driver of centromere repeat evolution and diversity. This repair appears to be mediated by microhomology, suggesting that tandem repeats may have evolved to facilitate the repair of frequent DSBs in centromeres.

  4. Peripheral blood transcriptome sequencing reveals rejection-relevant genes in long-term heart transplantation.

    Science.gov (United States)

    Chen, Yan; Zhang, Haibo; Xiao, Xue; Jia, Yixin; Wu, Weili; Liu, Licheng; Jiang, Jun; Zhu, Baoli; Meng, Xu; Chen, Weijun

    2013-10-03

    Peripheral blood-based gene expression patterns have been investigated as biomarkers to monitor the immune system and rule out rejection after heart transplantation. Recent advances in the high-throughput deep sequencing (HTS) technologies provide new leads in transcriptome analysis. By performing Solexa/Illumina's digital gene expression (DGE) profiling, we analyzed gene expression profiles of PBMCs from 6 quiescent (grade 0) and 6 rejection (grade 2R&3R) heart transplant recipients at more than 6 months after transplantation. Subsequently, quantitative real-time polymerase chain reaction (qRT-PCR) was carried out in an independent validation cohort of 47 individuals from three rejection groups (ISHLT, grade 0,1R, 2R&3R). Through DGE sequencing and qPCR validation, 10 genes were identified as informative genes for detection of cardiac transplant rejection. A further clustering analysis showed that the 10 genes were not only effective for distinguishing patients with acute cardiac allograft rejection, but also informative for discriminating patients with renal allograft rejection based on both blood and biopsy samples. Moreover, PPI network analysis revealed that the 10 genes were connected to each other within a short interaction distance. We proposed a 10-gene signature for heart transplant patients at high-risk of developing severe rejection, which was found to be effective as well in other organ transplant. Moreover, we supposed that these genes function systematically as biomarkers in long-time allograft rejection. Further validation in broad transplant population would be required before the non-invasive biomarkers can be generally utilized to predict the risk of transplant rejection. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  5. The complete genome sequence of Fibrobacter succinogenes S85 reveals a cellulolytic and metabolic specialist.

    Directory of Open Access Journals (Sweden)

    Garret Suen

    Full Text Available Fibrobacter succinogenes is an important member of the rumen microbial community that converts plant biomass into nutrients usable by its host. This bacterium, which is also one of only two cultivated species in its phylum, is an efficient and prolific degrader of cellulose. Specifically, it has a particularly high activity against crystalline cellulose that requires close physical contact with this substrate. However, unlike other known cellulolytic microbes, it does not degrade cellulose using a cellulosome or by producing high extracellular titers of cellulase enzymes. To better understand the biology of F. succinogenes, we sequenced the genome of the type strain S85 to completion. A total of 3,085 open reading frames were predicted from its 3.84 Mbp genome. Analysis of sequences predicted to encode for carbohydrate-degrading enzymes revealed an unusually high number of genes that were classified into 49 different families of glycoside hydrolases, carbohydrate binding modules (CBMs, carbohydrate esterases, and polysaccharide lyases. Of the 31 identified cellulases, none contain CBMs in families 1, 2, and 3, typically associated with crystalline cellulose degradation. Polysaccharide hydrolysis and utilization assays showed that F. succinogenes was able to hydrolyze a number of polysaccharides, but could only utilize the hydrolytic products of cellulose. This suggests that F. succinogenes uses its array of hemicellulose-degrading enzymes to remove hemicelluloses to gain access to cellulose. This is reflected in its genome, as F. succinogenes lacks many of the genes necessary to transport and metabolize the hydrolytic products of non-cellulose polysaccharides. The F. succinogenes genome reveals a bacterium that specializes in cellulose as its sole energy source, and provides insight into a novel strategy for cellulose degradation.

  6. Sequencing and Analysis of Neanderthal Genomic DNA

    Energy Technology Data Exchange (ETDEWEB)

    Noonan, James P.; Coop, Graham; Kudaravalli, Sridhar; Smith,Doug; Krause, Johannes; Alessi, Joe; Chen, Feng; Platt, Darren; Paabo,Svante; Pritchard, Jonathan K.; Rubin, Edward M.

    2006-06-13

    Recovery and analysis of multiple Neanderthal autosomalsequences using a metagenomic approach reveals that modern humans andNeanderthals split ~;400,000 years ago, without significant evidence ofsubsequent admixture.

  7. Isolation of Hox cluster genes from insects reveals an accelerated sequence evolution rate.

    Directory of Open Access Journals (Sweden)

    Heike Hadrys

    Full Text Available Among gene families it is the Hox genes and among metazoan animals it is the insects (Hexapoda that have attracted particular attention for studying the evolution of development. Surprisingly though, no Hox genes have been isolated from 26 out of 35 insect orders yet, and the existing sequences derive mainly from only two orders (61% from Hymenoptera and 22% from Diptera. We have designed insect specific primers and isolated 37 new partial homeobox sequences of Hox cluster genes (lab, pb, Hox3, ftz, Antp, Scr, abd-a, Abd-B, Dfd, and Ubx from six insect orders, which are crucial to insect phylogenetics. These new gene sequences provide a first step towards comparative Hox gene studies in insects. Furthermore, comparative distance analyses of homeobox sequences reveal a correlation between gene divergence rate and species radiation success with insects showing the highest rate of homeobox sequence evolution.

  8. Digital image sequence processing, compression, and analysis

    CERN Document Server

    Reed, Todd R

    2004-01-01

    IntroductionTodd R. ReedCONTENT-BASED IMAGE SEQUENCE REPRESENTATIONPedro M. Q. Aguiar, Radu S. Jasinschi, José M. F. Moura, andCharnchai PluempitiwiriyawejTHE COMPUTATION OF MOTIONChristoph Stiller, Sören Kammel, Jan Horn, and Thao DangMOTION ANALYSIS AND DISPLACEMENT ESTIMATION IN THE FREQUENCY DOMAINLuca Lucchese and Guido Maria CortelazzoQUALITY OF SERVICE ASSESSMENT IN NEW GENERATION WIRELESS VIDEO COMMUNICATIONSGaetano GiuntaERROR CONCEALMENT IN DIGITAL VIDEOFrancesco G.B. De NataleIMAGE SEQUENCE RESTORATION: A WIDER PERSPECTIVEAnil KokaramVIDEO SUMMARIZATIONCuneyt M. Taskiran and Edward

  9. Sequence comparison and phylogenetic analysis of core gene of ...

    African Journals Online (AJOL)

    Phylogenetic analysis suggests that our sequences are clustered with sequences reported from Japan. This is the first phylogenetic analysis of HCV core gene from Pakistani population. Our sequences and sequences from Japan are grouped into same cluster in the phylogenetic tree. Sequence comparison and ...

  10. [Complete genome sequencing and sequence analysis of BCG Tice].

    Science.gov (United States)

    Wang, Zhiming; Pan, Yuanlong; Wu, Jun; Zhu, Baoli

    2012-10-04

    The objective of this study is to obtain the complete genome sequence of Bacillus Calmette-Guerin Tice (BCG Tice), in order to provide more information about the molecular biology of BCG Tice and design more reasonable vaccines to prevent tuberculosis. We assembled the data from high-throughput sequencing with SOAPdenovo software, with many contigs and scaffolds obtained. There are many sequence gaps and physical gaps remained as a result of regional low coverage and low quality. We designed primers at the end of contigs and performed PCR amplification in order to link these contigs and scaffolds. With various enzymes to perform PCR amplification, adjustment of PCR reaction conditions, and combined with clone construction to sequence, all the gaps were finished. We obtained the complete genome sequence of BCG Tice and submitted it to GenBank of National Center for Biotechnology Information (NCBI). The genome of BCG Tice is 4334064 base pairs in length, with GC content 65.65%. The problems and strategies during the finishing step of BCG Tice sequencing are illuminated here, with the hope of affording some experience to those who are involved in the finishing step of genome sequencing. The microarray data were verified by our results.

  11. Comparative analyses of six solanaceous transcriptomes reveal a high degree of sequence conservation and species-specific transcripts

    Directory of Open Access Journals (Sweden)

    Ouyang Shu

    2005-09-01

    Full Text Available Abstract Background The Solanaceae is a family of closely related species with diverse phenotypes that have been exploited for agronomic purposes. Previous studies involving a small number of genes suggested sequence conservation across the Solanaceae. The availability of large collections of Expressed Sequence Tags (ESTs for the Solanaceae now provides the opportunity to assess sequence conservation and divergence on a genomic scale. Results All available ESTs and Expressed Transcripts (ETs, 449,224 sequences for six Solanaceae species (potato, tomato, pepper, petunia, tobacco and Nicotiana benthamiana, were clustered and assembled into gene indices. Examination of gene ontologies revealed that the transcripts within the gene indices encode a similar suite of biological processes. Although the ESTs and ETs were derived from a variety of tissues, 55–81% of the sequences had significant similarity at the nucleotide level with sequences among the six species. Putative orthologs could be identified for 28–58% of the sequences. This high degree of sequence conservation was supported by expression profiling using heterologous hybridizations to potato cDNA arrays that showed similar expression patterns in mature leaves for all six solanaceous species. 16–19% of the transcripts within the six Solanaceae gene indices did not have matches among Solanaceae, Arabidopsis, rice or 21 other plant gene indices. Conclusion Results from this genome scale analysis confirmed a high level of sequence conservation at the nucleotide level of the coding sequence among Solanaceae. Additionally, the results indicated that part of the Solanaceae transcriptome is likely to be unique for each species.

  12. Complete genome sequence and comparative genomic analysis of Mycobacterium massiliense JCM 15300 in the Mycobacterium abscessus group reveal a conserved genomic island MmGI-1 related to putative lipid metabolism.

    Directory of Open Access Journals (Sweden)

    Tsuyoshi Sekizuka

    Full Text Available Mycobacterium abscessus group subsp., such as M. massiliense, M. abscessus sensu stricto and M. bolletii, are an environmental organism found in soil, water and other ecological niches, and have been isolated from respiratory tract infection, skin and soft tissue infection, postoperative infection of cosmetic surgery. To determine the unique genetic feature of M. massiliense, we sequenced the complete genome of M. massiliense type strain JCM 15300 (corresponding to CCUG 48898. Comparative genomic analysis was performed among Mycobacterium spp. and among M. abscessus group subspp., showing that additional ß-oxidation-related genes and, notably, the mammalian cell entry (mce operon were located on a genomic island, M. massiliense Genomic Island 1 (MmGI-1, in M. massiliense. In addition, putative anaerobic respiration system-related genes and additional mycolic acid cyclopropane synthetase-related genes were found uniquely in M. massiliense. Japanese isolates of M. massiliense also frequently possess the MmGI-1 (14/44, approximately 32% and three unique conserved regions (26/44; approximately 60%, 34/44; approximately 77% and 40/44; approximately 91%, as well as isolates of other countries (Malaysia, France, United Kingdom and United States. The well-conserved genomic island MmGI-1 may play an important role in high growth potential with additional lipid metabolism, extra factors for survival in the environment or synthesis of complex membrane-associated lipids. ORFs on MmGI-1 showed similarities to ORFs of phylogenetically distant M. avium complex (MAC, suggesting that horizontal gene transfer or genetic recombination events might have occurred within MmGI-1 among M. massiliense and MAC.

  13. Utility of RNA Sequencing for Analysis of Maize Reproductive Transcriptomes

    Directory of Open Access Journals (Sweden)

    Rebecca M. Davidson

    2011-11-01

    Full Text Available Transcriptome sequencing is a powerful method for studying global expression patterns in large, complex genomes. Evaluation of sequence-based expression profiles during reproductive development would provide functional annotation to genes underlying agronomic traits. We generated transcriptome profiles for 12 diverse maize ( L. reproductive tissues representing male, female, developing seed, and leaf tissues using high throughput transcriptome sequencing. Overall, ∼80% of annotated genes were expressed. Comparative analysis between sequence and hybridization-based methods demonstrated the utility of ribonucleic acid sequencing (RNA-seq for expression determination and differentiation of paralagous genes (∼85% of maize genes. Analysis of 4975 gene families across reproductive tissues revealed expression divergence is proportional to family size. In all pairwise comparisons between tissues, 7 (pre- vs. postemergence cobs to 48% (pollen vs. ovule of genes were differentially expressed. Genes with expression restricted to a single tissue within this study were identified with the highest numbers observed in leaves, endosperm, and pollen. Coexpression network analysis identified 17 gene modules with complex and shared expression patterns containing many previously described maize genes. The data and analyses in this study provide valuable tools through improved gene annotation, gene family characterization, and a core set of candidate genes to further characterize maize reproductive development and improve grain yield potential.

  14. Sequence Matching Analysis for Curriculum Development

    Directory of Open Access Journals (Sweden)

    Liem Yenny Bendatu

    2015-06-01

    Full Text Available Many organizations apply information technologies to support their business processes. Using the information technologies, the actual events are recorded and utilized to conform with predefined model. Conformance checking is an approach to measure the fitness and appropriateness between process model and actual events. However, when there are multiple events with the same timestamp, the traditional approach unfit to result such measures. This study attempts to develop a sequence matching analysis. Considering conformance checking as the basis of this approach, this proposed approach utilizes the current control flow technique in process mining domain. A case study in the field of educational process has been conducted. This study also proposes a curriculum analysis framework to test the proposed approach. By considering the learning sequence of students, it results some measurements for curriculum development. Finally, the result of the proposed approach has been verified by relevant instructors for further development.

  15. Whole-genome sequencing reveals mutational landscape underlying phenotypic differences between two widespread Chinese cattle breeds.

    Directory of Open Access Journals (Sweden)

    Yao Xu

    Full Text Available Whole-genome sequencing provides a powerful tool to obtain more genetic variability that could produce a range of benefits for cattle breeding industry. Nanyang (Bos indicus and Qinchuan (Bos taurus are two important Chinese indigenous cattle breeds with distinct phenotypes. To identify the genetic characteristics responsible for variation in phenotypes between the two breeds, in the present study, we for the first time sequenced the genomes of four Nanyang and four Qinchuan cattle with 10 to 12 fold on average of 97.86% and 98.98% coverage of genomes, respectively. Comparison with the Bos_taurus_UMD_3.1 reference assembly yielded 9,010,096 SNPs for Nanyang, and 6,965,062 for Qinchuan cattle, 51% and 29% of which were novel SNPs, respectively. A total of 154,934 and 115,032 small indels (1 to 3 bp were found in the Nanyang and Qinchuan genomes, respectively. The SNP and indel distribution revealed that Nanyang showed a genetically high diversity as compared to Qinchuan cattle. Furthermore, a total of 2,907 putative cases of copy number variation (CNV were identified by aligning Nanyang to Qinchuan genome, 783 of which (27% encompassed the coding regions of 495 functional genes. The gene ontology (GO analysis revealed that many CNV genes were enriched in the immune system and environment adaptability. Among several CNV genes related to lipid transport and fat metabolism, Lepin receptor gene (LEPR overlapping with CNV_1815 showed remarkably higher copy number in Qinchuan than Nanyang (log2 (ratio = -2.34988; P value = 1.53E-102. Further qPCR and association analysis investigated that the copy number of the LEPR gene presented positive correlations with transcriptional expression and phenotypic traits, suggesting the LEPR CNV may contribute to the higher fat deposition in muscles of Qinchuan cattle. Our findings provide evidence that the distinct phenotypes of Nanyang and Qinchuan breeds may be due to the different genetic variations including SNPs

  16. Uncommon nucleotide excision repair phenotypes revealed by targeted high-throughput sequencing.

    Science.gov (United States)

    Calmels, Nadège; Greff, Géraldine; Obringer, Cathy; Kempf, Nadine; Gasnier, Claire; Tarabeux, Julien; Miguet, Marguerite; Baujat, Geneviève; Bessis, Didier; Bretones, Patricia; Cavau, Anne; Digeon, Béatrice; Doco-Fenzy, Martine; Doray, Bérénice; Feillet, François; Gardeazabal, Jesus; Gener, Blanca; Julia, Sophie; Llano-Rivas, Isabel; Mazur, Artur; Michot, Caroline; Renaldo-Robin, Florence; Rossi, Massimiliano; Sabouraud, Pascal; Keren, Boris; Depienne, Christel; Muller, Jean; Mandel, Jean-Louis; Laugel, Vincent

    2016-03-22

    Deficient nucleotide excision repair (NER) activity causes a variety of autosomal recessive diseases including xeroderma pigmentosum (XP) a disorder which pre-disposes to skin cancer, and the severe multisystem condition known as Cockayne syndrome (CS). In view of the clinical overlap between NER-related disorders, as well as the existence of multiple phenotypes and the numerous genes involved, we developed a new diagnostic approach based on the enrichment of 16 NER-related genes by multiplex amplification coupled with next-generation sequencing (NGS). Our test cohort consisted of 11 DNA samples, all with known mutations and/or non pathogenic SNPs in two of the tested genes. We then used the same technique to analyse samples from a prospective cohort of 40 patients. Multiplex amplification and sequencing were performed using AmpliSeq protocol on the Ion Torrent PGM (Life Technologies). We identified causative mutations in 17 out of the 40 patients (43%). Four patients showed biallelic mutations in the ERCC6(CSB) gene, five in the ERCC8(CSA) gene: most of them had classical CS features but some had very mild and incomplete phenotypes. A small cohort of 4 unrelated classic XP patients from the Basque country (Northern Spain) revealed a common splicing mutation in POLH (XP-variant), demonstrating a new founder effect in this population. Interestingly, our results also found ERCC2(XPD), ERCC3(XPB) or ERCC5(XPG) mutations in two cases of UV-sensitive syndrome and in two cases with mixed XP/CS phenotypes. Our study confirms that NGS is an efficient technique for the analysis of NER-related disorders on a molecular level. It is particularly useful for phenotypes with combined features or unusually mild symptoms. Targeted NGS used in conjunction with DNA repair functional tests and precise clinical evaluation permits rapid and cost-effective diagnosis in patients with NER-defects.

  17. Analysis of Pteridium ribosomal RNA sequences by rapid direct sequencing.

    Science.gov (United States)

    Tan, M K

    1991-08-01

    A total of 864 bases from 5 regions interspersed in the 18S and 26S rRNA molecules from various clones of Pteridium covering the general geographical distribution of the genus was analysed using a rapid rRNA sequencing technique. No base difference has been detected amongst the three major lineages, two of which apparently separated before the breakup of the ancient supercontinent, Pangaea. These regions of the rRNA sequences have thus been conserved for at least 160 million years and are here compared with other eukaryotic, especially plant rRNAs.

  18. Sequence analysis of the genome of carnation (Dianthus caryophyllus L.).

    Science.gov (United States)

    Yagi, Masafumi; Kosugi, Shunichi; Hirakawa, Hideki; Ohmiya, Akemi; Tanase, Koji; Harada, Taro; Kishimoto, Kyutaro; Nakayama, Masayoshi; Ichimura, Kazuo; Onozaki, Takashi; Yamaguchi, Hiroyasu; Sasaki, Nobuhiro; Miyahara, Taira; Nishizaki, Yuzo; Ozeki, Yoshihiro; Nakamura, Noriko; Suzuki, Takamasa; Tanaka, Yoshikazu; Sato, Shusei; Shirasawa, Kenta; Isobe, Sachiko; Miyamura, Yoshinori; Watanabe, Akiko; Nakayama, Shinobu; Kishida, Yoshie; Kohara, Mitsuyo; Tabata, Satoshi

    2014-06-01

    The whole-genome sequence of carnation (Dianthus caryophyllus L.) cv. 'Francesco' was determined using a combination of different new-generation multiplex sequencing platforms. The total length of the non-redundant sequences was 568,887,315 bp, consisting of 45,088 scaffolds, which covered 91% of the 622 Mb carnation genome estimated by k-mer analysis. The N50 values of contigs and scaffolds were 16,644 bp and 60,737 bp, respectively, and the longest scaffold was 1,287,144 bp. The average GC content of the contig sequences was 36%. A total of 1050, 13, 92 and 143 genes for tRNAs, rRNAs, snoRNA and miRNA, respectively, were identified in the assembled genomic sequences. For protein-encoding genes, 43 266 complete and partial gene structures excluding those in transposable elements were deduced. Gene coverage was ∼ 98%, as deduced from the coverage of the core eukaryotic genes. Intensive characterization of the assigned carnation genes and comparison with those of other plant species revealed characteristic features of the carnation genome. The results of this study will serve as a valuable resource for fundamental and applied research of carnation, especially for breeding new carnation varieties. Further information on the genomic sequences is available at http://carnation.kazusa.or.jp. © The Author 2013. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.

  19. Re-analysis of metagenomic sequences from acute flaccid myelitis patients reveals alternatives to enterovirus D68 infection [v2; ref status: indexed, http://f1000r.es/5mz

    Directory of Open Access Journals (Sweden)

    Florian P. Breitwieser

    2015-07-01

    Full Text Available Metagenomic sequence data can be used to detect the presence of infectious viruses and bacteria, but normal microbial flora make this process challenging. We re-analyzed metagenomic RNA sequence data collected during a recent outbreak of acute flaccid myelitis (AFM, caused in some cases by infection with enterovirus D68. We found that among the patients whose symptoms were previously attributed to enterovirus D68, one patient had clear evidence of infection with Haemophilus influenzae, and a second patient had a severe Staphylococcus aureus infection caused by a methicillin-resistant strain. Neither of these bacteria were identified in the original study. These observations may have relevance in cases that present with flaccid paralysis because bacterial infections, co-infections or post-infection immune responses may trigger pathogenic processes that may present as poliomyelitis-like syndromes and may mimic AFM.  A separate finding was that large numbers of human sequences were present in each of the publicly released samples, although the original study reported that human sequences had been removed before deposition.

  20. First full-length genome sequence of the polerovirus luffa aphid-borne yellows virus (LABYV) reveals the presence of at least two consensus sequences in an isolate from Thailand.

    Science.gov (United States)

    Knierim, Dennis; Maiss, Edgar; Kenyon, Lawrence; Winter, Stephan; Menzel, Wulf

    2015-10-01

    Luffa aphid-borne yellows virus (LABYV) was proposed as the name for a previously undescribed polerovirus based on partial genome sequences obtained from samples of cucurbit plants collected in Thailand between 2008 and 2013. In this study, we determined the first full-length genome sequence of LABYV. Based on phylogenetic analysis and genome properties, it is clear that this virus represents a distinct species in the genus Polerovirus. Analysis of sequences from sample TH24, which was collected in 2010 from a luffa plant in Thailand, reveals the presence of two different full-length genome consensus sequences.

  1. Validation of Genotyping-By-Sequencing Analysis in Populations of Tetraploid Alfalfa by 454 Sequencing

    Science.gov (United States)

    Rocher, Solen; Jean, Martine; Castonguay, Yves; Belzile, François

    2015-01-01

    Genotyping-by-sequencing (GBS) is a relatively low-cost high throughput genotyping technology based on next generation sequencing and is applicable to orphan species with no reference genome. A combination of genome complexity reduction and multiplexing with DNA barcoding provides a simple and affordable way to resolve allelic variation between plant samples or populations. GBS was performed on ApeKI libraries using DNA from 48 genotypes each of two heterogeneous populations of tetraploid alfalfa (Medicago sativa spp. sativa): the synthetic cultivar Apica (ATF0) and a derived population (ATF5) obtained after five cycles of recurrent selection for superior tolerance to freezing (TF). Nearly 400 million reads were obtained from two lanes of an Illumina HiSeq 2000 sequencer and analyzed with the Universal Network-Enabled Analysis Kit (UNEAK) pipeline designed for species with no reference genome. Following the application of whole dataset-level filters, 11,694 single nucleotide polymorphism (SNP) loci were obtained. About 60% had a significant match on the Medicago truncatula syntenic genome. The accuracy of allelic ratios and genotype calls based on GBS data was directly assessed using 454 sequencing on a subset of SNP loci scored in eight plant samples. Sequencing depth in this study was not sufficient for accurate tetraploid allelic dosage, but reliable genotype calls based on diploid allelic dosage were obtained when using additional quality filtering. Principal Component Analysis of SNP loci in plant samples revealed that a small proportion (<5%) of the genetic variability assessed by GBS is able to differentiate ATF0 and ATF5. Our results confirm that analysis of GBS data using UNEAK is a reliable approach for genome-wide discovery of SNP loci in outcrossed polyploids. PMID:26115486

  2. Metagenomic sequencing reveals the relationship between microbiota composition and quality of Chinese Rice Wine.

    Science.gov (United States)

    Hong, Xutao; Chen, Jing; Liu, Lin; Wu, Huan; Tan, Haiqin; Xie, Guangfa; Xu, Qian; Zou, Huijun; Yu, Wenjing; Wang, Lan; Qin, Nan

    2016-05-31

    Chinese Rice Wine (CRW) is a common alcoholic beverage in China. To investigate the influence of microbial composition on the quality of CRW, high throughput sequencing was performed for 110 wine samples on bacterial 16S rRNA gene and fungal Internal Transcribed Spacer II (ITS2). Bioinformatic analyses demonstrated that the quality of yeast starter and final wine correlated with microbial taxonomic composition, which was exemplified by our finding that wine spoilage resulted from a high proportion of genus Lactobacillus. Subsequently, based on Lactobacillus abundance of an early stage, a model was constructed to predict final wine quality. In addition, three batches of 20 representative wine samples selected from a pool of 110 samples were further analyzed in metagenomics. The results revealed that wine spoilage was due to rapid growth of Lactobacillus brevis at the early stage of fermentation. Gene functional analysis indicated the importance of some pathways such as synthesis of biotin, malolactic fermentation and production of short-chain fatty acid. These results led to a conclusion that metabolisms of microbes influence the wine quality. Thus, nurturing of beneficial microbes and inhibition of undesired ones are both important for the mechanized brewery.

  3. Genome Sequencing Reveals the Potential of Achromobacter sp. HZ01 for Bioremediation

    Directory of Open Access Journals (Sweden)

    Yue-Hui Hong

    2017-08-01

    Full Text Available Petroleum pollution is a severe environmental issue. Comprehensively revealing the genetic backgrounds of hydrocarbon-degrading microorganisms contributes to developing effective methods for bioremediation of crude oil-polluted environments. Marine bacterium Achromobacter sp. HZ01 is capable of degrading hydrocarbons and producing biosurfactants. In this study, the draft genome (5.5 Mbp of strain HZ01 has been obtained by Illumina sequencing, containing 5,162 predicted genes. Genome annotation shows that “amino acid metabolism” is the most abundant metabolic pathway. Strain HZ01 is not capable of using some common carbohydrates as the sole carbon sources, which is due to that it contains few genes associated with carbohydrate transport and lacks some important enzymes related to glycometabolism. It contains abundant proteins directly related to petroleum hydrocarbon degradation. AlkB hydroxylase and its homologs were not identified. It harbors a complete enzyme system of terminal oxidation pathway for n-alkane degradation, which may be initiated by cytochrome P450. The enzymes involved in the catechol pathway are relatively complete for the degradation of aromatic compounds. This bacterium lacks several essential enzymes for methane oxidation, and Baeyer-Villiger monooxygenase involved in the subterminal oxidation pathway and cycloalkane degradation was not identified. These results suggest that strain HZ01 degrades n-alkanes via the terminal oxidation pathway, degrades aromatic compounds primarily via the catechol pathway and cannot perform methane oxidation or cycloalkane degradation. Additionally, strain HZ01 possesses abundant genes related to the metabolism of secondary metabolites, including some genes involved in biosurfactant (such as glycolipids and lipopeptides synthesis. The genome analysis also reveals its genetic basis for nitrogen metabolism, antibiotic resistance, regulatory responses to environmental changes, cell motility

  4. 18S rDNA Sequences from Microeukaryotes Reveal Oil Indicators in Mangrove Sediment

    Science.gov (United States)

    Santos, Henrique F.; Cury, Juliano C.; Carmo, Flavia L.; Rosado, Alexandre S.; Peixoto, Raquel S.

    2010-01-01

    Background Microeukaryotes are an effective indicator of the presence of environmental contaminants. However, the characterisation of these organisms by conventional tools is often inefficient, and recent molecular studies have revealed a great diversity of microeukaryotes. The full extent of this diversity is unknown, and therefore, the distribution, ecological role and responses to anthropogenic effects of microeukaryotes are rather obscure. The majority of oil from oceanic oil spills (e.g., the May 2010 accident in the Gulf of Mexico) converges on coastal ecosystems such as mangroves, which are threatened with worldwide disappearance, highlighting the need for efficient tools to indicate the presence of oil in these environments. However, no studies have used molecular methods to assess the effects of oil contamination in mangrove sediment on microeukaryotes as a group. Methodology/Principal Findings We evaluated the population dynamics and the prevailing 18S rDNA phylotypes of microeukaryotes in mangrove sediment microcosms with and without oil contamination, using PCR/DGGE and clone libraries. We found that microeukaryotes are useful for monitoring oil contamination in mangroves. Our clone library analysis revealed a decrease in both diversity and species richness after contamination. The phylogenetic group that showed the greatest sensitivity to oil was the Nematoda. After contamination, a large increase in the abundance of the groups Bacillariophyta (diatoms) and Biosoecida was detected. The oil-contaminated samples were almost entirely dominated by organisms related to Bacillariophyta sp. and Cafeteria minima, which indicates that these groups are possible targets for biomonitoring oil in mangroves. The DGGE fingerprints also indicated shifts in microeukaryote profiles; specific band sequencing indicated the appearance of Bacillariophyta sp. only in contaminated samples and Nematoda only in non-contaminated sediment. Conclusions/Significance We believe that

  5. Ontogeny of hepatic energy metabolism genes in mice as revealed by RNA-sequencing.

    Directory of Open Access Journals (Sweden)

    Helen J Renaud

    Full Text Available The liver plays a central role in metabolic homeostasis by coordinating synthesis, storage, breakdown, and redistribution of nutrients. Hepatic energy metabolism is dynamically regulated throughout different life stages due to different demands for energy during growth and development. However, changes in gene expression patterns throughout ontogeny for factors important in hepatic energy metabolism are not well understood. We performed detailed transcript analysis of energy metabolism genes during various stages of liver development in mice. Livers from male C57BL/6J mice were collected at twelve ages, including perinatal and postnatal time points (n = 3/age. The mRNA was quantified by RNA-Sequencing, with transcript abundance estimated by Cufflinks. One thousand sixty energy metabolism genes were examined; 794 were above detection, of which 627 were significantly changed during at least one developmental age compared to adult liver. Two-way hierarchical clustering revealed three major clusters dependent on age: GD17.5-Day 5 (perinatal-enriched, Day 10-Day 20 (pre-weaning-enriched, and Day 25-Day 60 (adolescence/adulthood-enriched. Clustering analysis of cumulative mRNA expression values for individual pathways of energy metabolism revealed three patterns of enrichment: glycolysis, ketogenesis, and glycogenesis were all perinatally-enriched; glycogenolysis was the only pathway enriched during pre-weaning ages; whereas lipid droplet metabolism, cholesterol and bile acid metabolism, gluconeogenesis, and lipid metabolism were all enriched in adolescence/adulthood. This study reveals novel findings such as the divergent expression of the fatty acid β-oxidation enzymes Acyl-CoA oxidase 1 and Carnitine palmitoyltransferase 1a, indicating a switch from mitochondrial to peroxisomal β-oxidation after weaning; as well as the dynamic ontogeny of genes implicated in obesity such as Stearoyl-CoA desaturase 1 and Elongation of very long chain fatty

  6. Bovine exome sequence analysis and targeted SNP genotyping of recessive fertility defects BH1, HH2, and HH3 reveal a putative causative mutation in SMC2 for HH3.

    Science.gov (United States)

    McClure, Matthew C; Bickhart, Derek; Null, Dan; Vanraden, Paul; Xu, Lingyang; Wiggans, George; Liu, George; Schroeder, Steve; Glasscock, Jarret; Armstrong, Jon; Cole, John B; Van Tassell, Curtis P; Sonstegard, Tad S

    2014-01-01

    The recent discovery of bovine haplotypes with negative effects on fertility in the Brown Swiss, Holstein, and Jersey breeds has allowed producers to identify carrier animals using commercial single nucleotide polymorphism (SNP) genotyping assays. This study was devised to identify the causative mutations underlying defective bovine embryo development contained within three of these haplotypes (Brown Swiss haplotype 1 and Holstein haplotypes 2 and 3) by combining exome capture with next generation sequencing. Of the 68,476,640 sequence variations (SV) identified, only 1,311 genome-wide SNP were concordant with the haplotype status of 21 sequenced carriers. Validation genotyping of 36 candidate SNP identified only 1 variant that was concordant to Holstein haplotype 3 (HH3), while no variants located within the refined intervals for HH2 or BH1 were concordant. The variant strictly associated with HH3 is a non-synonymous SNP (T/C) within exon 24 of the Structural Maintenance of Chromosomes 2 (SMC2) on Chromosome 8 at position 95,410,507 (UMD3.1). This polymorphism changes amino acid 1135 from phenylalanine to serine and causes a non-neutral, non-tolerated, and evolutionarily unlikely substitution within the NTPase domain of the encoded protein. Because only exome capture sequencing was used, we could not rule out the possibility that the true causative mutation for HH3 might lie in a non-exonic genomic location. Given the essential role of SMC2 in DNA repair, chromosome condensation and segregation during cell division, our findings strongly support the non-synonymous SNP (T/C) in SMC2 as the likely causative mutation. The absence of concordant variations for HH2 or BH1 suggests either the underlying causative mutations lie within a non-exomic region or in exome regions not covered by the capture array.

  7. Bovine exome sequence analysis and targeted SNP genotyping of recessive fertility defects BH1, HH2, and HH3 reveal a putative causative mutation in SMC2 for HH3.

    Directory of Open Access Journals (Sweden)

    Matthew C McClure

    Full Text Available The recent discovery of bovine haplotypes with negative effects on fertility in the Brown Swiss, Holstein, and Jersey breeds has allowed producers to identify carrier animals using commercial single nucleotide polymorphism (SNP genotyping assays. This study was devised to identify the causative mutations underlying defective bovine embryo development contained within three of these haplotypes (Brown Swiss haplotype 1 and Holstein haplotypes 2 and 3 by combining exome capture with next generation sequencing. Of the 68,476,640 sequence variations (SV identified, only 1,311 genome-wide SNP were concordant with the haplotype status of 21 sequenced carriers. Validation genotyping of 36 candidate SNP identified only 1 variant that was concordant to Holstein haplotype 3 (HH3, while no variants located within the refined intervals for HH2 or BH1 were concordant. The variant strictly associated with HH3 is a non-synonymous SNP (T/C within exon 24 of the Structural Maintenance of Chromosomes 2 (SMC2 on Chromosome 8 at position 95,410,507 (UMD3.1. This polymorphism changes amino acid 1135 from phenylalanine to serine and causes a non-neutral, non-tolerated, and evolutionarily unlikely substitution within the NTPase domain of the encoded protein. Because only exome capture sequencing was used, we could not rule out the possibility that the true causative mutation for HH3 might lie in a non-exonic genomic location. Given the essential role of SMC2 in DNA repair, chromosome condensation and segregation during cell division, our findings strongly support the non-synonymous SNP (T/C in SMC2 as the likely causative mutation. The absence of concordant variations for HH2 or BH1 suggests either the underlying causative mutations lie within a non-exomic region or in exome regions not covered by the capture array.

  8. Whole-exome sequencing reveals GPIHBP1 mutations in infantile colitis with severe hypertriglyceridemia.

    Science.gov (United States)

    Gonzaga-Jauregui, Claudia; Mir, Sabina; Penney, Samantha; Jhangiani, Shalini; Midgen, Craig; Finegold, Milton; Muzny, Donna M; Wang, Min; Bacino, Carlos A; Gibbs, Richard A; Lupski, James R; Kellermayer, Richard; Hanchard, Neil A

    2014-07-01

    Severe congenital hypertriglyceridemia (HTG) is a rare disorder caused by mutations in genes affecting lipoprotein lipase (LPL) activity. Here we report a 5-week-old Hispanic girl with severe HTG (12,031 mg/dL, normal limit 150 mg/dL) who presented with the unusual combination of lower gastrointestinal bleeding and milky plasma. Initial colonoscopy was consistent with colitis, which resolved with reduction of triglycerides. After negative sequencing of the LPL gene, whole-exome sequencing revealed novel compound heterozygous mutations in GPIHBP1. Our study broadens the phenotype of GPIHBP1-associated HTG, reinforces the effectiveness of whole-exome sequencing in Mendelian diagnoses, and implicates triglycerides in gastrointestinal mucosal injury.

  9. Draft whole genome sequence of groundnut stem rot fungus Athelia rolfsii revealing genetic architect of its pathogenicity and virulence.

    Science.gov (United States)

    Iquebal, M A; Tomar, Rukam S; Parakhia, M V; Singla, Deepak; Jaiswal, Sarika; Rathod, V M; Padhiyar, S M; Kumar, Neeraj; Rai, Anil; Kumar, Dinesh

    2017-07-13

    Groundnut (Arachis hypogaea L.) is an important oil seed crop having major biotic constraint in production due to stem rot disease caused by fungus, Athelia rolfsii causing 25-80% loss in productivity. As chemical and biological combating strategies of this fungus are not very effective, thus genome sequencing can reveal virulence and pathogenicity related genes for better understanding of the host-parasite interaction. We report draft assembly of Athelia rolfsii genome of ~73 Mb having 8919 contigs. Annotation analysis revealed 16830 genes which are involved in fungicide resistance, virulence and pathogenicity along with putative effector and lethal genes. Secretome analysis revealed CAZY genes representing 1085 enzymatic genes, glycoside hydrolases, carbohydrate esterases, carbohydrate-binding modules, auxillary activities, glycosyl transferases and polysaccharide lyases. Repeat analysis revealed 11171 SSRs, LTR, GYPSY and COPIA elements. Comparative analysis with other existing ascomycotina genome predicted conserved domain family of WD40, CYP450, Pkinase and ABC transporter revealing insight of evolution of pathogenicity and virulence. This study would help in understanding pathogenicity and virulence at molecular level and development of new combating strategies. Such approach is imperative in endeavour of genome based solution in stem rot disease management leading to better productivity of groundnut crop in tropical region of world.

  10. Scalable whole-exome sequencing of cell-free DNA reveals high concordance with metastatic tumors.

    Science.gov (United States)

    Adalsteinsson, Viktor A; Ha, Gavin; Freeman, Samuel S; Choudhury, Atish D; Stover, Daniel G; Parsons, Heather A; Gydush, Gregory; Reed, Sarah C; Rotem, Denisse; Rhoades, Justin; Loginov, Denis; Livitz, Dimitri; Rosebrock, Daniel; Leshchiner, Ignaty; Kim, Jaegil; Stewart, Chip; Rosenberg, Mara; Francis, Joshua M; Zhang, Cheng-Zhong; Cohen, Ofir; Oh, Coyin; Ding, Huiming; Polak, Paz; Lloyd, Max; Mahmud, Sairah; Helvie, Karla; Merrill, Margaret S; Santiago, Rebecca A; O'Connor, Edward P; Jeong, Seong H; Leeson, Rachel; Barry, Rachel M; Kramkowski, Joseph F; Zhang, Zhenwei; Polacek, Laura; Lohr, Jens G; Schleicher, Molly; Lipscomb, Emily; Saltzman, Andrea; Oliver, Nelly M; Marini, Lori; Waks, Adrienne G; Harshman, Lauren C; Tolaney, Sara M; Van Allen, Eliezer M; Winer, Eric P; Lin, Nancy U; Nakabayashi, Mari; Taplin, Mary-Ellen; Johannessen, Cory M; Garraway, Levi A; Golub, Todd R; Boehm, Jesse S; Wagle, Nikhil; Getz, Gad; Love, J Christopher; Meyerson, Matthew

    2017-11-06

    Whole-exome sequencing of cell-free DNA (cfDNA) could enable comprehensive profiling of tumors from blood but the genome-wide concordance between cfDNA and tumor biopsies is uncertain. Here we report ichorCNA, software that quantifies tumor content in cfDNA from 0.1× coverage whole-genome sequencing data without prior knowledge of tumor mutations. We apply ichorCNA to 1439 blood samples from 520 patients with metastatic prostate or breast cancers. In the earliest tested sample for each patient, 34% of patients have ≥10% tumor-derived cfDNA, sufficient for standard coverage whole-exome sequencing. Using whole-exome sequencing, we validate the concordance of clonal somatic mutations (88%), copy number alterations (80%), mutational signatures, and neoantigens between cfDNA and matched tumor biopsies from 41 patients with ≥10% cfDNA tumor content. In summary, we provide methods to identify patients eligible for comprehensive cfDNA profiling, revealing its applicability to many patients, and demonstrate high concordance of cfDNA and metastatic tumor whole-exome sequencing.

  11. Reanalysis of RNA-sequencing data reveals several additional fusion genes with multiple isoforms.

    Science.gov (United States)

    Kangaspeska, Sara; Hultsch, Susanne; Edgren, Henrik; Nicorici, Daniel; Murumägi, Astrid; Kallioniemi, Olli

    2012-01-01

    RNA-sequencing and tailored bioinformatic methodologies have paved the way for identification of expressed fusion genes from the chaotic genomes of solid tumors. We have recently successfully exploited RNA-sequencing for the discovery of 24 novel fusion genes in breast cancer. Here, we demonstrate the importance of continuous optimization of the bioinformatic methodology for this purpose, and report the discovery and experimental validation of 13 additional fusion genes from the same samples. Integration of copy number profiling with the RNA-sequencing results revealed that the majority of the gene fusions were promoter-donating events that occurred at copy number transition points or involved high-level DNA-amplifications. Sequencing of genomic fusion break points confirmed that DNA-level rearrangements underlie selected fusion transcripts. Furthermore, a significant portion (>60%) of the fusion genes were alternatively spliced. This illustrates the importance of reanalyzing sequencing data as gene definitions change and bioinformatic methods improve, and highlights the previously unforeseen isoform diversity among fusion transcripts.

  12. Reanalysis of RNA-sequencing data reveals several additional fusion genes with multiple isoforms.

    Directory of Open Access Journals (Sweden)

    Sara Kangaspeska

    Full Text Available RNA-sequencing and tailored bioinformatic methodologies have paved the way for identification of expressed fusion genes from the chaotic genomes of solid tumors. We have recently successfully exploited RNA-sequencing for the discovery of 24 novel fusion genes in breast cancer. Here, we demonstrate the importance of continuous optimization of the bioinformatic methodology for this purpose, and report the discovery and experimental validation of 13 additional fusion genes from the same samples. Integration of copy number profiling with the RNA-sequencing results revealed that the majority of the gene fusions were promoter-donating events that occurred at copy number transition points or involved high-level DNA-amplifications. Sequencing of genomic fusion break points confirmed that DNA-level rearrangements underlie selected fusion transcripts. Furthermore, a significant portion (>60% of the fusion genes were alternatively spliced. This illustrates the importance of reanalyzing sequencing data as gene definitions change and bioinformatic methods improve, and highlights the previously unforeseen isoform diversity among fusion transcripts.

  13. Multi-region and single-cell sequencing reveal variable genomic heterogeneity in rectal cancer.

    Science.gov (United States)

    Liu, Mingshan; Liu, Yang; Di, Jiabo; Su, Zhe; Yang, Hong; Jiang, Beihai; Wang, Zaozao; Zhuang, Meng; Bai, Fan; Su, Xiangqian

    2017-11-23

    Colorectal cancer is a heterogeneous group of malignancies with complex molecular subtypes. While colon cancer has been widely investigated, studies on rectal cancer are very limited. Here, we performed multi-region whole-exome sequencing and single-cell whole-genome sequencing to examine the genomic intratumor heterogeneity (ITH) of rectal tumors. We sequenced nine tumor regions and 88 single cells from two rectal cancer patients with tumors of the same molecular classification and characterized their mutation profiles and somatic copy number alterations (SCNAs) at the multi-region and the single-cell levels. A variable extent of genomic heterogeneity was observed between the two patients, and the degree of ITH increased when analyzed on the single-cell level. We found that major SCNAs were early events in cancer development and inherited steadily. Single-cell sequencing revealed mutations and SCNAs which were hidden in bulk sequencing. In summary, we studied the ITH of rectal cancer at regional and single-cell resolution and demonstrated that variable heterogeneity existed in two patients. The mutational scenarios and SCNA profiles of two patients with treatment naïve from the same molecular subtype are quite different. Our results suggest each tumor possesses its own architecture, which may result in different diagnosis, prognosis, and drug responses. Remarkable ITH exists in the two patients we have studied, providing a preliminary impression of ITH in rectal cancer.

  14. The Douglas-Fir Genome Sequence Reveals Specialization of the Photosynthetic Apparatus in Pinaceae

    Directory of Open Access Journals (Sweden)

    David B. Neale

    2017-09-01

    Full Text Available A reference genome sequence for Pseudotsuga menziesii var. menziesii (Mirb. Franco (Coastal Douglas-fir is reported, thus providing a reference sequence for a third genus of the family Pinaceae. The contiguity and quality of the genome assembly far exceeds that of other conifer reference genome sequences (contig N50 = 44,136 bp and scaffold N50 = 340,704 bp. Incremental improvements in sequencing and assembly technologies are in part responsible for the higher quality reference genome, but it may also be due to a slightly lower exact repeat content in Douglas-fir vs. pine and spruce. Comparative genome annotation with angiosperm species reveals gene-family expansion and contraction in Douglas-fir and other conifers which may account for some of the major morphological and physiological differences between the two major plant groups. Notable differences in the size of the NDH-complex gene family and genes underlying the functional basis of shade tolerance/intolerance were observed. This reference genome sequence not only provides an important resource for Douglas-fir breeders and geneticists but also sheds additional light on the evolutionary processes that have led to the divergence of modern angiosperms from the more ancient gymnosperms.

  15. Homozygosity mapping and targeted sanger sequencing reveal genetic defects underlying inherited retinal disease in families from pakistan.

    Directory of Open Access Journals (Sweden)

    Maleeha Maria

    Full Text Available Homozygosity mapping has facilitated the identification of the genetic causes underlying inherited diseases, particularly in consanguineous families with multiple affected individuals. This knowledge has also resulted in a mutation dataset that can be used in a cost and time effective manner to screen frequent population-specific genetic variations associated with diseases such as inherited retinal disease (IRD.We genetically screened 13 families from a cohort of 81 Pakistani IRD families diagnosed with Leber congenital amaurosis (LCA, retinitis pigmentosa (RP, congenital stationary night blindness (CSNB, or cone dystrophy (CD. We employed genome-wide single nucleotide polymorphism (SNP array analysis to identify homozygous regions shared by affected individuals and performed Sanger sequencing of IRD-associated genes located in the sizeable homozygous regions. In addition, based on population specific mutation data we performed targeted Sanger sequencing (TSS of frequent variants in AIPL1, CEP290, CRB1, GUCY2D, LCA5, RPGRIP1 and TULP1, in probands from 28 LCA families.Homozygosity mapping and Sanger sequencing of IRD-associated genes revealed the underlying mutations in 10 families. TSS revealed causative variants in three families. In these 13 families four novel mutations were identified in CNGA1, CNGB1, GUCY2D, and RPGRIP1.Homozygosity mapping and TSS revealed the underlying genetic cause in 13 IRD families, which is useful for genetic counseling as well as therapeutic interventions that are likely to become available in the near future.

  16. 454 sequencing reveals extreme complexity of the class II Major Histocompatibility Complex in the collared flycatcher

    Directory of Open Access Journals (Sweden)

    Gustafsson Lars

    2010-12-01

    Full Text Available Abstract Background Because of their functional significance, the Major Histocompatibility Complex (MHC class I and II genes have been the subject of continuous interest in the fields of ecology, evolution and conservation. In some vertebrate groups MHC consists of multiple loci with similar alleles; therefore, the multiple loci must be genotyped simultaneously. In such complex systems, understanding of the evolutionary patterns and their causes has been limited due to challenges posed by genotyping. Results Here we used 454 amplicon sequencing to characterize MHC class IIB exon 2 variation in the collared flycatcher, an important organism in evolutionary and immuno-ecological studies. On the basis of over 152,000 sequencing reads we identified 194 putative alleles in 237 individuals. We found an extreme complexity of the MHC class IIB in the collared flycatchers, with our estimates pointing to the presence of at least nine expressed loci and a large, though difficult to estimate precisely, number of pseudogene loci. Many similar alleles occurred in the pseudogenes indicating either a series of recent duplications or extensive concerted evolution. The expressed alleles showed unambiguous signals of historical selection and the occurrence of apparent interlocus exchange of alleles. Placing the collared flycatcher's MHC sequences in the context of passerine diversity revealed transspecific MHC class II evolution within the Muscicapidae family. Conclusions 454 amplicon sequencing is an effective tool for advancing our understanding of the MHC class II structure and evolutionary patterns in Passeriformes. We found a highly dynamic pattern of evolution of MHC class IIB genes with strong signals of selection and pronounced sequence divergence in expressed genes, in contrast to the apparent sequence homogenization in pseudogenes. We show that next generation sequencing offers a universal, affordable method for the characterization and, in perspective

  17. FAST: FAST Analysis of Sequences Toolbox

    Directory of Open Access Journals (Sweden)

    Travis J. Lawrence

    2015-05-01

    Full Text Available FAST (FAST Analysis of Sequences Toolbox provides simple, powerful open source command-line tools to filter, transform, annotate and analyze biological sequence data. Modeled after the GNU (GNU’s Not Unix Textutils such as grep, cut, and tr, FAST tools such as fasgrep, fascut, and fastr make it easy to rapidly prototype expressive bioinformatic workflows in a compact and generic command vocabulary. Compact combinatorial encoding of data workflows with FAST commands can simplify the documentation and reproducibility of bioinformatic protocols, supporting better transparency in biological data science. Interface self-consistency and conformity with conventions of GNU, Matlab, Perl, BioPerl, R and GenBank help make FAST easy and rewarding to learn. FAST automates numerical, taxonomic, and text-based sorting, selection and transformation of sequence records and alignment sites based on content, index ranges, descriptive tags, annotated features, and in-line calculated analytics, including composition and codon usage. Automated content- and feature-based extraction of sites and support for molecular population genetic statistics makes FAST useful for molecular evolutionary analysis. FAST is portable, easy to install and secure thanks to the relative maturity of its Perl and BioPerl foundations, with stable releases posted to CPAN. Development as well as a publicly accessible Cookbook and Wiki are available on the FAST GitHub repository at https://github.com/tlawrence3/FAST. The default data exchange format in FAST is Multi-FastA (specifically, a restriction of BioPerl FastA format. Sanger and Illumina 1.8+ FastQ formatted files are also supported. FAST makes it easier for non-programmer biologists to interactively investigate and control biological data at the speed of thought.

  18. Bayesian Correlation Analysis for Sequence Count Data.

    Directory of Open Access Journals (Sweden)

    Daniel Sánchez-Taltavull

    Full Text Available Evaluating the similarity of different measured variables is a fundamental task of statistics, and a key part of many bioinformatics algorithms. Here we propose a Bayesian scheme for estimating the correlation between different entities' measurements based on high-throughput sequencing data. These entities could be different genes or miRNAs whose expression is measured by RNA-seq, different transcription factors or histone marks whose expression is measured by ChIP-seq, or even combinations of different types of entities. Our Bayesian formulation accounts for both measured signal levels and uncertainty in those levels, due to varying sequencing depth in different experiments and to varying absolute levels of individual entities, both of which affect the precision of the measurements. In comparison with a traditional Pearson correlation analysis, we show that our Bayesian correlation analysis retains high correlations when measurement confidence is high, but suppresses correlations when measurement confidence is low-especially for entities with low signal levels. In addition, we consider the influence of priors on the Bayesian correlation estimate. Perhaps surprisingly, we show that naive, uniform priors on entities' signal levels can lead to highly biased correlation estimates, particularly when different experiments have widely varying sequencing depths. However, we propose two alternative priors that provably mitigate this problem. We also prove that, like traditional Pearson correlation, our Bayesian correlation calculation constitutes a kernel in the machine learning sense, and thus can be used as a similarity measure in any kernel-based machine learning algorithm. We demonstrate our approach on two RNA-seq datasets and one miRNA-seq dataset.

  19. A basic analysis toolkit for biological sequences

    Directory of Open Access Journals (Sweden)

    Siragusa Enrico

    2007-09-01

    Full Text Available Abstract This paper presents a software library, nicknamed BATS, for some basic sequence analysis tasks. Namely, local alignments, via approximate string matching, and global alignments, via longest common subsequence and alignments with affine and concave gap cost functions. Moreover, it also supports filtering operations to select strings from a set and establish their statistical significance, via z-score computation. None of the algorithms is new, but although they are generally regarded as fundamental for sequence analysis, they have not been implemented in a single and consistent software package, as we do here. Therefore, our main contribution is to fill this gap between algorithmic theory and practice by providing an extensible and easy to use software library that includes algorithms for the mentioned string matching and alignment problems. The library consists of C/C++ library functions as well as Perl library functions. It can be interfaced with Bioperl and can also be used as a stand-alone system with a GUI. The software is available at http://www.math.unipa.it/~raffaele/BATS/ under the GNU GPL.

  20. Extensive structural variations between mitochondrial genomes of CMS and normal peppers (Capsicum annuum L.) revealed by complete nucleotide sequencing.

    Science.gov (United States)

    Jo, Yeong Deuk; Choi, Yoomi; Kim, Dong-Hwan; Kim, Byung-Dong; Kang, Byoung-Cheorl

    2014-07-04

    Cytoplasmic male sterility (CMS) is an inability to produce functional pollen that is caused by mutation of the mitochondrial genome. Comparative analyses of mitochondrial genomes of lines with and without CMS in several species have revealed structural differences between genomes, including extensive rearrangements caused by recombination. However, the mitochondrial genome structure and the DNA rearrangements that may be related to CMS have not been characterized in Capsicum spp. We obtained the complete mitochondrial genome sequences of the pepper CMS line FS4401 (507,452 bp) and the fertile line Jeju (511,530 bp). Comparative analysis between mitochondrial genomes of peppers and tobacco that are included in Solanaceae revealed extensive DNA rearrangements and poor conservation in non-coding DNA. In comparison between pepper lines, FS4401 and Jeju mitochondrial DNAs contained the same complement of protein coding genes except for one additional copy of an atp6 gene (ψatp6-2) in FS4401. In terms of genome structure, we found eighteen syntenic blocks in the two mitochondrial genomes, which have been rearranged in each genome. By contrast, sequences between syntenic blocks, which were specific to each line, accounted for 30,380 and 17,847 bp in FS4401 and Jeju, respectively. The previously-reported CMS candidate genes, orf507 and ψatp6-2, were located on the edges of the largest sequence segments that were specific to FS4401. In this region, large number of small sequence segments which were absent or found on different locations in Jeju mitochondrial genome were combined together. The incorporation of repeats and overlapping of connected sequence segments by a few nucleotides implied that extensive rearrangements by homologous recombination might be involved in evolution of this region. Further analysis using mtDNA pairs from other plant species revealed common features of DNA regions around CMS-associated genes. Although large portion of sequence context was

  1. Whole genome sequence analysis of Mycobacterium suricattae

    KAUST Repository

    Dippenaar, Anzaan; Parsons, Sven David Charles; Sampson, Samantha Leigh; Van Der Merwe, Ruben Gerhard; Drewe, Julian Ashley; Abdallah, Abdallah; Siame, Kabengele Keith; Gey Van Pittius, Nicolaas Claudius; Van Helden, Paul David; Pain, Arnab; Warren, Robin Mark

    2015-01-01

    Tuberculosis occurs in various mammalian hosts and is caused by a range of different lineages of the Mycobacterium tuberculosis complex (MTBC). A recently described member, Mycobacterium suricattae, causes tuberculosis in meerkats (Suricata suricatta) in Southern Africa and preliminary genetic analysis showed this organism to be closely related to an MTBC pathogen of rock hyraxes (Procavia capensis), the dassie bacillus. Here we make use of whole genome sequencing to describe the evolution of the genome of M. suricattae, including known and novel regions of difference, SNPs and IS6110 insertion sites. We used genome-wide phylogenetic analysis to show that M. suricattae clusters with the chimpanzee bacillus, previously isolated from a chimpanzee (Pan troglodytes) in West Africa. We propose an evolutionary scenario for the Mycobacterium africanum lineage 6 complex, showing the evolutionary relationship of M. africanum and chimpanzee bacillus, and the closely related members M. suricattae, dassie bacillus and Mycobacterium mungi.

  2. Whole genome sequence analysis of Mycobacterium suricattae

    KAUST Repository

    Dippenaar, Anzaan

    2015-10-21

    Tuberculosis occurs in various mammalian hosts and is caused by a range of different lineages of the Mycobacterium tuberculosis complex (MTBC). A recently described member, Mycobacterium suricattae, causes tuberculosis in meerkats (Suricata suricatta) in Southern Africa and preliminary genetic analysis showed this organism to be closely related to an MTBC pathogen of rock hyraxes (Procavia capensis), the dassie bacillus. Here we make use of whole genome sequencing to describe the evolution of the genome of M. suricattae, including known and novel regions of difference, SNPs and IS6110 insertion sites. We used genome-wide phylogenetic analysis to show that M. suricattae clusters with the chimpanzee bacillus, previously isolated from a chimpanzee (Pan troglodytes) in West Africa. We propose an evolutionary scenario for the Mycobacterium africanum lineage 6 complex, showing the evolutionary relationship of M. africanum and chimpanzee bacillus, and the closely related members M. suricattae, dassie bacillus and Mycobacterium mungi.

  3. Targeted massively parallel sequencing of angiosarcomas reveals frequent activation of the mitogen activated protein kinase pathway

    Science.gov (United States)

    Murali, Rajmohan; Chandramohan, Raghu; Möller, Inga; Scholz, Simone L.; Berger, Michael; Huberman, Kety; Viale, Agnes; Pirun, Mono; Socci, Nicholas D.; Bouvier, Nancy; Bauer, Sebastian; Artl, Monika; Schilling, Bastian; Schimming, Tobias; Sucker, Antje; Schwindenhammer, Benjamin; Grabellus, Florian; Speicher, Michael R.; Schaller, Jörg; Hillen, Uwe; Schadendorf, Dirk; Mentzel, Thomas; Cheng, Donavan T.; Wiesner, Thomas; Griewank, Klaus G.

    2015-01-01

    Angiosarcomas are rare malignant mesenchymal tumors of endothelial differentiation. The clinical behavior is usually aggressive and the prognosis for patients with advanced disease is poor with no effective therapies. The genetic bases of these tumors have been partially revealed in recent studies reporting genetic alterations such as amplifications of MYC (primarily in radiation-associated angiosarcomas), inactivating mutations in PTPRB and R707Q hotspot mutations of PLCG1. Here, we performed a comprehensive genomic analysis of 34 angiosarcomas using a clinically-approved, hybridization-based targeted next-generation sequencing assay for 341 well-established oncogenes and tumor suppressor genes. Over half of the angiosarcomas (n = 18, 53%) harbored genetic alterations affecting the MAPK pathway, involving mutations in KRAS, HRAS, NRAS, BRAF, MAPK1 and NF1, or amplifications in MAPK1/CRKL, CRAF or BRAF. The most frequently detected genetic aberrations were mutations in TP53 in 12 tumors (35%) and losses of CDKN2A in 9 tumors (26%). MYC amplifications were generally mutually exclusive of TP53 alterations and CDKN2A loss and were identified in 8 tumors (24%), most of which (n = 7, 88%) arose post-irradiation. Previously reported mutations in PTPRB (n = 10, 29%) and one (3%) PLCG1 R707Q mutation were also identified. Our results demonstrate that angiosarcomas are a genetically heterogeneous group of tumors, harboring a wide range of genetic alterations. The high frequency of genetic events affecting the MAPK pathway suggests that targeted therapies inhibiting MAPK signaling may be promising therapeutic avenues in patients with advanced angiosarcomas. PMID:26440310

  4. Next generation sequencing reveals distinct fecal pollution signatures in aquatic sediments across gradients of anthropogenic influence

    Directory of Open Access Journals (Sweden)

    Gian Marco Luna

    2016-11-01

    Full Text Available Aquatic sediments are the repository of a variety of anthropogenic pollutants, including bacteria of fecal origin, that reach the aquatic environment from a variety of sources. Although fecal bacteria can survive for long periods of time in aquatic sediments, the microbiological quality of sediments is almost entirely neglected when performing quality assessments of aquatic ecosystems. Here we investigated the relative abundance, patterns and diversity of fecal bacterial populations in two coastal areas in the Northern Adriatic Sea (Italy: the Po river prodelta (PRP, an estuarine area receiving significant contaminant discharge from one of the largest European rivers and the Lagoon of Venice (LV, a transitional environment impacted by a multitude of anthropogenic stressors. From both areas, several indicators of fecal and sewage contamination were determined in the sediments using Next Generation Sequencing (NGS of 16S rDNA amplicons. At both areas, fecal contamination was high, with fecal bacteria accounting for up to 3.96% and 1.12% of the sediment bacterial assemblages in PRP and LV, respectively. The magnitude of the fecal signature was highest in the PRP site, highlighting the major role of the Po river in spreading microbial contaminants into the adjacent coastal area. In the LV site, fecal pollution was highest in the urban area, and almost disappeared when moving to the open sea. Our analysis revealed a large number of fecal Operational Taxonomic Units (OTU, 960 and 181 in PRP and LV, respectively and showed a different fecal signature in the two areas, suggesting a diverse contribution of human and non-human sources of contamination. These results highlight the potential of NGS techniques to gain insights into the origin and fate of different fecal bacteria populations in aquatic sediments.

  5. Mitochondrial genome sequences reveal deep divergences among Anopheles punctulatus sibling species in Papua New Guinea

    Directory of Open Access Journals (Sweden)

    Logue Kyle

    2013-02-01

    Full Text Available Abstract Background Members of the Anopheles punctulatus group (AP group are the primary vectors of human malaria in Papua New Guinea. The AP group includes 13 sibling species, most of them morphologically indistinguishable. Understanding why only certain species are able to transmit malaria requires a better comprehension of their evolutionary history. In particular, understanding relationships and divergence times among Anopheles species may enable assessing how malaria-related traits (e.g. blood feeding behaviours, vector competence have evolved. Methods DNA sequences of 14 mitochondrial (mt genomes from five AP sibling species and two species of the Anopheles dirus complex of Southeast Asia were sequenced. DNA sequences from all concatenated protein coding genes (10,770 bp were then analysed using a Bayesian approach to reconstruct phylogenetic relationships and date the divergence of the AP sibling species. Results Phylogenetic reconstruction using the concatenated DNA sequence of all mitochondrial protein coding genes indicates that the ancestors of the AP group arrived in Papua New Guinea 25 to 54 million years ago and rapidly diverged to form the current sibling species. Conclusion Through evaluation of newly described mt genome sequences, this study has revealed a divergence among members of the AP group in Papua New Guinea that would significantly predate the arrival of humans in this region, 50 thousand years ago. The divergence observed among the mtDNA sequences studied here may have resulted from reproductive isolation during historical changes in sea-level through glacial minima and maxima. This leads to a hypothesis that the AP sibling species have evolved independently for potentially thousands of generations. This suggests that the evolution of many phenotypes, such as insecticide resistance will arise independently in each of the AP sibling species studied here.

  6. Deep sequencing reveals double mutations in cis of MPL exon 10 in myeloproliferative neoplasms.

    Science.gov (United States)

    Pietra, Daniela; Brisci, Angela; Rumi, Elisa; Boggi, Sabrina; Elena, Chiara; Pietrelli, Alessandro; Bordoni, Roberta; Ferrari, Maurizio; Passamonti, Francesco; De Bellis, Gianluca; Cremonesi, Laura; Cazzola, Mario

    2011-04-01

    Somatic mutations of MPL exon 10, mainly involving a W515 substitution, have been described in JAK2 (V617F)-negative patients with essential thrombocythemia and primary myelofibrosis. We used direct sequencing and high-resolution melt analysis to identify mutations of MPL exon 10 in 570 patients with myeloproliferative neoplasms, and allele specific PCR and deep sequencing to further characterize a subset of mutated patients. Somatic mutations were detected in 33 of 221 patients (15%) with JAK2 (V617F)-negative essential thrombocythemia or primary myelofibrosis. Only one patient with essential thrombocythemia carried both JAK2 (V617F) and MPL (W515L). High-resolution melt analysis identified abnormal patterns in all the MPL mutated cases, while direct sequencing did not detect the mutant MPL in one fifth of them. In 3 cases carrying double MPL mutations, deep sequencing analysis showed identical load and location in cis of the paired lesions, indicating their simultaneous occurrence on the same chromosome.

  7. Sequencing the CHO DXB11 genome reveals regional variations in genomic stability and haploidy

    DEFF Research Database (Denmark)

    Kaas, Christian Schrøder; Kristensen, Claus; Betenbaugh, Michael J.

    2015-01-01

    Background: The DHFR negative CHO DXB11 cell line (also known as DUX-B11 and DUKX) was historically the first CHO cell line to be used for large scale production of heterologous proteins and is still used for production of a number of complex proteins.  Results: Here we present the genomic sequence...... of the CHO DXB11 genome sequenced to a depth of 33x. Overall a significant genomic drift was seen favoring GC -> AT point mutations in line with the chemical mutagenesis strategy used for generation of the cell line. The sequencing depth for each gene in the genome revealed distinct peaks at sequencing...... in eight additional analyzed CHO genomes (15-20% haploidy) but not in the genome of the Chinese hamster. The dhfr gene is confirmed to be haploid in CHO DXB11; transcriptionally active and the remaining allele contains a G410C point mutation causing a Thr137Arg missense mutation. We find similar to 2...

  8. Computational analysis of sequence selection mechanisms.

    Science.gov (United States)

    Meyerguz, Leonid; Grasso, Catherine; Kleinberg, Jon; Elber, Ron

    2004-04-01

    Mechanisms leading to gene variations are responsible for the diversity of species and are important components of the theory of evolution. One constraint on gene evolution is that of protein foldability; the three-dimensional shapes of proteins must be thermodynamically stable. We explore the impact of this constraint and calculate properties of foldable sequences using 3660 structures from the Protein Data Bank. We seek a selection function that receives sequences as input, and outputs survival probability based on sequence fitness to structure. We compute the number of sequences that match a particular protein structure with energy lower than the native sequence, the density of the number of sequences, the entropy, and the "selection" temperature. The mechanism of structure selection for sequences longer than 200 amino acids is approximately universal. For shorter sequences, it is not. We speculate on concrete evolutionary mechanisms that show this behavior.

  9. Comparative analysis of sequences from PT 2013

    DEFF Research Database (Denmark)

    Mikkelsen, Susie Sommer

    Sheatfish and not EHNV. Generally, mistakes occurred at the ends of the sequences. This can be due to several factors. One is that the sequence has not been trimmed of the sequence primer sites. Another is the lack of quality control of the chromatogram. Finally, sequencing in just one direction can result...... diseases in Europe. As part of the EURL proficiency test for fish diseases it is required to sequence any RANA virus isolates found in any of the samples. It is also highly recommended to sequence the ISA virus to determine whether it be HPRΔ or HPR0. Furthermore, it is recommended that any VHSV and IHNV...... isolates be genotyped. As part of the evaluation of the proficiency results it was decided this year to look into the quality and similarity of the sequence results for selected viruses. Ampoule III in the proficiency test 2013 contained an EHNV isolate. The EURL received 43 sequences from 41 laboratories...

  10. Time fluctuation analysis of forest fire sequences

    Science.gov (United States)

    Vega Orozco, Carmen D.; Kanevski, Mikhaïl; Tonini, Marj; Golay, Jean; Pereira, Mário J. G.

    2013-04-01

    Forest fires are complex events involving both space and time fluctuations. Understanding of their dynamics and pattern distribution is of great importance in order to improve the resource allocation and support fire management actions at local and global levels. This study aims at characterizing the temporal fluctuations of forest fire sequences observed in Portugal, which is the country that holds the largest wildfire land dataset in Europe. This research applies several exploratory data analysis measures to 302,000 forest fires occurred from 1980 to 2007. The applied clustering measures are: Morisita clustering index, fractal and multifractal dimensions (box-counting), Ripley's K-function, Allan Factor, and variography. These algorithms enable a global time structural analysis describing the degree of clustering of a point pattern and defining whether the observed events occur randomly, in clusters or in a regular pattern. The considered methods are of general importance and can be used for other spatio-temporal events (i.e. crime, epidemiology, biodiversity, geomarketing, etc.). An important contribution of this research deals with the analysis and estimation of local measures of clustering that helps understanding their temporal structure. Each measure is described and executed for the raw data (forest fires geo-database) and results are compared to reference patterns generated under the null hypothesis of randomness (Poisson processes) embedded in the same time period of the raw data. This comparison enables estimating the degree of the deviation of the real data from a Poisson process. Generalizations to functional measures of these clustering methods, taking into account the phenomena, were also applied and adapted to detect time dependences in a measured variable (i.e. burned area). The time clustering of the raw data is compared several times with the Poisson processes at different thresholds of the measured function. Then, the clustering measure value

  11. SVAMP: Sequence variation analysis, maps and phylogeny

    KAUST Repository

    Naeem, Raeece

    2014-04-03

    Summary: SVAMP is a stand-alone desktop application to visualize genomic variants (in variant call format) in the context of geographical metadata. Users of SVAMP are able to generate phylogenetic trees and perform principal coordinate analysis in real time from variant call format (VCF) and associated metadata files. Allele frequency map, geographical map of isolates, Tajima\\'s D metric, single nucleotide polymorphism density, GC and variation density are also available for visualization in real time. We demonstrate the utility of SVAMP in tracking a methicillin-resistant Staphylococcus aureus outbreak from published next-generation sequencing data across 15 countries. We also demonstrate the scalability and accuracy of our software on 245 Plasmodium falciparum malaria isolates from three continents. Availability and implementation: The Qt/C++ software code, binaries, user manual and example datasets are available at http://cbrc.kaust.edu.sa/svamp. © The Author 2014.

  12. Statistical analysis of next generation sequencing data

    CERN Document Server

    Nettleton, Dan

    2014-01-01

    Next Generation Sequencing (NGS) is the latest high throughput technology to revolutionize genomic research. NGS generates massive genomic datasets that play a key role in the big data phenomenon that surrounds us today. To extract signals from high-dimensional NGS data and make valid statistical inferences and predictions, novel data analytic and statistical techniques are needed. This book contains 20 chapters written by prominent statisticians working with NGS data. The topics range from basic preprocessing and analysis with NGS data to more complex genomic applications such as copy number variation and isoform expression detection. Research statisticians who want to learn about this growing and exciting area will find this book useful. In addition, many chapters from this book could be included in graduate-level classes in statistical bioinformatics for training future biostatisticians who will be expected to deal with genomic data in basic biomedical research, genomic clinical trials and personalized med...

  13. GEITLERINEMA SPECIES (OSCILLATORIALES, CYANOBACTERIA) REVEALED BY CELLULAR MORPHOLOGY, ULTRASTRUCTURE, AND DNA SEQUENCING(1).

    Science.gov (United States)

    Do Carmo Bittencourt-Oliveira, Maria; Do Nascimento Moura, Ariadne; De Oliveira, Mariana Cabral; Sidnei Massola, Nelson

    2009-06-01

    Geitlerinema amphibium (C. Agardh ex Gomont) Anagn. and G. unigranulatum (Rama N. Singh) Komárek et M. T. P. Azevedo are morphologically close species with characteristics frequently overlapping. Ten strains of Geitlerinema (six of G. amphibium and four of G. unigranulatum) were analyzed by DNA sequencing and transmission electronic and optical microscopy. Among the investigated strains, the two species were not separated with respect to cellular dimensions, and cellular width was the most varying characteristic. The number and localization of granules, as well as other ultrastructural characteristics, did not provide a means to discriminate between the two species. The two species were not separated either by geography or environment. These results were further corroborated by the analysis of the cpcB-cpcA intergenic spacer (PC-IGS) sequences. Given the fact that morphology is very uniform, plus the coexistence of these populations in the same habitat, it would be nearly impossible to distinguish between them in nature. On the other hand, two of the analyzed strains were distinct from all others based on the PC-IGS sequences, in spite of their morphological similarity. PC-IGS sequences indicate that these two strains could be a different species of Geitlerinema. Using morphology, cell ultrastructure, and PC-IGS sequences, it is not possible to distinguish G. amphibium and G. unigranulatum. Therefore, they should be treated as one species, G. unigranulatum as a synonym of G. amphibium. © 2009 Phycological Society of America.

  14. Movement Pattern Analysis Based on Sequence Signatures

    Directory of Open Access Journals (Sweden)

    Seyed Hossein Chavoshi

    2015-09-01

    Full Text Available Increased affordability and deployment of advanced tracking technologies have led researchers from various domains to analyze the resulting spatio-temporal movement data sets for the purpose of knowledge discovery. Two different approaches can be considered in the analysis of moving objects: quantitative analysis and qualitative analysis. This research focuses on the latter and uses the qualitative trajectory calculus (QTC, a type of calculus that represents qualitative data on moving point objects (MPOs, and establishes a framework to analyze the relative movement of multiple MPOs. A visualization technique called sequence signature (SESI is used, which enables to map QTC patterns in a 2D indexed rasterized space in order to evaluate the similarity of relative movement patterns of multiple MPOs. The applicability of the proposed methodology is illustrated by means of two practical examples of interacting MPOs: cars on a highway and body parts of a samba dancer. The results show that the proposed method can be effectively used to analyze interactions of multiple MPOs in different domains.

  15. Direct chloroplast sequencing: comparison of sequencing platforms and analysis tools for whole chloroplast barcoding.

    Directory of Open Access Journals (Sweden)

    Marta Brozynska

    Full Text Available Direct sequencing of total plant DNA using next generation sequencing technologies generates a whole chloroplast genome sequence that has the potential to provide a barcode for use in plant and food identification. Advances in DNA sequencing platforms may make this an attractive approach for routine plant identification. The HiSeq (Illumina and Ion Torrent (Life Technology sequencing platforms were used to sequence total DNA from rice to identify polymorphisms in the whole chloroplast genome sequence of a wild rice plant relative to cultivated rice (cv. Nipponbare. Consensus chloroplast sequences were produced by mapping sequence reads to the reference rice chloroplast genome or by de novo assembly and mapping of the resulting contigs to the reference sequence. A total of 122 polymorphisms (SNPs and indels between the wild and cultivated rice chloroplasts were predicted by these different sequencing and analysis methods. Of these, a total of 102 polymorphisms including 90 SNPs were predicted by both platforms. Indels were more variable with different sequencing methods, with almost all discrepancies found in homopolymers. The Ion Torrent platform gave no apparent false SNP but was less reliable for indels. The methods should be suitable for routine barcoding using appropriate combinations of sequencing platform and data analysis.

  16. High-throughput sequencing of the B-cell receptor in African Burkitt lymphoma reveals clues to pathogenesis.

    Science.gov (United States)

    Lombardo, Katharine A; Coffey, David G; Morales, Alicia J; Carlson, Christopher S; Towlerton, Andrea M H; Gerdts, Sarah E; Nkrumah, Francis K; Neequaye, Janet; Biggar, Robert J; Orem, Jackson; Casper, Corey; Mbulaiteye, Sam M; Bhatia, Kishor G; Warren, Edus H

    2017-03-28

    Burkitt lymphoma (BL), the most common pediatric cancer in sub-Saharan Africa, is a malignancy of antigen-experienced B lymphocytes. High-throughput sequencing (HTS) of the immunoglobulin heavy ( IGH ) and light chain ( IGK / IGL ) loci was performed on genomic DNA from 51 primary BL tumors: 19 from Uganda and 32 from Ghana. Reverse transcription polymerase chain reaction analysis and tumor RNA sequencing (RNAseq) was performed on the Ugandan tumors to confirm and extend the findings from the HTS of tumor DNA. Clonal IGH and IGK / IGL rearrangements were identified in 41 and 46 tumors, respectively. Evidence for rearrangement of the second IGH allele was observed in only 6 of 41 tumor samples with a clonal IGH rearrangement, suggesting that the normal process of biallelic IGHD to IGHJ diversity-joining (DJ) rearrangement is often disrupted in BL progenitor cells. Most tumors, including those with a sole dominant, nonexpressed DJ rearrangement, contained many IGH and IGK / IGL sequences that differed from the dominant rearrangement by < 10 nucleotides, suggesting that the target of ongoing mutagenesis of these loci in BL tumor cells is not limited to expressed alleles. IGHV usage in both BL tumor cohorts revealed enrichment for IGHV genes that are infrequently used in memory B cells from healthy subjects. Analysis of publicly available DNA sequencing and RNAseq data revealed that these same IGHV genes were overrepresented in dominant tumor-associated IGH rearrangements in several independent BL tumor cohorts. These data suggest that BL derives from an abnormal B-cell progenitor and that aberrant mutational processes are active on the immunoglobulin loci in BL cells.

  17. Sequence analysis of putative swrW gene required for surfactant ...

    African Journals Online (AJOL)

    Serratia marcescens produces biosurfactant serrawettin, essential for its population migration behavior. Serrawettin W1 was revealed to be an antibiotic serratamolide that makes it significant for deoxyribonucleic acid (DNA) and protein sequence analysis. Four nucleotide and amino-acid sequences from local strains ...

  18. Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    KAUST Repository

    Black, PA

    2015-10-24

    Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug resistance. In an age where whole genome sequencing is increasingly relied upon for defining the structure of bacterial genomes, it is important to investigate the reliability of next generation sequencing to identify clonal variants present in a minor percentage of the population. This study aimed to define a reliable cut-off for identification of low frequency sequence variants and to subsequently investigate genetic heterogeneity and the evolution of drug resistance in M. tuberculosis. Methods Genomic DNA was isolated from single colonies from 14 rifampicin mono-resistant M. tuberculosis isolates, as well as the primary cultures and follow up MDR cultures from two of these patients. The whole genomes of the M. tuberculosis isolates were sequenced using either the Illumina MiSeq or Illumina HiSeq platforms. Sequences were analysed with an in-house pipeline. Results Using next-generation sequencing in combination with Sanger sequencing and statistical analysis we defined a read frequency cut-off of 30 % to identify low frequency M. tuberculosis variants with high confidence. Using this cut-off we demonstrated a high rate of genetic diversity between single colonies isolated from one population, showing that by using the current sequencing technology, single colonies are not a true reflection of the genetic diversity within a whole population and vice versa. We further showed that numerous heterogeneous variants emerge and then disappear during the evolution of isoniazid resistance within individual patients. Our findings allowed us to formulate a model for the selective bottleneck which occurs during the course of infection, acting as a genomic purification event. Conclusions Our study demonstrated true levels of genetic diversity

  19. Supplementary Material for: Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    KAUST Repository

    Black, PA

    2015-01-01

    Abstract Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug resistance. In an age where whole genome sequencing is increasingly relied upon for defining the structure of bacterial genomes, it is important to investigate the reliability of next generation sequencing to identify clonal variants present in a minor percentage of the population. This study aimed to define a reliable cut-off for identification of low frequency sequence variants and to subsequently investigate genetic heterogeneity and the evolution of drug resistance in M. tuberculosis. Methods Genomic DNA was isolated from single colonies from 14 rifampicin mono-resistant M. tuberculosis isolates, as well as the primary cultures and follow up MDR cultures from two of these patients. The whole genomes of the M. tuberculosis isolates were sequenced using either the Illumina MiSeq or Illumina HiSeq platforms. Sequences were analysed with an in-house pipeline. Results Using next-generation sequencing in combination with Sanger sequencing and statistical analysis we defined a read frequency cut-off of 30 % to identify low frequency M. tuberculosis variants with high confidence. Using this cut-off we demonstrated a high rate of genetic diversity between single colonies isolated from one population, showing that by using the current sequencing technology, single colonies are not a true reflection of the genetic diversity within a whole population and vice versa. We further showed that numerous heterogeneous variants emerge and then disappear during the evolution of isoniazid resistance within individual patients. Our findings allowed us to formulate a model for the selective bottleneck which occurs during the course of infection, acting as a genomic purification event. Conclusions Our study demonstrated true levels of genetic

  20. Noncoding sequence classification based on wavelet transform analysis: part I

    Science.gov (United States)

    Paredes, O.; Strojnik, M.; Romo-Vázquez, R.; Vélez Pérez, H.; Ranta, R.; Garcia-Torales, G.; Scholl, M. K.; Morales, J. A.

    2017-09-01

    DNA sequences in human genome can be divided into the coding and noncoding ones. Coding sequences are those that are read during the transcription. The identification of coding sequences has been widely reported in literature due to its much-studied periodicity. Noncoding sequences represent the majority of the human genome. They play an important role in gene regulation and differentiation among the cells. However, noncoding sequences do not exhibit periodicities that correlate to their functions. The ENCODE (Encyclopedia of DNA elements) and Epigenomic Roadmap Project projects have cataloged the human noncoding sequences into specific functions. We study characteristics of noncoding sequences with wavelet analysis of genomic signals.

  1. Which MRI sequence of the spine best reveals bone-marrow metastases of neuroblastoma?

    International Nuclear Information System (INIS)

    Meyer, James S.; Jaramillo, Diego; Siegel, Marilyn J.; Farooqui, Saleem O.; Fletcher, Barry D.; Hoffer, Fredric A.

    2005-01-01

    MRI is an effective tool in evaluating bone marrow metastases. However, no study has defined which MRI sequences or image characteristics best correlate with bone-marrow metastases in neuroblastoma. To identify and refine MRI criteria and sequence selection for the diagnosis of bone-marrow metastases in children with neuroblastoma. Ninety-one children (mean age: 3.2 years; standard deviation: 2.8 years) enrolled in the RDOG IV study participated in our study. Forty-five children had bone metastases determined by bone-marrow aspiration or biopsy (n=4), radionuclide imaging (n=2), or both (n=39). Spine lesions were characterized using coronal T1-weighted (T1W) sagittal short tau inversion recovery (STIR) and coronal gadolinium-enhanced T1-weighted (GAD) MR sequences. Contingency table analysis was performed to determine which MRI sequences and characteristics were associated with metastases. The MRI criteria for metastatic disease were then developed for each imaging sequence. The sensitivity, specificity, predictive values, and accuracy of these criteria were determined for the whole group, children younger than 12 months old, and children 12 months and older. The MR characteristics that had significant (P ≤ 0.05) associations with metastases were homogeneous low T1-signal intensity, homogeneous high STIR-signal intensity, and heterogeneous pattern on T1, STIR, or GAD. Homogeneous low T1-signal had the highest sensitivity (88%), but a specificity of 62% for detecting metastases. A heterogeneous pattern on GAD was highly specific (97%), but relatively insensitive (65%) for detecting metastases. These MR characteristics were most accurate in children 12 months and older. The combination of non-contrast-enhanced T1W and GAD sequences can be used to determine the presence of spinal metastases in children with neuroblastoma, particularly those children who are 1 year and older. (orig.)

  2. Image sequence analysis workstation for multipoint motion analysis

    Science.gov (United States)

    Mostafavi, Hassan

    1990-08-01

    This paper describes an application-specific engineering workstation designed and developed to analyze motion of objects from video sequences. The system combines the software and hardware environment of a modem graphic-oriented workstation with the digital image acquisition, processing and display techniques. In addition to automation and Increase In throughput of data reduction tasks, the objective of the system Is to provide less invasive methods of measurement by offering the ability to track objects that are more complex than reflective markers. Grey level Image processing and spatial/temporal adaptation of the processing parameters is used for location and tracking of more complex features of objects under uncontrolled lighting and background conditions. The applications of such an automated and noninvasive measurement tool include analysis of the trajectory and attitude of rigid bodies such as human limbs, robots, aircraft in flight, etc. The system's key features are: 1) Acquisition and storage of Image sequences by digitizing and storing real-time video; 2) computer-controlled movie loop playback, freeze frame display, and digital Image enhancement; 3) multiple leading edge tracking in addition to object centroids at up to 60 fields per second from both live input video or a stored Image sequence; 4) model-based estimation and tracking of the six degrees of freedom of a rigid body: 5) field-of-view and spatial calibration: 6) Image sequence and measurement data base management; and 7) offline analysis software for trajectory plotting and statistical analysis.

  3. Single nucleus genome sequencing reveals high similarity among nuclei of an endomycorrhizal fungus.

    Directory of Open Access Journals (Sweden)

    Kui Lin

    2014-01-01

    Full Text Available Nuclei of arbuscular endomycorrhizal fungi have been described as highly diverse due to their asexual nature and absence of a single cell stage with only one nucleus. This has raised fundamental questions concerning speciation, selection and transmission of the genetic make-up to next generations. Although this concept has become textbook knowledge, it is only based on studying a few loci, including 45S rDNA. To provide a more comprehensive insight into the genetic makeup of arbuscular endomycorrhizal fungi, we applied de novo genome sequencing of individual nuclei of Rhizophagus irregularis. This revealed a surprisingly low level of polymorphism between nuclei. In contrast, within a nucleus, the 45S rDNA repeat unit turned out to be highly diverged. This finding demystifies a long-lasting hypothesis on the complex genetic makeup of arbuscular endomycorrhizal fungi. Subsequent genome assembly resulted in the first draft reference genome sequence of an arbuscular endomycorrhizal fungus. Its length is 141 Mbps, representing over 27,000 protein-coding gene models. We used the genomic sequence to reinvestigate the phylogenetic relationships of Rhizophagus irregularis with other fungal phyla. This unambiguously demonstrated that Glomeromycota are more closely related to Mucoromycotina than to its postulated sister Dikarya.

  4. Genome re-sequencing of semi-wild soybean reveals a complex Soja population structure and deep introgression.

    Directory of Open Access Journals (Sweden)

    Jie Qiu

    Full Text Available Semi-wild soybean is a unique type of soybean that retains both wild and domesticated characteristics, which provides an important intermediate type for understanding the evolution of the subgenus Soja population in the Glycine genus. In this study, a semi-wild soybean line (Maliaodou and a wild line (Lanxi 1 collected from the lower Yangtze regions were deeply sequenced while nine other semi-wild lines were sequenced to a 3-fold genome coverage. Sequence analysis revealed that (1 no independent phylogenetic branch covering all 10 semi-wild lines was observed in the Soja phylogenetic tree; (2 besides two distinct subpopulations of wild and cultivated soybean in the Soja population structure, all semi-wild lines were mixed with some wild lines into a subpopulation rather than an independent one or an intermediate transition type of soybean domestication; (3 high heterozygous rates (0.19-0.49 were observed in several semi-wild lines; and (4 over 100 putative selective regions were identified by selective sweep analysis, including those related to the development of seed size. Our results suggested a hybridization origin for the semi-wild soybean, which makes a complex Soja population structure.

  5. Sequencing of Australian wild rice genomes reveals ancestral relationships with domesticated rice.

    Science.gov (United States)

    Brozynska, Marta; Copetti, Dario; Furtado, Agnelo; Wing, Rod A; Crayn, Darren; Fox, Glen; Ishikawa, Ryuji; Henry, Robert J

    2017-06-01

    The related A genome species of the Oryza genus are the effective gene pool for rice. Here, we report draft genomes for two Australian wild A genome taxa: O. rufipogon-like population, referred to as Taxon A, and O. meridionalis-like population, referred to as Taxon B. These two taxa were sequenced and assembled by integration of short- and long-read next-generation sequencing (NGS) data to create a genomic platform for a wider rice gene pool. Here, we report that, despite the distinct chloroplast genome, the nuclear genome of the Australian Taxon A has a sequence that is much closer to that of domesticated rice (O. sativa) than to the other Australian wild populations. Analysis of 4643 genes in the A genome clade showed that the Australian annual, O. meridionalis, and related perennial taxa have the most divergent (around 3 million years) genome sequences relative to domesticated rice. A test for admixture showed possible introgression into the Australian Taxon A (diverged around 1.6 million years ago) especially from the wild indica/O. nivara clade in Asia. These results demonstrate that northern Australia may be the centre of diversity of the A genome Oryza and suggest the possibility that this might also be the centre of origin of this group and represent an important resource for rice improvement. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  6. Deep sequencing of the oral microbiome reveals signatures of periodontal disease.

    Directory of Open Access Journals (Sweden)

    Bo Liu

    Full Text Available The oral microbiome, the complex ecosystem of microbes inhabiting the human mouth, harbors several thousands of bacterial types. The proliferation of pathogenic bacteria within the mouth gives rise to periodontitis, an inflammatory disease known to also constitute a risk factor for cardiovascular disease. While much is known about individual species associated with pathogenesis, the system-level mechanisms underlying the transition from health to disease are still poorly understood. Through the sequencing of the 16S rRNA gene and of whole community DNA we provide a glimpse at the global genetic, metabolic, and ecological changes associated with periodontitis in 15 subgingival plaque samples, four from each of two periodontitis patients, and the remaining samples from three healthy individuals. We also demonstrate the power of whole-metagenome sequencing approaches in characterizing the genomes of key players in the oral microbiome, including an unculturable TM7 organism. We reveal the disease microbiome to be enriched in virulence factors, and adapted to a parasitic lifestyle that takes advantage of the disrupted host homeostasis. Furthermore, diseased samples share a common structure that was not found in completely healthy samples, suggesting that the disease state may occupy a narrow region within the space of possible configurations of the oral microbiome. Our pilot study demonstrates the power of high-throughput sequencing as a tool for understanding the role of the oral microbiome in periodontal disease. Despite a modest level of sequencing (~2 lanes Illumina 76 bp PE and high human DNA contamination (up to ~90% we were able to partially reconstruct several oral microbes and to preliminarily characterize some systems-level differences between the healthy and diseased oral microbiomes.

  7. Chromosomal structures and repetitive sequences divergence in Cucumis species revealed by comparative cytogenetic mapping.

    Science.gov (United States)

    Zhang, Yunxia; Cheng, Chunyan; Li, Ji; Yang, Shuqiong; Wang, Yunzhu; Li, Ziang; Chen, Jinfeng; Lou, Qunfeng

    2015-09-25

    Differentiation and copy number of repetitive sequences affect directly chromosome structure which contributes to reproductive isolation and speciation. Comparative cytogenetic mapping has been verified an efficient tool to elucidate the differentiation and distribution of repetitive sequences in genome. In present study, the distinct chromosomal structures of five Cucumis species were revealed through genomic in situ hybridization (GISH) technique and comparative cytogenetic mapping of major satellite repeats. Chromosome structures of five Cucumis species were investigated using GISH and comparative mapping of specific satellites. Southern hybridization was employed to study the proliferation of satellites, whose structural characteristics were helpful for analyzing chromosome evolution. Preferential distribution of repetitive DNAs at the subtelomeric regions was found in C. sativus, C hystrix and C. metuliferus, while majority was positioned at the pericentromeric heterochromatin regions in C. melo and C. anguria. Further, comparative GISH (cGISH) through using genomic DNA of other species as probes revealed high homology of repeats between C. sativus and C. hystrix. Specific satellites including 45S rDNA, Type I/II, Type III, Type IV, CentM and telomeric repeat were then comparatively mapped in these species. Type I/II and Type IV produced bright signals at the subtelomeric regions of C. sativus and C. hystrix simultaneously, which might explain the significance of their amplification in the divergence of Cucumis subgenus from the ancient ancestor. Unique positioning of Type III and CentM only at the centromeric domains of C. sativus and C. melo, respectively, combining with unique southern bands, revealed rapid evolutionary patterns of centromeric DNA in Cucumis. Obvious interstitial telomeric repeats were observed in chromosomes 1 and 2 of C. sativus, which might provide evidence of the fusion hypothesis of chromosome evolution from x = 12 to x = 7 in

  8. Novel algorithms for protein sequence analysis

    NARCIS (Netherlands)

    Ye, Kai

    2008-01-01

    Each protein is characterized by its unique sequential order of amino acids, the so-called protein sequence. Biology”s paradigm is that this order of amino acids determines the protein”s architecture and function. In this thesis, we introduce novel algorithms to analyze protein sequences. Chapter 1

  9. Pig genome sequence - analysis and publication strategy

    DEFF Research Database (Denmark)

    Archibald, Alan L.; Bolund, Lars; Churcher, Carol

    2010-01-01

    preferentially selected for sequencing. In accordance with the Bermuda and Fort Lauderdale agreements and the more recent Toronto Statement the data have been released into public sequence repositories (Genbank/EMBL, NCBI/Ensembl trace repositories) in a timely manner and in advance of publication. CONCLUSIONS...

  10. Complete genome sequence of Fer-de-Lance Virus reveals a novel gene in reptilian Paramyxoviruses

    Science.gov (United States)

    Kurath, G.; Batts, W.N.; Ahne, W.; Winton, J.R.

    2004-01-01

    The complete RNA genome sequence of the archetype reptilian paramyxovirus, Fer-de-Lance virus (FDLV), has been determined. The genome is 15,378 nucleotides in length and consists of seven nonoverlapping genes in the order 3??? N-U-P-M-F-HN-L 5???, coding for the nucleocapsid, unknown, phospho-, matrix, fusion, hemagglutinin-neuraminidase, and large polymerase proteins, respectively. The gene junctions contain highly conserved transcription start and stop signal sequences and tri-nucleotide intergenic regions similar to those of other Paramyxoviridae. The FDLV P gene expression strategy is like that of rubulaviruses, which express the accessory V protein from the primary transcript and edit a portion of the mRNA to encode P and I proteins. There is also an overlapping open reading frame potentially encoding a small basic protein in the P gene. The gene designated U (unknown), encodes a deduced protein of 19.4 kDa that has no counterpart in other paramyxoviruses and has no similarity with sequences in the National Center for Biotechnology Information database. Active transcription of the U gene in infected cells was demonstrated by Northern blot analysis, and bicistronic N-U mRNA was also evident. The genomes of two other snake paramyxovirus genotypes were also found to have U genes, with 11 to 16% nucleotide divergence from the FDLV U gene. Pairwise comparisons of amino acid identities and phylogenetic analyses of all deduced FDLV protein sequences with homologous sequences from other Paramyxoviridae indicate that FDLV represents a new genus within the subfamily Paramyxovirinae. We suggest the name Ferlavirus for the new genus, with FDLV as the type species.

  11. Mitochondrial DNA sequence data reveals association of haplogroup U with psychosis in bipolar disorder.

    Science.gov (United States)

    Frye, Mark A; Ryu, Euijung; Nassan, Malik; Jenkins, Gregory D; Andreazza, Ana C; Evans, Jared M; McElroy, Susan L; Oglesbee, Devin; Highsmith, W Edward; Biernacka, Joanna M

    2017-01-01

    Converging genetic, postmortem gene-expression, cellular, and neuroimaging data implicate mitochondrial dysfunction in bipolar disorder. This study was conducted to investigate whether mitochondrial DNA (mtDNA) haplogroups and single nucleotide variants (SNVs) are associated with sub-phenotypes of bipolar disorder. MtDNA from 224 patients with Bipolar I disorder (BPI) was sequenced, and association of sequence variations with 3 sub-phenotypes (psychosis, rapid cycling, and adolescent illness onset) was evaluated. Gene-level tests were performed to evaluate overall burden of minor alleles for each phenotype. The haplogroup U was associated with a higher risk of psychosis. Secondary analyses of SNVs provided nominal evidence for association of psychosis with variants in the tRNA, ND4 and ND5 genes. The association of psychosis with ND4 (gene that encodes NADH dehydrogenase 4) was further supported by gene-level analysis. Preliminary analysis of mtDNA sequence data suggests a higher risk of psychosis with the U haplogroup and variation in the ND4 gene implicated in electron transport chain energy regulation. Further investigation of the functional consequences of this mtDNA variation is encouraged. Copyright © 2016. Published by Elsevier Ltd.

  12. Genome sequencing of chimpanzee malaria parasites reveals possible pathways of adaptation to human hosts

    KAUST Repository

    Otto, Thomas D.

    2014-09-09

    Plasmodium falciparum causes most human malaria deaths, having prehistorically evolved from parasites of African Great Apes. Here we explore the genomic basis of P. falciparum adaptation to human hosts by fully sequencing the genome of the closely related chimpanzee parasite species P. reichenowi, and obtaining partial sequence data from a more distantly related chimpanzee parasite (P. gaboni). The close relationship between P. reichenowi and P. falciparum is emphasized by almost complete conservation of genomic synteny, but against this strikingly conserved background we observe major differences at loci involved in erythrocyte invasion. The organization of most virulence-associated multigene families, including the hypervariable var genes, is broadly conserved, but P. falciparum has a smaller subset of rif and stevor genes whose products are expressed on the infected erythrocyte surface. Genome-wide analysis identifies other loci under recent positive selection, but a limited number of changes at the host–parasite interface may have mediated host switching.

  13. Characterization and sequence analysis of cysteine and glycine-rich ...

    African Journals Online (AJOL)

    Primers specific for CSRP3 were designed using known cDNA sequences of Bos taurus published in database with different accession numbers. Polymerase chain reaction (PCR) was performed and products were purified and sequenced. Sequence analysis and alignment were carried out using CLUSTAL W (1.83).

  14. Sequencing and analysis of an Irish human genome.

    LENUS (Irish Health Repository)

    Tong, Pin

    2010-01-01

    Recent studies generating complete human sequences from Asian, African and European subgroups have revealed population-specific variation and disease susceptibility loci. Here, choosing a DNA sample from a population of interest due to its relative geographical isolation and genetic impact on further populations, we extend the above studies through the generation of 11-fold coverage of the first Irish human genome sequence.

  15. Incident sequence analysis; event trees, methods and graphical symbols

    International Nuclear Information System (INIS)

    1980-11-01

    When analyzing incident sequences, unwanted events resulting from a certain cause are looked for. Graphical symbols and explanations of graphical representations are presented. The method applies to the analysis of incident sequences in all types of facilities. By means of the incident sequence diagram, incident sequences, i.e. the logical and chronological course of repercussions initiated by the failure of a component or by an operating error, can be presented and analyzed simply and clearly

  16. Computer-aided visualization and analysis system for sequence evaluation

    Energy Technology Data Exchange (ETDEWEB)

    Chee, Mark S.; Wang, Chunwei; Jevons, Luis C.; Bernhart, Derek H.; Lipshutz, Robert J.

    2004-05-11

    A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.

  17. Whole-genome sequencing of Bacillus subtilis XF-1 reveals mechanisms for biological control and multiple beneficial properties in plants.

    Science.gov (United States)

    Guo, Shengye; Li, Xingyu; He, Pengfei; Ho, Honhing; Wu, Yixin; He, Yueqiu

    2015-06-01

    Bacillus subtilis XF-1 is a gram-positive, plant-associated bacterium that stimulates plant growth and produces secondary metabolites that suppress soil-borne plant pathogens. In particular, it is especially highly efficient at controlling the clubroot disease of cruciferous crops. Its 4,061,186-bp genome contains an estimated 3853 protein-coding sequences and the 1155 genes of XF-1 are present in most genome-sequenced Bacillus strains: 3757 genes in B. subtilis 168, and 1164 in B. amyloliquefaciens FZB42. Analysis using the Cluster of Orthologous Groups database of proteins shows that 60 genes control bacterial mobility, 221 genes are related to cell wall and membrane biosynthesis, and more than 112 are genes associated with secondary metabolites. In addition, the genes contributed to the strain's plant colonization, bio-control and stimulation of plant growth. Sequencing of the genome is a fundamental step for developing a desired strain to serve as an efficient biological control agent and plant growth stimulator. Similar to other members of the taxon, XF-1 has a genome that contains giant gene clusters for the non-ribosomal synthesis of antifungal lipopeptides (surfactin and fengycin), the polyketides (macrolactin and bacillaene), the siderophore bacillibactin, and the dipeptide bacilysin. There are two synthesis pathways for volatile growth-promoting compounds. The expression of biosynthesized antibiotic peptides in XF-1 was revealed by matrix-assisted laser desorption/ionization-time of flight mass spectrometry.

  18. Genome sequencing and comparative genomics reveal a repertoire of putative pathogenicity genes in chilli anthracnose fungus Colletotrichum truncatum.

    Science.gov (United States)

    Rao, Soumya; Nandineni, Madhusudan R

    2017-01-01

    Colletotrichum truncatum, a major fungal phytopathogen, causes the anthracnose disease on an economically important spice crop chilli (Capsicum annuum), resulting in huge economic losses in tropical and sub-tropical countries. It follows a subcuticular intramural infection strategy on chilli with a short, asymptomatic, endophytic phase, which contrasts with the intracellular hemibiotrophic lifestyle adopted by most of the Colletotrichum species. However, little is known about the molecular determinants and the mechanism of pathogenicity in this fungus. A high quality whole genome sequence and gene annotation based on transcriptome data of an Indian isolate of C. truncatum from chilli has been obtained. Analysis of the genome sequence revealed a rich repertoire of pathogenicity genes in C. truncatum encoding secreted proteins, effectors, plant cell wall degrading enzymes, secondary metabolism associated proteins, with potential roles in the host-specific infection strategy, placing it next only to the Fusarium species. The size of genome assembly, number of predicted genes and some of the functional categories were similar to other sequenced Colletotrichum species. The comparative genomic analyses with other species and related fungi identified some unique genes and certain highly expanded gene families of CAZymes, proteases and secondary metabolism associated genes in the genome of C. truncatum. The draft genome assembly and functional annotation of potential pathogenicity genes of C. truncatum provide an important genomic resource for understanding the biology and lifestyle of this important phytopathogen and will pave the way for designing efficient disease control regimens.

  19. High-resolution deep sequencing reveals biodiversity, population structure, and persistence of HIV-1 quasispecies within host ecosystems

    Directory of Open Access Journals (Sweden)

    Yin Li

    2012-12-01

    Full Text Available Abstract Background Deep sequencing provides the basis for analysis of biodiversity of taxonomically similar organisms in an environment. While extensively applied to microbiome studies, population genetics studies of viruses are limited. To define the scope of HIV-1 population biodiversity within infected individuals, a suite of phylogenetic and population genetic algorithms was applied to HIV-1 envelope hypervariable domain 3 (Env V3 within peripheral blood mononuclear cells from a group of perinatally HIV-1 subtype B infected, therapy-naïve children. Results Biodiversity of HIV-1 Env V3 quasispecies ranged from about 70 to 270 unique sequence clusters across individuals. Viral population structure was organized into a limited number of clusters that included the dominant variants combined with multiple clusters of low frequency variants. Next generation viral quasispecies evolved from low frequency variants at earlier time points through multiple non-synonymous changes in lineages within the evolutionary landscape. Minor V3 variants detected as long as four years after infection co-localized in phylogenetic reconstructions with early transmitting viruses or with subsequent plasma virus circulating two years later. Conclusions Deep sequencing defines HIV-1 population complexity and structure, reveals the ebb and flow of dominant and rare viral variants in the host ecosystem, and identifies an evolutionary record of low-frequency cell-associated viral V3 variants that persist for years. Bioinformatics pipeline developed for HIV-1 can be applied for biodiversity studies of virome populations in human, animal, or plant ecosystems.

  20. Whole Exome Sequencing for a Patient with Rubinstein-Taybi Syndrome Reveals de Novo Variants besides an Overt CREBBP Mutation

    Directory of Open Access Journals (Sweden)

    Hee Jeong Yoo

    2015-03-01

    Full Text Available Rubinstein-Taybi syndrome (RSTS is a rare condition with a prevalence of 1 in 125,000–720,000 births and characterized by clinical features that include facial, dental, and limb dysmorphology and growth retardation. Most cases of RSTS occur sporadically and are caused by de novo mutations. Cytogenetic or molecular abnormalities are detected in only 55% of RSTS cases. Previous genetic studies have yielded inconsistent results due to the variety of methods used for genetic analysis. The purpose of this study was to use whole exome sequencing (WES to evaluate the genetic causes of RSTS in a young girl presenting with an Autism phenotype. We used the Autism diagnostic observation schedule (ADOS and Autism diagnostic interview revised (ADI-R to confirm her diagnosis of Autism. In addition, various questionnaires were used to evaluate other psychiatric features. We used WES to analyze the DNA sequences of the patient and her parents and to search for de novo variants. The patient showed all the typical features of Autism, WES revealed a de novo frameshift mutation in CREBBP and de novo sequence variants in TNC and IGFALS genes. Mutations in the CREBBP gene have been extensively reported in RSTS patients, while potential missense mutations in TNC and IGFALS genes have not previously been associated with RSTS. The TNC and IGFALS genes are involved in central nervous system development and growth. It is possible for patients with RSTS to have additional de novo variants that could account for previously unexplained phenotypes.

  1. Next-generation sequencing can reveal in vitro-generated PCR crossover products: some artifactual sequences correspond to HLA alleles in the IMGT/HLA database.

    Science.gov (United States)

    Holcomb, C L; Rastrou, M; Williams, T C; Goodridge, D; Lazaro, A M; Tilanus, M; Erlich, H A

    2014-01-01

    The high-resolution human leukocyte antigen (HLA) genotyping assay that we developed using 454 sequencing and Conexio software uses generic polymerase chain reaction (PCR) primers for DRB exon 2. Occasionally, we observed low abundance DRB amplicon sequences that resulted from in vitro PCR 'crossing over' between DRB1 and DRB3/4/5. These hybrid sequences, revealed by the clonal sequencing property of the 454 system, were generally observed at a read depth of 5%-10% of the true alleles. They usually contained at least one mismatch with the IMGT/HLA database, and consequently, were easily recognizable and did not cause a problem for HLA genotyping. Sometimes, however, these artifactual sequences matched a rare allele and the automatic genotype assignment was incorrect. These observations raised two issues: (1) could PCR conditions be modified to reduce such artifacts? and (2) could some of the rare alleles listed in the IMGT/HLA database be artifacts rather than true alleles? Because PCR crossing over occurs during late cycles of PCR, we compared DRB genotypes resulting from 28 and (our standard) 35 cycles of PCR. For all 21 cell line DNAs amplified for 35 cycles, crossover products were detected. In 33% of the cases, these hybrid sequences corresponded to named alleles. With amplification for only 28 cycles, these artifactual sequences were not detectable. To investigate whether some rare alleles in the IMGT/HLA database might be due to PCR artifacts, we analyzed four samples obtained from the investigators who submitted the sequences. In three cases, the sequences were generated from true alleles. In one case, our 454 sequencing revealed an error in the previously submitted sequence. © 2013 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  2. Whole genome sequencing reveals a de novo SHANK3 mutation in familial autism spectrum disorder.

    Directory of Open Access Journals (Sweden)

    Sergio I Nemirovsky

    Full Text Available Clinical genomics promise to be especially suitable for the study of etiologically heterogeneous conditions such as Autism Spectrum Disorder (ASD. Here we present three siblings with ASD where we evaluated the usefulness of Whole Genome Sequencing (WGS for the diagnostic approach to ASD.We identified a family segregating ASD in three siblings with an unidentified cause. We performed WGS in the three probands and used a state-of-the-art comprehensive bioinformatic analysis pipeline and prioritized the identified variants located in genes likely to be related to ASD. We validated the finding by Sanger sequencing in the probands and their parents.Three male siblings presented a syndrome characterized by severe intellectual disability, absence of language, autism spectrum symptoms and epilepsy with negative family history for mental retardation, language disorders, ASD or other psychiatric disorders. We found germline mosaicism for a heterozygous deletion of a cytosine in the exon 21 of the SHANK3 gene, resulting in a missense sequence of 5 codons followed by a premature stop codon (NM_033517:c.3259_3259delC, p.Ser1088Profs*6.We reported an infrequent form of familial ASD where WGS proved useful in the clinic. We identified a mutation in SHANK3 that underscores its relevance in Autism Spectrum Disorder.

  3. Mycobacterium malmesburyense sp. nov., a non-tuberculous species of the genus Mycobacterium revealed by multiple gene sequence characterization.

    Science.gov (United States)

    Gcebe, Nomakorinte; Rutten, Victor; Pittius, Nicolaas Gey van; Naicker, Brendon; Michel, Anita

    2017-04-01

    Non-tuberculous mycobacteria (NTM) are ubiquitous in the environment, and an increasing number of NTM species have been isolated and characterized from both humans and animals, highlighting the zoonotic potential of these bacteria. Host exposure to NTM may impact on cross-reactive immune responsiveness, which may affect diagnosis of bovine tuberculosis and may also play a role in the variability of the efficacy of Mycobacterium bovis BCG vaccination against tuberculosis. In this study we characterized 10 NTM isolates originating from water, soil, nasal swabs of cattle and African buffalo as well as bovine tissue samples. These isolates were previously identified during an NTM survey and were all found, using 16S rRNA gene sequence analysis to be closely related to Mycobacterium moriokaense. A polyphasic approach that included phenotypic characterization, antibiotic susceptibility profiling, mycolic acid profiling and phylogenetic analysis of four gene loci, 16S rRNA, hsp65, sodA and rpoB, was employed to characterize these isolates. Sequence data analysis of the four gene loci revealed that these isolates belong to a unique species of the genus Mycobacterium. This evidence was further supported by several differences in phenotypic characteristics between the isolates and the closely related species. We propose the name Mycobacterium malmesburyense sp. nov. for this novel species. The type strain is WCM 7299T (=ATCC BAA-2759T=CIP 110822T).

  4. Dynamic Evolution of Pathogenicity Revealed by Sequencing and Comparative Genomics of 19 Pseudomonas syringae Isolates

    Science.gov (United States)

    Romanchuk, Artur; Chang, Jeff H.; Mukhtar, M. Shahid; Cherkis, Karen; Roach, Jeff; Grant, Sarah R.; Jones, Corbin D.; Dangl, Jeffery L.

    2011-01-01

    Closely related pathogens may differ dramatically in host range, but the molecular, genetic, and evolutionary basis for these differences remains unclear. In many Gram- negative bacteria, including the phytopathogen Pseudomonas syringae, type III effectors (TTEs) are essential for pathogenicity, instrumental in structuring host range, and exhibit wide diversity between strains. To capture the dynamic nature of virulence gene repertoires across P. syringae, we screened 11 diverse strains for novel TTE families and coupled this nearly saturating screen with the sequencing and assembly of 14 phylogenetically diverse isolates from a broad collection of diseased host plants. TTE repertoires vary dramatically in size and content across all P. syringae clades; surprisingly few TTEs are conserved and present in all strains. Those that are likely provide basal requirements for pathogenicity. We demonstrate that functional divergence within one conserved locus, hopM1, leads to dramatic differences in pathogenicity, and we demonstrate that phylogenetics-informed mutagenesis can be used to identify functionally critical residues of TTEs. The dynamism of the TTE repertoire is mirrored by diversity in pathways affecting the synthesis of secreted phytotoxins, highlighting the likely role of both types of virulence factors in determination of host range. We used these 14 draft genome sequences, plus five additional genome sequences previously reported, to identify the core genome for P. syringae and we compared this core to that of two closely related non-pathogenic pseudomonad species. These data revealed the recent acquisition of a 1 Mb megaplasmid by a sub-clade of cucumber pathogens. This megaplasmid encodes a type IV secretion system and a diverse set of unknown proteins, which dramatically increases both the genomic content of these strains and the pan-genome of the species. PMID:21799664

  5. Conserved properties of dentate gyrus neurogenesis across postnatal development revealed by single-cell RNA sequencing.

    Science.gov (United States)

    Hochgerner, Hannah; Zeisel, Amit; Lönnerberg, Peter; Linnarsson, Sten

    2018-02-01

    The dentate gyrus of the hippocampus is a brain region in which neurogenesis persists into adulthood; however, the relationship between developmental and adult dentate gyrus neurogenesis has not been examined in detail. Here we used single-cell RNA sequencing to reveal the molecular dynamics and diversity of dentate gyrus cell types in perinatal, juvenile, and adult mice. We found distinct quiescent and proliferating progenitor cell types, linked by transient intermediate states to neuroblast stages and fully mature granule cells. We observed shifts in the molecular identity of quiescent and proliferating radial glia and granule cells during the postnatal period that were then maintained through adult stages. In contrast, intermediate progenitor cells, neuroblasts, and immature granule cells were nearly indistinguishable at all ages. These findings demonstrate the fundamental similarity of postnatal and adult neurogenesis in the hippocampus and pinpoint the early postnatal transformation of radial glia from embryonic progenitors to adult quiescent stem cells.

  6. Whole genome sequencing reveals a novel deletion variant in the KIT gene in horses with white spotted coat colour phenotypes.

    Science.gov (United States)

    Dürig, N; Jude, R; Holl, H; Brooks, S A; Lafayette, C; Jagannathan, V; Leeb, T

    2017-08-01

    White spotting phenotypes in horses can range in severity from the common white markings up to completely white horses. EDNRB, KIT, MITF, PAX3 and TRPM1 represent known candidate genes for such phenotypes in horses. For the present study, we re-investigated a large horse family segregating a variable white spotting phenotype, for which conventional Sanger sequencing of the candidate genes' individual exons had failed to reveal the causative variant. We obtained whole genome sequence data from an affected horse and specifically searched for structural variants in the known candidate genes. This analysis revealed a heterozygous ~1.9-kb deletion spanning exons 10-13 of the KIT gene (chr3:77,740,239_77,742,136del1898insTATAT). In continuity with previously named equine KIT variants we propose to designate the newly identified deletion variant W22. We had access to 21 horses carrying the W22 allele. Four of them were compound heterozygous W20/W22 and had a completely white phenotype. Our data suggest that W22 represents a true null allele of the KIT gene, whereas the previously identified W20 leads to a partial loss of function. These findings will enable more precise genetic testing for depigmentation phenotypes in horses. © 2017 Stichting International Foundation for Animal Genetics.

  7. Genotyping by PCR and High-Throughput Sequencing of Commercial Probiotic Products Reveals Composition Biases.

    Directory of Open Access Journals (Sweden)

    Wesley Morovic

    2016-11-01

    Full Text Available Recent advances in microbiome research have brought renewed focus on beneficial bacteria, many of which are available in food and dietary supplements. Although probiotics have historically been defined as microorganisms that convey health benefits when ingested in sufficient viable amounts, this description now includes the stipulation well defined strains, encompassing definitive taxonomy for consumer consideration and regulatory oversight. Here, we evaluated 52 commercial dietary supplements covering a range of labeled species, and determined their content using plate counting, targeted genotyping. Additionally, strain identities were assessed using methods recently published by the United States Pharmacopeial Convention. We also determined the relative abundance of individual bacteria by high-throughput sequencing (HTS of the 16S rRNA sequence using paired-end 2x250bp Illumina MiSeq technology. Using multiple methods, we tested the hypothesis that products do contain the quantitative amount of labeled bacteria, and qualitative list of labeled microbial species. We found that 17 samples (33% were below label claim for CFU prior to their expiration dates. A multiplexed-PCR scheme showed that only 30/52 (58% of the products contained a correctly labeled classification, with issues encompassing incorrect taxonomy, missing species and un-labeled species. The HTS revealed that many blended products consisted predominantly of Lactobacillus acidophilus and Bifidobacterium animalis subsp. lactis. These results highlight the need for reliable methods to qualitatively determine the correct taxonomy and quantitatively ascertain the relative amounts of mixed microbial populations in commercial probiotic products.

  8. Deep sequencing reveals exceptional diversity and modes of transmission for bacterial sponge symbionts.

    Science.gov (United States)

    Webster, Nicole S; Taylor, Michael W; Behnam, Faris; Lücker, Sebastian; Rattei, Thomas; Whalan, Stephen; Horn, Matthias; Wagner, Michael

    2010-08-01

    Marine sponges contain complex bacterial communities of considerable ecological and biotechnological importance, with many of these organisms postulated to be specific to sponge hosts. Testing this hypothesis in light of the recent discovery of the rare microbial biosphere, we investigated three Australian sponges by massively parallel 16S rRNA gene tag pyrosequencing. Here we show bacterial diversity that is unparalleled in an invertebrate host, with more than 250,000 sponge-derived sequence tags being assigned to 23 bacterial phyla and revealing up to 2996 operational taxonomic units (95% sequence similarity) per sponge species. Of the 33 previously described 'sponge-specific' clusters that were detected in this study, 48% were found exclusively in adults and larvae - implying vertical transmission of these groups. The remaining taxa, including 'Poribacteria', were also found at very low abundance among the 135,000 tags retrieved from surrounding seawater. Thus, members of the rare seawater biosphere may serve as seed organisms for widely occurring symbiont populations in sponges and their host association might have evolved much more recently than previously thought. © 2009 Society for Applied Microbiology and Blackwell Publishing Ltd.

  9. Next-generation sequencing reveals phylogeographic structure and a species tree for recent bird divergences

    DEFF Research Database (Denmark)

    McCormack, John E.; Maley, James M.; Hird, Sarah M.

    2012-01-01

    divergence in four phylogenetically diverse avian systems using a method for quick and cost-effective generation of primary DNA sequence data using pyrosequencing. NGS data were processed using an analytical pipeline that reduces many reads into two called alleles per locus per individual. Using single...... throughout the genome. Using eight loci found in Zonotrichia and Junco lineages, we were also able to generate a species tree of these sparrow sister genera, demonstrating the potential of this method for generating data amenable to coalescent-based analysis. We discuss improvements that should enhance...

  10. Unprecedented high-resolution view of bacterial operon architecture revealed by RNA sequencing.

    Science.gov (United States)

    Conway, Tyrrell; Creecy, James P; Maddox, Scott M; Grissom, Joe E; Conkle, Trevor L; Shadid, Tyler M; Teramoto, Jun; San Miguel, Phillip; Shimada, Tomohiro; Ishihama, Akira; Mori, Hirotada; Wanner, Barry L

    2014-07-08

    We analyzed the transcriptome of Escherichia coli K-12 by strand-specific RNA sequencing at single-nucleotide resolution during steady-state (logarithmic-phase) growth and upon entry into stationary phase in glucose minimal medium. To generate high-resolution transcriptome maps, we developed an organizational schema which showed that in practice only three features are required to define operon architecture: the promoter, terminator, and deep RNA sequence read coverage. We precisely annotated 2,122 promoters and 1,774 terminators, defining 1,510 operons with an average of 1.98 genes per operon. Our analyses revealed an unprecedented view of E. coli operon architecture. A large proportion (36%) of operons are complex with internal promoters or terminators that generate multiple transcription units. For 43% of operons, we observed differential expression of polycistronic genes, despite being in the same operons, indicating that E. coli operon architecture allows fine-tuning of gene expression. We found that 276 of 370 convergent operons terminate inefficiently, generating complementary 3' transcript ends which overlap on average by 286 nucleotides, and 136 of 388 divergent operons have promoters arranged such that their 5' ends overlap on average by 168 nucleotides. We found 89 antisense transcripts of 397-nucleotide average length, 7 unannotated transcripts within intergenic regions, and 18 sense transcripts that completely overlap operons on the opposite strand. Of 519 overlapping transcripts, 75% correspond to sequences that are highly conserved in E. coli (>50 genomes). Our data extend recent studies showing unexpected transcriptome complexity in several bacteria and suggest that antisense RNA regulation is widespread. Importance: We precisely mapped the 5' and 3' ends of RNA transcripts across the E. coli K-12 genome by using a single-nucleotide analytical approach. Our resulting high-resolution transcriptome maps show that ca. one-third of E. coli operons are

  11. Draft genome sequence of Streptomyces coelicoflavus ZG0656 reveals the putative biosynthetic gene cluster of acarviostatin family α-amylase inhibitors.

    Science.gov (United States)

    Guo, X; Geng, P; Bai, F; Bai, G; Sun, T; Li, X; Shi, L; Zhong, Q

    2012-08-01

    The aims of this study are to obtain the draft genome sequence of Streptomyces coelicoflavus ZG0656, which produces novel acarviostatin family α-amylase inhibitors, and then to reveal the putative acarviostatin-related gene cluster and the biosynthetic pathway. The draft genome sequence of S. coelicoflavus ZG0656 was generated using a shotgun approach employing a combination of 454 and Solexa sequencing technologies. Genome analysis revealed a putative gene cluster for acarviostatin biosynthesis, termed sct-cluster. The cluster contains 13 acarviostatin synthetic genes, six transporter genes, four starch degrading or transglycosylation enzyme genes and two regulator genes. On the basis of bioinformatic analysis, we proposed a putative biosynthetic pathway of acarviostatins. The intracellular steps produce a structural core, acarviostatin I00-7-P, and the extracellular assemblies lead to diverse acarviostatin end products. The draft genome sequence of S. coelicoflavus ZG0656 revealed the putative biosynthetic gene cluster of acarviostatins and a putative pathway of acarviostatin production. To our knowledge, S. coelicoflavus ZG0656 is the first strain in this species for which a genome sequence has been reported. The analysis of sct-cluster provided important insights into the biosynthesis of acarviostatins. This work will be a platform for producing novel variants and yield improvement. © 2012 The Authors. Letters in Applied Microbiology © 2012 The Society for Applied Microbiology.

  12. Whole genome sequencing of the monomorphic pathogen Mycobacterium bovis reveals local differentiation of cattle clinical isolates.

    Science.gov (United States)

    Lasserre, Moira; Fresia, Pablo; Greif, Gonzalo; Iraola, Gregorio; Castro-Ramos, Miguel; Juambeltz, Arturo; Nuñez, Álvaro; Naya, Hugo; Robello, Carlos; Berná, Luisa

    2018-01-02

    Bovine tuberculosis (bTB) poses serious risks to animal welfare and economy, as well as to public health as a zoonosis. Its etiological agent, Mycobacterium bovis, belongs to the Mycobacterium tuberculosis complex (MTBC), a group of genetically monomorphic organisms featured by a remarkably high overall nucleotide identity (99.9%). Indeed, this characteristic is of major concern for correct typing and determination of strain-specific traits based on sequence diversity. Due to its historical economic dependence on cattle production, Uruguay is deeply affected by the prevailing incidence of Mycobacterium bovis. With the world's highest number of cattle per human, and its intensive cattle production, Uruguay represents a particularly suited setting to evaluate genomic variability among isolates, and the diversity traits associated to this pathogen. We compared 186 genomes from MTBC strains isolated worldwide, and found a highly structured population in M. bovis. The analysis of 23 new M. bovis genomes, belonging to strains isolated in Uruguay evidenced three groups present in the country. Despite presenting an expected highly conserved genomic structure and sequence, these strains segregate into a clustered manner within the worldwide phylogeny. Analysis of the non-pe/ppe differential areas against a reference genome defined four main sources of variability, namely: regions of difference (RD), variable genes, duplications and novel genes. RDs and variant analysis segregated the strains into clusters that are concordant with their spoligotype identities. Due to its high homoplasy rate, spoligotyping failed to reflect the true genomic diversity among worldwide representative strains, however, it remains a good indicator for closely related populations. This study introduces a comprehensive population structure analysis of worldwide M. bovis isolates. The incorporation and analysis of 23 novel Uruguayan M. bovis genomes, sheds light onto the genomic diversity of this

  13. Next-generation sequence analysis of cancer xenograft models.

    Directory of Open Access Journals (Sweden)

    Fernando J Rossello

    Full Text Available Next-generation sequencing (NGS studies in cancer are limited by the amount, quality and purity of tissue samples. In this situation, primary xenografts have proven useful preclinical models. However, the presence of mouse-derived stromal cells represents a technical challenge to their use in NGS studies. We examined this problem in an established primary xenograft model of small cell lung cancer (SCLC, a malignancy often diagnosed from small biopsy or needle aspirate samples. Using an in silico strategy that assign reads according to species-of-origin, we prospectively compared NGS data from primary xenograft models with matched cell lines and with published datasets. We show here that low-coverage whole-genome analysis demonstrated remarkable concordance between published genome data and internal controls, despite the presence of mouse genomic DNA. Exome capture sequencing revealed that this enrichment procedure was highly species-specific, with less than 4% of reads aligning to the mouse genome. Human-specific expression profiling with RNA-Seq replicated array-based gene expression experiments, whereas mouse-specific transcript profiles correlated with published datasets from human cancer stroma. We conclude that primary xenografts represent a useful platform for complex NGS analysis in cancer research for tumours with limited sample resources, or those with prominent stromal cell populations.

  14. Genome sequence of Thermofilum pendens reveals an exceptional loss of biosynthetic pathways without genome reduction

    Energy Technology Data Exchange (ETDEWEB)

    Kyrpides, Nikos; Anderson, Iain; Rodriguez, Jason; Susanti, Dwi; Porat, Iris; Reich, Claudia; Ulrich, Luke E.; Elkins, James G.; Mavromatis, Kostas; Lykidis, Athanasios; Kim, Edwin; Thompson, Linda S.; Nolan, Matt; Land, Miriam; Copeland, Alex; Lapidus, Alla; Lucas, Susan; Detter, Chris; Zhulin, Igor B.; Olsen, Gary J.; Whitman, William; Mukhopadhyay, Biswarup; Bristow, James; Kyrpides, Nikos

    2008-01-01

    We report the complete genome of Thermofilum pendens, a deep-branching, hyperthermophilic member of the order Thermoproteales within the archaeal kingdom Crenarchaeota. T. pendens is a sulfur-dependent, anaerobic heterotroph isolated from a solfatara in Iceland. It is an extracellular commensal, requiring an extract of Thermoproteus tenax for growth, and the genome sequence reveals that biosynthetic pathways for purines, most amino acids, and most cofactors are absent. In fact T. pendens has fewer biosynthetic enzymes than obligate intracellular parasites, although it does not display other features common among obligate parasites and thus does not appear to be in the process of becoming a parasite. It appears that T. pendens has adapted to life in an environment rich in nutrients. T. pendens was known to utilize peptides as an energy source, but the genome reveals substantial ability to grow on carbohydrates. T. pendens is the first crenarchaeote and only the second archaeon found to have a transporter of the phosphotransferase system. In addition to fermentation, T. pendens may gain energy from sulfur reduction with hydrogen and formate as electron donors. It may also be capable of sulfur-independent growth on formate with formate hydrogenlyase. Additional novel features are the presence of a monomethylamine:corrinoid methyltransferase, the first time this enzyme has been found outside of Methanosarcinales, and a presenilin-related protein. Predicted highly expressed proteins do not include housekeeping genes, and instead include ABC transporters for carbohydrates and peptides, and CRISPR-associated proteins.

  15. Pervasive within-Mitochondrion Single-Nucleotide Variant Heteroplasmy as Revealed by Single-Mitochondrion Sequencing

    Directory of Open Access Journals (Sweden)

    Jacqueline Morris

    2017-12-01

    Full Text Available Summary: A number of mitochondrial diseases arise from single-nucleotide variant (SNV accumulation in multiple mitochondria. Here, we present a method for identification of variants present at the single-mitochondrion level in individual mouse and human neuronal cells, allowing for extremely high-resolution study of mitochondrial mutation dynamics. We identified extensive heteroplasmy between individual mitochondrion, along with three high-confidence variants in mouse and one in human that were present in multiple mitochondria across cells. The pattern of variation revealed by single-mitochondrion data shows surprisingly pervasive levels of heteroplasmy in inbred mice. Distribution of SNV loci suggests inheritance of variants across generations, resulting in Poisson jackpot lines with large SNV load. Comparison of human and mouse variants suggests that the two species might employ distinct modes of somatic segregation. Single-mitochondrion resolution revealed mitochondria mutational dynamics that we hypothesize to affect risk probabilities for mutations reaching disease thresholds. : Morris et al. use independent sequencing of multiple individual mitochondria from mouse and human brain cells to show high pervasiveness of mutations. The mutations are heteroplasmic within single mitochondria and within and between cells. These findings suggest mechanisms by which mutations accumulate over time, resulting in mitochondrial dysfunction and disease. Keywords: single mitochondrion, single cell, human neuron, mouse neuron, single-nucleotide variation

  16. Re-inspection of small RNA sequence datasets reveals several novel human miRNA genes.

    Directory of Open Access Journals (Sweden)

    Thomas Birkballe Hansen

    Full Text Available BACKGROUND: miRNAs are key players in gene expression regulation. To fully understand the complex nature of cellular differentiation or initiation and progression of disease, it is important to assess the expression patterns of as many miRNAs as possible. Thereby, identifying novel miRNAs is an essential prerequisite to make possible a comprehensive and coherent understanding of cellular biology. METHODOLOGY/PRINCIPAL FINDINGS: Based on two extensive, but previously published, small RNA sequence datasets from human embryonic stem cells and human embroid bodies, respectively [1], we identified 112 novel miRNA-like structures and were able to validate miRNA processing in 12 out of 17 investigated cases. Several miRNA candidates were furthermore substantiated by including additional available small RNA datasets, thereby demonstrating the power of combining datasets to identify miRNAs that otherwise may be assigned as experimental noise. CONCLUSIONS/SIGNIFICANCE: Our analysis highlights that existing datasets are not yet exhaustedly studied and continuous re-analysis of the available data is important to uncover all features of small RNA sequencing.

  17. Establishing a framework for comparative analysis of genome sequences

    Energy Technology Data Exchange (ETDEWEB)

    Bansal, A.K.

    1995-06-01

    This paper describes a framework and a high-level language toolkit for comparative analysis of genome sequence alignment The framework integrates the information derived from multiple sequence alignment and phylogenetic tree (hypothetical tree of evolution) to derive new properties about sequences. Multiple sequence alignments are treated as an abstract data type. Abstract operations have been described to manipulate a multiple sequence alignment and to derive mutation related information from a phylogenetic tree by superimposing parsimonious analysis. The framework has been applied on protein alignments to derive constrained columns (in a multiple sequence alignment) that exhibit evolutionary pressure to preserve a common property in a column despite mutation. A Prolog toolkit based on the framework has been implemented and demonstrated on alignments containing 3000 sequences and 3904 columns.

  18. Identities among actin-encoding cDNAs of the Nile tilapia (Oreochromis niloticus and other eukaryote species revealed by nucleotide and amino acid sequence analyses

    Directory of Open Access Journals (Sweden)

    Andréia B. Poletto

    2008-01-01

    Full Text Available Actin-encoding cDNAs of Nile tilapia (Oreochromis niloticus were isolated by RT-PCR using total RNA samples of different tissues and further characterized by nucleotide sequencing and in silico amino acid (aa sequence analysis. Comparisons among the actin gene sequences of O. niloticus and those of other species evidenced that the isolated genes present a high similarity to other fish and other vertebrate actin genes. The highest nucleotide resemblance was observed between O. niloticus and O. mossambicus a-actin and b-actin genes. Analysis of the predicted aa sequences revealed two distinct types of cytoplasmic actins, one cardiac muscle actin type and one skeletal muscle actin type that were expressed in different tissues of Nile tilapia. The evolutionary relationships between the Nile tilapia actin genes and diverse other organisms is discussed.

  19. Scalable Kernel Methods and Algorithms for General Sequence Analysis

    Science.gov (United States)

    Kuksa, Pavel

    2011-01-01

    Analysis of large-scale sequential data has become an important task in machine learning and pattern recognition, inspired in part by numerous scientific and technological applications such as the document and text classification or the analysis of biological sequences. However, current computational methods for sequence comparison still lack…

  20. Recurrence plot analysis of DNA sequences

    Energy Technology Data Exchange (ETDEWEB)

    Wu Zuobing [State Key Laboratory of Nonlinear Mechanics, Institute of Mechanics, Chinese Academy of Sciences, Beijing 100080 (China)]. E-mail: wuzb@lnm.imech.ac.cn

    2004-11-15

    Recurrence plot technique of DNA sequences is established on metric representation and employed to analyze correlation structure of nucleotide strings. It is found that, in the transference of nucleotide strings, a human DNA fragment has a major correlation distance, but a yeast chromosome's correlation distance has a constant increasing.

  1. The complete genome sequence of Staphylothermus marinus reveals differences in sulfur metabolism among heterotrophic Crenarchaeota

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, iain J.; Dharmarajan, Lakshmi; Rodriguez, Jason; Hooper, Sean; Porat, Iris; Ulrich, Luke E.; Elkins, James G.; Mavromatis, Kostas; Sun, Hui; Land, Miriam; Lapidus, Alla; Lucas, Susan; Barry, Kerrie; Huber, Harald; Zhulin, Igor B.; Whitman, William B.; Mukhopadhyay, Biswarup; Woese, Carl; Bristow, James; Kyrpides, Nikos

    2008-09-05

    Staphylothermus marinus is an anaerobic, sulfur-reducing peptide fermenter of the archaeal phylum Crenarchaeota. It is the third heterotrophic, obligate sulfur reducing crenarchaeote to be sequenced and provides an opportunity for comparative analysis of the three genomes. The 1.57 Mbp genome of the hyperthermophilic crenarchaeote Staphylothermus marinus has been completely sequenced. The main energy generating pathways likely involve 2-oxoacid:ferredoxin oxidoreductases and ADP-forming acetyl-CoA synthases. S. marinus possesses several enzymes not present in other crenarchaeotes including a sodium ion-translocating decarboxylase likely to be involved in amino acid degradation. S. marinus lacks sulfur-reducing enzymes present in the other two sulfur-reducing crenarchaeotes that have been sequenced - Thermofilum pendens and Hyperthermus butylicus. Instead it has three operons similar to the mbh and mbx operons of Pyrococcus furiosus, which may play a role in sulfur reduction and/or hydrogen production. The two marine organisms, S. marinus and H. butylicus, possess more sodium-dependent transporters than T. pendens and use symporters for potassium uptake while T. pendens uses an ATP-dependent potassium transporter. T. pendens has adapted to a nutrient-rich environment while H. butylicus is adapted to a nutrient-poor environment, and S. marinus lies between these two extremes. The three heterotrophic sulfur-reducing crenarchaeotes have adapted to their habitats, terrestrial vs. marine, via their transporter content, and they have also adapted to environments with differing levels of nutrients. Despite the fact that they all use sulfur as an electron acceptor, they are likely to have different pathways for sulfur reduction.

  2. Genetic divergence of Asiatic Bdellocephala (Turbellaria, Tricladida, Paludicola) as revealed by partial 18S rRNA gene sequence comparisons.

    Science.gov (United States)

    Kuznedelov, K D; Timoshkin, O A; Goldman, E

    1997-01-01

    Polymerase chain reaction (PCR) and direct sequencing of small ribosomal RNA genes were used for analysis of genetic differences among Asiatic species of freshwater triclad genus Bdellocephala. Representatives of four species and four subspecies of this genus were used to establish homology between nucleotides in the 5'-end portion of small ribosomal RNA gene sequences. Within 552 nucleotide sites of aligned sequences compared, six variable base positions were discovered, dividing Bdellocephala into five different genotypes. Sequence data allow to distinguish two groups of these genotypes. One of them unites species from Kamchatka and Japan, another one unites Baikalian taxa. Agreement between available morphological, cytological and sequence data is discussed.

  3. The complete genome sequence of Bacillus velezensis 9912D reveals its biocontrol mechanism as a novel commercial biological fungicide agent.

    Science.gov (United States)

    Pan, Hua-Qi; Li, Qing-Lian; Hu, Jiang-Chun

    2017-04-10

    A Bacillus sp. 9912 mutant, 9912D, was approved as a new biological fungicide agent by the Ministry of Agriculture of the People's Republic of China in 2016 owing to its excellent inhibitory effect on various plant pathogens and being environment-friendly. Here, we present the genome of 9912D with a circular chromosome having 4436 coding DNA sequences (CDSs), and a circular plasmid encoding 59 CDSs. This strain was finally designated as Bacillus velezensis based on phylogenomic analyses. Genome analysis revealed a total of 19 candidate gene clusters involved in secondary metabolite biosynthesis, including potential new type II lantibiotics. The absence of fengycin biosynthetic gene cluster is noteworthy. Our data offer insights into the genetic, biological and physiological characteristics of this strain and aid in deeper understanding of its biocontrol mechanism. Copyright © 2017 Elsevier B.V. All rights reserved.

  4. Massively parallel amplicon sequencing reveals isotype-specific variability of antimicrobial peptide transcripts in Mytilus galloprovincialis.

    Directory of Open Access Journals (Sweden)

    Umberto Rosani

    Full Text Available BACKGROUND: Effective innate responses against potential pathogens are essential in the living world and possibly contributed to the evolutionary success of invertebrates. Taken together, antimicrobial peptide (AMP precursors of defensin, mytilin, myticin and mytimycin can represent about 40% of the hemocyte transcriptome in mussels injected with viral-like and bacterial preparations, and unique profiles of myticin C variants are expressed in single mussels. Based on amplicon pyrosequencing, we have ascertained and compared the natural and Vibrio-induced diversity of AMP transcripts in mussel hemocytes from three European regions. METHODOLOGY/PRINCIPAL FINDINGS: Hemolymph was collected from mussels farmed in the coastal regions of Palavas (France, Vigo (Spain and Venice (Italy. To represent the AMP families known in M. galloprovincialis, nine transcript sequences have been selected, amplified from hemocyte RNA and subjected to pyrosequencing. Hemolymph from farmed (offshore and wild (lagoon Venice mussels, both injected with 10(7 Vibrio cells, were similarly processed. Amplicon pyrosequencing emphasized the AMP transcript diversity, with Single Nucleotide Changes (SNC minimal for mytilin B/C and maximal for arthropod-like defensin and myticin C. Ratio of non-synonymous vs. synonymous changes also greatly differed between AMP isotypes. Overall, each amplicon revealed similar levels of nucleotidic variation across geographical regions, with two main sequence patterns confirmed for mytimycin and no substantial changes after immunostimulation. CONCLUSIONS/SIGNIFICANCE: Barcoding and bidirectional pyrosequencing allowed us to map and compare the transcript diversity of known mussel AMPs. Though most of the genuine cds variation was common to the analyzed samples we could estimate from 9 to 106 peptide variants in hemolymph pools representing 100 mussels, depending on the AMP isoform and sampling site. In this study, no prevailing SNC patterns related

  5. Whole mitochondrial genome sequencing of domestic horses reveals incorporation of extensive wild horse diversity during domestication

    Directory of Open Access Journals (Sweden)

    Lippold Sebastian

    2011-11-01

    Full Text Available Abstract Background DNA target enrichment by micro-array capture combined with high throughput sequencing technologies provides the possibility to obtain large amounts of sequence data (e.g. whole mitochondrial DNA genomes from multiple individuals at relatively low costs. Previously, whole mitochondrial genome data for domestic horses (Equus caballus were limited to only a few specimens and only short parts of the mtDNA genome (especially the hypervariable region were investigated for larger sample sets. Results In this study we investigated whole mitochondrial genomes of 59 domestic horses from 44 breeds and a single Przewalski horse (Equus przewalski using a recently described multiplex micro-array capture approach. We found 473 variable positions within the domestic horses, 292 of which are parsimony-informative, providing a well resolved phylogenetic tree. Our divergence time estimate suggests that the mitochondrial genomes of modern horse breeds shared a common ancestor around 93,000 years ago and no later than 38,000 years ago. A Bayesian skyline plot (BSP reveals a significant population expansion beginning 6,000-8,000 years ago with an ongoing exponential growth until the present, similar to other domestic animal species. Our data further suggest that a large sample of wild horse diversity was incorporated into the domestic population; specifically, at least 46 of the mtDNA lineages observed in domestic horses (73% already existed before the beginning of domestication about 5,000 years ago. Conclusions Our study provides a window into the maternal origins of extant domestic horses and confirms that modern domestic breeds present a wide sample of the mtDNA diversity found in ancestral, now extinct, wild horse populations. The data obtained allow us to detect a population expansion event coinciding with the beginning of domestication and to estimate both the minimum number of female horses incorporated into the domestic gene pool and the

  6. Multilocus sequence data reveal dozens of putative cryptic species in a radiation of endemic Californian mygalomorph spiders (Araneae, Mygalomorphae, Nemesiidae).

    Science.gov (United States)

    Leavitt, Dean H; Starrett, James; Westphal, Michael F; Hedin, Marshal

    2015-10-01

    We use mitochondrial and multi-locus nuclear DNA sequence data to infer both species boundaries and species relationships within California nemesiid spiders. Higher-level phylogenetic data show that the California radiation is monophyletic and distantly related to European members of the genus Brachythele. As such, we consider all California nemesiid taxa to belong to the genus Calisoga Chamberlin, 1937. Rather than find support for one or two taxa as previously hypothesized, genetic data reveal Calisoga to be a species-rich radiation of spiders, including perhaps dozens of species. This conclusion is supported by multiple mitochondrial barcoding analyses, and also independent analyses of nuclear data that reveal general genealogical congruence. We discovered three instances of sympatry, and genetic data indicate reproductive isolation when in sympatry. An examination of female reproductive morphology does not reveal species-specific characters, and observed male morphological differences for a subset of putative species are subtle. Our coalescent species tree analysis of putative species lays the groundwork for future research on the taxonomy and biogeographic history of this remarkable endemic radiation. Copyright © 2015 Elsevier Inc. All rights reserved.

  7. Internal Transcribed Spacer 1 (ITS1 based sequence typing reveals phylogenetically distinct Ascaris population

    Directory of Open Access Journals (Sweden)

    Koushik Das

    2015-01-01

    Full Text Available Taxonomic differentiation among morphologically identical Ascaris species is a debatable scientific issue in the context of Ascariasis epidemiology. To explain the disease epidemiology and also the taxonomic position of different Ascaris species, genome information of infecting strains from endemic areas throughout the world is certainly crucial. Ascaris population from human has been genetically characterized based on the widely used genetic marker, internal transcribed spacer1 (ITS1. Along with previously reported and prevalent genotype G1, 8 new sequence variants of ITS1 have been identified. Genotype G1 was significantly present among female patients aged between 10 to 15 years. Intragenic linkage disequilibrium (LD analysis at target locus within our study population has identified an incomplete LD value with potential recombination events. A separate cluster of Indian isolates with high bootstrap value indicate their distinct phylogenetic position in comparison to the global Ascaris population. Genetic shuffling through recombination could be a possible reason for high population diversity and frequent emergence of new sequence variants, identified in present and other previous studies. This study explores the genetic organization of Indian Ascaris population for the first time which certainly includes some fundamental information on the molecular epidemiology of Ascariasis.

  8. Shotgun Bisulfite Sequencing of the Betula platyphylla Genome Reveals the Tree’s DNA Methylation Patterning

    Directory of Open Access Journals (Sweden)

    Chang Su

    2014-12-01

    Full Text Available DNA methylation plays a critical role in the regulation of gene expression. Most studies of DNA methylation have been performed in herbaceous plants, and little is known about the methylation patterns in tree genomes. In the present study, we generated a map of methylated cytosines at single base pair resolution for Betula platyphylla (white birch by bisulfite sequencing combined with transcriptomics to analyze DNA methylation and its effects on gene expression. We obtained a detailed view of the function of DNA methylation sequence composition and distribution in the genome of B. platyphylla. There are 34,460 genes in the whole genome of birch, and 31,297 genes are methylated. Conservatively, we estimated that 14.29% of genomic cytosines are methylcytosines in birch. Among the methylation sites, the CHH context accounts for 48.86%, and is the largest proportion. Combined transcriptome and methylation analysis showed that the genes with moderate methylation levels had higher expression levels than genes with high and low methylation. In addition, methylated genes are highly enriched for the GO subcategories of binding activities, catalytic activities, cellular processes, response to stimulus and cell death, suggesting that methylation mediates these pathways in birch trees.

  9. Deep sequencing reveals distinct patterns of DNA methylation in prostate cancer.

    Science.gov (United States)

    Kim, Jung H; Dhanasekaran, Saravana M; Prensner, John R; Cao, Xuhong; Robinson, Daniel; Kalyana-Sundaram, Shanker; Huang, Christina; Shankar, Sunita; Jing, Xiaojun; Iyer, Matthew; Hu, Ming; Sam, Lee; Grasso, Catherine; Maher, Christopher A; Palanisamy, Nallasivam; Mehra, Rohit; Kominsky, Hal D; Siddiqui, Javed; Yu, Jindan; Qin, Zhaohui S; Chinnaiyan, Arul M

    2011-07-01

    Beginning with precursor lesions, aberrant DNA methylation marks the entire spectrum of prostate cancer progression. We mapped the global DNA methylation patterns in select prostate tissues and cell lines using MethylPlex-next-generation sequencing (M-NGS). Hidden Markov model-based next-generation sequence analysis identified ∼68,000 methylated regions per sample. While global CpG island (CGI) methylation was not differential between benign adjacent and cancer samples, overall promoter CGI methylation significantly increased from ~12.6% in benign samples to 19.3% and 21.8% in localized and metastatic cancer tissues, respectively (P-value prostate tissues, 2481 differentially methylated regions (DMRs) are cancer-specific, including numerous novel DMRs. A novel cancer-specific DMR in the WFDC2 promoter showed frequent methylation in cancer (17/22 tissues, 6/6 cell lines), but not in the benign tissues (0/10) and normal PrEC cells. Integration of LNCaP DNA methylation and H3K4me3 data suggested an epigenetic mechanism for alternate transcription start site utilization, and these modifications segregated into distinct regions when present on the same promoter. Finally, we observed differences in repeat element methylation, particularly LINE-1, between ERG gene fusion-positive and -negative cancers, and we confirmed this observation using pyrosequencing on a tissue panel. This comprehensive methylome map will further our understanding of epigenetic regulation in prostate cancer progression.

  10. Penicillium arizonense, a new, genome sequenced fungal species, reveals a high chemical diversity in secreted metabolites

    Science.gov (United States)

    Grijseels, Sietske; Nielsen, Jens Christian; Randelovic, Milica; Nielsen, Jens; Nielsen, Kristian Fog; Workman, Mhairi; Frisvad, Jens Christian

    2016-01-01

    A new soil-borne species belonging to the Penicillium section Canescentia is described, Penicillium arizonense sp. nov. (type strain CBS 141311T = IBT 12289T). The genome was sequenced and assembled into 33.7 Mb containing 12,502 predicted genes. A phylogenetic assessment based on marker genes confirmed the grouping of P. arizonense within section Canescentia. Compared to related species, P. arizonense proved to encode a high number of proteins involved in carbohydrate metabolism, in particular hemicellulases. Mining the genome for genes involved in secondary metabolite biosynthesis resulted in the identification of 62 putative biosynthetic gene clusters. Extracts of P. arizonense were analysed for secondary metabolites and austalides, pyripyropenes, tryptoquivalines, fumagillin, pseurotin A, curvulinic acid and xanthoepocin were detected. A comparative analysis against known pathways enabled the proposal of biosynthetic gene clusters in P. arizonense responsible for the synthesis of all detected compounds except curvulinic acid. The capacity to produce biomass degrading enzymes and the identification of a high chemical diversity in secreted bioactive secondary metabolites, offers a broad range of potential industrial applications for the new species P. arizonense. The description and availability of the genome sequence of P. arizonense, further provides the basis for biotechnological exploitation of this species. PMID:27739446

  11. Genome sequencing reveals complex secondary metabolome in themarine actinomycete Salinispora tropica

    Energy Technology Data Exchange (ETDEWEB)

    Udwary, Daniel W.; Zeigler, Lisa; Asolkar, Ratnakar; Singan,Vasanth; Lapidus, Alla; Fenical, William; Jensen, Paul R.; Moore, BradleyS.

    2007-05-01

    Recent fermentation studies have identified actinomycetes ofthe marine-dwelling genus Salinispora as prolific natural productproducers. To further evaluate their biosynthetic potential, we analyzedall identifiable secondary natural product gene clusters from therecently sequenced 5,184,724 bp S. tropica CNB-440 circular genome. Ouranalysis shows that biosynthetic potential meets or exceeds that shown byprevious Streptomyces genome sequences as well as other naturalproduct-producing actinomycetes. The S. tropica genome features ninepolyketide synthase systems of every known formally classified family,non-ribosomal peptide synthetases and several hybrid clusters. While afew clusters appear to encode molecules previously identified inStreptomyces species,the majority of the 15 biosynthetic loci are novel.Specific chemical information about putative and observed natural productmolecules is presented and discussed. In addition, our bioinformaticanalysis was critical for the structure elucidation of the novelpolyenemacrolactam salinilactam A. This study demonstrates the potentialfor genomic analysis to complement and strengthen traditional naturalproduct isolation studies and firmly establishes the genus Salinispora asa rich source of novel drug-like molecules.

  12. ITS2 sequence-structure phylogeny reveals diverse endophytic Pseudocercospora fungi on poplars.

    Science.gov (United States)

    Yan, Dong-Hui; Gao, Qian; Sun, Xiaoming; Song, Xiaoyu; Li, Hongchang

    2018-04-01

    For matching the new fungal nomenclature to abolish pleomorphic names for a fungus, a genus Pseudocercospora s. str. was suggested to host holomorphic Pseudocercosproa fungi. But the Pseudocercosproa fungi need extra phylogenetic loci to clarify their taxonomy and diversity for their existing and coming species. Internal transcribed spacer 2 (ITS2) secondary structures have been promising in charactering species phylogeny in plants, animals and fungi. In present study, a conserved model of ITS2 secondary structures was confirmed on fungi in Pseudocercospora s. str. genus using RNAshape program. The model has a typical eukaryotic four-helix ITS2 secondary structure. But a single U base occurred in conserved motif of U-U mismatch in Helix 2, and a UG emerged in UGGU motif in Helix 3 to Pseudocercospora fungi. The phylogeny analyses based on the ITS2 sequence-secondary structures with compensatory base change characterizations are able to delimit more species for Pseudocercospora s. str. than phylogenic inferences of traditional multi-loci alignments do. The model was employed to explore the diversity of endophytic Pseudocercospora fungi in poplar trees. The analysis results also showed that endophytic Pseudocercospora fungi were diverse in species and evolved a specific lineage in poplar trees. This work suggested that ITS2 sequence-structures could become as additionally significant loci for species phylogenetic and taxonomic studies on Pseudocerospora fungi, and that Pseudocercospora endophytes could be important roles to Pseudocercospora fungi's evolution and function in ecology.

  13. Deep RNA sequencing reveals dynamic regulation of myocardial noncoding RNAs in failing human heart and remodeling with mechanical circulatory support.

    Science.gov (United States)

    Yang, Kai-Chien; Yamada, Kathryn A; Patel, Akshar Y; Topkara, Veli K; George, Isaac; Cheema, Faisal H; Ewald, Gregory A; Mann, Douglas L; Nerbonne, Jeanne M

    2014-03-04

    Microarrays have been used extensively to profile transcriptome remodeling in failing human heart, although the genomic coverage provided is limited and fails to provide a detailed picture of the myocardial transcriptome landscape. Here, we describe sequencing-based transcriptome profiling, providing comprehensive analysis of myocardial mRNA, microRNA (miRNA), and long noncoding RNA (lncRNA) expression in failing human heart before and after mechanical support with a left ventricular (LV) assist device (LVAD). Deep sequencing of RNA isolated from paired nonischemic (NICM; n=8) and ischemic (ICM; n=8) human failing LV samples collected before and after LVAD and from nonfailing human LV (n=8) was conducted. These analyses revealed high abundance of mRNA (37%) and lncRNA (71%) of mitochondrial origin. miRNASeq revealed 160 and 147 differentially expressed miRNAs in ICM and NICM, respectively, compared with nonfailing LV. Among these, only 2 (ICM) and 5 (NICM) miRNAs are normalized with LVAD. RNASeq detected 18 480, including 113 novel, lncRNAs in human LV. Among the 679 (ICM) and 570 (NICM) lncRNAs differentially expressed with heart failure, ≈10% are improved or normalized with LVAD. In addition, the expression signature of lncRNAs, but not miRNAs or mRNAs, distinguishes ICM from NICM. Further analysis suggests that cis-gene regulation represents a major mechanism of action of human cardiac lncRNAs. The myocardial transcriptome is dynamically regulated in advanced heart failure and after LVAD support. The expression profiles of lncRNAs, but not mRNAs or miRNAs, can discriminate failing hearts of different pathologies and are markedly altered in response to LVAD support. These results suggest an important role for lncRNAs in the pathogenesis of heart failure and in reverse remodeling observed with mechanical support.

  14. Analysis of Neuronal Sequences Using Pairwise Biases

    Science.gov (United States)

    2015-08-27

    semantic memory (knowledge of facts) and implicit memory (e.g., how to ride a bike ). Evidence for the participation of the hippocampus in the formation of...hippocampal formation in an attempt to be cured of severe epileptic seizures. Although the surgery was successful in regards to reducing the frequency and...very different from each other in many ways including duration and number of spikes. Still, these sequences share a similar trend in the general order

  15. Google matrix analysis of DNA sequences.

    Science.gov (United States)

    Kandiah, Vivek; Shepelyansky, Dima L

    2013-01-01

    For DNA sequences of various species we construct the Google matrix [Formula: see text] of Markov transitions between nearby words composed of several letters. The statistical distribution of matrix elements of this matrix is shown to be described by a power law with the exponent being close to those of outgoing links in such scale-free networks as the World Wide Web (WWW). At the same time the sum of ingoing matrix elements is characterized by the exponent being significantly larger than those typical for WWW networks. This results in a slow algebraic decay of the PageRank probability determined by the distribution of ingoing elements. The spectrum of [Formula: see text] is characterized by a large gap leading to a rapid relaxation process on the DNA sequence networks. We introduce the PageRank proximity correlator between different species which determines their statistical similarity from the view point of Markov chains. The properties of other eigenstates of the Google matrix are also discussed. Our results establish scale-free features of DNA sequence networks showing their similarities and distinctions with the WWW and linguistic networks.

  16. Google matrix analysis of DNA sequences.

    Directory of Open Access Journals (Sweden)

    Vivek Kandiah

    Full Text Available For DNA sequences of various species we construct the Google matrix [Formula: see text] of Markov transitions between nearby words composed of several letters. The statistical distribution of matrix elements of this matrix is shown to be described by a power law with the exponent being close to those of outgoing links in such scale-free networks as the World Wide Web (WWW. At the same time the sum of ingoing matrix elements is characterized by the exponent being significantly larger than those typical for WWW networks. This results in a slow algebraic decay of the PageRank probability determined by the distribution of ingoing elements. The spectrum of [Formula: see text] is characterized by a large gap leading to a rapid relaxation process on the DNA sequence networks. We introduce the PageRank proximity correlator between different species which determines their statistical similarity from the view point of Markov chains. The properties of other eigenstates of the Google matrix are also discussed. Our results establish scale-free features of DNA sequence networks showing their similarities and distinctions with the WWW and linguistic networks.

  17. Reverse transcriptase sequences from mulberry LTR retrotransposons: characterization analysis

    Directory of Open Access Journals (Sweden)

    Ma Bi

    2017-10-01

    Full Text Available Copia and Gypsy play important roles in structural, functional and evolutionary dynamics of plant genomes. In this study, a total of 106 and 101, Copia and Gypsy reverse transcriptase (rt were amplified respectively in the Morus notabilis genome using degenerate primers. All sequences exhibited high levels of heterogeneity, were rich in AT and possessed higher sequence divergence of Copia rt in comparison to Gypsy rt. Two reasons are likely to account for this phenomenon: a these elements often experience deletions or fragmentation by illegitimate or unequal homologous recombination in the transposition process; b strong purifying selective pressure drives the evolution of these elements through “selective silencing” with random mutation and eventual deletion from the host genome. Interestingly, mulberry rt clustered with other rt from distantly related taxa according to the phylogenetic analysis. This phenomenon did not result from horizontal transposable element transfer. Results obtained from fluorescence in situ hybridization revealed that most of the hybridization signals were preferentially concentrated in pericentromeric and distal regions of chromosomes, and these elements may play important roles in the regions in which they are found. Results of this study support the continued pursuit of further functional studies of Copia and Gypsy in the mulberry genome.

  18. Messenger RNA biomarker signatures for forensic body fluid identification revealed by targeted RNA sequencing.

    Science.gov (United States)

    Hanson, E; Ingold, S; Haas, C; Ballantyne, J

    2018-05-01

    The recovery of a DNA profile from the perpetrator or victim in criminal investigations can provide valuable 'source level' information for investigators. However, a DNA profile does not reveal the circumstances by which biological material was transferred. Some contextual information can be obtained by a determination of the tissue or fluid source of origin of the biological material as it is potentially indicative of some behavioral activity on behalf of the individual that resulted in its transfer from the body. Here, we sought to improve upon established RNA based methods for body fluid identification by developing a targeted multiplexed next generation mRNA sequencing assay comprising a panel of approximately equal sized gene amplicons. The multiplexed biomarker panel includes several highly specific gene targets with the necessary specificity to definitively identify most forensically relevant biological fluids and tissues (blood, semen, saliva, vaginal secretions, menstrual blood and skin). In developing the biomarker panel we evaluated 66 gene targets, with a progressive iteration of testing target combinations that exhibited optimal sensitivity and specificity using a training set of forensically relevant body fluid samples. The current assay comprises 33 targets: 6 blood, 6 semen, 6 saliva, 4 vaginal secretions, 5 menstrual blood and 6 skin markers. We demonstrate the sensitivity and specificity of the assay and the ability to identify body fluids in single source and admixed stains. A 16 sample blind test was carried out by one lab with samples provided by the other participating lab. The blinded lab correctly identified the body fluids present in 15 of the samples with the major component identified in the 16th. Various classification methods are being investigated to permit inference of the body fluid/tissue in dried physiological stains. These include the percentage of reads in a sample that are due to each of the 6 tissues/body fluids tested and

  19. Time-Resolved Transposon Insertion Sequencing Reveals Genome-Wide Fitness Dynamics during Infection.

    Science.gov (United States)

    Yang, Guanhua; Billings, Gabriel; Hubbard, Troy P; Park, Joseph S; Yin Leung, Ka; Liu, Qin; Davis, Brigid M; Zhang, Yuanxing; Wang, Qiyao; Waldor, Matthew K

    2017-10-03

    Transposon insertion sequencing (TIS) is a powerful high-throughput genetic technique that is transforming functional genomics in prokaryotes, because it enables genome-wide mapping of the determinants of fitness. However, current approaches for analyzing TIS data assume that selective pressures are constant over time and thus do not yield information regarding changes in the genetic requirements for growth in dynamic environments (e.g., during infection). Here, we describe structured analysis of TIS data collected as a time series, termed pattern analysis of conditional essentiality (PACE). From a temporal series of TIS data, PACE derives a quantitative assessment of each mutant's fitness over the course of an experiment and identifies mutants with related fitness profiles. In so doing, PACE circumvents major limitations of existing methodologies, specifically the need for artificial effect size thresholds and enumeration of bacterial population expansion. We used PACE to analyze TIS samples of Edwardsiella piscicida (a fish pathogen) collected over a 2-week infection period from a natural host (the flatfish turbot). PACE uncovered more genes that affect E. piscicida 's fitness in vivo than were detected using a cutoff at a terminal sampling point, and it identified subpopulations of mutants with distinct fitness profiles, one of which informed the design of new live vaccine candidates. Overall, PACE enables efficient mining of time series TIS data and enhances the power and sensitivity of TIS-based analyses. IMPORTANCE Transposon insertion sequencing (TIS) enables genome-wide mapping of the genetic determinants of fitness, typically based on observations at a single sampling point. Here, we move beyond analysis of endpoint TIS data to create a framework for analysis of time series TIS data, termed pattern analysis of conditional essentiality (PACE). We applied PACE to identify genes that contribute to colonization of a natural host by the fish pathogen

  20. Direct sequencing of FAH gene in Pakistani tyrosinemia type 1 families reveals a novel mutation.

    Science.gov (United States)

    Ijaz, Sadaqat; Zahoor, Muhammad Yasir; Imran, Muhammad; Afzal, Sibtain; Bhinder, Munir A; Ullah, Ihsan; Cheema, Huma Arshad; Ramzan, Khushnooda; Shehzad, Wasim

    2016-03-01

    Hereditary tyrosinemia type 1 (HT1) is a rare inborn error of tyrosine catabolism with a worldwide prevalence of one out of 100,000 live births. HT1 is clinically characterized by hepatic and renal dysfunction resulting from the deficiency of fumarylacetoacetate hydrolase (FAH) enzyme, caused by recessive mutations in the FAH gene. We present here the first report on identification of FAH mutations in HT1 patients from Pakistan with a novel one. Three Pakistani families, each having one child affected with HT1, were enrolled over a period of 1.5 years. Two of the affected children had died as they were presented late with acute form. All regions of the FAH gene spanning exons and splicing sites were amplified by polymerase chain reaction (PCR) and mutation analysis was carried out by direct sequencing. Results of sequencing were confirmed by restriction fragment length polymorphism (PCR-RFLP) analysis. Three different FAH mutations, one in each family, were found to co-segregate with the disease phenotype. Two of these FAH mutations have been known (c.192G>T and c.1062+5G>A [IVS12+5G>A]), while c.67T>C (p.Ser23Pro) was a novel mutation. The novel variant was not detected in any of 120 chromosomes from normal ethnically matched individuals. Most of the HT1 patients die before they present to hospitals in Pakistan, as is indicated by enrollment of only three families in 1.5 years. Most of those with late clinical presentation do not survive due to delayed diagnosis followed by untimely treatment. This tragic condition advocates the establishment of expanded newborn screening program for HT1 within Pakistan.

  1. CSReport: A New Computational Tool Designed for Automatic Analysis of Class Switch Recombination Junctions Sequenced by High-Throughput Sequencing.

    Science.gov (United States)

    Boyer, François; Boutouil, Hend; Dalloul, Iman; Dalloul, Zeinab; Cook-Moreau, Jeanne; Aldigier, Jean-Claude; Carrion, Claire; Herve, Bastien; Scaon, Erwan; Cogné, Michel; Péron, Sophie

    2017-05-15

    B cells ensure humoral immune responses due to the production of Ag-specific memory B cells and Ab-secreting plasma cells. In secondary lymphoid organs, Ag-driven B cell activation induces terminal maturation and Ig isotype class switch (class switch recombination [CSR]). CSR creates a virtually unique IgH locus in every B cell clone by intrachromosomal recombination between two switch (S) regions upstream of each C region gene. Amount and structural features of CSR junctions reveal valuable information about the CSR mechanism, and analysis of CSR junctions is useful in basic and clinical research studies of B cell functions. To provide an automated tool able to analyze large data sets of CSR junction sequences produced by high-throughput sequencing (HTS), we designed CSReport, a software program dedicated to support analysis of CSR recombination junctions sequenced with a HTS-based protocol (Ion Torrent technology). CSReport was assessed using simulated data sets of CSR junctions and then used for analysis of Sμ-Sα and Sμ-Sγ1 junctions from CH12F3 cells and primary murine B cells, respectively. CSReport identifies junction segment breakpoints on reference sequences and junction structure (blunt-ended junctions or junctions with insertions or microhomology). Besides the ability to analyze unprecedentedly large libraries of junction sequences, CSReport will provide a unified framework for CSR junction studies. Our results show that CSReport is an accurate tool for analysis of sequences from our HTS-based protocol for CSR junctions, thereby facilitating and accelerating their study. Copyright © 2017 by The American Association of Immunologists, Inc.

  2. The Douglas-fir genome sequence reveals specialization of the photosynthetic apparatus in Pinaceae

    Science.gov (United States)

    David B. Neale; Patrick E. McGuire; Nicholas C. Wheeler; Kristian A. Stevens; Marc W. Crepeau; Charis Cardeno; Aleksey V. Zimin; Daniela Puiu; Geo M. Pertea; U. Uzay Sezen; Claudio Casola; Tomasz E. Koralewski; Robin Paul; Daniel Gonzalez-Ibeas; Sumaira Zaman; Richard Cronn; Mark Yandell; Carson Holt; Charles H. Langley; James A. Yorke; Steven L. Salzberg; Jill L. Wegrzyn

    2017-01-01

    A reference genome sequence for Pseudotsuga menziesii var. menziesii (Mirb.) Franco (Coastal Douglas-fir) is reported, thus providing a reference sequence for a third genus of the family Pinaceae. The contiguity and quality of the genome assembly far exceeds that of other conifer reference genome sequences (contig N50 = 44,136 bp and scaffold N50...

  3. Error Analysis of Deep Sequencing of Phage Libraries: Peptides Censored in Sequencing

    Directory of Open Access Journals (Sweden)

    Wadim L. Matochko

    2013-01-01

    Full Text Available Next-generation sequencing techniques empower selection of ligands from phage-display libraries because they can detect low abundant clones and quantify changes in the copy numbers of clones without excessive selection rounds. Identification of errors in deep sequencing data is the most critical step in this process because these techniques have error rates >1%. Mechanisms that yield errors in Illumina and other techniques have been proposed, but no reports to date describe error analysis in phage libraries. Our paper focuses on error analysis of 7-mer peptide libraries sequenced by Illumina method. Low theoretical complexity of this phage library, as compared to complexity of long genetic reads and genomes, allowed us to describe this library using convenient linear vector and operator framework. We describe a phage library as N×1 frequency vector n=ni, where ni is the copy number of the ith sequence and N is the theoretical diversity, that is, the total number of all possible sequences. Any manipulation to the library is an operator acting on n. Selection, amplification, or sequencing could be described as a product of a N×N matrix and a stochastic sampling operator (Sa. The latter is a random diagonal matrix that describes sampling of a library. In this paper, we focus on the properties of Sa and use them to define the sequencing operator (Seq. Sequencing without any bias and errors is Seq=Sa IN, where IN is a N×N unity matrix. Any bias in sequencing changes IN to a nonunity matrix. We identified a diagonal censorship matrix (CEN, which describes elimination or statistically significant downsampling, of specific reads during the sequencing process.

  4. Sequencing chromosomal abnormalities reveals neurodevelopmental loci that confer risk across diagnostic boundaries

    Science.gov (United States)

    Talkowski, Michael E.; Rosenfeld, Jill A.; Blumenthal, Ian; Pillalamarri, Vamsee; Chiang, Colby; Heilbut, Adrian; Ernst, Carl; Hanscom, Carrie; Rossin, Elizabeth; Lindgren, Amelia; Pereira, Shahrin; Ruderfer, Douglas; Kirby, Andrew; Ripke, Stephan; Harris, David; Lee, Ji-Hyun; Ha, Kyungsoo; Kim, Hyung-Goo; Solomon, Benjamin D.; Gropman, Andrea L.; Lucente, Diane; Sims, Katherine; Ohsumi, Toshiro K.; Borowsky, Mark L.; Loranger, Stephanie; Quade, Bradley; Lage, Kasper; Miles, Judith; Wu, Bai-Lin; Shen, Yiping; Neale, Benjamin; Shaffer, Lisa G.; Daly, Mark J.; Morton, Cynthia C.; Gusella, James F.

    2012-01-01

    SUMMARY Balanced chromosomal abnormalities (BCAs) represent a reservoir of single gene disruptions in neurodevelopmental disorders (NDD). We sequenced BCAs in autism and related NDDs, revealing disruption of 33 loci in four general categories: 1) genes associated with abnormal neurodevelopment (e.g., AUTS2, FOXP1, CDKL5), 2) single gene contributors to microdeletion syndromes (MBD5, SATB2, EHMT1, SNURF-SNRPN), 3) novel risk loci (e.g., CHD8, KIRREL3, ZNF507), and 4) genes associated with later onset psychiatric disorders (e.g., TCF4, ZNF804A, PDE10A, GRIN2B, ANK3). We also discovered profoundly increased burden of copy number variants among 19,556 neurodevelopmental cases compared to 13,991 controls (p = 2.07×10−47) and enrichment of polygenic risk alleles from autism and schizophrenia genome-wide association studies (p = 0.0018 and 0.0009, respectively). Our findings suggest a polygenic risk model of autism incorporating loci of strong effect and indicate that some neurodevelopmental genes are sensitive to perturbation by multiple mutational mechanisms, leading to variable phenotypic outcomes that manifest at different life stages. PMID:22521361

  5. Deep sequencing of the Camellia sinensis transcriptome revealed candidate genes for major metabolic pathways of tea-specific compounds

    Energy Technology Data Exchange (ETDEWEB)

    Shi, CY; Yang, H; Wei, CL; Yu, O; Zhang, ZZ; Sun, J; Wan, XC

    2011-01-01

    Tea is one of the most popular non-alcoholic beverages worldwide. However, the tea plant, Camellia sinensis, is difficult to culture in vitro, to transform, and has a large genome, rendering little genomic information available. Recent advances in large-scale RNA sequencing (RNA-seq) provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes. Using high-throughput Illumina RNA-seq, the transcriptome from poly (A){sup +} RNA of C. sinensis was analyzed at an unprecedented depth (2.59 gigabase pairs). Approximate 34.5 million reads were obtained, trimmed, and assembled into 127,094 unigenes, with an average length of 355 bp and an N50 of 506 bp, which consisted of 788 contig clusters and 126,306 singletons. This number of unigenes was 10-fold higher than existing C. sinensis sequences deposited in GenBank (as of August 2010). Sequence similarity analyses against six public databases (Uniprot, NR and COGs at NCBI, Pfam, InterPro and KEGG) found 55,088 unigenes that could be annotated with gene descriptions, conserved protein domains, or gene ontology terms. Some of the unigenes were assigned to putative metabolic pathways. Targeted searches using these annotations identified the majority of genes associated with several primary metabolic pathways and natural product pathways that are important to tea quality, such as flavonoid, theanine and caffeine biosynthesis pathways. Novel candidate genes of these secondary pathways were discovered. Comparisons with four previously prepared cDNA libraries revealed that this transcriptome dataset has both a high degree of consistency with previous EST data and an approximate 20 times increase in coverage. Thirteen unigenes related to theanine and flavonoid synthesis were validated. Their expression patterns in different organs of the tea plant were analyzed by RT-PCR and quantitative real

  6. Deep sequencing of the Camellia sinensis transcriptome revealed candidate genes for major metabolic pathways of tea-specific compounds

    Directory of Open Access Journals (Sweden)

    Chen Qi

    2011-02-01

    Full Text Available Abstract Background Tea is one of the most popular non-alcoholic beverages worldwide. However, the tea plant, Camellia sinensis, is difficult to culture in vitro, to transform, and has a large genome, rendering little genomic information available. Recent advances in large-scale RNA sequencing (RNA-seq provide a fast, cost-effective, and reliable approach to generate large expression datasets for functional genomic analysis, which is especially suitable for non-model species with un-sequenced genomes. Results Using high-throughput Illumina RNA-seq, the transcriptome from poly (A+ RNA of C. sinensis was analyzed at an unprecedented depth (2.59 gigabase pairs. Approximate 34.5 million reads were obtained, trimmed, and assembled into 127,094 unigenes, with an average length of 355 bp and an N50 of 506 bp, which consisted of 788 contig clusters and 126,306 singletons. This number of unigenes was 10-fold higher than existing C. sinensis sequences deposited in GenBank (as of August 2010. Sequence similarity analyses against six public databases (Uniprot, NR and COGs at NCBI, Pfam, InterPro and KEGG found 55,088 unigenes that could be annotated with gene descriptions, conserved protein domains, or gene ontology terms. Some of the unigenes were assigned to putative metabolic pathways. Targeted searches using these annotations identified the majority of genes associated with several primary metabolic pathways and natural product pathways that are important to tea quality, such as flavonoid, theanine and caffeine biosynthesis pathways. Novel candidate genes of these secondary pathways were discovered. Comparisons with four previously prepared cDNA libraries revealed that this transcriptome dataset has both a high degree of consistency with previous EST data and an approximate 20 times increase in coverage. Thirteen unigenes related to theanine and flavonoid synthesis were validated. Their expression patterns in different organs of the tea plant were

  7. Appearances can be deceptive: revealing a hidden viral infection with deep sequencing in a plant quarantine context.

    Science.gov (United States)

    Candresse, Thierry; Filloux, Denis; Muhire, Brejnev; Julian, Charlotte; Galzi, Serge; Fort, Guillaume; Bernardo, Pauline; Daugrois, Jean-Heindrich; Fernandez, Emmanuel; Martin, Darren P; Varsani, Arvind; Roumagnac, Philippe

    2014-01-01

    Comprehensive inventories of plant viral diversity are essential for effective quarantine and sanitation efforts. The safety of regulated plant material exchanges presently relies heavily on techniques such as PCR or nucleic acid hybridisation, which are only suited to the detection and characterisation of specific, well characterised pathogens. Here, we demonstrate the utility of sequence-independent next generation sequencing (NGS) of both virus-derived small interfering RNAs (siRNAs) and virion-associated nucleic acids (VANA) for the detailed identification and characterisation of viruses infecting two quarantined sugarcane plants. Both plants originated from Egypt and were known to be infected with Sugarcane streak Egypt Virus (SSEV; Genus Mastrevirus, Family Geminiviridae), but were revealed by the NGS approaches to also be infected by a second highly divergent mastrevirus, here named Sugarcane white streak Virus (SWSV). This novel virus had escaped detection by all routine quarantine detection assays and was found to also be present in sugarcane plants originating from Sudan. Complete SWSV genomes were cloned and sequenced from six plants and all were found to share >91% genome-wide identity. With the exception of two SWSV variants, which potentially express unusually large RepA proteins, the SWSV isolates display genome characteristics very typical to those of all other previously described mastreviruses. An analysis of virus-derived siRNAs for SWSV and SSEV showed them to be strongly influenced by secondary structures within both genomic single stranded DNA and mRNA transcripts. In addition, the distribution of siRNA size frequencies indicates that these mastreviruses are likely subject to both transcriptional and post-transcriptional gene silencing. Our study stresses the potential advantages of NGS-based virus metagenomic screening in a plant quarantine setting and indicates that such techniques could dramatically reduce the numbers of non

  8. Appearances can be deceptive: revealing a hidden viral infection with deep sequencing in a plant quarantine context.

    Directory of Open Access Journals (Sweden)

    Thierry Candresse

    Full Text Available Comprehensive inventories of plant viral diversity are essential for effective quarantine and sanitation efforts. The safety of regulated plant material exchanges presently relies heavily on techniques such as PCR or nucleic acid hybridisation, which are only suited to the detection and characterisation of specific, well characterised pathogens. Here, we demonstrate the utility of sequence-independent next generation sequencing (NGS of both virus-derived small interfering RNAs (siRNAs and virion-associated nucleic acids (VANA for the detailed identification and characterisation of viruses infecting two quarantined sugarcane plants. Both plants originated from Egypt and were known to be infected with Sugarcane streak Egypt Virus (SSEV; Genus Mastrevirus, Family Geminiviridae, but were revealed by the NGS approaches to also be infected by a second highly divergent mastrevirus, here named Sugarcane white streak Virus (SWSV. This novel virus had escaped detection by all routine quarantine detection assays and was found to also be present in sugarcane plants originating from Sudan. Complete SWSV genomes were cloned and sequenced from six plants and all were found to share >91% genome-wide identity. With the exception of two SWSV variants, which potentially express unusually large RepA proteins, the SWSV isolates display genome characteristics very typical to those of all other previously described mastreviruses. An analysis of virus-derived siRNAs for SWSV and SSEV showed them to be strongly influenced by secondary structures within both genomic single stranded DNA and mRNA transcripts. In addition, the distribution of siRNA size frequencies indicates that these mastreviruses are likely subject to both transcriptional and post-transcriptional gene silencing. Our study stresses the potential advantages of NGS-based virus metagenomic screening in a plant quarantine setting and indicates that such techniques could dramatically reduce the numbers of non

  9. Phylogenetic inferences of Nepenthes species in Peninsular Malaysia revealed by chloroplast (trnL intron) and nuclear (ITS) DNA sequences

    OpenAIRE

    Bunawan, Hamidun; Yen, Choong Chee; Yaakop, Salmah; Noor, Normah Mohd

    2017-01-01

    Background The chloroplastic trnL intron and the nuclear internal transcribed spacer (ITS) region were sequenced for 11 Nepenthes species recorded in Peninsular Malaysia to examine their phylogenetic relationship and to evaluate the usage of trnL intron and ITS sequences for phylogenetic reconstruction of this genus. Results Phylogeny reconstruction was carried out using neighbor-joining, maximum parsimony and Bayesian analyses. All the trees revealed two major clusters, a lowland group consi...

  10. Cloning and sequence analysis of benzo-a-pyreneinducible ...

    African Journals Online (AJOL)

    The phylogenetic tree based on the amino acid sequences clearly shows tilapia CYP1A and killifish CYP1A to be more closely related to each other than to the other CYP1A subfamilies. Sequence analysis of 3727 bp of genomic DNA showed that the clone obtained was the structural gene of CYP1A which consists of ...

  11. Biological sequence analysis: probabilistic models of proteins and nucleic acids

    National Research Council Canada - National Science Library

    Durbin, Richard

    1998-01-01

    ... analysis methods are now based on principles of probabilistic modelling. Examples of such methods include the use of probabilistically derived score matrices to determine the significance of sequence alignments, the use of hidden Markov models as the basis for profile searches to identify distant members of sequence families, and the inference...

  12. Phylogenetic analysis of the genus Hordeum using repetitive DNA sequences

    DEFF Research Database (Denmark)

    Svitashev, S.; Bryngelsson, T.; Vershinin, A.

    1994-01-01

    A set of six cloned barley (Hordeum vulgare) repetitive DNA sequences was used for the analysis of phylogenetic relationships among 31 species (46 taxa) of the genus Hordeum, using molecular hybridization techniques. In situ hybridization experiments showed dispersed organization of the sequences...

  13. Multilocus Sequence Typing Reveals a New Cluster of Closely Related Candida tropicalis Genotypes in Italian Patients With Neurological Disorders.

    Science.gov (United States)

    Scordino, Fabio; Giuffrè, Letterio; Barberi, Giuseppina; Marino Merlo, Francesca; Orlando, Maria Grazia; Giosa, Domenico; Romeo, Orazio

    2018-01-01

    Candida tropicalis is a pathogenic yeast that has emerged as an important cause of candidemia especially in elderly patients with hematological malignancies. Infections caused by this species are mainly reported from Latin America and Asian-Pacific countries although recent epidemiological data revealed that C. tropicalis accounts for 6-16.4% of the Candida bloodstream infections (BSIs) in Italy by representing a relevant issue especially for patients receiving long-term hospital care. The aim of this study was to describe the genetic diversity of C. tropicalis isolates contaminating the hands of healthcare workers (HCWs) and hospital environments and/or associated with BSIs occurring in patients with different neurological disorders and without hematological disease. A total of 28 C. tropicalis isolates were genotyped using multilocus sequence typing analysis of six housekeeping ( ICL1, MDR1, SAPT2, SAPT4, XYR1 , and ZWF1 ) genes and data revealed the presence of only eight diploid sequence types (DSTs) of which 6 (75%) were completely new. Four eBURST clonal complexes (CC2, CC10, CC11, and CC33) contained all DSTs found in this study and the CC33 resulted in an exclusive, well-defined, clonal cluster from Italy. In conclusion, C. tropicalis could represent an important cause of BSIs in long-term hospitalized patients with no underlying hematological disease. The findings of this study also suggest a potential horizontal transmission of a specific C. tropicalis clone through hands of HCWs and expand our understanding of the molecular epidemiology of this pathogen whose population structure is still far from being fully elucidated as its complexity increases as different categories of patients and geographic areas are examined.

  14. Time-Resolved Transposon Insertion Sequencing Reveals Genome-Wide Fitness Dynamics during Infection

    Directory of Open Access Journals (Sweden)

    Guanhua Yang

    2017-10-01

    Full Text Available Transposon insertion sequencing (TIS is a powerful high-throughput genetic technique that is transforming functional genomics in prokaryotes, because it enables genome-wide mapping of the determinants of fitness. However, current approaches for analyzing TIS data assume that selective pressures are constant over time and thus do not yield information regarding changes in the genetic requirements for growth in dynamic environments (e.g., during infection. Here, we describe structured analysis of TIS data collected as a time series, termed pattern analysis of conditional essentiality (PACE. From a temporal series of TIS data, PACE derives a quantitative assessment of each mutant’s fitness over the course of an experiment and identifies mutants with related fitness profiles. In so doing, PACE circumvents major limitations of existing methodologies, specifically the need for artificial effect size thresholds and enumeration of bacterial population expansion. We used PACE to analyze TIS samples of Edwardsiella piscicida (a fish pathogen collected over a 2-week infection period from a natural host (the flatfish turbot. PACE uncovered more genes that affect E. piscicida’s fitness in vivo than were detected using a cutoff at a terminal sampling point, and it identified subpopulations of mutants with distinct fitness profiles, one of which informed the design of new live vaccine candidates. Overall, PACE enables efficient mining of time series TIS data and enhances the power and sensitivity of TIS-based analyses.

  15. Parametric inference for biological sequence analysis.

    Science.gov (United States)

    Pachter, Lior; Sturmfels, Bernd

    2004-11-16

    One of the major successes in computational biology has been the unification, by using the graphical model formalism, of a multitude of algorithms for annotating and comparing biological sequences. Graphical models that have been applied to these problems include hidden Markov models for annotation, tree models for phylogenetics, and pair hidden Markov models for alignment. A single algorithm, the sum-product algorithm, solves many of the inference problems that are associated with different statistical models. This article introduces the polytope propagation algorithm for computing the Newton polytope of an observation from a graphical model. This algorithm is a geometric version of the sum-product algorithm and is used to analyze the parametric behavior of maximum a posteriori inference calculations for graphical models.

  16. Recurrence time statistics: versatile tools for genomic DNA sequence analysis.

    Science.gov (United States)

    Cao, Yinhe; Tung, Wen-Wen; Gao, J B

    2004-01-01

    With the completion of the human and a few model organisms' genomes, and the genomes of many other organisms waiting to be sequenced, it has become increasingly important to develop faster computational tools which are capable of easily identifying the structures and extracting features from DNA sequences. One of the more important structures in a DNA sequence is repeat-related. Often they have to be masked before protein coding regions along a DNA sequence are to be identified or redundant expressed sequence tags (ESTs) are to be sequenced. Here we report a novel recurrence time based method for sequence analysis. The method can conveniently study all kinds of periodicity and exhaustively find all repeat-related features from a genomic DNA sequence. An efficient codon index is also derived from the recurrence time statistics, which has the salient features of being largely species-independent and working well on very short sequences. Efficient codon indices are key elements of successful gene finding algorithms, and are particularly useful for determining whether a suspected EST belongs to a coding or non-coding region. We illustrate the power of the method by studying the genomes of E. coli, the yeast S. cervisivae, the nematode worm C. elegans, and the human, Homo sapiens. Computationally, our method is very efficient. It allows us to carry out analysis of genomes on the whole genomic scale by a PC.

  17. DNA sequence analyses reveal abundant diversity, endemism and evidence for Asian origin of the porcini mushrooms.

    Directory of Open Access Journals (Sweden)

    Bang Feng

    Full Text Available The wild gourmet mushroom Boletus edulis and its close allies are of significant ecological and economic importance. They are found throughout the Northern Hemisphere, but despite their ubiquity there are still many unresolved issues with regard to the taxonomy, systematics and biogeography of this group of mushrooms. Most phylogenetic studies of Boletus so far have characterized samples from North America and Europe and little information is available on samples from other areas, including the ecologically and geographically diverse regions of China. Here we analyzed DNA sequence variation in three gene markers from samples of these mushrooms from across China and compared our findings with those from other representative regions. Our results revealed fifteen novel phylogenetic species (about one-third of the known species and a newly identified lineage represented by Boletus sp. HKAS71346 from tropical Asia. The phylogenetic analyses support eastern Asia as the center of diversity for the porcini sensu stricto clade. Within this clade, B. edulis is the only known holarctic species. The majority of the other phylogenetic species are geographically restricted in their distributions. Furthermore, molecular dating and geological evidence suggest that this group of mushrooms originated during the Eocene in eastern Asia, followed by dispersal to and subsequent speciation in other parts of Asia, Europe, and the Americas from the middle Miocene through the early Pliocene. In contrast to the ancient dispersal of porcini in the strict sense in the Northern Hemisphere, the occurrence of B. reticulatus and B. edulis sensu lato in the Southern Hemisphere was probably due to recent human-mediated introductions.

  18. DNA Sequence Analyses Reveal Abundant Diversity, Endemism and Evidence for Asian Origin of the Porcini Mushrooms

    Science.gov (United States)

    Feng, Bang; Xu, Jianping; Wu, Gang; Zeng, Nian-Kai; Li, Yan-Chun; Tolgor, Bau; Kost, Gerhard W.; Yang, Zhu L.

    2012-01-01

    The wild gourmet mushroom Boletus edulis and its close allies are of significant ecological and economic importance. They are found throughout the Northern Hemisphere, but despite their ubiquity there are still many unresolved issues with regard to the taxonomy, systematics and biogeography of this group of mushrooms. Most phylogenetic studies of Boletus so far have characterized samples from North America and Europe and little information is available on samples from other areas, including the ecologically and geographically diverse regions of China. Here we analyzed DNA sequence variation in three gene markers from samples of these mushrooms from across China and compared our findings with those from other representative regions. Our results revealed fifteen novel phylogenetic species (about one-third of the known species) and a newly identified lineage represented by Boletus sp. HKAS71346 from tropical Asia. The phylogenetic analyses support eastern Asia as the center of diversity for the porcini sensu stricto clade. Within this clade, B. edulis is the only known holarctic species. The majority of the other phylogenetic species are geographically restricted in their distributions. Furthermore, molecular dating and geological evidence suggest that this group of mushrooms originated during the Eocene in eastern Asia, followed by dispersal to and subsequent speciation in other parts of Asia, Europe, and the Americas from the middle Miocene through the early Pliocene. In contrast to the ancient dispersal of porcini in the strict sense in the Northern Hemisphere, the occurrence of B. reticulatus and B. edulis sensu lato in the Southern Hemisphere was probably due to recent human-mediated introductions. PMID:22629418

  19. RESEARCH NOTE Genome-based exome-sequencing analysis ...

    Indian Academy of Sciences (India)

    Navya

    2017-02-22

    Feb 22, 2017 ... Genome-based exome-sequencing analysis identifies GYG1, DIS3L, DDRGK1 genes ... Cardiology Division, Department of Internal Medicine, Severance .... with p values of <0.05 byanalyzing differences in allele distribution.

  20. Editorial: Special Issue on Algorithms for Sequence Analysis and Storage

    Directory of Open Access Journals (Sweden)

    Veli Mäkinen

    2014-03-01

    Full Text Available This special issue of Algorithms is dedicated to approaches to biological sequence analysis that have algorithmic novelty and potential for fundamental impact in methods used for genome research.

  1. Tools for integrated sequence-structure analysis with UCSF Chimera

    Directory of Open Access Journals (Sweden)

    Huang Conrad C

    2006-07-01

    Full Text Available Abstract Background Comparing related structures and viewing the structures in the context of sequence alignments are important tasks in protein structure-function research. While many programs exist for individual aspects of such work, there is a need for interactive visualization tools that: (a provide a deep integration of sequence and structure, far beyond mapping where a sequence region falls in the structure and vice versa; (b facilitate changing data of one type based on the other (for example, using only sequence-conserved residues to match structures, or adjusting a sequence alignment based on spatial fit; (c can be used with a researcher's own data, including arbitrary sequence alignments and annotations, closely or distantly related sets of proteins, etc.; and (d interoperate with each other and with a full complement of molecular graphics features. We describe enhancements to UCSF Chimera to achieve these goals. Results The molecular graphics program UCSF Chimera includes a suite of tools for interactive analyses of sequences and structures. Structures automatically associate with sequences in imported alignments, allowing many kinds of crosstalk. A novel method is provided to superimpose structures in the absence of a pre-existing sequence alignment. The method uses both sequence and secondary structure, and can match even structures with very low sequence identity. Another tool constructs structure-based sequence alignments from superpositions of two or more proteins. Chimera is designed to be extensible, and mechanisms for incorporating user-specific data without Chimera code development are also provided. Conclusion The tools described here apply to many problems involving comparison and analysis of protein structures and their sequences. Chimera includes complete documentation and is intended for use by a wide range of scientists, not just those in the computational disciplines. UCSF Chimera is free for non-commercial use and is

  2. Molecular analysis of sourdough reveals Lactobacillus mindensis sp. nov.

    Science.gov (United States)

    Ehrmann, Matthias A; Müller, Martin R A; Vogel, Rudi F

    2003-01-01

    Genotypic fingerprinting to analyse the bacterial flora of an industrial sourdough revealed a coherent group of strains which could not be associated with a valid species. Comparative 16S rDNA sequence analysis showed that these strains formed a homogeneous cluster distinct from their closest relatives, Lactobacillus farciminis, Lactobacillus alimentarius and Lactobacillus kimchii. To characterize them further, physiological (sugar fermentation, formation of DL-lactate, hydrolysis of arginine, growth temperature, CO2 production) and chemotaxonomic properties have been determined. The DNA G +C content was 37.5 0.2 mol%. The peptidoglycan was of the lysine-D-iso-asparagine (L-Lys-D-Asp) type. The strains were homofermentative, Gram-positive, catalase-negative, non-spore-forming, non-motile rods. They were found as a major stable component of a rye flour sourdough fermentation. Physiological, biochemical as well as genotypic data suggested them to be a new species of the genus Lactobacillus. This was confirmed by DNA-DNA hybridization of genomic DNA, and the name Lactobacillus mindensis is proposed. The type strain of this species is DSM 14500T (=LMG 21508T).

  3. SVAMP: Sequence variation analysis, maps and phylogeny

    KAUST Repository

    Naeem, Raeece; Hidayah, Lailatul; Preston, Mark D.; Clark, Taane G.; Pain, Arnab

    2014-01-01

    Summary: SVAMP is a stand-alone desktop application to visualize genomic variants (in variant call format) in the context of geographical metadata. Users of SVAMP are able to generate phylogenetic trees and perform principal coordinate analysis

  4. Sequence dependence of electron-induced DNA strand breakage revealed by DNA nanoarrays

    DEFF Research Database (Denmark)

    Keller, Adrian; Rackwitz, Jenny; Cauët, Emilie

    2014-01-01

    The electronic structure of DNA is determined by its nucleotide sequence, which is for instance exploited in molecular electronics. Here we demonstrate that also the DNA strand breakage induced by low-energy electrons (18 eV) depends on the nucleotide sequence. To determine the absolute cross sec...

  5. Not All Order Memory Is Equal: Test Demands Reveal Dissociations in Memory for Sequence Information

    Science.gov (United States)

    Jonker, Tanya R.; MacLeod, Colin M.

    2017-01-01

    Remembering the order of a sequence of events is a fundamental feature of episodic memory. Indeed, a number of formal models represent temporal context as part of the memory system, and memory for order has been researched extensively. Yet, the nature of the code(s) underlying sequence memory is still relatively unknown. Across 4 experiments that…

  6. Deep amplicon sequencing reveals mixed phytoplasma infection within single grapevine plants

    DEFF Research Database (Denmark)

    Nicolaisen, Mogens; Contaldo, Nicoletta; Makarova, Olga

    2011-01-01

    The diversity of phytoplasmas within single plants has not yet been fully investigated. In this project, deep amplicon sequencing was used to generate 50,926 phytoplasma sequences from 11 phytoplasma-infected grapevine samples from a PCR amplicon in the 5' end of the 16S region. After clustering ...

  7. Molecular characterization, sequence analysis and tissue expression of a porcine gene – MOSPD2

    Directory of Open Access Journals (Sweden)

    Yang Jie

    2017-01-01

    Full Text Available The full-length cDNA sequence of a porcine gene, MOSPD2, was amplified using the rapid amplification of cDNA ends method based on a pig expressed sequence tag sequence which was highly homologous to the coding sequence of the human MOSPD2 gene. Sequence prediction analysis revealed that the open reading frame of this gene encodes a protein of 491 amino acids that has high homology with the motile sperm domain-containing protein 2 (MOSPD2 of five species: horse (89%, human (90%, chimpanzee (89%, rhesus monkey (89% and mouse (85%; thus, it could be defined as a porcine MOSPD2 gene. This novel porcine gene was assigned GeneID: 100153601. This gene is structured in 15 exons and 14 introns as revealed by computer-assisted analysis. The phylogenetic analysis revealed that the porcine MOSPD2 gene has a closer genetic relationship with the MOSPD2 gene of horse. Tissue expression analysis indicated that the porcine MOSPD2 gene is generally and differentially expressed in the spleen, muscle, skin, kidney, lung, liver, fat and heart. Our experiment is the first to establish the primary foundation for further research on the porcine MOSPD2 gene.

  8. RNA Sequencing Reveals that Kaposi Sarcoma-Associated Herpesvirus Infection Mimics Hypoxia Gene Expression Signature

    Science.gov (United States)

    Viollet, Coralie; Davis, David A.; Tekeste, Shewit S.; Reczko, Martin; Pezzella, Francesco; Ragoussis, Jiannis

    2017-01-01

    Kaposi sarcoma-associated herpesvirus (KSHV) causes several tumors and hyperproliferative disorders. Hypoxia and hypoxia-inducible factors (HIFs) activate latent and lytic KSHV genes, and several KSHV proteins increase the cellular levels of HIF. Here, we used RNA sequencing, qRT-PCR, Taqman assays, and pathway analysis to explore the miRNA and mRNA response of uninfected and KSHV-infected cells to hypoxia, to compare this with the genetic changes seen in chronic latent KSHV infection, and to explore the degree to which hypoxia and KSHV infection interact in modulating mRNA and miRNA expression. We found that the gene expression signatures for KSHV infection and hypoxia have a 34% overlap. Moreover, there were considerable similarities between the genes up-regulated by hypoxia in uninfected (SLK) and in KSHV-infected (SLKK) cells. hsa-miR-210, a HIF-target known to have pro-angiogenic and anti-apoptotic properties, was significantly up-regulated by both KSHV infection and hypoxia using Taqman assays. Interestingly, expression of KSHV-encoded miRNAs was not affected by hypoxia. These results demonstrate that KSHV harnesses a part of the hypoxic cellular response and that a substantial portion of hypoxia-induced changes in cellular gene expression are induced by KSHV infection. Therefore, targeting hypoxic pathways may be a useful way to develop therapeutic strategies for KSHV-related diseases. PMID:28046107

  9. Mitochondrial genome sequences reveal evolutionary relationships of the Phytophthora 1c clade species.

    Science.gov (United States)

    Lassiter, Erica S; Russ, Carsten; Nusbaum, Chad; Zeng, Qiandong; Saville, Amanda C; Olarte, Rodrigo A; Carbone, Ignazio; Hu, Chia-Hui; Seguin-Orlando, Andaine; Samaniego, Jose A; Thorne, Jeffrey L; Ristaino, Jean B

    2015-11-01

    Phytophthora infestans is one of the most destructive plant pathogens of potato and tomato globally. The pathogen is closely related to four other Phytophthora species in the 1c clade including P. phaseoli, P. ipomoeae, P. mirabilis and P. andina that are important pathogens of other wild and domesticated hosts. P. andina is an interspecific hybrid between P. infestans and an unknown Phytophthora species. We have sequenced mitochondrial genomes of the sister species of P. infestans and examined the evolutionary relationships within the clade. Phylogenetic analysis indicates that the P. phaseoli mitochondrial lineage is basal within the clade. P. mirabilis and P. ipomoeae are sister lineages and share a common ancestor with the Ic mitochondrial lineage of P. andina. These lineages in turn are sister to the P. infestans and P. andina Ia mitochondrial lineages. The P. andina Ic lineage diverged much earlier than the P. andina Ia mitochondrial lineage and P. infestans. The presence of two mitochondrial lineages in P. andina supports the hybrid nature of this species. The ancestral state of the P. andina Ic lineage in the tree and its occurrence only in the Andean regions of Ecuador, Colombia and Peru suggests that the origin of this species hybrid in nature may occur there.

  10. DSAP: deep-sequencing small RNA analysis pipeline.

    Science.gov (United States)

    Huang, Po-Jung; Liu, Yi-Chung; Lee, Chi-Ching; Lin, Wei-Chen; Gan, Richie Ruei-Chi; Lyu, Ping-Chiang; Tang, Petrus

    2010-07-01

    DSAP is an automated multiple-task web service designed to provide a total solution to analyzing deep-sequencing small RNA datasets generated by next-generation sequencing technology. DSAP uses a tab-delimited file as an input format, which holds the unique sequence reads (tags) and their corresponding number of copies generated by the Solexa sequencing platform. The input data will go through four analysis steps in DSAP: (i) cleanup: removal of adaptors and poly-A/T/C/G/N nucleotides; (ii) clustering: grouping of cleaned sequence tags into unique sequence clusters; (iii) non-coding RNA (ncRNA) matching: sequence homology mapping against a transcribed sequence library from the ncRNA database Rfam (http://rfam.sanger.ac.uk/); and (iv) known miRNA matching: detection of known miRNAs in miRBase (http://www.mirbase.org/) based on sequence homology. The expression levels corresponding to matched ncRNAs and miRNAs are summarized in multi-color clickable bar charts linked to external databases. DSAP is also capable of displaying miRNA expression levels from different jobs using a log(2)-scaled color matrix. Furthermore, a cross-species comparative function is also provided to show the distribution of identified miRNAs in different species as deposited in miRBase. DSAP is available at http://dsap.cgu.edu.tw.

  11. Quantiprot - a Python package for quantitative analysis of protein sequences.

    Science.gov (United States)

    Konopka, Bogumił M; Marciniak, Marta; Dyrka, Witold

    2017-07-17

    The field of protein sequence analysis is dominated by tools rooted in substitution matrices and alignments. A complementary approach is provided by methods of quantitative characterization. A major advantage of the approach is that quantitative properties defines a multidimensional solution space, where sequences can be related to each other and differences can be meaningfully interpreted. Quantiprot is a software package in Python, which provides a simple and consistent interface to multiple methods for quantitative characterization of protein sequences. The package can be used to calculate dozens of characteristics directly from sequences or using physico-chemical properties of amino acids. Besides basic measures, Quantiprot performs quantitative analysis of recurrence and determinism in the sequence, calculates distribution of n-grams and computes the Zipf's law coefficient. We propose three main fields of application of the Quantiprot package. First, quantitative characteristics can be used in alignment-free similarity searches, and in clustering of large and/or divergent sequence sets. Second, a feature space defined by quantitative properties can be used in comparative studies of protein families and organisms. Third, the feature space can be used for evaluating generative models, where large number of sequences generated by the model can be compared to actually observed sequences.

  12. Plastome Sequence Determination and Comparative Analysis for Members of the Lolium-Festuca Grass Species Complex

    Science.gov (United States)

    Hand, Melanie L.; Spangenberg, German C.; Forster, John W.; Cogan, Noel O. I.

    2013-01-01

    Chloroplast genome sequences are of broad significance in plant biology, due to frequent use in molecular phylogenetics, comparative genomics, population genetics, and genetic modification studies. The present study used a second-generation sequencing approach to determine and assemble the plastid genomes (plastomes) of four representatives from the agriculturally important Lolium-Festuca species complex of pasture grasses (Lolium multiflorum, Festuca pratensis, Festuca altissima, and Festuca ovina). Total cellular DNA was extracted from either roots or leaves, was sequenced, and the output was filtered for plastome-related reads. A comparison between sources revealed fewer plastome-related reads from root-derived template but an increase in incidental bacterium-derived sequences. Plastome assembly and annotation indicated high levels of sequence identity and a conserved organization and gene content between species. However, frequent deletions within the F. ovina plastome appeared to contribute to a smaller plastid genome size. Comparative analysis with complete plastome sequences from other members of the Poaceae confirmed conservation of most grass-specific features. Detailed analysis of the rbcL–psaI intergenic region, however, revealed a “hot-spot” of variation characterized by independent deletion events. The evolutionary implications of this observation are discussed. The complete plastome sequences are anticipated to provide the basis for potential organelle-specific genetic modification of pasture grasses. PMID:23550121

  13. Analysis of 16S libraries of mouse gastrointestinal microflora reveals a large new group of mouse intestinal bacteria

    NARCIS (Netherlands)

    Salzman, NH; de Jong, H; Paterson, Y; Harmsen, HJM; Welling, GW; Bos, NA

    2002-01-01

    Total genomic DNA from samples of intact mouse small intestine, large intestine, caecum and faeces was used as template for PCR amplification of 16S rRNA gene sequences with conserved bacterial primers. Phylogenetic analysis of the amplification products revealed 40 unique 16S rDNA sequences. Of

  14. Mycobacterium malmesburyense sp. nov., a non-tuberculous species of the genus Mycobacterium revealed by multiple gene sequence characterization

    CSIR Research Space (South Africa)

    Gcebe, N

    2017-04-01

    Full Text Available Journal of Systematic and Evolutionary Microbiology: DOI 10.1099/ijsem.0.001678 Mycobacterium malmesburyense sp. nov., a non-tuberculous species of the genus Mycobacterium revealed by multiple gene sequence characterization Gcebe N Rutten V Gey...

  15. Exome sequences of multiplex, multigenerational families reveal schizophrenia risk loci with potential implications for neurocognitive performance.

    Science.gov (United States)

    Kos, Mark Z; Carless, Melanie A; Peralta, Juan; Curran, Joanne E; Quillen, Ellen E; Almeida, Marcio; Blackburn, August; Blondell, Lucy; Roalf, David R; Pogue-Geile, Michael F; Gur, Ruben C; Göring, Harald H H; Nimgaonkar, Vishwajit L; Gur, Raquel E; Almasy, Laura

    2017-12-01

    Schizophrenia is a serious mental illness, involving disruptions in thought and behavior, with a worldwide prevalence of about one percent. Although highly heritable, much of the genetic liability of schizophrenia is yet to be explained. We searched for susceptibility loci in multiplex, multigenerational families affected by schizophrenia, targeting protein-altering variation with in silico predicted functional effects. Exome sequencing was performed on 136 samples from eight European-American families, including 23 individuals diagnosed with schizophrenia or schizoaffective disorder. In total, 11,878 non-synonymous variants from 6,396 genes were tested for their association with schizophrenia spectrum disorders. Pathway enrichment analyses were conducted on gene-based test results, protein-protein interaction (PPI) networks, and epistatic effects. Using a significance threshold of FDR < 0.1, association was detected for rs10941112 (p = 2.1 × 10 -5 ; q-value = 0.073) in AMACR, a gene involved in fatty acid metabolism and previously implicated in schizophrenia, with significant cis effects on gene expression (p = 5.5 × 10 -4 ), including brain tissue data from the Genotype-Tissue Expression project (minimum p = 6.0 × 10 -5 ). A second SNP, rs10378 located in TMEM176A, also shows risk effects in the exome data (p = 2.8 × 10 -5 ; q-value = 0.073). PPIs among our top gene-based association results (p < 0.05; n = 359 genes) reveal significant enrichment of genes involved in NCAM-mediated neurite outgrowth (p = 3.0 × 10 -5 ), while exome-wide SNP-SNP interaction effects for rs10941112 and rs10378 indicate a potential role for kinase-mediated signaling involved in memory and learning. In conclusion, these association results implicate AMACR and TMEM176A in schizophrenia risk, whose effects may be modulated by genes involved in synaptic plasticity and neurocognitive performance. © 2017 Wiley Periodicals, Inc.

  16. Survey sequencing and comparative analysis of the elephant shark (Callorhinchus milii genome.

    Directory of Open Access Journals (Sweden)

    Byrappa Venkatesh

    2007-04-01

    Full Text Available Owing to their phylogenetic position, cartilaginous fishes (sharks, rays, skates, and chimaeras provide a critical reference for our understanding of vertebrate genome evolution. The relatively small genome of the elephant shark, Callorhinchus milii, a chimaera, makes it an attractive model cartilaginous fish genome for whole-genome sequencing and comparative analysis. Here, the authors describe survey sequencing (1.4x coverage and comparative analysis of the elephant shark genome, one of the first cartilaginous fish genomes to be sequenced to this depth. Repetitive sequences, represented mainly by a novel family of short interspersed element-like and long interspersed element-like sequences, account for about 28% of the elephant shark genome. Fragments of approximately 15,000 elephant shark genes reveal specific examples of genes that have been lost differentially during the evolution of tetrapod and teleost fish lineages. Interestingly, the degree of conserved synteny and conserved sequences between the human and elephant shark genomes are higher than that between human and teleost fish genomes. Elephant shark contains putative four Hox clusters indicating that, unlike teleost fish genomes, the elephant shark genome has not experienced an additional whole-genome duplication. These findings underscore the importance of the elephant shark as a critical reference vertebrate genome for comparative analysis of the human and other vertebrate genomes. This study also demonstrates that a survey-sequencing approach can be applied productively for comparative analysis of distantly related vertebrate genomes.

  17. Supplementary Material for: Whole genome sequencing reveals genomic heterogeneity and antibiotic purification in Mycobacterium tuberculosis isolates

    KAUST Repository

    Black, PA; Vos, M. de; Louw, GE; Merwe, RG van der; Dippenaar, A.; Streicher, EM; Abdallah, AM; Sampson, SL; Victor, TC; Dolby, T.; Simpson, JA; Helden, PD van; Warren, RM; Pain, Arnab

    2015-01-01

    Abstract Background Whole genome sequencing has revolutionised the interrogation of mycobacterial genomes. Recent studies have reported conflicting findings on the genomic stability of Mycobacterium tuberculosis during the evolution of drug

  18. Genome sequencing of chimpanzee malaria parasites reveals possible pathways of adaptation to human hosts

    KAUST Repository

    Otto, Thomas D.; Rayner, Julian C.; Bö hme, Ulrike; Pain, Arnab; Spottiswoode, Natasha; Sanders, Mandy; Quail, Michael; Ollomo, Benjamin; Renaud, Franç ois; Thomas, Alan W.; Prugnolle, Franck; Conway, David J.; Newbold, Chris; Berriman, Matthew

    2014-01-01

    related chimpanzee parasite species P. reichenowi, and obtaining partial sequence data from a more distantly related chimpanzee parasite (P. gaboni). The close relationship between P. reichenowi and P. falciparum is emphasized by almost complete

  19. Whole-genome sequencing reveals mutational landscape underlying phenotypic differences between two widespread Chinese cattle breeds

    OpenAIRE

    Xu, Yao; Jiang, Yu; Shi, Tao; Cai, Hanfang; Lan, Xianyong; Zhao, Xin; Plath, Martin; Chen, Hong

    2017-01-01

    Whole-genome sequencing provides a powerful tool to obtain more genetic variability that could produce a range of benefits for cattle breeding industry. Nanyang (Bos indicus) and Qinchuan (Bos taurus) are two important Chinese indigenous cattle breeds with distinct phenotypes. To identify the genetic characteristics responsible for variation in phenotypes between the two breeds, in the present study, we for the first time sequenced the genomes of four Nanyang and four Qinchuan cattle with 10 ...

  20. Deep sequencing reveals a novel closterovirus associated with wild rose leaf rosette disease.

    Science.gov (United States)

    He, Yan; Yang, Zuokun; Hong, Ni; Wang, Guoping; Ning, Guogui; Xu, Wenxing

    2015-06-01

    A bizarre virus-like symptom of a leaf rosette formed by dense small leaves on branches of wild roses (Rosa multiflora Thunb.), designated as 'wild rose leaf rosette disease' (WRLRD), was observed in China. To investigate the presumed causal virus, a wild rose sample affected by WRLRD was subjected to deep sequencing of small interfering RNAs (siRNAs) for a complete survey of the infecting viruses and viroids. The assembly of siRNAs led to the reconstruction of the complete genomes of three known viruses, namely Apple stem grooving virus (ASGV), Blackberry chlorotic ringspot virus (BCRV) and Prunus necrotic ringspot virus (PNRSV), and of a novel virus provisionally named 'rose leaf rosette-associated virus' (RLRaV). Phylogenetic analysis clearly placed RLRaV alongside members of the genus Closterovirus, family Closteroviridae. Genome organization of RLRaV RNA (17,653 nucleotides) showed 13 open reading frames (ORFs), except ORF1 and the quintuple gene block, most of which showed no significant similarities with known viral proteins, but, instead, had detectable identities to fungal or bacterial proteins. Additional novel molecular features indicated that RLRaV seems to be the most complex virus among the known genus members. To our knowledge, this is the first report of WRLRD and its associated closterovirus, as well as two ilarviruses and one capilovirus, infecting wild roses. Our findings present novel information about the closterovirus and the aetiology of this rose disease which should facilitate its control. More importantly, the novel features of RLRaV help to clarify the molecular and evolutionary features of the closterovirus. © 2014 BSPP AND JOHN WILEY & SONS LTD.

  1. An abundance of ubiquitously expressed genes revealed by tissue transcriptome sequence data.

    Directory of Open Access Journals (Sweden)

    Daniel Ramsköld

    2009-12-01

    Full Text Available The parts of the genome transcribed by a cell or tissue reflect the biological processes and functions it carries out. We characterized the features of mammalian tissue transcriptomes at the gene level through analysis of RNA deep sequencing (RNA-Seq data across human and mouse tissues and cell lines. We observed that roughly 8,000 protein-coding genes were ubiquitously expressed, contributing to around 75% of all mRNAs by message copy number in most tissues. These mRNAs encoded proteins that were often intracellular, and tended to be involved in metabolism, transcription, RNA processing or translation. In contrast, genes for secreted or plasma membrane proteins were generally expressed in only a subset of tissues. The distribution of expression levels was broad but fairly continuous: no support was found for the concept of distinct expression classes of genes. Expression estimates that included reads mapping to coding exons only correlated better with qRT-PCR data than estimates which also included 3' untranslated regions (UTRs. Muscle and liver had the least complex transcriptomes, in that they expressed predominantly ubiquitous genes and a large fraction of the transcripts came from a few highly expressed genes, whereas brain, kidney and testis expressed more complex transcriptomes with the vast majority of genes expressed and relatively small contributions from the most expressed genes. mRNAs expressed in brain had unusually long 3'UTRs, and mean 3'UTR length was higher for genes involved in development, morphogenesis and signal transduction, suggesting added complexity of UTR-based regulation for these genes. Our results support a model in which variable exterior components feed into a large, densely connected core composed of ubiquitously expressed intracellular proteins.

  2. Microbiota present in cystic fibrosis lungs as revealed by whole genome sequencing.

    Directory of Open Access Journals (Sweden)

    Philippe M Hauser

    Full Text Available Determination of the precise composition and variation of microbiota in cystic fibrosis lungs is crucial since chronic inflammation due to microorganisms leads to lung damage and ultimately, death. However, this constitutes a major technical challenge. Culturing of microorganisms does not provide a complete representation of a microbiota, even when using culturomics (high-throughput culture. So far, only PCR-based metagenomics have been investigated. However, these methods are biased towards certain microbial groups, and suffer from uncertain quantification of the different microbial domains. We have explored whole genome sequencing (WGS using the Illumina high-throughput technology applied directly to DNA extracted from sputa obtained from two cystic fibrosis patients. To detect all microorganism groups, we used four procedures for DNA extraction, each with a different lysis protocol. We avoided biases due to whole DNA amplification thanks to the high efficiency of current Illumina technology. Phylogenomic classification of the reads by three different methods produced similar results. Our results suggest that WGS provides, in a single analysis, a better qualitative and quantitative assessment of microbiota compositions than cultures and PCRs. WGS identified a high quantity of Haemophilus spp. (patient 1 or Staphylococcus spp. plus Streptococcus spp. (patient 2 together with low amounts of anaerobic (Veillonella, Prevotella, Fusobacterium and aerobic bacteria (Gemella, Moraxella, Granulicatella. WGS suggested that fungal members represented very low proportions of the microbiota, which were detected by cultures and PCRs because of their selectivity. The future increase of reads' sizes and decrease in cost should ensure the usefulness of WGS for the characterisation of microbiota.

  3. Multilocus Sequence Typing Reveals Relevant Genetic Variation and Different Evolutionary Dynamics among Strains of Xanthomonas arboricola pv. juglandis

    Directory of Open Access Journals (Sweden)

    Marco Scortichini

    2010-11-01

    Full Text Available Forty-five Xanthomonas arboricola pv. juglandis (Xaj strains originating from Juglans regia cultivation in different countries were molecularly typed by means of MultiLocus Sequence Typing (MLST, using acnB, gapA, gyrB and rpoD gene fragments. A total of 2.5 kilobases was used to infer the phylogenetic relationship among the strains and possible recombination events. Haplotype diversity, linkage disequilibrium analysis, selection tests, gene flow estimates and codon adaptation index were also assessed. The dendrograms built by maximum likelihood with concatenated nucleotide and amino acid sequences revealed two major and two minor phylotypes. The same haplotype was found in strains originating from different continents, and different haplotypes were found in strains isolated in the same year from the same location. A recombination breakpoint was detected within the rpoD gene fragment. At the pathovar level, the Xaj populations studied here are clonal and under neutral selection. However, four Xaj strains isolated from walnut fruits with apical necrosis are under diversifying selection, suggesting a possible new adaptation. Gene flow estimates do not support the hypothesis of geographic isolation of the strains, even though the genetic diversity between the strains increases as the geographic distance between them increases. A triplet deletion, causing the absence of valine, was found in the rpoD fragment of all 45 Xaj strains when compared with X. axonopodis pv. citri strain 306. The codon adaptation index was high in all four genes studied, indicating a relevant metabolic activity.

  4. Exploring evidence of positive selection reveals genetic basis of meat quality traits in Berkshire pigs through whole genome sequencing.

    Science.gov (United States)

    Jeong, Hyeonsoo; Song, Ki-Duk; Seo, Minseok; Caetano-Anollés, Kelsey; Kim, Jaemin; Kwak, Woori; Oh, Jae-Don; Kim, EuiSoo; Jeong, Dong Kee; Cho, Seoae; Kim, Heebal; Lee, Hak-Kyo

    2015-08-20

    Natural and artificial selection following domestication has led to the existence of more than a hundred pig breeds, as well as incredible variation in phenotypic traits. Berkshire pigs are regarded as having superior meat quality compared to other breeds. As the meat production industry seeks selective breeding approaches to improve profitable traits such as meat quality, information about genetic determinants of these traits is in high demand. However, most of the studies have been performed using trained sensory panel analysis without investigating the underlying genetic factors. Here we investigate the relationship between genomic composition and this phenotypic trait by scanning for signatures of positive selection in whole-genome sequencing data. We generated genomes of 10 Berkshire pigs at a total of 100.6 coverage depth, using the Illumina Hiseq2000 platform. Along with the genomes of 11 Landrace and 13 Yorkshire pigs, we identified genomic variants of 18.9 million SNVs and 3.4 million Indels in the mapped regions. We identified several associated genes related to lipid metabolism, intramuscular fatty acid deposition, and muscle fiber type which attribute to pork quality (TG, FABP1, AKIRIN2, GLP2R, TGFBR3, JPH3, ICAM2, and ERN1) by applying between population statistical tests (XP-EHH and XP-CLR). A statistical enrichment test was also conducted to detect breed specific genetic variation. In addition, de novo short sequence read assembly strategy identified several candidate genes (SLC25A14, IGF1, PI4KA, CACNA1A) as also contributing to lipid metabolism. Results revealed several candidate genes involved in Berkshire meat quality; most of these genes are involved in lipid metabolism and intramuscular fat deposition. These results can provide a basis for future research on the genomic characteristics of Berkshire pigs.

  5. Gene Expression Profiles in Paired Gingival Biopsies from Periodontitis-Affected and Healthy Tissues Revealed by Massively Parallel Sequencing

    Science.gov (United States)

    Båge, Tove; Lagervall, Maria; Jansson, Leif; Lundeberg, Joakim; Yucel-Lindberg, Tülay

    2012-01-01

    Periodontitis is a chronic inflammatory disease affecting the soft tissue and bone that surrounds the teeth. Despite extensive research, distinctive genes responsible for the disease have not been identified. The objective of this study was to elucidate transcriptome changes in periodontitis, by investigating gene expression profiles in gingival tissue obtained from periodontitis-affected and healthy gingiva from the same patient, using RNA-sequencing. Gingival biopsies were obtained from a disease-affected and a healthy site from each of 10 individuals diagnosed with periodontitis. Enrichment analysis performed among uniquely expressed genes for the periodontitis-affected and healthy tissues revealed several regulated pathways indicative of inflammation for the periodontitis-affected condition. Hierarchical clustering of the sequenced biopsies demonstrated clustering according to the degree of inflammation, as observed histologically in the biopsies, rather than clustering at the individual level. Among the top 50 upregulated genes in periodontitis-affected tissues, we investigated two genes which have not previously been demonstrated to be involved in periodontitis. These included interferon regulatory factor 4 and chemokine (C-C motif) ligand 18, which were also expressed at the protein level in gingival biopsies from patients with periodontitis. In conclusion, this study provides a first step towards a quantitative comprehensive insight into the transcriptome changes in periodontitis. We demonstrate for the first time site-specific local variation in gene expression profiles of periodontitis-affected and healthy tissues obtained from patients with periodontitis, using RNA-seq. Further, we have identified novel genes expressed in periodontitis tissues, which may constitute potential therapeutic targets for future treatment strategies of periodontitis. PMID:23029519

  6. Infidelity of SARS-CoV Nsp14-exonuclease mutant virus replication is revealed by complete genome sequencing.

    Directory of Open Access Journals (Sweden)

    Lance D Eckerle

    2010-05-01

    Full Text Available Most RNA viruses lack the mechanisms to recognize and correct mutations that arise during genome replication, resulting in quasispecies diversity that is required for pathogenesis and adaptation. However, it is not known how viruses encoding large viral RNA genomes such as the Coronaviridae (26 to 32 kb balance the requirements for genome stability and quasispecies diversity. Further, the limits of replication infidelity during replication of large RNA genomes and how decreased fidelity impacts virus fitness over time are not known. Our previous work demonstrated that genetic inactivation of the coronavirus exoribonuclease (ExoN in nonstructural protein 14 (nsp14 of murine hepatitis virus results in a 15-fold decrease in replication fidelity. However, it is not known whether nsp14-ExoN is required for replication fidelity of all coronaviruses, nor the impact of decreased fidelity on genome diversity and fitness during replication and passage. We report here the engineering and recovery of nsp14-ExoN mutant viruses of severe acute respiratory syndrome coronavirus (SARS-CoV that have stable growth defects and demonstrate a 21-fold increase in mutation frequency during replication in culture. Analysis of complete genome sequences from SARS-ExoN mutant viral clones revealed unique mutation sets in every genome examined from the same round of replication and a total of 100 unique mutations across the genome. Using novel bioinformatic tools and deep sequencing across the full-length genome following 10 population passages in vitro, we demonstrate retention of ExoN mutations and continued increased diversity and mutational load compared to wild-type SARS-CoV. The results define a novel genetic and bioinformatics model for introduction and identification of multi-allelic mutations in replication competent viruses that will be powerful tools for testing the effects of decreased fidelity and increased quasispecies diversity on viral replication

  7. Nonlinear analysis of river flow time sequences

    Science.gov (United States)

    Porporato, Amilcare; Ridolfi, Luca

    1997-06-01

    Within the field of chaos theory several methods for the analysis of complex dynamical systems have recently been proposed. In light of these ideas we study the dynamics which control the behavior over time of river flow, investigating the existence of a low-dimension deterministic component. The present article follows the research undertaken in the work of Porporato and Ridolfi [1996a] in which some clues as to the existence of chaos were collected. Particular emphasis is given here to the problem of noise and to nonlinear prediction. With regard to the latter, the benefits obtainable by means of the interpolation of the available time series are reported and the remarkable predictive results attained with this nonlinear method are shown.

  8. Transcriptome sequencing of Mycosphaerella fijiensis during association with Musa acuminata reveals candidate pathogenicity genes.

    Science.gov (United States)

    Noar, Roslyn D; Daub, Margaret E

    2016-08-30

    genes with higher expression in infected leaf tissue, suggesting that they may play a role in pathogenicity. For two other scaffolds, no transcripts were detected in either condition, and PCR assays support the hypothesis that at least one of these scaffolds corresponds to a dispensable chromosome that is not required for survival or pathogenicity. Our study revealed major changes in the transcriptome of Mycosphaerella fijiensis, when associating with its host compared to during saprophytic growth in medium. This analysis identified putative pathogenicity genes and also provides support for the existence of dispensable chromosomes in this fungus.

  9. Accident sequence analysis of human-computer interface design

    International Nuclear Information System (INIS)

    Fan, C.-F.; Chen, W.-H.

    2000-01-01

    It is important to predict potential accident sequences of human-computer interaction in a safety-critical computing system so that vulnerable points can be disclosed and removed. We address this issue by proposing a Multi-Context human-computer interaction Model along with its analysis techniques, an Augmented Fault Tree Analysis, and a Concurrent Event Tree Analysis. The proposed augmented fault tree can identify the potential weak points in software design that may induce unintended software functions or erroneous human procedures. The concurrent event tree can enumerate possible accident sequences due to these weak points

  10. The sequence and analysis of a Chinese pig genome

    Directory of Open Access Journals (Sweden)

    Fang Xiaodong

    2012-11-01

    Full Text Available Abstract Background The pig is an economically important food source, amounting to approximately 40% of all meat consumed worldwide. Pigs also serve as an important model organism because of their similarity to humans at the anatomical, physiological and genetic level, making them very useful for studying a variety of human diseases. A pig strain of particular interest is the miniature pig, specifically the Wuzhishan pig (WZSP, as it has been extensively inbred. Its high level of homozygosity offers increased ease for selective breeding for specific traits and a more straightforward understanding of the genetic changes that underlie its biological characteristics. WZSP also serves as a promising means for applications in surgery, tissue engineering, and xenotransplantation. Here, we report the sequencing and analysis of an inbreeding WZSP genome. Results Our results reveal some unique genomic features, including a relatively high level of homozygosity in the diploid genome, an unusual distribution of heterozygosity, an over-representation of tRNA-derived transposable elements, a small amount of porcine endogenous retrovirus, and a lack of type C retroviruses. In addition, we carried out systematic research on gene evolution, together with a detailed investigation of the counterparts of human drug target genes. Conclusion Our results provide the opportunity to more clearly define the genomic character of pig, which could enhance our ability to create more useful pig models.

  11. Analysis of expressed sequence tags from the Ulva prolifera (Chlorophyta)

    Science.gov (United States)

    Niu, Jianfeng; Hu, Haiyan; Hu, Songnian; Wang, Guangce; Peng, Guang; Sun, Song

    2010-01-01

    In 2008, a green tide broke out before the sailing competition of the 29th Olympic Games in Qingdao. The causative species was determined to be Enteromorpha prolifera ( Ulva prolifera O. F. Müller), a familiar green macroalga along the coastline of China. Rapid accumulation of a large biomass of floating U. prolifera prompted research on different aspects of this species. In this study, we constructed a nonnormalized cDNA library from the thalli of U. prolifera and acquired 10 072 high-quality expressed sequence tags (ESTs). These ESTs were assembled into 3 519 nonredundant gene groups, including 1 446 clusters and 2 073 singletons. After annotation with the nr database, a large number of genes were found to be related with chloroplast and ribosomal protein, GO functional classification showed 1 418 ESTs participated in photosynthesis and 1 359 ESTs were responsible for the generation of precursor metabolites and energy. In addition, rather comprehensive carbon fixation pathways were found in U. prolifera using KEGG. Some stress-related and signal transduction-related genes were also found in this study. All the evidences displayed that U. prolifera had substance and energy foundation for the intense photosynthesis and the rapid proliferation. Phylogenetic analysis of cytochrome c oxidase subunit I revealed that this green-tide causative species is most closely affiliated to Pseudendoclonium akinetum (Ulvophyceae).

  12. 3' terminal diversity of MRP RNA and other human noncoding RNAs revealed by deep sequencing.

    Science.gov (United States)

    Goldfarb, Katherine C; Cech, Thomas R

    2013-09-21

    Post-transcriptional 3' end processing is a key component of RNA regulation. The abundant and essential RNA subunit of RNase MRP has been proposed to function in three distinct cellular compartments and therefore may utilize this mode of regulation. Here we employ 3' RACE coupled with high-throughput sequencing to characterize the 3' terminal sequences of human MRP RNA and other noncoding RNAs that form RNP complexes. The 3' terminal sequence of MRP RNA from HEK293T cells has a distinctive distribution of genomically encoded termini (including an assortment of U residues) with a portion of these selectively tagged by oligo(A) tails. This profile contrasts with the relatively homogenous 3' terminus of an in vitro transcribed MRP RNA control and the differing 3' terminal profiles of U3 snoRNA, RNase P RNA, and telomerase RNA (hTR). 3' RACE coupled with deep sequencing provides a valuable framework for the functional characterization of 3' terminal sequences of noncoding RNAs.

  13. 3′ terminal diversity of MRP RNA and other human noncoding RNAs revealed by deep sequencing

    Science.gov (United States)

    2013-01-01

    Background Post-transcriptional 3′ end processing is a key component of RNA regulation. The abundant and essential RNA subunit of RNase MRP has been proposed to function in three distinct cellular compartments and therefore may utilize this mode of regulation. Here we employ 3′ RACE coupled with high-throughput sequencing to characterize the 3′ terminal sequences of human MRP RNA and other noncoding RNAs that form RNP complexes. Results The 3′ terminal sequence of MRP RNA from HEK293T cells has a distinctive distribution of genomically encoded termini (including an assortment of U residues) with a portion of these selectively tagged by oligo(A) tails. This profile contrasts with the relatively homogenous 3′ terminus of an in vitro transcribed MRP RNA control and the differing 3′ terminal profiles of U3 snoRNA, RNase P RNA, and telomerase RNA (hTR). Conclusions 3′ RACE coupled with deep sequencing provides a valuable framework for the functional characterization of 3′ terminal sequences of noncoding RNAs. PMID:24053768

  14. Torque measurements reveal sequence-specific cooperative transitions in supercoiled DNA

    Science.gov (United States)

    Oberstrass, Florian C.; Fernandes, Louis E.; Bryant, Zev

    2012-01-01

    B-DNA becomes unstable under superhelical stress and is able to adopt a wide range of alternative conformations including strand-separated DNA and Z-DNA. Localized sequence-dependent structural transitions are important for the regulation of biological processes such as DNA replication and transcription. To directly probe the effect of sequence on structural transitions driven by torque, we have measured the torsional response of a panel of DNA sequences using single molecule assays that employ nanosphere rotational probes to achieve high torque resolution. The responses of Z-forming d(pGpC)n sequences match our predictions based on a theoretical treatment of cooperative transitions in helical polymers. “Bubble” templates containing 50–100 bp mismatch regions show cooperative structural transitions similar to B-DNA, although less torque is required to disrupt strand–strand interactions. Our mechanical measurements, including direct characterization of the torsional rigidity of strand-separated DNA, establish a framework for quantitative predictions of the complex torsional response of arbitrary sequences in their biological context. PMID:22474350

  15. Food Fish Identification from DNA Extraction through Sequence Analysis

    Science.gov (United States)

    Hallen-Adams, Heather E.

    2015-01-01

    This experiment exposed 3rd and 4th y undergraduates and graduate students taking a course in advanced food analysis to DNA extraction, polymerase chain reaction (PCR), and DNA sequence analysis. Students provided their own fish sample, purchased from local grocery stores, and the class as a whole extracted DNA, which was then subjected to PCR,…

  16. Genetic differences among Haplorchis taichui populations in Indochina revealed by mitochondrial COX1 sequences.

    Science.gov (United States)

    Thaenkham, U; Phuphisut, O; Nuamtanong, S; Yoonuan, T; Sa-Nguankiat, S; Vonghachack, Y; Belizario, V Y; Dung, D T; Dekumyoy, P; Waikagul, J

    2017-09-01

    Haplorchis taichui is an intestinal heterophyid fluke that is pathogenic to humans. It is widely distributed in Asia, with a particularly high prevalence in Indochina. Previous work revealed that the lack of gene flow between three distinct populations of Vietnamese H. taichui can be attributed to their geographic isolation with no interconnected river basins. To test the hypothesis that interconnected river basins allow gene flow between otherwise isolated populations of H. taichui, as previously demonstrated for another trematode, Opisthorchis viverrini, we compared the genetic structures of seven populations of H. taichui from various localities in the lower Mekong Basin, in Thailand and Laos, with those in Vietnam, using the mitochondrial cytochrome c oxidase subunit 1 (COX1) gene. To determine the gene flow between these H. taichui populations, we calculated their phylogenetic relationships, genetic distances and haplotype diversity. Each population showed very low nucleotide diversity at this locus. However, high levels of genetic differentiation between the populations indicated very little gene flow. A phylogenetic analysis divided the populations into four clusters that correlated with the country of origin. The negligible gene flow between the Thai and Laos populations, despite sharing the Mekong Basin, caused us to reject our hypothesis. Our data suggest that the distribution of H. taichui populations was incidentally associated with national borders.

  17. Genetic Diversity of Selected Mangifera Species Revealed by Inter Simple Sequence Repeats Markers

    Directory of Open Access Journals (Sweden)

    Zulhairil Ariffin

    2015-01-01

    Full Text Available ISSR markers were employed to reveal genetic diversity and genetic relatedness among 28 Mangifera accessions collected from Yan (Kedah, Bukit Gantang (Perak, Sibuti (Sarawak, and Papar (Sabah. A total of 198 markers were generated using nine anchored primers and one nonanchored primer. Genetic variation among the 28 accessions of Mangifera species including wild relatives, landraces, and clonal varieties is high, with an average degree of polymorphism of 98% and mean Shannon index, H0=7.50. Analysis on 18 Mangifera indica accessions also showed high degree of polymorphism of 99% and mean Shannon index, H0=5.74. Dice index of genetic similarity ranged from 0.0938 to 0.8046 among the Mangifera species. The dendrogram showed that the Mangifera species were grouped into three main divergent clusters. Cluster 1 comprised 14 accessions from Kedah and Perak. Cluster II and cluster III comprised 14 accessions from Sarawak and Sabah. Meanwhile, the Dice index of genetic similarity for 18 accessions of Mangifera indica ranged from 0.2588 to 0.7742. The dendrogram also showed the 18 accessions of Mangifera indica were grouped into three main clusters. Cluster I comprised 10 landraces of Mangifera indica from Kedah. Cluster II comprised 7 landraces of Mangifera indica followed by Chokanan to form Cluster III.

  18. Analysis and Visualization Tool for Targeted Amplicon Bisulfite Sequencing on Ion Torrent Sequencers.

    Directory of Open Access Journals (Sweden)

    Stephan Pabinger

    Full Text Available Targeted sequencing of PCR amplicons generated from bisulfite deaminated DNA is a flexible, cost-effective way to study methylation of a sample at single CpG resolution and perform subsequent multi-target, multi-sample comparisons. Currently, no platform specific protocol, support, or analysis solution is provided to perform targeted bisulfite sequencing on a Personal Genome Machine (PGM. Here, we present a novel tool, called TABSAT, for analyzing targeted bisulfite sequencing data generated on Ion Torrent sequencers. The workflow starts with raw sequencing data, performs quality assessment, and uses a tailored version of Bismark to map the reads to a reference genome. The pipeline visualizes results as lollipop plots and is able to deduce specific methylation-patterns present in a sample. The obtained profiles are then summarized and compared between samples. In order to assess the performance of the targeted bisulfite sequencing workflow, 48 samples were used to generate 53 different Bisulfite-Sequencing PCR amplicons from each sample, resulting in 2,544 amplicon targets. We obtained a mean coverage of 282X using 1,196,822 aligned reads. Next, we compared the sequencing results of these targets to the methylation level of the corresponding sites on an Illumina 450k methylation chip. The calculated average Pearson correlation coefficient of 0.91 confirms the sequencing results with one of the industry-leading CpG methylation platforms and shows that targeted amplicon bisulfite sequencing provides an accurate and cost-efficient method for DNA methylation studies, e.g., to provide platform-independent confirmation of Illumina Infinium 450k methylation data. TABSAT offers a novel way to analyze data generated by Ion Torrent instruments and can also be used with data from the Illumina MiSeq platform. It can be easily accessed via the Platomics platform, which offers a web-based graphical user interface along with sample and parameter storage

  19. Multilocus Sequence Analysis and rpoB Sequencing of Mycobacterium abscessus (Sensu Lato) Strains▿

    Science.gov (United States)

    Macheras, Edouard; Roux, Anne-Laure; Bastian, Sylvaine; Leão, Sylvia Cardoso; Palaci, Moises; Sivadon-Tardy, Valérie; Gutierrez, Cristina; Richter, Elvira; Rüsch-Gerdes, Sabine; Pfyffer, Gaby; Bodmer, Thomas; Cambau, Emmanuelle; Gaillard, Jean-Louis; Heym, Beate

    2011-01-01

    Mycobacterium abscessus, Mycobacterium bolletii, and Mycobacterium massiliense (Mycobacterium abscessus sensu lato) are closely related species that currently are identified by the sequencing of the rpoB gene. However, recent studies show that rpoB sequencing alone is insufficient to discriminate between these species, and some authors have questioned their current taxonomic classification. We studied here a large collection of M. abscessus (sensu lato) strains by partial rpoB sequencing (752 bp) and multilocus sequence analysis (MLSA). The final MLSA scheme developed was based on the partial sequences of eight housekeeping genes: argH, cya, glpK, gnd, murC, pgm, pta, and purH. The strains studied included the three type strains (M. abscessus CIP 104536T, M. massiliense CIP 108297T, and M. bolletii CIP 108541T) and 120 isolates recovered between 1997 and 2007 in France, Germany, Switzerland, and Brazil. The rpoB phylogenetic tree confirmed the existence of three main clusters, each comprising the type strain of one species. However, divergence values between the M. massiliense and M. bolletii clusters all were below 3% and between the M. abscessus and M. massiliense clusters were from 2.66 to 3.59%. The tree produced using the concatenated MLSA gene sequences (4,071 bp) also showed three main clusters, each comprising the type strain of one species. The M. abscessus cluster had a bootstrap value of 100% and was mostly compact. Bootstrap values for the M. massiliense and M. bolletii branches were much lower (71 and 61%, respectively), with the M. massiliense cluster having a fuzzy aspect. Mean (range) divergence values were 2.17% (1.13 to 2.58%) between the M. abscessus and M. massiliense clusters, 2.37% (1.5 to 2.85%) between the M. abscessus and M. bolletii clusters, and 2.28% (0.86 to 2.68%) between the M. massiliense and M. bolletii clusters. Adding the rpoB sequence to the MLSA-concatenated sequence (total sequence, 4,823 bp) had little effect on the clustering

  20. Multilocus sequence analysis and rpoB sequencing of Mycobacterium abscessus (sensu lato) strains.

    Science.gov (United States)

    Macheras, Edouard; Roux, Anne-Laure; Bastian, Sylvaine; Leão, Sylvia Cardoso; Palaci, Moises; Sivadon-Tardy, Valérie; Gutierrez, Cristina; Richter, Elvira; Rüsch-Gerdes, Sabine; Pfyffer, Gaby; Bodmer, Thomas; Cambau, Emmanuelle; Gaillard, Jean-Louis; Heym, Beate

    2011-02-01

    Mycobacterium abscessus, Mycobacterium bolletii, and Mycobacterium massiliense (Mycobacterium abscessus sensu lato) are closely related species that currently are identified by the sequencing of the rpoB gene. However, recent studies show that rpoB sequencing alone is insufficient to discriminate between these species, and some authors have questioned their current taxonomic classification. We studied here a large collection of M. abscessus (sensu lato) strains by partial rpoB sequencing (752 bp) and multilocus sequence analysis (MLSA). The final MLSA scheme developed was based on the partial sequences of eight housekeeping genes: argH, cya, glpK, gnd, murC, pgm, pta, and purH. The strains studied included the three type strains (M. abscessus CIP 104536(T), M. massiliense CIP 108297(T), and M. bolletii CIP 108541(T)) and 120 isolates recovered between 1997 and 2007 in France, Germany, Switzerland, and Brazil. The rpoB phylogenetic tree confirmed the existence of three main clusters, each comprising the type strain of one species. However, divergence values between the M. massiliense and M. bolletii clusters all were below 3% and between the M. abscessus and M. massiliense clusters were from 2.66 to 3.59%. The tree produced using the concatenated MLSA gene sequences (4,071 bp) also showed three main clusters, each comprising the type strain of one species. The M. abscessus cluster had a bootstrap value of 100% and was mostly compact. Bootstrap values for the M. massiliense and M. bolletii branches were much lower (71 and 61%, respectively), with the M. massiliense cluster having a fuzzy aspect. Mean (range) divergence values were 2.17% (1.13 to 2.58%) between the M. abscessus and M. massiliense clusters, 2.37% (1.5 to 2.85%) between the M. abscessus and M. bolletii clusters, and 2.28% (0.86 to 2.68%) between the M. massiliense and M. bolletii clusters. Adding the rpoB sequence to the MLSA-concatenated sequence (total sequence, 4,823 bp) had little effect on the

  1. An optimum analysis sequence for environmental gamma-ray spectrometry

    Energy Technology Data Exchange (ETDEWEB)

    De la Torre, F.; Rios M, C.; Ruvalcaba A, M. G.; Mireles G, F.; Saucedo A, S.; Davila R, I.; Pinedo, J. L., E-mail: fta777@hotmail.co [Universidad Autonoma de Zacatecas, Centro Regional de Estudis Nucleares, Calle Cipres No. 10, Fracc. La Penuela, 98068 Zacatecas (Mexico)

    2010-10-15

    This work aims to obtain an optimum analysis sequence for environmental gamma-ray spectroscopy by means of Genie 2000 (Canberra). Twenty different analysis sequences were customized using different peak area percentages and different algorithms for: 1) peak finding, and 2) peak area determination, and with or without the use of a library -based on evaluated nuclear data- of common gamma-ray emitters in environmental samples. The use of an optimum analysis sequence with certified nuclear information avoids the problems originated by the significant variations in out-of-date nuclear parameters of commercial software libraries. Interference-free gamma ray energies with absolute emission probabilities greater than 3.75% were included in the customized library. The gamma-ray spectroscopy system (based on a Ge Re-3522 Canberra detector) was calibrated both in energy and shape by means of the IAEA-2002 reference spectra for software intercomparison. To test the performance of the analysis sequences, the IAEA-2002 reference spectrum was used. The z-score and the reduced {chi}{sup 2} criteria were used to determine the optimum analysis sequence. The results show an appreciable variation in the peak area determinations and their corresponding uncertainties. Particularly, the combination of second derivative peak locate with simple peak area integration algorithms provides the greater accuracy. Lower accuracy comes from the combination of library directed peak locate algorithm and Genie's Gamma-M peak area determination. (Author)

  2. An optimum analysis sequence for environmental gamma-ray spectrometry

    International Nuclear Information System (INIS)

    De la Torre, F.; Rios M, C.; Ruvalcaba A, M. G.; Mireles G, F.; Saucedo A, S.; Davila R, I.; Pinedo, J. L.

    2010-10-01

    This work aims to obtain an optimum analysis sequence for environmental gamma-ray spectroscopy by means of Genie 2000 (Canberra). Twenty different analysis sequences were customized using different peak area percentages and different algorithms for: 1) peak finding, and 2) peak area determination, and with or without the use of a library -based on evaluated nuclear data- of common gamma-ray emitters in environmental samples. The use of an optimum analysis sequence with certified nuclear information avoids the problems originated by the significant variations in out-of-date nuclear parameters of commercial software libraries. Interference-free gamma ray energies with absolute emission probabilities greater than 3.75% were included in the customized library. The gamma-ray spectroscopy system (based on a Ge Re-3522 Canberra detector) was calibrated both in energy and shape by means of the IAEA-2002 reference spectra for software intercomparison. To test the performance of the analysis sequences, the IAEA-2002 reference spectrum was used. The z-score and the reduced χ 2 criteria were used to determine the optimum analysis sequence. The results show an appreciable variation in the peak area determinations and their corresponding uncertainties. Particularly, the combination of second derivative peak locate with simple peak area integration algorithms provides the greater accuracy. Lower accuracy comes from the combination of library directed peak locate algorithm and Genie's Gamma-M peak area determination. (Author)

  3. Genotypic Resistance Tests Sequences Reveal the Role of Marginalized Populations in HIV-1 Transmission in Switzerland.

    Science.gov (United States)

    Shilaih, Mohaned; Marzel, Alex; Yang, Wan Lin; Scherrer, Alexandra U; Schüpbach, Jörg; Böni, Jürg; Yerly, Sabine; Hirsch, Hans H; Aubert, Vincent; Cavassini, Matthias; Klimkait, Thomas; Vernazza, Pietro L; Bernasconi, Enos; Furrer, Hansjakob; Günthard, Huldrych F; Kouyos, Roger

    2016-06-14

    Targeting hard-to-reach/marginalized populations is essential for preventing HIV-transmission. A unique opportunity to identify such populations in Switzerland is provided by a database of all genotypic-resistance-tests from Switzerland, including both sequences from the Swiss HIV Cohort Study (SHCS) and non-cohort sequences. A phylogenetic tree was built using 11,127 SHCS and 2,875 Swiss non-SHCS sequences. Demographics were imputed for non-SHCS patients using a phylogenetic proximity approach. Factors associated with non-cohort outbreaks were determined using logistic regression. Non-B subtype (univariable odds-ratio (OR): 1.9; 95% confidence interval (CI): 1.8-2.1), female gender (OR: 1.6; 95% CI: 1.4-1.7), black ethnicity (OR: 1.9; 95% CI: 1.7-2.1) and heterosexual transmission group (OR:1.8; 95% CI: 1.6-2.0), were all associated with underrepresentation in the SHCS. We found 344 purely non-SHCS transmission clusters, however, these outbreaks were small (median 2, maximum 7 patients) with a strong overlap with the SHCS'. 65% of non-SHCS sequences were part of clusters composed of >= 50% SHCS sequences. Our data suggests that marginalized-populations are underrepresented in the SHCS. However, the limited size of outbreaks among non-SHCS patients in-care implies that no major HIV outbreak in Switzerland was missed by the SHCS surveillance. This study demonstrates the potential of sequence data to assess and extend the scope of infectious-disease surveillance.

  4. Bacterial community compositions of coking wastewater treatment plants in steel industry revealed by Illumina high-throughput sequencing.

    Science.gov (United States)

    Ma, Qiao; Qu, Yuanyuan; Shen, Wenli; Zhang, Zhaojing; Wang, Jingwei; Liu, Ziyan; Li, Duanxing; Li, Huijie; Zhou, Jiti

    2015-03-01

    In this study, Illumina high-throughput sequencing was used to reveal the community structures of nine coking wastewater treatment plants (CWWTPs) in China for the first time. The sludge systems exhibited a similar community composition at each taxonomic level. Compared to previous studies, some of the core genera in municipal wastewater treatment plants such as Zoogloea, Prosthecobacter and Gp6 were detected as minor species. Thiobacillus (20.83%), Comamonas (6.58%), Thauera (4.02%), Azoarcus (7.78%) and Rhodoplanes (1.42%) were the dominant genera shared by at least six CWWTPs. The percentages of autotrophic ammonia-oxidizing bacteria and nitrite-oxidizing bacteria were unexpectedly low, which were verified by both real-time PCR and fluorescence in situ hybridization analyses. Hierarchical clustering and canonical correspondence analysis indicated that operation mode, flow rate and temperature might be the key factors in community formation. This study provides new insights into our understanding of microbial community compositions and structures of CWWTPs. Copyright © 2014 Elsevier Ltd. All rights reserved.

  5. Exome sequencing of oral squamous cell carcinoma in users of Arabian snuff reveals novel candidates for driver genes.

    Science.gov (United States)

    Al-Hebshi, Nezar Noor; Li, Shiyong; Nasher, Akram Thabet; El-Setouhy, Maged; Alsanosi, Rashad; Blancato, Jan; Loffredo, Christopher

    2016-07-15

    The study sought to identify genetic aberrations driving oral squamous cell carcinoma (OSCC) development among users of shammah, an Arabian preparation of smokeless tobacco. Twenty archival OSCC samples, 15 of which with a history of shammah exposure, were whole-exome sequenced at an average depth of 127×. Somatic mutations were identified using a novel, matched controls-independent filtration algorithm. CODEX and Exomedepth coupled with a novel, Database of Genomic Variant-based filter were employed to call somatic gene-copy number variations. Significantly mutated genes were identified with Oncodrive FM and the Youn and Simon's method. Candidate driver genes were nominated based on Gene Set Enrichment Analysis. The observed mutational spectrum was similar to that reported by the TCGA project. In addition to confirming known genes of OSCC (TP53, CDKNA2, CASP8, PIK3CA, HRAS, FAT1, TP63, CCND1 and FADD) the analysis identified several candidate novel driver events including mutations of NOTCH3, CSMD3, CRB1, CLTCL1, OSMR and TRPM2, amplification of the proto-oncogenes FOSL1, RELA, TRAF6, MDM2, FRS2 and BAG1, and deletion of the recently described tumor suppressor SMARCC1. Analysis also revealed significantly altered pathways not previously implicated in OSCC including Oncostatin-M signalling pathway, AP-1 and C-MYB transcription networks and endocytosis. There was a trend for higher number of mutations, amplifications and driver events in samples with history of shammah exposure particularly those that tested EBV positive, suggesting an interaction between tobacco exposure and EBV. The work provides further evidence for the genetic heterogeneity of oral cancer and suggests shammah-associated OSCC is characterized by extensive amplification of oncogenes. © 2016 UICC.

  6. Working memory for sequences of temporal durations reveals a volatile single-item store

    Directory of Open Access Journals (Sweden)

    Sanjay G Manohar

    2016-10-01

    Full Text Available When a sequence is held in working memory, different items are retained with differing fidelity. Here we ask whether a sequence of brief time intervals that must be remembered show recency effects, similar to those observed in verbal and visuospatial working memory. It has been suggested that prioritising some items over others can be accounted for by a focus of attention, maintaining some items in a privileged state. We therefore also investigated whether such benefits are vulnerable to disruption by attention or expectation. Participants listened to sequences of one to five tones, of varying durations (200ms to 2s. Subsequently, the length of one of the tones in the sequence had to be reproduced by holding a key. The discrepancy between the reproduced and actual durations quantified the fidelity of memory for auditory durations. Recall precision decreased with the number of items that had to be remembered, and was better for the first and last items of sequences, in line with set-size and serial position effects seen in other modalities. To test whether attentional filtering demands might impair performance, an irrelevant variation in pitch was introduced in some blocks of trials. In those blocks, memory precision was worse for sequences that consisted of only one item, i.e. the smallest memory set size. Thus, when irrelevant information was present, the benefit of having only one item in memory is attenuated. Finally we examined whether expectation could interfere with memory. On half the trials, the number of items in the upcoming sequence was cued. When the number of items was known in advance, performance was paradoxically worse when the sequence consisted of only one item. Thus the benefit of having only one item to remember is stronger when it is unexpectedly the only item. Our results suggest that similar mechanisms are used to hold auditory time durations in working memory, as for visual or verbal stimuli. Further, solitary items were

  7. The BsaHI restriction-modification system: Cloning, sequencing and analysis of conserved motifs

    Directory of Open Access Journals (Sweden)

    Roberts Richard J

    2008-05-01

    Full Text Available Abstract Background Restriction and modification enzymes typically recognise short DNA sequences of between two and eight bases in length. Understanding the mechanism of this recognition represents a significant challenge that we begin to address for the BsaHI restriction-modification system, which recognises the six base sequence GRCGYC. Results The DNA sequences of the genes for the BsaHI methyltransferase, bsaHIM, and restriction endonuclease, bsaHIR, have been determined (GenBank accession #EU386360, cloned and expressed in E. coli. Both the restriction endonuclease and methyltransferase enzymes share significant similarity with a group of 6 other enzymes comprising the restriction-modification systems HgiDI and HgiGI and the putative HindVP, NlaCORFDP, NpuORFC228P and SplZORFNP restriction-modification systems. A sequence alignment of these homologues shows that their amino acid sequences are largely conserved and highlights several motifs of interest. We target one such conserved motif, reading SPERRFD, at the C-terminal end of the bsaHIR gene. A mutational analysis of these amino acids indicates that the motif is crucial for enzymatic activity. Sequence alignment of the methyltransferase gene reveals a short motif within the target recognition domain that is conserved among enzymes recognising the same sequences. Thus, this motif may be used as a diagnostic tool to define the recognition sequences of the cytosine C5 methyltransferases. Conclusion We have cloned and sequenced the BsaHI restriction and modification enzymes. We have identified a region of the R. BsaHI enzyme that is crucial for its activity. Analysis of the amino acid sequence of the BsaHI methyltransferase enzyme led us to propose two new motifs that can be used in the diagnosis of the recognition sequence of the cytosine C5-methyltransferases.

  8. Genome Sequencing Reveals Loci under Artificial Selection that Underlie Disease Phenotypes in the Laboratory Rat

    NARCIS (Netherlands)

    Atanur, Santosh S.; Diaz, Ana Garcia; Maratou, Klio; Sarkis, Allison; Rotival, Maxime; Game, Laurence; Tschannen, Michael R.; Kaisaki, Pamela J.; Otto, Georg W.; Ma, Man Chun John; Keane, Thomas M.; Hummel, Oliver; Saar, Kathrin; Chen, Wei; Guryev, Victor; Gopalakrishnan, Kathirvel; Garrett, Michael R.; Joe, Bina; Citterio, Lorena; Bianchi, Giuseppe; McBride, Martin; Dominiczak, Anna; Adams, David J.; Serikawa, Tadao; Flicek, Paul; Cuppen, Edwin; Hubner, Norbert; Petretto, Enrico; Gauguier, Dominique; Kwitek, Anne; Jacob, Howard; Aitman, Timothy J.

    2013-01-01

    Large numbers of inbred laboratory rat strains have been developed for a range of complex disease phenotypes. To gain insights into the evolutionary pressures underlying selection for these phenotypes, we sequenced the genomes of 27 rat strains, including 11 models of hypertension, diabetes, and

  9. Low diversity in the mitogenome of sperm whales revealed by next-generation sequencing

    Science.gov (United States)

    Alana Alexander; Debbie Steel; Beth Slikas; Kendra Hoekzema; Colm Carraher; Matthew Parks; Richard Cronn; C. Scott Baker

    2012-01-01

    Large population sizes and global distributions generally associate with high mitochondrial DNA control region (CR) diversity. The sperm whale (Physeter macrocephalus) is an exception, showing low CR diversity relative to other cetaceans; however, diversity levels throughout the remainder of the sperm whale mitogenome are unknown. We sequenced 20...

  10. Ultra-deep sequencing reveals the subclonal structure and genomic evolution of oral squamous cell carcinoma

    DEFF Research Database (Denmark)

    Tabatabaeifar, Siavosh; Thomassen, Mads; Larsen, Martin Jakob

    Background: Oral squamous cell carcinoma (OSCC), a subgroup of head and neck squamous cell carcinoma (HNSCC), is primarily caused by alcohol consumption and tobacco use. Recent DNA sequencing studies suggests that HNSCC are very heterogeneous between patients; however the intra-patient subclonal...

  11. 454-sequencing reveals stochastic local reassembly and high disturbance tolerance within arbuscular mycorrhizal fungal communities

    DEFF Research Database (Denmark)

    Lekberg, Karin Ylva Margareta; Schnoor, Tim; Kjøller, Rasmus

    2012-01-01

    unpredictable, with approximately 40%of all sequences within a sample belonging to a single OTU of varying identity. The distribution of two plant species that are often poorly colonized by AMfungi (Dianthus deltoides and Carex arenaria) correlated significantly with the OTU composition, which may indicate...

  12. Genome Microscale Heterogeneity among Wild Potatoes Revealed by Diversity Arrays Technology Marker Sequences

    Directory of Open Access Journals (Sweden)

    Alessandra Traini

    2013-01-01

    Full Text Available Tuber-bearing potato species possess several genes that can be exploited to improve the genetic background of the cultivated potato Solanum tuberosum. Among them, S. bulbocastanum and S. commersonii are well known for their strong resistance to environmental stresses. However, scant information is available for these species in terms of genome organization, gene function, and regulatory networks. Consequently, genomic tools to assist breeding are meager, and efficient exploitation of these species has been limited so far. In this paper, we employed the reference genome sequences from cultivated potato and tomato and a collection of sequences of 1,423 potato Diversity Arrays Technology (DArT markers that show polymorphic representation across the genomes of S. bulbocastanum and/or S. commersonii genotypes. Our results highlighted microscale genome sequence heterogeneity that may play a significant role in functional and structural divergence between related species. Our analytical approach provides knowledge of genome structural and sequence variability that could not be detected by transcriptome and proteome approaches.

  13. Genome Microscale Heterogeneity among Wild Potatoes Revealed by Diversity Arrays Technology Marker Sequences.

    Science.gov (United States)

    Traini, Alessandra; Iorizzo, Massimo; Mann, Harpartap; Bradeen, James M; Carputo, Domenico; Frusciante, Luigi; Chiusano, Maria Luisa

    2013-01-01

    Tuber-bearing potato species possess several genes that can be exploited to improve the genetic background of the cultivated potato Solanum tuberosum. Among them, S. bulbocastanum and S. commersonii are well known for their strong resistance to environmental stresses. However, scant information is available for these species in terms of genome organization, gene function, and regulatory networks. Consequently, genomic tools to assist breeding are meager, and efficient exploitation of these species has been limited so far. In this paper, we employed the reference genome sequences from cultivated potato and tomato and a collection of sequences of 1,423 potato Diversity Arrays Technology (DArT) markers that show polymorphic representation across the genomes of S. bulbocastanum and/or S. commersonii genotypes. Our results highlighted microscale genome sequence heterogeneity that may play a significant role in functional and structural divergence between related species. Our analytical approach provides knowledge of genome structural and sequence variability that could not be detected by transcriptome and proteome approaches.

  14. Sequencing Chromosomal Abnormalities Reveals Neurodevelopmental Loci that Confer Risk across Diagnostic Boundaries

    DEFF Research Database (Denmark)

    Talkowski, Michael E.; Rosenfeld, Jill A.; Blumenthal, Ian

    2012-01-01

    Sequencing of balanced chromosomal abnormalities, combined with convergent genomic studies of gene expression, copy-number variation, and genome-wide association, identifies 22 new loci that contribute to autism and related neurodevelopmental disorders. These data support a polygenic risk model...

  15. High diversity of picornaviruses in rats from different continents revealed by deep sequencing

    DEFF Research Database (Denmark)

    Arn Hansen, Thomas; Mollerup, Sarah; Nguyen, Nam-Phuong

    2016-01-01

    Outbreaks of zoonotic diseases in humans and livestock are not uncommon, and an important component in containment of such emerging viral diseases is rapid and reliable diagnostics. Such methods are often PCR-based and hence require the availability of sequence data from the pathogen. Rattus norv...

  16. High diversity of picornaviruses in rats from different continents revealed by deep sequencing

    DEFF Research Database (Denmark)

    Arn Hansen, Thomas; Mollerup, Sarah; Nguyen, Nam-Phuong

    2016-01-01

    Outbreaks of zoonotic diseases in humans and livestock are not uncommon, and an important component in containment of such emerging viral diseases is rapid and reliable diagnostics. Such methods are often PCR-based and hence require the availability of sequence data from the pathogen. Rattus...

  17. Oil palm genome sequence reveals divergence of interfertile species in old and new worlds

    Science.gov (United States)

    Singh, Rajinder; Ong-Abdullah, Meilina; Low, Eng-Ti Leslie; Manaf, Mohamad Arif Abdul; Rosli, Rozana; Nookiah, Rajanaidu; Ooi, Leslie Cheng-Li; Ooi, Siew–Eng; Chan, Kuang-Lim; Halim, Mohd Amin; Azizi, Norazah; Nagappan, Jayanthi; Bacher, Blaire; Lakey, Nathan; Smith, Steven W; He, Dong; Hogan, Michael; Budiman, Muhammad A; Lee, Ernest K; DeSalle, Rob; Kudrna, David; Goicoechea, Jose Louis; Wing, Rod; Wilson, Richard K; Fulton, Robert S; Ordway, Jared M; Martienssen, Robert A; Sambanthamurthi, Ravigadevi

    2013-01-01

    Oil palm is the most productive oil-bearing crop. Planted on only 5% of the total vegetable oil acreage, palm oil accounts for 33% of vegetable oil, and 45% of edible oil worldwide, but increased cultivation competes with dwindling rainforest reserves. We report the 1.8 gigabase (Gb) genome sequence of the African oil palm Elaeis guineensis, the predominant source of worldwide oil production. 1.535 Gb of assembled sequence and transcriptome data from 30 tissue types were used to predict at least 34,802 genes, including oil biosynthesis genes and homologues of WRINKLED1 (WRI1), and other transcriptional regulators1, which are highly expressed in the kernel. We also report the draft sequence of the S. American oil palm Elaeis oleifera, which has the same number of chromosomes (2n=32) and produces fertile interspecific hybrids with E. guineensis2, but appears to have diverged in the new world. Segmental duplications of chromosome arms define the palaeotetraploid origin of palm trees. The oil palm sequence enables the discovery of genes for important traits as well as somaclonal epigenetic alterations which restrict the use of clones in commercial plantings3, and thus helps achieve sustainability for biofuels and edible oils, reducing the rainforest footprint of this tropical plantation crop. PMID:23883927

  18. Comment on "Protein sequences from mastodon and Tyrannosaurus rex revealed by mass spectrometry".

    Science.gov (United States)

    Pevzner, Pavel A; Kim, Sangtae; Ng, Julio

    2008-08-22

    Asara et al. (Reports, 13 April 2007, p. 280) reported sequencing of Tyrannosaurus rex proteins and used them to establish the evolutionary relationships between birds and dinosaurs. We argue that the reported T. rex peptides may represent statistical artifacts and call for complete data release to enable experimental and computational verification of their findings.

  19. Deep-sequencing revealed Citrus bark cracking viroid (CBCVd) as a highly aggressive pathogen on hop

    Czech Academy of Sciences Publication Activity Database

    Jakše, J.; Radišek, S.; Pokorn, T.; Matoušek, Jaroslav; Javornik, B.

    2015-01-01

    Roč. 64, č. 4 (2015), s. 831-842 ISSN 0032-0862 R&D Projects: GA MŠk(CZ) LH14255 Institutional support: RVO:60077344 Keywords : Bioinformatic * Citrus bark cracking viroid * Hop * Next-generation sequencing Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 2.383, year: 2015

  20. Sequence exploration reveals information bias among molecular markers used in phylogenetic reconstruction for Colletotrichum species.

    Science.gov (United States)

    Rampersad, Sephra N; Hosein, Fazeeda N; Carrington, Christine Vf

    2014-01-01

    The Colletotrichum gloeosporioides species complex is among the most destructive fungal plant pathogens in the world, however, identification of isolates of quarantine importance to the intra-specific level is confounded by a number of factors that affect phylogenetic reconstruction. Information bias and quality parameters were investigated to determine whether nucleotide sequence alignments and phylogenetic trees accurately reflect the genetic diversity and phylogenetic relatedness of individuals. Sequence exploration of GAPDH, ACT, TUB2 and ITS markers indicated that the query sequences had different patterns of nucleotide substitution but were without evidence of base substitution saturation. Regions of high entropy were much more dispersed in the ACT and GAPDH marker alignments than for the ITS and TUB2 markers. A discernible bimodal gap in the genetic distance frequency histograms was produced for the ACT and GAPDH markers which indicated successful separation of intra- and inter-specific sequences in the data set. Overall, analyses indicated clear differences in the ability of these markers to phylogenetically separate individuals to the intra-specific level which coincided with information bias.

  1. Community structure of arbuscular mycorrhizal fungi in undisturbed vegetation revealed by analyses of LSU rdna sequences

    DEFF Research Database (Denmark)

    Rosendahl, Søren; Holtgrewe-Stukenbrock, Eva

    2004-01-01

    Arbuscular mycorrhizal fungi (AMF) form a mutualistic symbiosis with plant roots and are found in most ecosystems. In this study the community structure of AMF in a clade of the genus Glomus was examined in undisturbed costal grassland using LSU rDNA sequences amplified from roots of Hieracium...

  2. Automatic analysis of the 2015 Gorkha earthquake aftershock sequence.

    Science.gov (United States)

    Baillard, C.; Lyon-Caen, H.; Bollinger, L.; Rietbrock, A.; Letort, J.; Adhikari, L. B.

    2016-12-01

    The Mw 7.8 Gorkha earthquake, that partially ruptured the Main Himalayan Thrust North of Kathmandu on the 25th April 2015, was the largest and most catastrophic earthquake striking Nepal since the great M8.4 1934 earthquake. This mainshock was followed by multiple aftershocks, among them, two notable events that occurred on the 12th May with magnitudes of 7.3 Mw and 6.3 Mw. Due to these recent events it became essential for the authorities and for the scientific community to better evaluate the seismic risk in the region through a detailed analysis of the earthquake catalog, amongst others, the spatio-temporal distribution of the Gorkha aftershock sequence. Here we complement this first study by doing a microseismic study using seismic data coming from the eastern part of the Nepalese Seismological Center network associated to one broadband station in Everest. Our primary goal is to deliver an accurate catalog of the aftershock sequence. Due to the exceptional number of events detected we performed an automatic picking/locating procedure which can be splitted in 4 steps: 1) Coarse picking of the onsets using a classical STA/LTA picker, 2) phase association of picked onsets to detect and declare seismic events, 3) Kurtosis pick refinement around theoretical arrival times to increase picking and location accuracy and, 4) local magnitude calculation based amplitude of waveforms. This procedure is time efficient ( 1 sec/event), reduces considerably the location uncertainties ( 2 to 5 km errors) and increases the number of events detected compared to manual processing. Indeed, the automatic detection rate is 10 times higher than the manual detection rate. By comparing to the USGS catalog we were able to give a new attenuation law to compute local magnitudes in the region. A detailed analysis of the seismicity shows a clear migration toward the east of the region and a sudden decrease of seismicity 100 km east of Kathmandu which may reveal the presence of a tectonic

  3. Combining genomic sequencing methods to explore viral diversity and reveal potential virus-host interactions

    Directory of Open Access Journals (Sweden)

    Cheryl-Emiliane Tien Chow

    2015-04-01

    Full Text Available Viral diversity and virus-host interactions in oxygen-starved regions of the ocean, also known as oxygen minimum zones (OMZs, remain relatively unexplored. Microbial community metabolism in OMZs alters nutrient and energy flow through marine food webs, resulting in biological nitrogen loss and greenhouse gas production. Thus, viruses infecting OMZ microbes have the potential to modulate community metabolism with resulting feedback on ecosystem function. Here, we describe viral communities inhabiting oxic surface (10m and oxygen-starved basin (200m waters of Saanich Inlet, a seasonally anoxic fjord on the coast of Vancouver Island, British Columbia using viral metagenomics and complete viral fosmid sequencing on samples collected between April 2007 and April 2010. Of 6459 open reading frames (ORFs predicted across all 34 viral fosmids, 77.6% (n=5010 had no homology to reference viral genomes. These fosmids recruited a higher proportion of viral metagenomic sequences from Saanich Inlet than from nearby northeastern subarctic Pacific Ocean (Line P waters, indicating differences in the viral communities between coastal and open ocean locations. While functional annotations of fosmid ORFs were limited, recruitment to NCBI’s non-redundant ‘nr’ database and publicly available single-cell genomes identified putative viruses infecting marine thaumarchaeal and SUP05 proteobacteria to provide potential host linkages with relevance to coupled biogeochemical cycling processes in OMZ waters. Taken together, these results highlight the power of coupled analyses of multiple sequence data types, such as viral metagenomic and fosmid sequence data with prokaryotic single cell genomes, to chart viral diversity, elucidate genomic and ecological contexts for previously unclassifiable viral sequences, and identify novel host interactions in natural and engineered ecosystems.

  4. Whole exome sequencing reveals concomitant mutations of multiple FA genes in individual Fanconi anemia patients.

    Science.gov (United States)

    Chang, Lixian; Yuan, Weiping; Zeng, Huimin; Zhou, Quanquan; Wei, Wei; Zhou, Jianfeng; Li, Miaomiao; Wang, Xiaomin; Xu, Mingjiang; Yang, Fengchun; Yang, Yungui; Cheng, Tao; Zhu, Xiaofan

    2014-05-15

    Fanconi anemia (FA) is a rare inherited genetic syndrome with highly variable clinical manifestations. Fifteen genetic subtypes of FA have been identified. Traditional complementation tests for grouping studies have been used generally in FA patients and in stepwise methods to identify the FA type, which can result in incomplete genetic information from FA patients. We diagnosed five pediatric patients with FA based on clinical manifestations, and we performed exome sequencing of peripheral blood specimens from these patients and their family members. The related sequencing data were then analyzed by bioinformatics, and the FANC gene mutations identified by exome sequencing were confirmed by PCR re-sequencing. Homozygous and compound heterozygous mutations of FANC genes were identified in all of the patients. The FA subtypes of the patients included FANCA, FANCM and FANCD2. Interestingly, four FA patients harbored multiple mutations in at least two FA genes, and some of these mutations have not been previously reported. These patients' clinical manifestations were vastly different from each other, as were their treatment responses to androstanazol and prednisone. This finding suggests that heterozygous mutation(s) in FA genes could also have diverse biological and/or pathophysiological effects on FA patients or FA gene carriers. Interestingly, we were not able to identify de novo mutations in the genes implicated in DNA repair pathways when the sequencing data of patients were compared with those of their parents. Our results indicate that Chinese FA patients and carriers might have higher and more complex mutation rates in FANC genes than have been conventionally recognized. Testing of the fifteen FANC genes in FA patients and their family members should be a regular clinical practice to determine the optimal care for the individual patient, to counsel the family and to obtain a better understanding of FA pathophysiology.

  5. Hybrid sequencing approach applied to human fecal metagenomic clone libraries revealed clones with potential biotechnological applications.

    Science.gov (United States)

    Džunková, Mária; D'Auria, Giuseppe; Pérez-Villarroya, David; Moya, Andrés

    2012-01-01

    Natural environments represent an incredible source of microbial genetic diversity. Discovery of novel biomolecules involves biotechnological methods that often require the design and implementation of biochemical assays to screen clone libraries. However, when an assay is applied to thousands of clones, one may eventually end up with very few positive clones which, in most of the cases, have to be "domesticated" for downstream characterization and application, and this makes screening both laborious and expensive. The negative clones, which are not considered by the selected assay, may also have biotechnological potential; however, unfortunately they would remain unexplored. Knowledge of the clone sequences provides important clues about potential biotechnological application of the clones in the library; however, the sequencing of clones one-by-one would be very time-consuming and expensive. In this study, we characterized the first metagenomic clone library from the feces of a healthy human volunteer, using a method based on 454 pyrosequencing coupled with a clone-by-clone Sanger end-sequencing. Instead of whole individual clone sequencing, we sequenced 358 clones in a pool. The medium-large insert (7-15 kb) cloning strategy allowed us to assemble these clones correctly, and to assign the clone ends to maintain the link between the position of a living clone in the library and the annotated contig from the 454 assembly. Finally, we found several open reading frames (ORFs) with previously described potential medical application. The proposed approach allows planning ad-hoc biochemical assays for the clones of interest, and the appropriate sub-cloning strategy for gene expression in suitable vectors/hosts.

  6. Hybrid sequencing approach applied to human fecal metagenomic clone libraries revealed clones with potential biotechnological applications.

    Directory of Open Access Journals (Sweden)

    Mária Džunková

    Full Text Available Natural environments represent an incredible source of microbial genetic diversity. Discovery of novel biomolecules involves biotechnological methods that often require the design and implementation of biochemical assays to screen clone libraries. However, when an assay is applied to thousands of clones, one may eventually end up with very few positive clones which, in most of the cases, have to be "domesticated" for downstream characterization and application, and this makes screening both laborious and expensive. The negative clones, which are not considered by the selected assay, may also have biotechnological potential; however, unfortunately they would remain unexplored. Knowledge of the clone sequences provides important clues about potential biotechnological application of the clones in the library; however, the sequencing of clones one-by-one would be very time-consuming and expensive. In this study, we characterized the first metagenomic clone library from the feces of a healthy human volunteer, using a method based on 454 pyrosequencing coupled with a clone-by-clone Sanger end-sequencing. Instead of whole individual clone sequencing, we sequenced 358 clones in a pool. The medium-large insert (7-15 kb cloning strategy allowed us to assemble these clones correctly, and to assign the clone ends to maintain the link between the position of a living clone in the library and the annotated contig from the 454 assembly. Finally, we found several open reading frames (ORFs with previously described potential medical application. The proposed approach allows planning ad-hoc biochemical assays for the clones of interest, and the appropriate sub-cloning strategy for gene expression in suitable vectors/hosts.

  7. Phylogenetic inferences of Nepenthes species in Peninsular Malaysia revealed by chloroplast (trnL intron) and nuclear (ITS) DNA sequences.

    Science.gov (United States)

    Bunawan, Hamidun; Yen, Choong Chee; Yaakop, Salmah; Noor, Normah Mohd

    2017-01-26

    The chloroplastic trnL intron and the nuclear internal transcribed spacer (ITS) region were sequenced for 11 Nepenthes species recorded in Peninsular Malaysia to examine their phylogenetic relationship and to evaluate the usage of trnL intron and ITS sequences for phylogenetic reconstruction of this genus. Phylogeny reconstruction was carried out using neighbor-joining, maximum parsimony and Bayesian analyses. All the trees revealed two major clusters, a lowland group consisting of N. ampullaria, N. mirabilis, N. gracilis and N. rafflesiana, and another containing both intermediately distributed species (N. albomarginata and N. benstonei) and four highland species (N. sanguinea, N. macfarlanei, N. ramispina and N. alba). The trnL intron and ITS sequences proved to provide phylogenetic informative characters for deriving a phylogeny of Nepenthes species in Peninsular Malaysia. To our knowledge, this is the first molecular phylogenetic study of Nepenthes species occurring along an altitudinal gradient in Peninsular Malaysia.

  8. Whole genome sequencing reveals complex evolution patterns of multidrug-resistant Mycobacterium tuberculosis Beijing strains in patients.

    Directory of Open Access Journals (Sweden)

    Matthias Merker

    Full Text Available Multidrug-resistant (MDR Mycobacterium tuberculosis complex (MTBC strains represent a major threat for tuberculosis (TB control. Treatment of MDR-TB patients is long and less effective, resulting in a significant number of treatment failures. The development of further resistances leads to extensively drug-resistant (XDR variants. However, data on the individual reasons for treatment failure, e.g. an induced mutational burst, and on the evolution of bacteria in the patient are only sparsely available. To address this question, we investigated the intra-patient evolution of serial MTBC isolates obtained from three MDR-TB patients undergoing longitudinal treatment, finally leading to XDR-TB. Sequential isolates displayed identical IS6110 fingerprint patterns, suggesting the absence of exogenous re-infection. We utilized whole genome sequencing (WGS to screen for variations in three isolates from Patient A and four isolates from Patient B and C, respectively. Acquired polymorphisms were subsequently validated in up to 15 serial isolates by Sanger sequencing. We determined eight (Patient A and nine (Patient B polymorphisms, which occurred in a stepwise manner during the course of the therapy and were linked to resistance or a potential compensatory mechanism. For both patients, our analysis revealed the long-term co-existence of clonal subpopulations that displayed different drug resistance allele combinations. Out of these, the most resistant clone was fixed in the population. In contrast, baseline and follow-up isolates of Patient C were distinguished each by eleven unique polymorphisms, indicating an exogenous re-infection with an XDR strain not detected by IS6110 RFLP typing. Our study demonstrates that intra-patient microevolution of MDR-MTBC strains under longitudinal treatment is more complex than previously anticipated. However, a mutator phenotype was not detected. The presence of different subpopulations might confound phenotypic and

  9. Genetic mutation analysis of human gastric adenocarcinomas using ion torrent sequencing platform.

    Directory of Open Access Journals (Sweden)

    Zhi Xu

    Full Text Available Gastric cancer is the one of the major causes of cancer-related death, especially in Asia. Gastric adenocarcinoma, the most common type of gastric cancer, is heterogeneous and its incidence and cause varies widely with geographical regions, gender, ethnicity, and diet. Since unique mutations have been observed in individual human cancer samples, identification and characterization of the molecular alterations underlying individual gastric adenocarcinomas is a critical step for developing more effective, personalized therapies. Until recently, identifying genetic mutations on an individual basis by DNA sequencing remained a daunting task. Recent advances in new next-generation DNA sequencing technologies, such as the semiconductor-based Ion Torrent sequencing platform, makes DNA sequencing cheaper, faster, and more reliable. In this study, we aim to identify genetic mutations in the genes which are targeted by drugs in clinical use or are under development in individual human gastric adenocarcinoma samples using Ion Torrent sequencing. We sequenced 737 loci from 45 cancer-related genes in 238 human gastric adenocarcinoma samples using the Ion Torrent Ampliseq Cancer Panel. The sequencing analysis revealed a high occurrence of mutations along the TP53 locus (9.7% in our sample set. Thus, this study indicates the utility of a cost and time efficient tool such as Ion Torrent sequencing to screen cancer mutations for the development of personalized cancer therapy.

  10. Sequence and phylogenetic analysis of chicken anaemia virus obtained from backyard and commercial chickens in Nigeria.

    Science.gov (United States)

    Oluwayelu, D O; Todd, D; Olaleye, O D

    2008-12-01

    This work reports the first molecular analysis study of chicken anaemia virus (CAV) in backyard chickens in Africa using molecular cloning and sequence analysis to characterize CAV strains obtained from commercial chickens and Nigerian backyard chickens. Partial VP1 gene sequences were determined for three CAVs from commercial chickens and for six CAV variants present in samples from a backyard chicken. Multiple alignment analysis revealed that the 6% and 4% nucleotide diversity obtained respectively for the commercial and backyard chicken strains translated to only 2% amino acid diversity for each breed. Overall, the amino acid composition of Nigerian CAVs was found to be highly conserved. Since the partial VP1 gene sequence of two backyard chicken cloned CAV strains (NGR/CI-8 and NGR/CI-9) were almost identical and evolutionarily closely related to the commercial chicken strains NGR-1, and NGR-4 and NGR-5, respectively, we concluded that CAV infections had crossed the farm boundary.

  11. Whole-exome sequencing reveals the spectrum of gene mutations and the clonal evolution patterns in paediatric acute myeloid leukaemia.

    Science.gov (United States)

    Shiba, Norio; Yoshida, Kenichi; Shiraishi, Yuichi; Okuno, Yusuke; Yamato, Genki; Hara, Yusuke; Nagata, Yasunobu; Chiba, Kenichi; Tanaka, Hiroko; Terui, Kiminori; Kato, Motohiro; Park, Myoung-Ja; Ohki, Kentaro; Shimada, Akira; Takita, Junko; Tomizawa, Daisuke; Kudo, Kazuko; Arakawa, Hirokazu; Adachi, Souichi; Taga, Takashi; Tawa, Akio; Ito, Etsuro; Horibe, Keizo; Sanada, Masashi; Miyano, Satoru; Ogawa, Seishi; Hayashi, Yasuhide

    2016-11-01

    Acute myeloid leukaemia (AML) is a molecularly and clinically heterogeneous disease. Targeted sequencing efforts have identified several mutations with diagnostic and prognostic values in KIT, NPM1, CEBPA and FLT3 in both adult and paediatric AML. In addition, massively parallel sequencing enabled the discovery of recurrent mutations (i.e. IDH1/2 and DNMT3A) in adult AML. In this study, whole-exome sequencing (WES) of 22 paediatric AML patients revealed mutations in components of the cohesin complex (RAD21 and SMC3), BCORL1 and ASXL2 in addition to previously known gene mutations. We also revealed intratumoural heterogeneities in many patients, implicating multiple clonal evolution events in the development of AML. Furthermore, targeted deep sequencing in 182 paediatric AML patients identified three major categories of recurrently mutated genes: cohesion complex genes [STAG2, RAD21 and SMC3 in 17 patients (8·3%)], epigenetic regulators [ASXL1/ASXL2 in 17 patients (8·3%), BCOR/BCORL1 in 7 patients (3·4%)] and signalling molecules. We also performed WES in four patients with relapsed AML. Relapsed AML evolved from one of the subclones at the initial phase and was accompanied by many additional mutations, including common driver mutations that were absent or existed only with lower allele frequency in the diagnostic samples, indicating a multistep process causing leukaemia recurrence. © 2016 John Wiley & Sons Ltd.

  12. Atypical case of Wolfram syndrome revealed through targeted exome sequencing in a patient with suspected mitochondrial disease.

    Science.gov (United States)

    Lieber, Daniel S; Vafai, Scott B; Horton, Laura C; Slate, Nancy G; Liu, Shangtao; Borowsky, Mark L; Calvo, Sarah E; Schmahmann, Jeremy D; Mootha, Vamsi K

    2012-01-06

    Mitochondrial diseases comprise a diverse set of clinical disorders that affect multiple organ systems with varying severity and age of onset. Due to their clinical and genetic heterogeneity, these diseases are difficult to diagnose. We have developed a targeted exome sequencing approach to improve our ability to properly diagnose mitochondrial diseases and apply it here to an individual patient. Our method targets mitochondrial DNA (mtDNA) and the exons of 1,600 nuclear genes involved in mitochondrial biology or Mendelian disorders with multi-system phenotypes, thereby allowing for simultaneous evaluation of multiple disease loci. Targeted exome sequencing was performed on a patient initially suspected to have a mitochondrial disorder. The patient presented with diabetes mellitus, diffuse brain atrophy, autonomic neuropathy, optic nerve atrophy, and a severe amnestic syndrome. Further work-up revealed multiple heteroplasmic mtDNA deletions as well as profound thiamine deficiency without a clear nutritional cause. Targeted exome sequencing revealed a homozygous c.1672C > T (p.R558C) missense mutation in exon 8 of WFS1 that has previously been reported in a patient with Wolfram syndrome. This case demonstrates how clinical application of next-generation sequencing technology can enhance the diagnosis of patients suspected to have rare genetic disorders. Furthermore, the finding of unexplained thiamine deficiency in a patient with Wolfram syndrome suggests a potential link between WFS1 biology and thiamine metabolism that has implications for the clinical management of Wolfram syndrome patients.

  13. Atypical case of Wolfram syndrome revealed through targeted exome sequencing in a patient with suspected mitochondrial disease

    Directory of Open Access Journals (Sweden)

    Lieber Daniel S

    2012-01-01

    Full Text Available Abstract Background Mitochondrial diseases comprise a diverse set of clinical disorders that affect multiple organ systems with varying severity and age of onset. Due to their clinical and genetic heterogeneity, these diseases are difficult to diagnose. We have developed a targeted exome sequencing approach to improve our ability to properly diagnose mitochondrial diseases and apply it here to an individual patient. Our method targets mitochondrial DNA (mtDNA and the exons of 1,600 nuclear genes involved in mitochondrial biology or Mendelian disorders with multi-system phenotypes, thereby allowing for simultaneous evaluation of multiple disease loci. Case Presentation Targeted exome sequencing was performed on a patient initially suspected to have a mitochondrial disorder. The patient presented with diabetes mellitus, diffuse brain atrophy, autonomic neuropathy, optic nerve atrophy, and a severe amnestic syndrome. Further work-up revealed multiple heteroplasmic mtDNA deletions as well as profound thiamine deficiency without a clear nutritional cause. Targeted exome sequencing revealed a homozygous c.1672C > T (p.R558C missense mutation in exon 8 of WFS1 that has previously been reported in a patient with Wolfram syndrome. Conclusion This case demonstrates how clinical application of next-generation sequencing technology can enhance the diagnosis of patients suspected to have rare genetic disorders. Furthermore, the finding of unexplained thiamine deficiency in a patient with Wolfram syndrome suggests a potential link between WFS1 biology and thiamine metabolism that has implications for the clinical management of Wolfram syndrome patients.

  14. Genetic variation in Rhodomyrtus tomentosa (Kemunting) populations from Malaysia as revealed by inter-simple sequence repeat markers.

    Science.gov (United States)

    Hue, T S; Abdullah, T L; Abdullah, N A P; Sinniah, U R

    2015-12-14

    Kemunting (Rhodomyrtus tomentosa) from the Myrtaceae family, is native to Malaysia. It is widely used in traditional medicine to treat various illnesses and possesses significant antibacterial properties. In addition, it has great potential as ornamental in landscape design. Genetic variability studies are important for the rational management and conservation of genetic material. In the present study, inter-simple sequence repeat markers were used to assess the genetic diversity of 18 R. tomentosa populations collected from ten states of Peninsular Malaysia. The 11 primers selected generated 173 bands that ranged in size from 1.6 kb to 130 bp, which corresponded to an average of 15.73 bands per primer. Of these bands, 97.69% (169 in total) were polymorphic. High genetic diversity was documented at the species level (H(T) = 0.2705; I = 0.3973; PPB = 97.69%) but there was a low diversity at population level (H(S) = 0.0073; I = 0 .1085; PPB = 20.14%). The high level of genetic differentiation revealed by G(ST) (73%) and analysis of molecular variance (63%), together with the limited gene flow among population (N(m) = 0.1851), suggests that the populations examined are isolated. Results from an unweighted pair group method with arithmetic mean dendrogram and principal coordinate analysis clearly grouped the populations into two geographic groups. This clear grouping can also be demonstrated by the significant Mantel test (r = 0.581, P = 0.001). We recommend that all the R. tomentosa populations be preserved in conservation program.

  15. RNA deep sequencing reveals novel candidate genes and polymorphisms in boar testis and liver tissues with divergent androstenone levels.

    Directory of Open Access Journals (Sweden)

    Asep Gunawan

    Full Text Available Boar taint is an unpleasant smell and taste of pork meat derived from some entire male pigs. The main causes of boar taint are the two compounds androstenone (5α-androst-16-en-3-one and skatole (3-methylindole. It is crucial to understand the genetic mechanism of boar taint to select pigs for lower androstenone levels and thus reduce boar taint. The aim of the present study was to investigate transcriptome differences in boar testis and liver tissues with divergent androstenone levels using RNA deep sequencing (RNA-Seq. The total number of reads produced for each testis and liver sample ranged from 13,221,550 to 33,206,723 and 12,755,487 to 46,050,468, respectively. In testis samples 46 genes were differentially regulated whereas 25 genes showed differential expression in the liver. The fold change values ranged from -4.68 to 2.90 in testis samples and -2.86 to 3.89 in liver samples. Differentially regulated genes in high androstenone testis and liver samples were enriched in metabolic processes such as lipid metabolism, small molecule biochemistry and molecular transport. This study provides evidence for transcriptome profile and gene polymorphisms of boars with divergent androstenone level using RNA-Seq technology. Digital gene expression analysis identified candidate genes in flavin monooxygenease family, cytochrome P450 family and hydroxysteroid dehydrogenase family. Moreover, polymorphism and association analysis revealed mutation in IRG6, MX1, IFIT2, CYP7A1, FMO5 and KRT18 genes could be potential candidate markers for androstenone levels in boars. Further studies are required for proving the role of candidate genes to be used in genomic selection against boar taint in pig breeding programs.

  16. Single-cell RNA sequencing reveals metallothionein heterogeneity during hESC differentiation to definitive endoderm

    Directory of Open Access Journals (Sweden)

    Junjie Lu

    2018-04-01

    Full Text Available Differentiation of human pluripotent stem cells towards definitive endoderm (DE is the critical first step for generating cells comprising organs such as the gut, liver, pancreas and lung. This in-vitro differentiation process generates a heterogeneous population with a proportion of cells failing to differentiate properly and maintaining expression of pluripotency factors such as Oct4. RNA sequencing of single cells collected at four time points during a 4-day DE differentiation identified high expression of metallothionein genes in the residual Oct4-positive cells that failed to differentiate to DE. Using X-ray fluorescence microscopy and multi-isotope mass spectrometry, we discovered that high intracellular zinc level corresponds with persistent Oct4 expression and failure to differentiate. This study improves our understanding of the cellular heterogeneity during in-vitro directed differentiation and provides a valuable resource to improve DE differentiation efficiency. Keywords: hPSC, Differentiation, Definitive endoderm, Heterogeneity, Single cell, RNA sequencing

  17. Deep sequencing reveals persistence of cell-associated mumps vaccine virus in chronic encephalitis.

    Science.gov (United States)

    Morfopoulou, Sofia; Mee, Edward T; Connaughton, Sarah M; Brown, Julianne R; Gilmour, Kimberly; Chong, W K 'Kling'; Duprex, W Paul; Ferguson, Deborah; Hubank, Mike; Hutchinson, Ciaran; Kaliakatsos, Marios; McQuaid, Stephen; Paine, Simon; Plagnol, Vincent; Ruis, Christopher; Virasami, Alex; Zhan, Hong; Jacques, Thomas S; Schepelmann, Silke; Qasim, Waseem; Breuer, Judith

    2017-01-01

    Routine childhood vaccination against measles, mumps and rubella has virtually abolished virus-related morbidity and mortality. Notwithstanding this, we describe here devastating neurological complications associated with the detection of live-attenuated mumps virus Jeryl Lynn (MuV JL5 ) in the brain of a child who had undergone successful allogeneic transplantation for severe combined immunodeficiency (SCID). This is the first confirmed report of MuV JL5 associated with chronic encephalitis and highlights the need to exclude immunodeficient individuals from immunisation with live-attenuated vaccines. The diagnosis was only possible by deep sequencing of the brain biopsy. Sequence comparison of the vaccine batch to the MuV JL5 isolated from brain identified biased hypermutation, particularly in the matrix gene, similar to those found in measles from cases of SSPE. The findings provide unique insights into the pathogenesis of paramyxovirus brain infections.

  18. Sequence diversity in haloalkane dehalogenases, as revealed by PCR using family-specific primers

    Czech Academy of Sciences Publication Activity Database

    Kotík, Michael; Faměrová, Veronika

    2012-01-01

    Roč. 88, č. 2 (2012), s. 212-217 ISSN 0167-7012 R&D Projects: GA ČR GAP504/10/0137; GA ČR GAP207/10/0135 Institutional research plan: CEZ:AV0Z50200510 Keywords : Dehalogenation * Consensus sequence * Degenerate PCR primer Subject RIV: EE - Microbiology, Virology Impact factor: 2.161, year: 2012

  19. Genetic signatures of adaptation revealed from transcriptome sequencing of Arctic and red foxes

    OpenAIRE

    Kumar, Vikas; Kutschera, Verena E.; Nilsson, Maria A.; Janke, Axel

    2015-01-01

    Background The genus Vulpes (true foxes) comprises numerous species that inhabit a wide range of habitats and climatic conditions, including one species, the Arctic fox (Vulpes lagopus) which is adapted to the arctic region. A close relative to the Arctic fox, the red fox (Vulpes vulpes), occurs in subarctic to subtropical habitats. To study the genetic basis of their adaptations to different environments, transcriptome sequences from two Arctic foxes and one red fox individual were generated...

  20. Bisulfite sequencing reveals that Aspergillus flavus holds a hollow in DNA methylation.

    Directory of Open Access Journals (Sweden)

    Si-Yang Liu

    Full Text Available Aspergillus flavus first gained scientific attention for its production of aflatoxin. The underlying regulation of aflatoxin biosynthesis has been serving as a theoretical model for biosynthesis of other microbial secondary metabolites. Nevertheless, for several decades, the DNA methylation status, one of the important epigenomic modifications involved in gene regulation, in A. flavus remains to be controversial. Here, we applied bisulfite sequencing in conjunction with a biological replicate strategy to investigate the DNA methylation profiling of A. flavus genome. Both the bisulfite sequencing data and the methylome comparisons with other fungi confirm that the DNA methylation level of this fungus is negligible. Further investigation into the DNA methyltransferase of Aspergillus uncovers its close relationship with RID-like enzymes as well as its divergence with the methyltransferase of species with validated DNA methylation. The lack of repeat contents of the A. flavus' genome and the high RIP-index of the small amount of remanent repeat potentially support our speculation that DNA methylation may be absent in A. flavus or that it may possess de novo DNA methylation which occurs very transiently during the obscure sexual stage of this fungal species. This work contributes to our understanding on the DNA methylation status of A. flavus, as well as reinforces our views on the DNA methylation in fungal species. In addition, our strategy of applying bisulfite sequencing to DNA methylation detection in species with low DNA methylation may serve as a reference for later scientific investigations in other hypomethylated species.

  1. High diversity of picornaviruses in rats from different continents revealed by deep sequencing.

    Science.gov (United States)

    Hansen, Thomas Arn; Mollerup, Sarah; Nguyen, Nam-Phuong; White, Nicole E; Coghlan, Megan; Alquezar-Planas, David E; Joshi, Tejal; Jensen, Randi Holm; Fridholm, Helena; Kjartansdóttir, Kristín Rós; Mourier, Tobias; Warnow, Tandy; Belsham, Graham J; Bunce, Michael; Willerslev, Eske; Nielsen, Lars Peter; Vinner, Lasse; Hansen, Anders Johannes

    2016-08-17

    Outbreaks of zoonotic diseases in humans and livestock are not uncommon, and an important component in containment of such emerging viral diseases is rapid and reliable diagnostics. Such methods are often PCR-based and hence require the availability of sequence data from the pathogen. Rattus norvegicus (R. norvegicus) is a known reservoir for important zoonotic pathogens. Transmission may be direct via contact with the animal, for example, through exposure to its faecal matter, or indirectly mediated by arthropod vectors. Here we investigated the viral content in rat faecal matter (n=29) collected from two continents by analyzing 2.2 billion next-generation sequencing reads derived from both DNA and RNA. Among other virus families, we found sequences from members of the Picornaviridae to be abundant in the microbiome of all the samples. Here we describe the diversity of the picornavirus-like contigs including near-full-length genomes closely related to the Boone cardiovirus and Theiler's encephalomyelitis virus. From this study, we conclude that picornaviruses within R. norvegicus are more diverse than previously recognized. The virome of R. norvegicus should be investigated further to assess the full potential for zoonotic virus transmission.

  2. Selfish supernumerary chromosome reveals its origin as a mosaic of host genome and organellar sequences.

    Science.gov (United States)

    Martis, Mihaela Maria; Klemme, Sonja; Banaei-Moghaddam, Ali Mohammad; Blattner, Frank R; Macas, Jiří; Schmutzer, Thomas; Scholz, Uwe; Gundlach, Heidrun; Wicker, Thomas; Šimková, Hana; Novák, Petr; Neumann, Pavel; Kubaláková, Marie; Bauer, Eva; Haseneyer, Grit; Fuchs, Jörg; Doležel, Jaroslav; Stein, Nils; Mayer, Klaus F X; Houben, Andreas

    2012-08-14

    Supernumerary B chromosomes are optional additions to the basic set of A chromosomes, and occur in all eukaryotic groups. They differ from the basic complement in morphology, pairing behavior, and inheritance and are not required for normal growth and development. The current view is that B chromosomes are parasitic elements comparable to selfish DNA, like transposons. In contrast to transposons, they are autonomously inherited independent of the host genome and have their own mechanisms of mitotic or meiotic drive. Although B chromosomes were first described a century ago, little is known about their origin and molecular makeup. The widely accepted view is that they are derived from fragments of A chromosomes and/or generated in response to interspecific hybridization. Through next-generation sequencing of sorted A and B chromosomes, we show that B chromosomes of rye are rich in gene-derived sequences, allowing us to trace their origin to fragments of A chromosomes, with the largest parts corresponding to rye chromosomes 3R and 7R. Compared with A chromosomes, B chromosomes were also found to accumulate large amounts of specific repeats and insertions of organellar DNA. The origin of rye B chromosomes occurred an estimated ∼1.1-1.3 Mya, overlapping in time with the onset of the genus Secale (1.7 Mya). We propose a comprehensive model of B chromosome evolution, including its origin by recombination of several A chromosomes followed by capturing of additional A-derived and organellar sequences and amplification of B-specific repeats.

  3. Metabolic diversity and ecological niches of Achromatium populations revealed with single-cell genomic sequencing

    Directory of Open Access Journals (Sweden)

    Muammar eMansor

    2015-08-01

    Full Text Available Large, sulfur-cycling, calcite-precipitating bacteria in the genus Achromatium represent a significant proportion of bacterial communities near sediment-water interfaces throughout the world. Our understanding of their potentially crucial roles in calcium, carbon, sulfur, nitrogen, and iron cycling is limited because they have not been cultured or sequenced using environmental genomics approaches to date. We utilized single-cell genomic sequencing to obtain one incomplete and two nearly complete draft genomes for Achromatium collected at Warm Mineral Springs, FL. Based on 16S rRNA gene sequences, the three cells represent distinct and relatively distant Achromatium populations (91-92% identity. The draft genomes encode key genes involved in sulfur and hydrogen oxidation; oxygen, nitrogen and polysulfide respiration; carbon and nitrogen fixation; organic carbon assimilation and storage; chemotaxis; twitching motility; antibiotic resistance; and membrane transport. Known genes for iron and manganese energy metabolism were not detected. The presence of pyrophosphatase and vacuolar (V-type ATPases, which are generally rare in bacterial genomes, suggests a role for these enzymes in calcium transport, proton pumping, and/or energy generation in the membranes of calcite-containing inclusions.

  4. Sequencing of bovine herpesvirus 4 v.test strain reveals important genome features

    Directory of Open Access Journals (Sweden)

    Gillet Laurent

    2011-08-01

    Full Text Available Abstract Background Bovine herpesvirus 4 (BoHV-4 is a useful model for the human pathogenic gammaherpesviruses Epstein-Barr virus and Kaposi's Sarcoma-associated Herpesvirus. Although genome manipulations of this virus have been greatly facilitated by the cloning of the BoHV-4 V.test strain as a Bacterial Artificial Chromosome (BAC, the lack of a complete genome sequence for this strain limits its experimental use. Methods In this study, we have determined the complete sequence of BoHV-4 V.test strain by a pyrosequencing approach. Results The long unique coding region (LUR consists of 108,241 bp encoding at least 79 open reading frames and is flanked by several polyrepetitive DNA units (prDNA. As previously suggested, we showed that the prDNA unit located at the left prDNA-LUR junction (prDNA-G differs from the other prDNA units (prDNA-inner. Namely, the prDNA-G unit lacks the conserved pac-2 cleavage and packaging signal in its right terminal region. Based on the mechanisms of cleavage and packaging of herpesvirus genomes, this feature implies that only genomes bearing left and right end prDNA units are encapsulated into virions. Conclusions In this study, we have determined the complete genome sequence of the BAC-cloned BoHV-4 V.test strain and identified genome organization features that could be important in other herpesviruses.

  5. DNA Barcoding: Amplification and sequence analysis of rbcl and matK genome regions in three divergent plant species

    Directory of Open Access Journals (Sweden)

    Javed Iqbal Wattoo

    2016-11-01

    Full Text Available Background: DNA barcoding is a novel method of species identification based on nucleotide diversity of conserved sequences. The establishment and refining of plant DNA barcoding systems is more challenging due to high genetic diversity among different species. Therefore, targeting the conserved nuclear transcribed regions would be more reliable for plant scientists to reveal genetic diversity, species discrimination and phylogeny. Methods: In this study, we amplified and sequenced the chloroplast DNA regions (matk+rbcl of Solanum nigrum, Euphorbia helioscopia and Dalbergia sissoo to study the functional annotation, homology modeling and sequence analysis to allow a more efficient utilization of these sequences among different plant species. These three species represent three families; Solanaceae, Euphorbiaceae and Fabaceae respectively. Biological sequence homology and divergence of amplified sequences was studied using Basic Local Alignment Tool (BLAST. Results: Both primers (matk+rbcl showed good amplification in three species. The sequenced regions reveled conserved genome information for future identification of different medicinal plants belonging to these species. The amplified conserved barcodes revealed different levels of biological homology after sequence analysis. The results clearly showed that the use of these conserved DNA sequences as barcode primers would be an accurate way for species identification and discrimination. Conclusion: The amplification and sequencing of conserved genome regions identified a novel sequence of matK in native species of Solanum nigrum. The findings of the study would be applicable in medicinal industry to establish DNA based identification of different medicinal plant species to monitor adulteration.

  6. De novo transcriptome sequencing and sequence analysis of the malaria vector Anopheles sinensis (Diptera: Culicidae)

    Science.gov (United States)

    2014-01-01

    Background Anopheles sinensis is the major malaria vector in China and Southeast Asia. Vector control is one of the most effective measures to prevent malaria transmission. However, there is little transcriptome information available for the malaria vector. To better understand the biological basis of malaria transmission and to develop novel and effective means of vector control, there is a need to build a transcriptome dataset for functional genomics analysis by large-scale RNA sequencing (RNA-seq). Methods To provide a more comprehensive and complete transcriptome of An. sinensis, eggs, larvae, pupae, male adults and female adults RNA were pooled together for cDNA preparation, sequenced using the Illumina paired-end sequencing technology and assembled into unigenes. These unigenes were then analyzed in their genome mapping, functional annotation, homology, codon usage bias and simple sequence repeats (SSRs). Results Approximately 51.6 million clean reads were obtained, trimmed, and assembled into 38,504 unigenes with an average length of 571 bp, an N50 of 711 bp, and an average GC content 51.26%. Among them, 98.4% of unigenes could be mapped onto the reference genome, and 69% of unigenes could be annotated with known biological functions. Homology analysis identified certain numbers of An. sinensis unigenes that showed homology or being putative 1:1 orthologues with genomes of other Dipteran species. Codon usage bias was analyzed and 1,904 SSRs were detected, which will provide effective molecular markers for the population genetics of this species. Conclusions Our data and analysis provide the most comprehensive transcriptomic resource and characteristics currently available for An. sinensis, and will facilitate genetic, genomic studies, and further vector control of An. sinensis. PMID:25000941

  7. Sequence analysis corresponding to the PPE and PE proteins in ...

    Indian Academy of Sciences (India)

    Unknown

    AB repeats; Mycobacterium tuberculosis genome; PE-PPE domain; PPE, PE proteins; sequence analysis; surface antigens. J. Biosci. | Vol. ... bacterium tuberculosis genomes resulted in the identification of a previously uncharacterized 225 amino acid- ...... Vega Lopez F, Brooks L A, Dockrell H M, De Smet K A,. Thompson ...

  8. Molecular cloning, expression analysis and sequence prediction of ...

    African Journals Online (AJOL)

    CCAAT/enhancer-binding protein beta as an essential transcriptional factor, regulates the differentiation of adipocytes and the deposition of fat. Herein, we cloned the whole open reading frame (ORF) of bovine C/EBPβ gene and analyzed its putative protein structures via DNA cloning and sequence analysis. Then, the ...

  9. Multilocus sequence analysis of phytopathogenic species of the genus Streptomyces

    Science.gov (United States)

    The identification and classification of species within the genus Streptomyces is difficult because there are presently 576 validly described species and this number increases every year. The value of the application of multilocus sequence analysis scheme to the systematics of Streptomyces species h...

  10. Sequence symmetry analysis in pharmacovigilance and pharmacoepidemiologic studies

    DEFF Research Database (Denmark)

    Lai, Edward Chia Cheng; Pratt, Nicole; Hsieh, Cheng Yang

    2017-01-01

    Sequence symmetry analysis (SSA) is a method for detecting adverse drug events by utilizing computerized claims data. The method has been increasingly used to investigate safety concerns of medications and as a pharmacovigilance tool to identify unsuspected side effects. Validation studies have i...

  11. DNAApp: a mobile application for sequencing data analysis.

    Science.gov (United States)

    Nguyen, Phi-Vu; Verma, Chandra Shekhar; Gan, Samuel Ken-En

    2014-11-15

    There have been numerous applications developed for decoding and visualization of ab1 DNA sequencing files for Windows and MAC platforms, yet none exists for the increasingly popular smartphone operating systems. The ability to decode sequencing files cannot easily be carried out using browser accessed Web tools. To overcome this hurdle, we have developed a new native app called DNAApp that can decode and display ab1 sequencing file on Android and iOS. In addition to in-built analysis tools such as reverse complementation, protein translation and searching for specific sequences, we have incorporated convenient functions that would facilitate the harnessing of online Web tools for a full range of analysis. Given the high usage of Android/iOS tablets and smartphones, such bioinformatics apps would raise productivity and facilitate the high demand for analyzing sequencing data in biomedical research. The Android version of DNAApp is available in Google Play Store as 'DNAApp', and the iOS version is available in the App Store. More details on the app can be found at www.facebook.com/APDLab; www.bii.a-star.edu.sg/research/trd/apd.php The DNAApp user guide is available at http://tinyurl.com/DNAAppuser, and a video tutorial is available on Google Play Store and App Store, as well as on the Facebook page. samuelg@bii.a-star.edu.sg. © The Author 2014. Published by Oxford University Press.

  12. DNAApp: a mobile application for sequencing data analysis

    Science.gov (United States)

    Nguyen, Phi-Vu; Verma, Chandra Shekhar; Gan, Samuel Ken-En

    2014-01-01

    Summary: There have been numerous applications developed for decoding and visualization of ab1 DNA sequencing files for Windows and MAC platforms, yet none exists for the increasingly popular smartphone operating systems. The ability to decode sequencing files cannot easily be carried out using browser accessed Web tools. To overcome this hurdle, we have developed a new native app called DNAApp that can decode and display ab1 sequencing file on Android and iOS. In addition to in-built analysis tools such as reverse complementation, protein translation and searching for specific sequences, we have incorporated convenient functions that would facilitate the harnessing of online Web tools for a full range of analysis. Given the high usage of Android/iOS tablets and smartphones, such bioinformatics apps would raise productivity and facilitate the high demand for analyzing sequencing data in biomedical research. Availability and implementation: The Android version of DNAApp is available in Google Play Store as ‘DNAApp’, and the iOS version is available in the App Store. More details on the app can be found at www.facebook.com/APDLab; www.bii.a-star.edu.sg/research/trd/apd.php The DNAApp user guide is available at http://tinyurl.com/DNAAppuser, and a video tutorial is available on Google Play Store and App Store, as well as on the Facebook page. Contact: samuelg@bii.a-star.edu.sg PMID:25095882

  13. Long-read sequencing data analysis for yeasts.

    Science.gov (United States)

    Yue, Jia-Xing; Liti, Gianni

    2018-06-01

    Long-read sequencing technologies have become increasingly popular due to their strengths in resolving complex genomic regions. As a leading model organism with small genome size and great biotechnological importance, the budding yeast Saccharomyces cerevisiae has many isolates currently being sequenced with long reads. However, analyzing long-read sequencing data to produce high-quality genome assembly and annotation remains challenging. Here, we present a modular computational framework named long-read sequencing data analysis for yeasts (LRSDAY), the first one-stop solution that streamlines this process. Starting from the raw sequencing reads, LRSDAY can produce chromosome-level genome assembly and comprehensive genome annotation in a highly automated manner with minimal manual intervention, which is not possible using any alternative tool available to date. The annotated genomic features include centromeres, protein-coding genes, tRNAs, transposable elements (TEs), and telomere-associated elements. Although tailored for S. cerevisiae, we designed LRSDAY to be highly modular and customizable, making it adaptable to virtually any eukaryotic organism. When applying LRSDAY to an S. cerevisiae strain, it takes ∼41 h to generate a complete and well-annotated genome from ∼100× Pacific Biosciences (PacBio) running the basic workflow with four threads. Basic experience working within the Linux command-line environment is recommended for carrying out the analysis using LRSDAY.

  14. Tumor transcriptome sequencing reveals allelic expression imbalances associated with copy number alterations.

    Directory of Open Access Journals (Sweden)

    Brian B Tuch

    Full Text Available Due to growing throughput and shrinking cost, massively parallel sequencing is rapidly becoming an attractive alternative to microarrays for the genome-wide study of gene expression and copy number alterations in primary tumors. The sequencing of transcripts (RNA-Seq should offer several advantages over microarray-based methods, including the ability to detect somatic mutations and accurately measure allele-specific expression. To investigate these advantages we have applied a novel, strand-specific RNA-Seq method to tumors and matched normal tissue from three patients with oral squamous cell carcinomas. Additionally, to better understand the genomic determinants of the gene expression changes observed, we have sequenced the tumor and normal genomes of one of these patients. We demonstrate here that our RNA-Seq method accurately measures allelic imbalance and that measurement on the genome-wide scale yields novel insights into cancer etiology. As expected, the set of genes differentially expressed in the tumors is enriched for cell adhesion and differentiation functions, but, unexpectedly, the set of allelically imbalanced genes is also enriched for these same cancer-related functions. By comparing the transcriptomic perturbations observed in one patient to his underlying normal and tumor genomes, we find that allelic imbalance in the tumor is associated with copy number mutations and that copy number mutations are, in turn, strongly associated with changes in transcript abundance. These results support a model in which allele-specific deletions and duplications drive allele-specific changes in gene expression in the developing tumor.

  15. Revised Mimivirus major capsid protein sequence reveals intron-containing gene structure and extra domain

    Directory of Open Access Journals (Sweden)

    Suzan-Monti Marie

    2009-05-01

    Full Text Available Abstract Background Acanthamoebae polyphaga Mimivirus (APM is the largest known dsDNA virus. The viral particle has a nearly icosahedral structure with an internal capsid shell surrounded with a dense layer of fibrils. A Capsid protein sequence, D13L, was deduced from the APM L425 coding gene and was shown to be the most abundant protein found within the viral particle. However this protein remained poorly characterised until now. A revised protein sequence deposited in a database suggested an additional N-terminal stretch of 142 amino acids missing from the original deduced sequence. This result led us to investigate the L425 gene structure and the biochemical properties of the complete APM major Capsid protein. Results This study describes the full length 3430 bp Capsid coding gene and characterises the 593 amino acids long corresponding Capsid protein 1. The recombinant full length protein allowed the production of a specific monoclonal antibody able to detect the Capsid protein 1 within the viral particle. This protein appeared to be post-translationnally modified by glycosylation and phosphorylation. We proposed a secondary structure prediction of APM Capsid protein 1 compared to the Capsid protein structure of Paramecium Bursaria Chlorella Virus 1, another member of the Nucleo-Cytoplasmic Large DNA virus family. Conclusion The characterisation of the full length L425 Capsid coding gene of Acanthamoebae polyphaga Mimivirus provides new insights into the structure of the main Capsid protein. The production of a full length recombinant protein will be useful for further structural studies.

  16. SMRT Sequencing Revealed Mitogenome Characteristics and Mitogenome-Wide DNA Modification Pattern in Ophiocordyceps sinensis.

    Science.gov (United States)

    Kang, Xincong; Hu, Liqin; Shen, Pengyuan; Li, Rui; Liu, Dongbo

    2017-01-01

    Single molecule, real-time (SMRT) sequencing was used to characterize mitochondrial (mt) genome of Ophiocordyceps sinensis and to analyze the mt genome-wide pattern of epigenetic DNA modification. The complete mt genome of O. sinensis , with a size of 157,539 bp, is the fourth largest Ascomycota mt genome sequenced to date. It contained 14 conserved protein-coding genes (PCGs), 1 intronic protein rps3 , 27 tRNAs and 2 rRNA subunits, which are common characteristics of the known mt genomes in Hypocreales. A phylogenetic tree inferred from 14 PCGs in Pezizomycotina fungi supports O. sinensis as most closely related to Hirsutella rhossiliensis in Ophiocordycipitaceae. A total of 36 sequence sites in rps3 were under positive selection, with dN/dS >1 in the 20 compared fungi. Among them, 16 sites were statistically significant. In addition, the mt genome-wide base modification pattern of O. sinensis was determined in this study, especially DNA methylation. The methylations were located in coding and uncoding regions of mt PCGs in O. sinensis , and might be closely related to the expression of PCGs or the binding affinity of transcription factor A to mtDNA. Consequently, these methylations may affect the enzymatic activity of oxidative phosphorylation and then the mt respiratory rate; or they may influence mt biogenesis. Therefore, methylations in the mitogenome of O. sinensis might be a genetic feature to adapt to the cold and low PO 2 environment at high altitude, where O. sinensis is endemic. This is the first report on epigenetic modifications in a fungal mt genome.

  17. Mitochondrial and nuclear sequence polymorphisms reveal geographic structuring in Amazonian populations of Echinococcus vogeli (Cestoda: Taeniidae).

    Science.gov (United States)

    Santos, Guilherme B; Soares, Manoel do C P; de F Brito, Elisabete M; Rodrigues, André L; Siqueira, Nilton G; Gomes-Gouvêa, Michele S; Alves, Max M; Carneiro, Liliane A; Malheiros, Andreza P; Póvoa, Marinete M; Zaha, Arnaldo; Haag, Karen L

    2012-12-01

    To date, nothing is known about the genetic diversity of the Echinococcus neotropical species, Echinococcus vogeli and Echinococcus oligarthrus. Here we used mitochondrial and nuclear DNA sequence polymorphisms to uncover the genetic structure, transmission and history of E. vogeli in the Brazilian Amazon, based on a sample of 38 isolates obtained from human and wild animal hosts. We confirm that the parasite is partially synanthropic and show that its populations are diverse. Furthermore, significant geographical structuring is found, with western and eastern populations being genetically divergent. Copyright © 2012 Australian Society for Parasitology Inc. Published by Elsevier Ltd. All rights reserved.

  18. Selfish supernumerary chromosome reveals its origin as a mosaic of host genome and organellar sequences

    Czech Academy of Sciences Publication Activity Database

    Martis, M.M.; Klemme, S.; Banaei-Moghaddam, A.M.; Blattner, F.R.; Macas, Jiří; Schmutzer, T.; Scholz, U.; Gundlach, H.; Wicker, T.; Šimková, Hana; Novák, Petr; Neumann, Pavel; Kubaláková, Marie; Bauer, E.; Haseneyer, G.; Fuchs, J.; Doležel, Jaroslav; Stein, N.; Mayer, K.F.X.; Houben, A.

    2012-01-01

    Roč. 109, č. 33 (2012), s. 13343-13346 ISSN 0027-8424 R&D Projects: GA ČR GBP501/12/G090; GA MŠk(CZ) OC10037 Institutional research plan: CEZ:AV0Z50510513 Institutional support: RVO:60077344 ; RVO:61389030 Keywords : FULL-LENGTH CDNAS * SECALE-CEREALE L. * B-CHROMOSOMES * REPETITIVE SEQUENCES Subject RIV: EB - Genetics ; Molecular Biology; EB - Genetics ; Molecular Biology (UEB-Q) Impact factor: 9.737, year: 2012

  19. Exome sequencing reveals VCP mutations as a cause of familial ALS

    OpenAIRE

    Johnson, Janel O.; Mandrioli, Jessica; Benatar, Michael; Abramzon, Yevgeniya; Van Deerlin, Vivianna M.; Trojanowski, John Q.; Gibbs, J Raphael; Brunetti, Maura; Gronka, Susan; Wuu, Joanne; Ding, Jinhui; McCluskey, Leo; Martinez-Lage, Maria; Falcone, Dana; Hernandez, Dena G.

    2010-01-01

    Using exome sequencing, we identified a p.R191Q amino acid change in the valosin-containing protein (VCP) gene in an Italian family with autosomal dominantly inherited amyotrophic lateral sclerosis (ALS). Mutations in VCP have previously been identified in families with Inclusion Body Myopathy, Paget’s disease and Frontotemporal Dementia (IBMPFD). Screening of VCP in a cohort of 210 familial ALS cases and 78 autopsy-proven ALS cases identified four additional mutations including a p.R155H mut...

  20. Molecular footprints of domestication and improvement in soybean revealed by whole genome re-sequencing

    DEFF Research Database (Denmark)

    Li, Ying-hui; Zhao, Shan-cen; Ma, Jian-xin

    2013-01-01

    and genetic improvement were identified.CONCLUSIONS:Given the uniqueness of the soybean germplasm sequenced, this study drew a clear picture of human-mediated evolution of the soybean genomes. The genomic resources and information provided by this study would also facilitate the discovery of genes......BACKGROUND:Artificial selection played an important role in the origin of modern Glycine max cultivars from the wild soybean Glycine soja. To elucidate the consequences of artificial selection accompanying the domestication and modern improvement of soybean, 25 new and 30 published whole-genome re...

  1. RNA-Based Amplicon Sequencing Reveals Microbiota Development during Ripening of Artisanal versus Industrial Lard d'Arnad.

    Science.gov (United States)

    Ferrocino, Ilario; Bellio, Alberto; Romano, Angelo; Macori, Guerrino; Rantsiou, Kalliopi; Decastelli, Lucia; Cocolin, Luca

    2017-08-15

    Valle d'Aosta Lard d'Arnad is a protected designation of origin (PDO) product produced from fat of the shoulder and back of heavy pigs. Its manufacturing process can be very diverse, especially regarding the maturation temperature and the NaCl concentration used for the brine; thereby, the main goal of this study was to investigate the impact of those parameters on the microbiota developed during curing and ripening. Three farms producing Lard d'Arnad were selected. Two plants, reflecting the industrial process characterized either by low maturation temperature (plant A [10% NaCl, 2°C]) or by using a low NaCl concentration (plant B [2.5% NaCl, 4°C]), were selected, while the third was characterized by an artisanal process (plant C [30% NaCl, 8°C]). Lard samples were obtained at time 0 and after 7, 15, 30, 60, and 90 days of maturation. From each plant, 3 independent lots were analyzed. The diversity of live microbiota was evaluated by using classical plate counts and amplicon target sequencing of small subunit (SSU) rRNA. The main taxa identified by sequencing were Acinetobacter johnsonii , Psychrobacter , Staphylococcus equorum , Staphylococcus sciuri , Pseudomonas fragi , Brochothrix , Halomonas , and Vibrio , and differences in their relative abundances distinguished samples from the individual plants. The composition of the microbiota was more similar among plants A and B, and it was characterized by the higher presence of taxa recognized as undesired bacteria in food-processing environments. Oligotype analysis of Halomonas and Acinetobacter revealed the presence of several characteristic oligotypes associated with A and B samples. IMPORTANCE Changes in the food production process can drastically affect the microbial community structure, with a possible impact on the final characteristics of the products. The industrial processes of Lard d'Arnad production are characterized by a reduction in the salt concentration in the brines to address a consumer demand

  2. Characteristics of the tomato chromoplast revealed by proteomic analysis

    OpenAIRE

    Barsan, Cristina; Sanchez-Bel, Paloma; Rombaldi, César Valmor; Egea, Isabel; Rossignol, Michel; Kuntz, Marcel; Zouine, Mohamed; Latché, Alain; Bouzayen, Mondher; Pech, Jean-Claude

    2010-01-01

    Chromoplasts are non-photosynthetic specialized plastids that are important in ripening tomato fruit (Solanum lycopersicum) since, among other functions, they are the site of accumulation of coloured compounds. Analysis of the proteome of red fruit chromoplasts revealed the presence of 988 proteins corresponding to 802 Arabidopsis unigenes, among which 209 had not been listed so far in plastidial databanks. These data revealed several features of the chromoplast. Proteins of lipid metabolism ...

  3. Comparative sequence analyses of the major quantitative trait locus phosphorus uptake 1 (Pup1) reveal a complex genetic structure.

    Science.gov (United States)

    Heuer, Sigrid; Lu, Xiaochun; Chin, Joong Hyoun; Tanaka, Juan Pariasca; Kanamori, Hiroyuki; Matsumoto, Takashi; De Leon, Teresa; Ulat, Victor Jun; Ismail, Abdelbagi M; Yano, Masahiro; Wissuwa, Matthias

    2009-06-01

    The phosphorus uptake 1 (Pup1) locus was identified as a major quantitative trait locus (QTL) for tolerance of phosphorus deficiency in rice. Near-isogenic lines with the Pup1 region from tolerant donor parent Kasalath typically show threefold higher phosphorus uptake and grain yield in phosphorus-deficient field trials than the intolerant parent Nipponbare. In this study, we report the fine mapping of the Pup1 locus to the long arm of chromosome 12 (15.31-15.47 Mb). Genes in the region were initially identified on the basis of the Nipponbare reference genome, but did not reveal any obvious candidate genes related to phosphorus uptake. Kasalath BAC clones were therefore sequenced and revealed a 278-kbp sequence significantly different from the syntenic regions in Nipponbare (145 kb) and in the indica reference genome of 93-11 (742 kbp). Size differences are caused by large insertions or deletions (INDELs), and an exceptionally large number of retrotransposon and transposon-related elements (TEs) present in all three sequences (45%-54%). About 46 kb of the Kasalath sequence did not align with the entire Nipponbare genome, and only three Nipponbare genes (fatty acid alpha-dioxygenase, dirigent protein and aspartic proteinase) are highly conserved in Kasalath. Two Nipponbare genes (expressed proteins) might have evolved by at least three TE integrations in an ancestor gene that is still present in Kasalath. Several predicted Kasalath genes are novel or unknown genes that are mainly located within INDEL regions. Our results highlight the importance of sequencing QTL regions in the respective donor parent, as important genes might not be present in the current reference genomes.

  4. Genomic localization, sequence analysis, and transcription of the putative human cytomegalovirus DNA polymerase gene

    International Nuclear Information System (INIS)

    Heilbronn, T.; Jahn, G.; Buerkle, A.; Freese, U.K.; Fleckenstein, B.; Zur Hausen, H.

    1987-01-01

    The human cytomegalovirus (HCMV)-induced DNA polymerase has been well characterized biochemically and functionally, but its genomic location has not yet been assigned. To identify the coding sequence, cross-hybridization with the herpes simplex virus type 1 (HSV-1) polymerase gene was used, as suggested by the close similarity of the herpes group virus-induced DNA polymerases to the HCMV DNA polymerase. A cosmid and plasmid library of the entire HCMV genome was screened with the BamHI Q fragment of HSF-1 at different stringency conditions. One PstI-HincII restriction fragment of 850 base pairs mapping within the EcoRI M fragment of HCMV cross-hybridized at T/sub m/ - 25/degrees/C. Sequence analysis revealed one open reading frame spanning the entire sequence. The amino acid sequence showed a highly conserved domain of 133 amino acids shared with the HSV and putative Esptein-Barr virus polymerase sequences. This domain maps within the C-terminal part of the HSV polymerase gene, which has been suggested to contain part of the catalytic center of the enzyme. Transcription analysis revealed one 5.4-kilobase early transcript in the sense orientation with respect to the open reading frame identified. This transcript appears to code for the 140-kilodalton HCMV polymerase protein

  5. Penicillium simile sp. nov. revealed by morphological and phylogenetic analysis.

    Science.gov (United States)

    Davolos, Domenico; Pietrangeli, Biancamaria; Persiani, Anna Maria; Maggi, Oriana

    2012-02-01

    The morphology of three phenetically identical Penicillium isolates, collected from the bioaerosol in a restoration laboratory in Italy, displayed macro- and microscopic characteristics that were similar though not completely ascribable to Penicillium raistrickii. For this reason, a phylogenetic approach based on DNA sequencing analysis was performed to establish both the taxonomic status and the evolutionary relationships of these three peculiar isolates in relation to previously described species of the genus Penicillium. We used four nuclear loci (both rRNA and protein coding genes) that have previously proved useful for the molecular investigation of taxa belonging to the genus Penicillium at various evolutionary levels. The internal transcribed spacer region (ITS1-5.8S-ITS2), domains D1 and D2 of the 28S rDNA, a region of the tubulin beta chain gene (benA) and part of the calmodulin gene (cmd) were amplified by PCR and sequenced. Analysis of the rRNA genes and of the benA and cmd sequence data indicates the presence of three isogenic isolates belonging to a genetically distinct species of the genus Penicillium, here described and named Penicillium simile sp. nov. (ATCC MYA-4591(T)  = CBS 129191(T)). This novel species is phylogenetically different from P. raistrickii and other related species of the genus Penicillium (e.g. Penicillium scabrosum), from which it can be distinguished on the basis of morphological trait analysis.

  6. Ultra Deep Sequencing of a Baculovirus Population Reveals Widespread Genomic Variations

    Directory of Open Access Journals (Sweden)

    Aurélien Chateigner

    2015-07-01

    Full Text Available Viruses rely on widespread genetic variation and large population size for adaptation. Large DNA virus populations are thought to harbor little variation though natural populations may be polymorphic. To measure the genetic variation present in a dsDNA virus population, we deep sequenced a natural strain of the baculovirus Autographa californica multiple nucleopolyhedrovirus. With 124,221X average genome coverage of our 133,926 bp long consensus, we could detect low frequency mutations (0.025%. K-means clustering was used to classify the mutations in four categories according to their frequency in the population. We found 60 high frequency non-synonymous mutations under balancing selection distributed in all functional classes. These mutants could alter viral adaptation dynamics, either through competitive or synergistic processes. Lastly, we developed a technique for the delimitation of large deletions in next generation sequencing data. We found that large deletions occur along the entire viral genome, with hotspots located in homologous repeat regions (hrs. Present in 25.4% of the genomes, these deletion mutants presumably require functional complementation to complete their infection cycle. They might thus have a large impact on the fitness of the baculovirus population. Altogether, we found a wide breadth of genomic variation in the baculovirus population, suggesting it has high adaptive potential.

  7. Landscape of Infiltrating T Cells in Liver Cancer Revealed by Single-Cell Sequencing.

    Science.gov (United States)

    Zheng, Chunhong; Zheng, Liangtao; Yoo, Jae-Kwang; Guo, Huahu; Zhang, Yuanyuan; Guo, Xinyi; Kang, Boxi; Hu, Ruozhen; Huang, Julie Y; Zhang, Qiming; Liu, Zhouzerui; Dong, Minghui; Hu, Xueda; Ouyang, Wenjun; Peng, Jirun; Zhang, Zemin

    2017-06-15

    Systematic interrogation of tumor-infiltrating lymphocytes is key to the development of immunotherapies and the prediction of their clinical responses in cancers. Here, we perform deep single-cell RNA sequencing on 5,063 single T cells isolated from peripheral blood, tumor, and adjacent normal tissues from six hepatocellular carcinoma patients. The transcriptional profiles of these individual cells, coupled with assembled T cell receptor (TCR) sequences, enable us to identify 11 T cell subsets based on their molecular and functional properties and delineate their developmental trajectory. Specific subsets such as exhausted CD8 + T cells and Tregs are preferentially enriched and potentially clonally expanded in hepatocellular carcinoma (HCC), and we identified signature genes for each subset. One of the genes, layilin, is upregulated on activated CD8 + T cells and Tregs and represses the CD8 + T cell functions in vitro. This compendium of transcriptome data provides valuable insights and a rich resource for understanding the immune landscape in cancers. Copyright © 2017 Elsevier Inc. All rights reserved.

  8. Extracellular DNA amplicon sequencing reveals high levels of benthic eukaryotic diversity in the central Red Sea

    KAUST Repository

    Pearman, John K.

    2015-11-01

    The present study aims to characterize the benthic eukaryotic biodiversity patterns at a coarse taxonomic level in three areas of the central Red Sea (a lagoon, an offshore area in Thuwal and a shallow coastal area near Jeddah) based on extracellular DNA. High-throughput amplicon sequencing targeting the V9 region of the 18S rRNA gene was undertaken for 32 sediment samples. High levels of alpha-diversity were detected with 16,089 operational taxonomic units (OTUs) being identified. The majority of the OTUs were assigned to Metazoa (29.2%), Alveolata (22.4%) and Stramenopiles (17.8%). Stramenopiles (Diatomea) and Alveolata (Ciliophora) were frequent in a lagoon and in shallower coastal stations, whereas metazoans (Arthropoda: Maxillopoda) were dominant in deeper offshore stations. Only 24.6% of total OTUs were shared among all areas. Beta-diversity was generally lower between the lagoon and Jeddah (nearshore) than between either of those and the offshore area, suggesting a nearshore–offshore biodiversity gradient. The current approach allowed for a broad-range of benthic eukaryotic biodiversity to be analysed with significantly less labour than would be required by other traditional taxonomic approaches. Our findings suggest that next generation sequencing techniques have the potential to provide a fast and standardised screening of benthic biodiversity at large spatial and temporal scales.

  9. Whole-exome sequencing reveals a rare interferon gamma receptor 1 mutation associated with myasthenia gravis.

    Science.gov (United States)

    Qi, Guoyan; Liu, Peng; Gu, Shanshan; Yang, Hongxia; Dong, Huimin; Xue, Yinping

    2018-04-01

    Our study is aimed to explore the underlying genetic basis of myasthenia gravis. We collected a Chinese pedigree with myasthenia gravis, and whole-exome sequencing was performed on the two affected siblings and their parents. The candidate pathogenic gene was identified by bioinformatics filtering, which was further verified by Sanger sequencing. The homozygous mutation c.G40A (p.V14M) in interferon gamma receptor 1was identified. Moreover, the mutation was also detected in 3 cases of 44 sporadic myasthenia gravis patients. The p.V14M substitution in interferon gamma receptor 1 may affect the signal peptide function and the translocation on cell membrane, which could disrupt the binding of the ligand of interferon gamma and antibody production, contributing to myasthenia gravis susceptibility. We discovered that a rare variant c.G40A in interferon gamma receptor 1 potentially contributes to the myasthenia gravis pathogenesis. Further functional studies are needed to confirm the effect of the interferon gamma receptor 1 on the myasthenia gravis phenotype.

  10. An Ambystoma mexicanum EST sequencing project: analysis of 17,352 expressed sequence tags from embryonic and regenerating blastema cDNA libraries

    Science.gov (United States)

    Habermann, Bianca; Bebin, Anne-Gaelle; Herklotz, Stephan; Volkmer, Michael; Eckelt, Kay; Pehlke, Kerstin; Epperlein, Hans Henning; Schackert, Hans Konrad; Wiebe, Glenis; Tanaka, Elly M

    2004-01-01

    Background The ambystomatid salamander, Ambystoma mexicanum (axolotl), is an important model organism in evolutionary and regeneration research but relatively little sequence information has so far been available. This is a major limitation for molecular studies on caudate development, regeneration and evolution. To address this lack of sequence information we have generated an expressed sequence tag (EST) database for A. mexicanum. Results Two cDNA libraries, one made from stage 18-22 embryos and the other from day-6 regenerating tail blastemas, generated 17,352 sequences. From the sequenced ESTs, 6,377 contigs were assembled that probably represent 25% of the expressed genes in this organism. Sequence comparison revealed significant homology to entries in the NCBI non-redundant database. Further examination of this gene set revealed the presence of genes involved in important cell and developmental processes, including cell proliferation, cell differentiation and cell-cell communication. On the basis of these data, we have performed phylogenetic analysis of key cell-cycle regulators. Interestingly, while cell-cycle proteins such as the cyclin B family display expected evolutionary relationships, the cyclin-dependent kinase inhibitor 1 gene family shows an unusual evolutionary behavior among the amphibians. Conclusions Our analysis reveals the importance of a comprehensive sequence set from a representative of the Caudata and illustrates that the EST sequence database is a rich source of molecular, developmental and regeneration studies. To aid in data mining, the ESTs have been organized into an easily searchable database that is freely available online. PMID:15345051

  11. Construction of an integrated database to support genomic sequence analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gilbert, W.; Overbeek, R.

    1994-11-01

    The central goal of this project is to develop an integrated database to support comparative analysis of genomes including DNA sequence data, protein sequence data, gene expression data and metabolism data. In developing the logic-based system GenoBase, a broader integration of available data was achieved due to assistance from collaborators. Current goals are to easily include new forms of data as they become available and to easily navigate through the ensemble of objects described within the database. This report comments on progress made in these areas.

  12. De novo Transcriptome Sequencing Reveals a Considerable Bias in the Incidence of Simple Sequence Repeats towards the Downstream of ‘Pre-miRNAs’ of Black Pepper

    Science.gov (United States)

    Joy, Nisha; Asha, Srinivasan; Mallika, Vijayan; Soniya, Eppurathu Vasudevan

    2013-01-01

    Next generation sequencing has an advantageon transformational development of species with limited available sequence data as it helps to decode the genome and transcriptome. We carried out the de novo sequencing using illuminaHiSeq™ 2000 to generate the first leaf transcriptome of black pepper (Piper nigrum L.), an important spice variety native to South India and also grown in other tropical regions. Despite the economic and biochemical importance of pepper, a scientifically rigorous study at the molecular level is far from complete due to lack of sufficient sequence information and cytological complexity of its genome. The 55 million raw reads obtained, when assembled using Trinity program generated 2,23,386 contigs and 1,28,157 unigenes. Reports suggest that the repeat-rich genomic regions give rise to small non-coding functional RNAs. MicroRNAs (miRNAs) are the most abundant type of non-coding regulatory RNAs. In spite of the widespread research on miRNAs, little is known about the hair-pin precursors of miRNAs bearing Simple Sequence Repeats (SSRs). We used the array of transcripts generated, for the in silico prediction and detection of ‘43 pre-miRNA candidates bearing different types of SSR motifs’. The analysis identified 3913 different types of SSR motifs with an average of one SSR per 3.04 MB of thetranscriptome. About 0.033% of the transcriptome constituted ‘pre-miRNA candidates bearing SSRs’. The abundance, type and distribution of SSR motifs studied across the hair-pin miRNA precursors, showed a significant bias in the position of SSRs towards the downstream of predicted ‘pre-miRNA candidates’. The catalogue of transcripts identified, together with the demonstration of reliable existence of SSRs in the miRNA precursors, permits future opportunities for understanding the genetic mechanism of black pepper and likely functions of ‘tandem repeats’ in miRNAs. PMID:23469176

  13. Mapping the transcription start points of the Staphylococcus aureus eap, emp, and vwb promoters reveals a conserved octanucleotide sequence that is essential for expression of these genes.

    Science.gov (United States)

    Harraghy, Niamh; Homerova, Dagmar; Herrmann, Mathias; Kormanec, Jan

    2008-01-01

    Mapping the transcription start points of the eap, emp, and vwb promoters revealed a conserved octanucleotide sequence (COS). Deleting this sequence abolished the expression of eap, emp, and vwb. However, electrophoretic mobility shift assays gave no evidence that this sequence was a binding site for SarA or SaeR, known regulators of eap and emp.

  14. RNA deep sequencing reveals differential microRNA expression during development of sea urchin and sea star.

    Directory of Open Access Journals (Sweden)

    Sabah Kadri

    Full Text Available microRNAs (miRNAs are small (20-23 nt, non-coding single stranded RNA molecules that act as post-transcriptional regulators of mRNA gene expression. They have been implicated in regulation of developmental processes in diverse organisms. The echinoderms, Strongylocentrotus purpuratus (sea urchin and Patiria miniata (sea star are excellent model organisms for studying development with well-characterized transcriptional networks. However, to date, nothing is known about the role of miRNAs during development in these organisms, except that the genes that are involved in the miRNA biogenesis pathway are expressed during their developmental stages. In this paper, we used Illumina Genome Analyzer (Illumina, Inc. to sequence small RNA libraries in mixed stage population of embryos from one to three days after fertilization of sea urchin and sea star (total of 22,670,000 reads. Analysis of these data revealed the miRNA populations in these two species. We found that 47 and 38 known miRNAs are expressed in sea urchin and sea star, respectively, during early development (32 in common. We also found 13 potentially novel miRNAs in the sea urchin embryonic library. miRNA expression is generally conserved between the two species during development, but 7 miRNAs are highly expressed in only one species. We expect that our two datasets will be a valuable resource for everyone working in the field of developmental biology and the regulatory networks that affect it. The computational pipeline to analyze Illumina reads is available at http://www.benoslab.pitt.edu/services.html.

  15. RNA Deep Sequencing Reveals Differential MicroRNA Expression during Development of Sea Urchin and Sea Star

    Science.gov (United States)

    Kadri, Sabah; Hinman, Veronica F.; Benos, Panayiotis V.

    2011-01-01

    microRNAs (miRNAs) are small (20–23 nt), non-coding single stranded RNA molecules that act as post-transcriptional regulators of mRNA gene expression. They have been implicated in regulation of developmental processes in diverse organisms. The echinoderms, Strongylocentrotus purpuratus (sea urchin) and Patiria miniata (sea star) are excellent model organisms for studying development with well-characterized transcriptional networks. However, to date, nothing is known about the role of miRNAs during development in these organisms, except that the genes that are involved in the miRNA biogenesis pathway are expressed during their developmental stages. In this paper, we used Illumina Genome Analyzer (Illumina, Inc.) to sequence small RNA libraries in mixed stage population of embryos from one to three days after fertilization of sea urchin and sea star (total of 22,670,000 reads). Analysis of these data revealed the miRNA populations in these two species. We found that 47 and 38 known miRNAs are expressed in sea urchin and sea star, respectively, during early development (32 in common). We also found 13 potentially novel miRNAs in the sea urchin embryonic library. miRNA expression is generally conserved between the two species during development, but 7 miRNAs are highly expressed in only one species. We expect that our two datasets will be a valuable resource for everyone working in the field of developmental biology and the regulatory networks that affect it. The computational pipeline to analyze Illumina reads is available at http://www.benoslab.pitt.edu/services.html. PMID:22216218

  16. Temporal sequencing of throughfall drop generation as revealed by use of a large-scale rainfall simulator

    Science.gov (United States)

    Nanko, K.; Levia, D. F., Jr.; Iida, S.; SUN, X.; Shinohara, Y.; Sakai, N.

    2017-12-01

    Scientists have been interested in throughfall drop size and its distribution because of its importance to soil erosion and the forest water balance. An indoor experiment was employed to deepen our understanding of throughfall drop generation processes to promote better management of forested ecosystems. The indoor experiment provides a unique opportunity to examine an array of constant rainfall intensities that are ideal conditions to pick up the effect of changing intensities and not found in the fields. Throughfall drop generation was examined for three species- Cryptomeria japonica D. Don (Japanese cedar), Chamaecyparis obtusa (Siebold & Zucc.) Endl. (Japanese cypress), and Zelkova serrata Thunb. (Japanese zelkova)- under both leafed and leafless conditions in the large-scale rainfall simulator in the National Research Institute for Earth Science and Disaster Resilience (Tsukuba, Japan) at varying rainfall intensities ranging from15 to 100 mm h-1. Drop size distributions of the applied rainfall and throughfall were measured simultaneously by 20 laser disdrometers. Utilizing the drop size dataset, throughfall was separated into three components: free throughfall, canopy drip, and splash throughfall. The temporal sequencing of the throughfall components were analyzed on a 1-min interval during each experimental run. The throughfall component percentage and drop size of canopy drip differed among tree species and rainfall intensities and by elapsed time from the beginning of the rainfall event. Preliminary analysis revealed that the time differences to produce branch drip as compared to leaf (or needle) drip was partly due to differential canopy wet-up processes and the disappearance of branch drips due to canopy saturation, leading to dissimilar throughfall drop size distributions beneath the various tree species examined. This research was supported by JSPS Invitation Fellowship for Research in Japan (Grant No.: S16088) and JSPS KAKENHI (Grant No.: JP15H05626).

  17. Metatranscriptome Sequencing Reveals Insights into the Gene Expression and Functional Potential of Rumen Wall Bacteria

    Directory of Open Access Journals (Sweden)

    Evelyne Mann

    2018-01-01

    Full Text Available Microbiota of the rumen wall constitute an important niche of rumen microbial ecology and their composition has been elucidated in different ruminants during the last years. However, the knowledge about the function of rumen wall microbes is still limited. Rumen wall biopsies were taken from three fistulated dairy cows under a standard forage-based diet and after 4 weeks of high concentrate feeding inducing a subacute rumen acidosis (SARA. Extracted RNA was used for metatranscriptome sequencing using Illumina HiSeq sequencing technology. The gene expression of the rumen wall microbial community was analyzed by mapping 35 million sequences against the Kyoto Encyclopedia for Genes and Genomes (KEGG database and determining differentially expressed genes. A total of 1,607 functional features were assigned with high expression of genes involved in central metabolism, galactose, starch and sucrose metabolism. The glycogen phosphorylase (EC:2.4.1.1 which degrades (1->4-alpha-D-glucans was among the highest expressed genes being transcribed by 115 bacterial genera. Energy metabolism genes were also highly expressed, including the pyruvate orthophosphate dikinase (EC:2.7.9.1 involved in pyruvate metabolism, which was covered by 177 genera. Nitrogen metabolism genes, in particular glutamate dehydrogenase (EC:1.4.1.4, glutamine synthetase (EC:6.3.1.2 and glutamate synthase (EC:1.4.1.13, EC:1.4.1.14 were also found to be highly expressed and prove rumen wall microbiota to be actively involved in providing host-relevant metabolites for exchange across the rumen wall. In addition, we found all four urease subunits (EC:3.5.1.5 transcribed by members of the genera Flavobacterium, Corynebacterium, Helicobacter, Clostridium, and Bacillus, and the dissimilatory sulfate reductase (EC 1.8.99.5 dsrABC, which is responsible for the reduction of sulfite to sulfide. We also provide in situ evidence for cellulose and cellobiose degradation, a key step in fiber-rich feed

  18. Analysis of Sequence Diagram Layout in Advanced UML Modelling Tools

    Directory of Open Access Journals (Sweden)

    Ņikiforova Oksana

    2016-05-01

    Full Text Available System modelling using Unified Modelling Language (UML is the task that should be solved for software development. The more complex software becomes the higher requirements are stated to demonstrate the system to be developed, especially in its dynamic aspect, which in UML is offered by a sequence diagram. To solve this task, the main attention is devoted to the graphical presentation of the system, where diagram layout plays the central role in information perception. The UML sequence diagram due to its specific structure is selected for a deeper analysis on the elements’ layout. The authors research represents the abilities of modern UML modelling tools to offer automatic layout of the UML sequence diagram and analyse them according to criteria required for the diagram perception.

  19. Network clustering coefficient approach to DNA sequence analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gerhardt, Guenther J.L. [Universidade Federal do Rio Grande do Sul-Hospital de Clinicas de Porto Alegre, Rua Ramiro Barcelos 2350/sala 2040/90035-003 Porto Alegre (Brazil); Departamento de Fisica e Quimica da Universidade de Caxias do Sul, Rua Francisco Getulio Vargas 1130, 95001-970 Caxias do Sul (Brazil); Lemke, Ney [Programa Interdisciplinar em Computacao Aplicada, Unisinos, Av. Unisinos, 950, 93022-000 Sao Leopoldo, RS (Brazil); Corso, Gilberto [Departamento de Biofisica e Farmacologia, Centro de Biociencias, Universidade Federal do Rio Grande do Norte, Campus Universitario, 59072 970 Natal, RN (Brazil)]. E-mail: corso@dfte.ufrn.br

    2006-05-15

    In this work we propose an alternative DNA sequence analysis tool based on graph theoretical concepts. The methodology investigates the path topology of an organism genome through a triplet network. In this network, triplets in DNA sequence are vertices and two vertices are connected if they occur juxtaposed on the genome. We characterize this network topology by measuring the clustering coefficient. We test our methodology against two main bias: the guanine-cytosine (GC) content and 3-bp (base pairs) periodicity of DNA sequence. We perform the test constructing random networks with variable GC content and imposed 3-bp periodicity. A test group of some organisms is constructed and we investigate the methodology in the light of the constructed random networks. We conclude that the clustering coefficient is a valuable tool since it gives information that is not trivially contained in 3-bp periodicity neither in the variable GC content.

  20. Evolutionary analysis of hepatitis C virus gene sequences from 1953

    Science.gov (United States)

    Gray, Rebecca R.; Tanaka, Yasuhito; Takebe, Yutaka; Magiorkinis, Gkikas; Buskell, Zelma; Seeff, Leonard; Alter, Harvey J.; Pybus, Oliver G.

    2013-01-01

    Reconstructing the transmission history of infectious diseases in the absence of medical or epidemiological records often relies on the evolutionary analysis of pathogen genetic sequences. The precision of evolutionary estimates of epidemic history can be increased by the inclusion of sequences derived from ‘archived’ samples that are genetically distinct from contemporary strains. Historical sequences are especially valuable for viral pathogens that circulated for many years before being formally identified, including HIV and the hepatitis C virus (HCV). However, surprisingly few HCV isolates sampled before discovery of the virus in 1989 are currently available. Here, we report and analyse two HCV subgenomic sequences obtained from infected individuals in 1953, which represent the oldest genetic evidence of HCV infection. The pairwise genetic diversity between the two sequences indicates a substantial period of HCV transmission prior to the 1950s, and their inclusion in evolutionary analyses provides new estimates of the common ancestor of HCV in the USA. To explore and validate the evolutionary information provided by these sequences, we used a new phylogenetic molecular clock method to estimate the date of sampling of the archived strains, plus the dates of four more contemporary reference genomes. Despite the short fragments available, we conclude that the archived sequences are consistent with a proposed sampling date of 1953, although statistical uncertainty is large. Our cross-validation analyses suggest that the bias and low statistical power observed here likely arise from a combination of high evolutionary rate heterogeneity and an unstructured, star-like phylogeny. We expect that attempts to date other historical viruses under similar circumstances will meet similar problems. PMID:23938759

  1. Small RNA sequencing reveals metastasis-related microRNAs in lung adenocarcinoma

    DEFF Research Database (Denmark)

    Daugaard, Iben; Venø, Morten T.; Yan, Yan

    2017-01-01

    The majority of lung cancer deaths are caused by metastatic disease. MicroRNAs (miRNAs) are posttranscriptional regulators of gene expression and miRNA dysregulation can contribute to metastatic progression. Here, small RNA sequencing was used to profile the miRNA and piwi-interacting RNA (piRNA......) transcriptomes in relation to lung cancer metastasis. RNA-seq was performed using RNA extracted from formalin-fixed paraffin embedded (FFPE) lung adenocarcinomas (LAC) and brain metastases from 8 patients, and LACs from 8 patients without detectable metastatic disease. Impact on miRNA and piRNA transcriptomes...... was subtle with 9 miRNAs and 8 piRNAs demonstrating differential expression between metastasizing and non-metastasizing LACs. For piRNAs, decreased expression of piR-57125 was the most significantly associated with distant metastasis. Validation by RT-qPCR in a LAC cohort comprising 52 patients confirmed...

  2. Penicillium arizonense, a new, genome sequenced fungal species, reveals a high chemical diversity in secreted metabolites

    DEFF Research Database (Denmark)

    Grijseels, Sietske; Nielsen, Jens Christian; Randelovic, Milica

    2016-01-01

    A new soil-borne species belonging to the Penicillium section Canescentia is described, Penicillium arizonense sp. nov. (type strain CBS 141311T = IBT 12289T). The genome was sequenced and assembled into 33.7 Mb containing 12,502 predicted genes. A phylogenetic assessment based on marker genes...... confirmed the grouping of P. arizonense within section Canescentia. Compared to related species, P. arizonense proved to encode a high number of proteins involved in carbohydrate metabolism, in particular hemicellulases. Mining the genome for genes involved in secondary metabolite biosynthesis resulted...... of biosynthetic gene clusters in P. arizonense responsible for the synthesis of all detected compounds except curvulinic acid. The capacity to produce biomass degrading enzymes and the identification of a high chemical diversity in secreted bioactive secondary metabolites, offers a broad range of potential...

  3. Deep Sequencing Reveals Uncharted Isoform Heterogeneity of the Protein-Coding Transcriptome in Cerebral Ischemia.

    Science.gov (United States)

    Bhattarai, Sunil; Aly, Ahmed; Garcia, Kristy; Ruiz, Diandra; Pontarelli, Fabrizio; Dharap, Ashutosh

    2018-06-03

    Gene expression in cerebral ischemia has been a subject of intense investigations for several years. Studies utilizing probe-based high-throughput methodologies such as microarrays have contributed significantly to our existing knowledge but lacked the capacity to dissect the transcriptome in detail. Genome-wide RNA-sequencing (RNA-seq) enables comprehensive examinations of transcriptomes for attributes such as strandedness, alternative splicing, alternative transcription start/stop sites, and sequence composition, thus providing a very detailed account of gene expression. Leveraging this capability, we conducted an in-depth, genome-wide evaluation of the protein-coding transcriptome of the adult mouse cortex after transient focal ischemia at 6, 12, or 24 h of reperfusion using RNA-seq. We identified a total of 1007 transcripts at 6 h, 1878 transcripts at 12 h, and 1618 transcripts at 24 h of reperfusion that were significantly altered as compared to sham controls. With isoform-level resolution, we identified 23 splice variants arising from 23 genes that were novel mRNA isoforms. For a subset of genes, we detected reperfusion time-point-dependent splice isoform switching, indicating an expression and/or functional switch for these genes. Finally, for 286 genes across all three reperfusion time-points, we discovered multiple, distinct, simultaneously expressed and differentially altered isoforms per gene that were generated via alternative transcription start/stop sites. Of these, 165 isoforms derived from 109 genes were novel mRNAs. Together, our data unravel the protein-coding transcriptome of the cerebral cortex at an unprecedented depth to provide several new insights into the flexibility and complexity of stroke-related gene transcription and transcript organization.

  4. Multi locus sequence typing of Chlamydia reveals an association between Chlamydia psittaci genotypes and host species.

    Science.gov (United States)

    Pannekoek, Yvonne; Dickx, Veerle; Beeckman, Delphine S A; Jolley, Keith A; Keijzers, Wendy C; Vretou, Evangelia; Maiden, Martin C J; Vanrompay, Daisy; van der Ende, Arie

    2010-12-02

    Chlamydia comprises a group of obligate intracellular bacterial parasites responsible for a variety of diseases in humans and animals, including several zoonoses. Chlamydia trachomatis causes diseases such as trachoma, urogenital infection and lymphogranuloma venereum with severe morbidity. Chlamydia pneumoniae is a common cause of community-acquired respiratory tract infections. Chlamydia psittaci, causing zoonotic pneumonia in humans, is usually hosted by birds, while Chlamydia abortus, causing abortion and fetal death in mammals, including humans, is mainly hosted by goats and sheep. We used multi-locus sequence typing to asses the population structure of Chlamydia. In total, 132 Chlamydia isolates were analyzed, including 60 C. trachomatis, 18 C. pneumoniae, 16 C. abortus, 34 C. psittaci and one of each of C. pecorum, C. caviae, C. muridarum and C. felis. Cluster analyses utilizing the Neighbour-Joining algorithm with the maximum composite likelihood model of concatenated sequences of 7 housekeeping fragments showed that C. psittaci 84/2334 isolated from a parrot grouped together with the C. abortus isolates from goats and sheep. Cluster analyses of the individual alleles showed that in all instances C. psittaci 84/2334 formed one group with C. abortus. Moving 84/2334 from the C. psittaci group to the C. abortus group resulted in a significant increase in the number of fixed differences and elimination of the number of shared mutations between C. psittaci and C. abortus. C. psittaci M56 from a muskrat branched separately from the main group of C. psittaci isolates. C. psittaci genotypes appeared to be associated with host species. The phylogenetic tree of C. psittaci did not follow that of its host bird species, suggesting host species jumps. In conclusion, we report for the first time an association between C. psittaci genotypes with host species.

  5. Multi locus sequence typing of Chlamydia reveals an association between Chlamydia psittaci genotypes and host species.

    Directory of Open Access Journals (Sweden)

    Yvonne Pannekoek

    2010-12-01

    Full Text Available Chlamydia comprises a group of obligate intracellular bacterial parasites responsible for a variety of diseases in humans and animals, including several zoonoses. Chlamydia trachomatis causes diseases such as trachoma, urogenital infection and lymphogranuloma venereum with severe morbidity. Chlamydia pneumoniae is a common cause of community-acquired respiratory tract infections. Chlamydia psittaci, causing zoonotic pneumonia in humans, is usually hosted by birds, while Chlamydia abortus, causing abortion and fetal death in mammals, including humans, is mainly hosted by goats and sheep. We used multi-locus sequence typing to asses the population structure of Chlamydia. In total, 132 Chlamydia isolates were analyzed, including 60 C. trachomatis, 18 C. pneumoniae, 16 C. abortus, 34 C. psittaci and one of each of C. pecorum, C. caviae, C. muridarum and C. felis. Cluster analyses utilizing the Neighbour-Joining algorithm with the maximum composite likelihood model of concatenated sequences of 7 housekeeping fragments showed that C. psittaci 84/2334 isolated from a parrot grouped together with the C. abortus isolates from goats and sheep. Cluster analyses of the individual alleles showed that in all instances C. psittaci 84/2334 formed one group with C. abortus. Moving 84/2334 from the C. psittaci group to the C. abortus group resulted in a significant increase in the number of fixed differences and elimination of the number of shared mutations between C. psittaci and C. abortus. C. psittaci M56 from a muskrat branched separately from the main group of C. psittaci isolates. C. psittaci genotypes appeared to be associated with host species. The phylogenetic tree of C. psittaci did not follow that of its host bird species, suggesting host species jumps. In conclusion, we report for the first time an association between C. psittaci genotypes with host species.

  6. Whole-genome sequencing and genetic variant analysis of a Quarter Horse mare.

    KAUST Repository

    Doan, Ryan; Cohen, Noah D; Sawyer, Jason; Ghaffari, Noushin; Johnson, Charlie D; Dindot, Scott V

    2012-01-01

    BACKGROUND: The catalog of genetic variants in the horse genome originates from a few select animals, the majority originating from the Thoroughbred mare used for the equine genome sequencing project. The purpose of this study was to identify genetic variants, including single nucleotide polymorphisms (SNPs), insertion/deletion polymorphisms (INDELs), and copy number variants (CNVs) in the genome of an individual Quarter Horse mare sequenced by next-generation sequencing. RESULTS: Using massively parallel paired-end sequencing, we generated 59.6 Gb of DNA sequence from a Quarter Horse mare resulting in an average of 24.7X sequence coverage. Reads were mapped to approximately 97% of the reference Thoroughbred genome. Unmapped reads were de novo assembled resulting in 19.1 Mb of new genomic sequence in the horse. Using a stringent filtering method, we identified 3.1 million SNPs, 193 thousand INDELs, and 282 CNVs. Genetic variants were annotated to determine their impact on gene structure and function. Additionally, we genotyped this Quarter Horse for mutations of known diseases and for variants associated with particular traits. Functional clustering analysis of genetic variants revealed that most of the genetic variation in the horse's genome was enriched in sensory perception, signal transduction, and immunity and defense pathways. CONCLUSIONS: This is the first sequencing of a horse genome by next-generation sequencing and the first genomic sequence of an individual Quarter Horse mare. We have increased the catalog of genetic variants for use in equine genomics by the addition of novel SNPs, INDELs, and CNVs. The genetic variants described here will be a useful resource for future studies of genetic variation regulating performance traits and diseases in equids.

  7. Whole-genome sequencing and genetic variant analysis of a Quarter Horse mare.

    KAUST Repository

    Doan, Ryan

    2012-02-17

    BACKGROUND: The catalog of genetic variants in the horse genome originates from a few select animals, the majority originating from the Thoroughbred mare used for the equine genome sequencing project. The purpose of this study was to identify genetic variants, including single nucleotide polymorphisms (SNPs), insertion/deletion polymorphisms (INDELs), and copy number variants (CNVs) in the genome of an individual Quarter Horse mare sequenced by next-generation sequencing. RESULTS: Using massively parallel paired-end sequencing, we generated 59.6 Gb of DNA sequence from a Quarter Horse mare resulting in an average of 24.7X sequence coverage. Reads were mapped to approximately 97% of the reference Thoroughbred genome. Unmapped reads were de novo assembled resulting in 19.1 Mb of new genomic sequence in the horse. Using a stringent filtering method, we identified 3.1 million SNPs, 193 thousand INDELs, and 282 CNVs. Genetic variants were annotated to determine their impact on gene structure and function. Additionally, we genotyped this Quarter Horse for mutations of known diseases and for variants associated with particular traits. Functional clustering analysis of genetic variants revealed that most of the genetic variation in the horse\\'s genome was enriched in sensory perception, signal transduction, and immunity and defense pathways. CONCLUSIONS: This is the first sequencing of a horse genome by next-generation sequencing and the first genomic sequence of an individual Quarter Horse mare. We have increased the catalog of genetic variants for use in equine genomics by the addition of novel SNPs, INDELs, and CNVs. The genetic variants described here will be a useful resource for future studies of genetic variation regulating performance traits and diseases in equids.

  8. Genetic Diversity of Selected Mangifera Species Revealed by Inter Simple Sequence Repeats Markers

    OpenAIRE

    Ariffin, Zulhairil; Md Sah, Muhammad Shafie; Idris, Salma; Hashim, Nuradni

    2015-01-01

    ISSR markers were employed to reveal genetic diversity and genetic relatedness among 28 Mangifera accessions collected from Yan (Kedah), Bukit Gantang (Perak), Sibuti (Sarawak), and Papar (Sabah). A total of 198 markers were generated using nine anchored primers and one nonanchored primer. Genetic variation among the 28 accessions of Mangifera species including wild relatives, landraces, and clonal varieties is high, with an average degree of polymorphism of 98% and mean Shannon index, H0=7.5...

  9. Unexpected allelic heterogeneity and spectrum of mutations in Fowler syndrome revealed by next-generation exome sequencing.

    Science.gov (United States)

    Lalonde, Emilie; Albrecht, Steffen; Ha, Kevin C H; Jacob, Karine; Bolduc, Nathalie; Polychronakos, Constantin; Dechelotte, Pierre; Majewski, Jacek; Jabado, Nada

    2010-08-01

    Protein coding genes constitute approximately 1% of the human genome but harbor 85% of the mutations with large effects on disease-related traits. Therefore, efficient strategies for selectively sequencing complete coding regions (i.e., "whole exome") have the potential to contribute our understanding of human diseases. We used a method for whole-exome sequencing coupling Agilent whole-exome capture to the Illumina DNA-sequencing platform, and investigated two unrelated fetuses from nonconsanguineous families with Fowler Syndrome (FS), a stereotyped phenotype lethal disease. We report novel germline mutations in feline leukemia virus subgroup C cellular-receptor-family member 2, FLVCR2, which has recently been shown to cause FS. Using this technology, we identified three types of genetic abnormalities: point-mutations, insertions-deletions, and intronic splice-site changes (first pathogenic report using this technology), in the fetuses who both were compound heterozygotes for the disease. Although revealing a high level of allelic heterogeneity and mutational spectrum in FS, this study further illustrates the successful application of whole-exome sequencing to uncover genetic defects in rare Mendelian disorders. Of importance, we show that we can identify genes underlying rare, monogenic and recessive diseases using a limited number of patients (n=2), in the absence of shared genetic heritage and in the presence of allelic heterogeneity.

  10. Genomewide variation in an introgression line of rice-Zizania revealed by whole-genome re-sequencing.

    Directory of Open Access Journals (Sweden)

    Zhen-Hui Wang

    Full Text Available BACKGROUND: Hybridization between genetically diverged organisms is known as an important avenue that drives plant genome evolution. The possible outcomes of hybridization would be the occurrences of genetic instabilities in the resultant hybrids. It remained under-investigated however whether pollination by alien pollens of a closely related but sexually "incompatible" species could evoke genomic changes and to what extent it may result in phenotypic novelties in the derived progenies. METHODOLOGY/PRINCIPAL FINDINGS: In this study, we have re-sequenced the genomes of Oryza sativa ssp. japonica cv. Matsumae and one of its derived introgressant RZ35 that was obtained from an introgressive hybridization between Matsumae and Zizanialatifolia Griseb. in general, 131 millions 90 base pair (bp paired-end reads were generated which covered 13.2 and 21.9 folds of the Matsumae and RZ35 genomes, respectively. Relative to Matsumae, a total of 41,724 homozygous single nucleotide polymorphisms (SNPs and 17,839 homozygous insertions/deletions (indels were identified in RZ35, of which 3,797 SNPs were nonsynonymous mutations. Furthermore, rampant mobilization of transposable elements (TEs was found in the RZ35 genome. The results of pathogen inoculation revealed that RZ35 exhibited enhanced resistance to blast relative to Matsumae. Notably, one nonsynonymous mutation was found in the known blast resistance gene Pid3/Pi25 and real-time quantitative (q RT-PCR analysis revealed constitutive up-regulation of its expression, suggesting both altered function and expression of Pid3/Pi25 may be responsible for the enhanced resistance to rice blast by RZ35. CONCLUSIONS/SIGNIFICANCE: Our results demonstrate that introgressive hybridization by Zizania has provoked genomewide, extensive genomic changes in the rice genome, and some of which have resulted in important phenotypic novelties. These findings suggest that introgressive hybridization by alien pollens of even a

  11. Using SQL Databases for Sequence Similarity Searching and Analysis.

    Science.gov (United States)

    Pearson, William R; Mackey, Aaron J

    2017-09-13

    Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. The unit also introduces search_demo, a database that stores sequence similarity search results. The search_demo database is then used to explore the evolutionary relationships between E. coli proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.

  12. Now And Next Generation Sequencing Techniques: Future of Sequence Analysis using Cloud Computing

    Directory of Open Access Journals (Sweden)

    Radhe Shyam Thakur

    2012-12-01

    Full Text Available Advancements in the field of sequencing techniques resulted in the huge sequenced data to be produced at a very faster rate. It is going cumbersome for the datacenter to maintain the databases. Data mining and sequence analysis approaches needs to analyze the databases several times to reach any efficient conclusion. To cope with such overburden on computer resources and to reach efficient and effective conclusions quickly, the virtualization of the resources and computation on pay as you go concept was introduced and termed as cloud computing. The datacenter’s hardware and software is collectively known as cloud which when available publicly is termed as public cloud. The datacenter’s resources are provided in a virtual mode to the clients via a service provider like Amazon, Google and Joyent which charges on pay as you go manner. The workload is shifted to the provider which is maintained by the required hardware and software upgradation. The service provider manages it by upgrading the requirements in the virtual mode. Basically a virtual environment is created according to the need of the user by taking permission from datacenter via internet, the task is performed and the environment is deleted after the task is over. In this discussion, we are focusing on the basics of cloud computing, the prerequisites and overall working of clouds. Furthermore, briefly the applications of cloud computing in biological systems, especially in comparative genomics, genome informatics and SNP detection with reference to traditional workflow are discussed.

  13. Now and next-generation sequencing techniques: future of sequence analysis using cloud computing.

    Science.gov (United States)

    Thakur, Radhe Shyam; Bandopadhyay, Rajib; Chaudhary, Bratati; Chatterjee, Sourav

    2012-01-01

    Advances in the field of sequencing techniques have resulted in the greatly accelerated production of huge sequence datasets. This presents immediate challenges in database maintenance at datacenters. It provides additional computational challenges in data mining and sequence analysis. Together these represent a significant overburden on traditional stand-alone computer resources, and to reach effective conclusions quickly and efficiently, the virtualization of the resources and computation on a pay-as-you-go concept (together termed "cloud computing") has recently appeared. The collective resources of the datacenter, including both hardware and software, can be available publicly, being then termed a public cloud, the resources being provided in a virtual mode to the clients who pay according to the resources they employ. Examples of public companies providing these resources include Amazon, Google, and Joyent. The computational workload is shifted to the provider, which also implements required hardware and software upgrades over time. A virtual environment is created in the cloud corresponding to the computational and data storage needs of the user via the internet. The task is then performed, the results transmitted to the user, and the environment finally deleted after all tasks are completed. In this discussion, we focus on the basics of cloud computing, and go on to analyze the prerequisites and overall working of clouds. Finally, the applications of cloud computing in biological systems, particularly in comparative genomics, genome informatics, and SNP detection are discussed with reference to traditional workflows.

  14. SEQUENCING AND SEQUENCE ANALYSIS OF MYOSTATIN GENE IN THE EXON 1 OF THE CAMEL (CAMELUS DROMEDARIUS

    Directory of Open Access Journals (Sweden)

    M. G. SHAH, A. S. QURESHI1, M. REISSMANN2 AND H. J. SCHWARTZ3

    2006-10-01

    Full Text Available Myostatin, also called growth differentiation factor-8 (GDF-8, is a member of the mammalian growth transforming family (TGF-beta superfamily, which is expressed specifically in developing an adult skeletal muscle. Muscular hypertrophy allele (mh allele in the double muscle breeds involved mutation within the myostatin gene. Genomic DNA was isolated from the camel hair using NucleoSpin Tissue kit. Two animals of each of the six breeds namely, Marecha, Dhatti, Larri, Kohi, Sakrai and Cambelpuri were used for sequencing. For PCR amplification of the gene, a primer pair was designed from homolog regions of already published sequences of farm animals from GenBank. Results showed that camel myostatin possessed more than 90% homology with that of cattle, sheep and pig. Camel formed separate cluster from the pig in spite of having high homology (98% and showed 94% homology with cattle and sheep as reported in literature. Sequence analysis of the PCR amplified part of exon 1 (256 bp of the camel myostatin was identical among six camel breeds.

  15. Genomic library screening for viruses from the human dental plaque revealed pathogen-specific lytic phage sequences.

    Science.gov (United States)

    Al-Jarbou, Ahmed Nasser

    2012-01-01

    Bacterial pathogenesis presents an astounding arsenal of virulence factors that allow them to conquer many different niches throughout the course of infection. Principally fascinating is the fact that some bacterial species are able to induce different diseases by expression of different combinations of virulence factors. Nevertheless, studies aiming at screening for the presence of bacteriophages in humans have been limited. Such screening procedures would eventually lead to identification of phage-encoded properties that impart increased bacterial fitness and/or virulence in a particular niche, and hence, would potentially be used to reverse the course of bacterial infections. As the human oral cavity represents a rich and dynamic ecosystem for several upper respiratory tract pathogens. However, little is known about virus diversity in human dental plaque which is an important reservoir. We applied the culture-independent approach to characterize virus diversity in human dental plaque making a library from a virus DNA fraction amplified using a multiple displacement method and sequenced 80 clones. The resulting sequence showed 44% significant identities to GenBank databases by TBLASTX analysis. TBLAST homology comparisons showed that 66% was viral; 18% eukarya; 10% bacterial; 6% mobile elements. These sequences were sorted into 6 contigs and 45 single sequences in which 4 contigs and a single sequence showed significant identity to a small region of a putative prophage in the Corynebacterium diphtheria genome. These findings interestingly highlight the uniqueness of over half of the sequences, whilst the dominance of a pathogen-specific prophage sequences imply their role in virulence.

  16. High-resolution whole-genome sequencing reveals that specific chromatin domains from most human chromosomes associate with nucleoli.

    Science.gov (United States)

    van Koningsbruggen, Silvana; Gierlinski, Marek; Schofield, Pietá; Martin, David; Barton, Geoffey J; Ariyurek, Yavuz; den Dunnen, Johan T; Lamond, Angus I

    2010-11-01

    The nuclear space is mostly occupied by chromosome territories and nuclear bodies. Although this organization of chromosomes affects gene function, relatively little is known about the role of nuclear bodies in the organization of chromosomal regions. The nucleolus is the best-studied subnuclear structure and forms around the rRNA repeat gene clusters on the acrocentric chromosomes. In addition to rDNA, other chromatin sequences also surround the nucleolar surface and may even loop into the nucleolus. These additional nucleolar-associated domains (NADs) have not been well characterized. We present here a whole-genome, high-resolution analysis of chromatin endogenously associated with nucleoli. We have used a combination of three complementary approaches, namely fluorescence comparative genome hybridization, high-throughput deep DNA sequencing and photoactivation combined with time-lapse fluorescence microscopy. The data show that specific sequences from most human chromosomes, in addition to the rDNA repeat units, associate with nucleoli in a reproducible and heritable manner. NADs have in common a high density of AT-rich sequence elements, low gene density and a statistically significant enrichment in transcriptionally repressed genes. Unexpectedly, both the direct DNA sequencing and fluorescence photoactivation data show that certain chromatin loci can specifically associate with either the nucleolus, or the nuclear envelope.

  17. Context based computational analysis and characterization of ARS consensus sequences (ACS of Saccharomyces cerevisiae genome

    Directory of Open Access Journals (Sweden)

    Vinod Kumar Singh

    2016-09-01

    Full Text Available Genome-wide experimental studies in Saccharomyces cerevisiae reveal that autonomous replicating sequence (ARS requires an essential consensus sequence (ACS for replication activity. Computational studies identified thousands of ACS like patterns in the genome. However, only a few hundreds of these sites act as replicating sites and the rest are considered as dormant or evolving sites. In a bid to understand the sequence makeup of replication sites, a content and context-based analysis was performed on a set of replicating ACS sequences that binds to origin-recognition complex (ORC denoted as ORC-ACS and non-replicating ACS sequences (nrACS, that are not bound by ORC. In this study, DNA properties such as base composition, correlation, sequence dependent thermodynamic and DNA structural profiles, and their positions have been considered for characterizing ORC-ACS and nrACS. Analysis reveals that ORC-ACS depict marked differences in nucleotide composition and context features in its vicinity compared to nrACS. Interestingly, an A-rich motif was also discovered in ORC-ACS sequences within its nucleosome-free region. Profound changes in the conformational features, such as DNA helical twist, inclination angle and stacking energy between ORC-ACS and nrACS were observed. Distribution of ACS motifs in the non-coding segments points to the locations of ORC-ACS which are found far away from the adjacent gene start position compared to nrACS thereby enabling an accessible environment for ORC-proteins. Our attempt is novel in considering the contextual view of ACS and its flanking region along with nucleosome positioning in the S. cerevisiae genome and may be useful for any computational prediction scheme.

  18. An Imaging And Graphics Workstation For Image Sequence Analysis

    Science.gov (United States)

    Mostafavi, Hassan

    1990-01-01

    This paper describes an application-specific engineering workstation designed and developed to analyze imagery sequences from a variety of sources. The system combines the software and hardware environment of the modern graphic-oriented workstations with the digital image acquisition, processing and display techniques. The objective is to achieve automation and high throughput for many data reduction tasks involving metric studies of image sequences. The applications of such an automated data reduction tool include analysis of the trajectory and attitude of aircraft, missile, stores and other flying objects in various flight regimes including launch and separation as well as regular flight maneuvers. The workstation can also be used in an on-line or off-line mode to study three-dimensional motion of aircraft models in simulated flight conditions such as wind tunnels. The system's key features are: 1) Acquisition and storage of image sequences by digitizing real-time video or frames from a film strip; 2) computer-controlled movie loop playback, slow motion and freeze frame display combined with digital image sharpening, noise reduction, contrast enhancement and interactive image magnification; 3) multiple leading edge tracking in addition to object centroids at up to 60 fields per second from both live input video or a stored image sequence; 4) automatic and manual field-of-view and spatial calibration; 5) image sequence data base generation and management, including the measurement data products; 6) off-line analysis software for trajectory plotting and statistical analysis; 7) model-based estimation and tracking of object attitude angles; and 8) interface to a variety of video players and film transport sub-systems.

  19. Multilocus sequence analysis of Treponema denticola strains of diverse origin

    Directory of Open Access Journals (Sweden)

    Mo Sisu

    2013-02-01

    Full Text Available Abstract Background The oral spirochete bacterium Treponema denticola is associated with both the incidence and severity of periodontal disease. Although the biological or phenotypic properties of a significant number of T. denticola isolates have been reported in the literature, their genetic diversity or phylogeny has never been systematically investigated. Here, we describe a multilocus sequence analysis (MLSA of 20 of the most highly studied reference strains and clinical isolates of T. denticola; which were originally isolated from subgingival plaque samples taken from subjects from China, Japan, the Netherlands, Canada and the USA. Results The sequences of the 16S ribosomal RNA gene, and 7 conserved protein-encoding genes (flaA, recA, pyrH, ppnK, dnaN, era and radC were successfully determined for each strain. Sequence data was analyzed using a variety of bioinformatic and phylogenetic software tools. We found no evidence of positive selection or DNA recombination within the protein-encoding genes, where levels of intraspecific sequence polymorphism varied from 18.8% (flaA to 8.9% (dnaN. Phylogenetic analysis of the concatenated protein-encoding gene sequence data (ca. 6,513 nucleotides for each strain using Bayesian and maximum likelihood approaches indicated that the T. denticola strains were monophyletic, and formed 6 well-defined clades. All analyzed T. denticola strains appeared to have a genetic origin distinct from that of ‘Treponema vincentii’ or Treponema pallidum. No specific geographical relationships could be established; but several strains isolated from different continents appear to be closely related at the genetic level. Conclusions Our analyses indicate that previous biological and biophysical investigations have predominantly focused on a subset of T. denticola strains with a relatively narrow range of genetic diversity. Our methodology and results establish a genetic framework for the discrimination and phylogenetic

  20. Sirius PSB: a generic system for analysis of biological sequences.

    Science.gov (United States)

    Koh, Chuan Hock; Lin, Sharene; Jedd, Gregory; Wong, Limsoon

    2009-12-01

    Computational tools are essential components of modern biological research. For example, BLAST searches can be used to identify related proteins based on sequence homology, or when a new genome is sequenced, prediction models can be used to annotate functional sites such as transcription start sites, translation initiation sites and polyadenylation sites and to predict protein localization. Here we present Sirius Prediction Systems Builder (PSB), a new computational tool for sequence analysis, classification and searching. Sirius PSB has four main operations: (1) Building a classifier, (2) Deploying a classifier, (3) Search for proteins similar to query proteins, (4) Preliminary and post-prediction analysis. Sirius PSB supports all these operations via a simple and interactive graphical user interface. Besides being a convenient tool, Sirius PSB has also introduced two novelties in sequence analysis. Firstly, genetic algorithm is used to identify interesting features in the feature space. Secondly, instead of the conventional method of searching for similar proteins via sequence similarity, we introduced searching via features' similarity. To demonstrate the capabilities of Sirius PSB, we have built two prediction models - one for the recognition of Arabidopsis polyadenylation sites and another for the subcellular localization of proteins. Both systems are competitive against current state-of-the-art models based on evaluation of public datasets. More notably, the time and effort required to build each model is greatly reduced with the assistance of Sirius PSB. Furthermore, we show that under certain conditions when BLAST is unable to find related proteins, Sirius PSB can identify functionally related proteins based on their biophysical similarities. Sirius PSB and its related supplements are available at: http://compbio.ddns.comp.nus.edu.sg/~sirius.

  1. Whole-genome sequencing reveals the mechanisms for evolution of streptomycin resistance in Lactobacillus plantarum.

    Science.gov (United States)

    Zhang, Fuxin; Gao, Jiayuan; Wang, Bini; Huo, Dongxue; Wang, Zhaoxia; Zhang, Jiachao; Shao, Yuyu

    2018-04-01

    In this research, we investigated the evolution of streptomycin resistance in Lactobacillus plantarum ATCC14917, which was passaged in medium containing a gradually increasing concentration of streptomycin. After 25 d, the minimum inhibitory concentration (MIC) of L. plantarum ATCC14917 had reached 131,072 µg/mL, which was 8,192-fold higher than the MIC of the original parent isolate. The highly resistant L. plantarum ATCC14917 isolate was then passaged in antibiotic-free medium to determine the stability of resistance. The MIC value of the L. plantarum ATCC14917 isolate decreased to 2,048 µg/mL after 35 d but remained constant thereafter, indicating that resistance was irreversible even in the absence of selection pressure. Whole-genome sequencing of parent isolates, control isolates, and isolates following passage was used to study the resistance mechanism of L. plantarum ATCC14917 to streptomycin and adaptation in the presence and absence of selection pressure. Five mutated genes (single nucleotide polymorphisms and structural variants) were verified in highly resistant L. plantarum ATCC14917 isolates, which were related to ribosomal protein S12, LPXTG-motif cell wall anchor domain protein, LrgA family protein, Ser/Thr phosphatase family protein, and a hypothetical protein that may correlate with resistance to streptomycin. After passage in streptomycin-free medium, only the mutant gene encoding ribosomal protein S12 remained; the other 4 mutant genes had reverted to the wild type as found in the parent isolate. Although the MIC value of L. plantarum ATCC14917 was reduced in the absence of selection pressure, it remained 128-fold higher than the MIC value of the parent isolate, indicating that ribosomal protein S12 may play an important role in streptomycin resistance. Using the mobile elements database, we demonstrated that streptomycin resistance-related genes in L. plantarum ATCC14917 were not located on mobile elements. This research offers a way of

  2. Genome sequencing reveals metabolic and cellular interdependence in an amoeba-kinetoplastid symbiosis

    Czech Academy of Sciences Publication Activity Database

    Tanifuji, G.; Cenci, U.; Moog, D.; Dean, S.; Nakayama, T.; David, Vojtěch; Fiala, Ivan; Curtis, B.A.; Sibbald, S. J.; Onodera, N. T.; Colp, M.; Flegontov, Pavel; Johnson-MacKinnon, J.; McPhee, M.; Inagaki, Y.; Hashimoto, T.; Kelly, S.; Gull, K.; Lukeš, Julius; Archibald, J.M.

    2017-01-01

    Roč. 7, SEP 15 (2017), č. článku 11688. ISSN 2045-2322 R&D Projects: GA ČR(CZ) GA14-23986S; GA MŠk LL1601 Institutional support: RVO:60077344 Keywords : trypanosoma-brucei reveals * hidden markov model * neoparamoeba-pemaquidensis * gill disease * phylogenetic analyses * ichthyobodo-necator * gene prediction * host control * evolution * proteomics Subject RIV: EB - Genetics ; Molecular Biology OBOR OECD: Biochemistry and molecular biology Impact factor: 4.259, year: 2016

  3. Structure and Sequence Analyses of Clustered Protocadherins Reveal Antiparallel Interactions that Mediate Homophilic Specificity.

    Science.gov (United States)

    Nicoludis, John M; Lau, Sze-Yi; Schärfe, Charlotta P I; Marks, Debora S; Weihofen, Wilhelm A; Gaudet, Rachelle

    2015-11-03

    Clustered protocadherin (Pcdh) proteins mediate dendritic self-avoidance in neurons via specific homophilic interactions in their extracellular cadherin (EC) domains. We determined crystal structures of EC1-EC3, containing the homophilic specificity-determining region, of two mouse clustered Pcdh isoforms (PcdhγA1 and PcdhγC3) to investigate the nature of the homophilic interaction. Within the crystal lattices, we observe antiparallel interfaces consistent with a role in trans cell-cell contact. Antiparallel dimerization is supported by evolutionary correlations. Two interfaces, located primarily on EC2-EC3, involve distinctive clustered Pcdh structure and sequence motifs, lack predicted glycosylation sites, and contain residues highly conserved in orthologs but not paralogs, pointing toward their biological significance as homophilic interaction interfaces. These two interfaces are similar yet distinct, reflecting a possible difference in interaction architecture between clustered Pcdh subfamilies. These structures initiate a molecular understanding of clustered Pcdh assemblies that are required to produce functional neuronal networks. Copyright © 2015 Elsevier Ltd. All rights reserved.

  4. Exomic sequencing of immune-related genes reveals novel candidate variants associated with alopecia universalis.

    Directory of Open Access Journals (Sweden)

    Seungbok Lee

    Full Text Available Alopecia areata (AA is a common autoimmune disorder mostly presented as round patches of hair loss and subclassified into alopecia totalis/alopecia universalis (AT/AU based on the area of alopecia. Although AA is relatively common, only 5% of AA patients progress to AT/AU, which affect the whole scalp and whole body respectively. To determine genetic determinants of this orphan disease, we undertook whole-exome sequencing of 6 samples from AU patients, and 26 variants in immune-related genes were selected as candidates. When an additional 14 AU samples were genotyped for these candidates, 6 of them remained at the level of significance in comparison with 155 Asian controls (p<1.92×10(-3. Linkage disequilibrium was observed between some of the most significant SNPs, including rs41559420 of HLA-DRB5 (p<0.001, OR 44.57 and rs28362679 of BTNL2 (p<0.001, OR 30.21. While BTNL2 was reported as a general susceptibility gene of AA previously, HLA-DRB5 has not been implicated in AA. In addition, we found several genetic variants in novel genes (HLA-DMB, TLR1, and PMS2 and discovered an additional locus on HLA-A, a known susceptibility gene of AA. This study provides further evidence for the association of previously reported genes with AA and novel findings such as HLA-DRB5, which might represent a hidden culprit gene for AU.

  5. Deep sequencing of the Camellia chekiangoleosa transcriptome revealed candidate genes for anthocyanin biosynthesis.

    Science.gov (United States)

    Wang, Zhong-Wei; Jiang, Cong; Wen, Qiang; Wang, Na; Tao, Yuan-Yuan; Xu, Li-An

    2014-03-15

    Camellia chekiangoleosa is an important species of genus Camellia. It provides high-quality edible oil and has great ornamental value. The flowers are big and red which bloom between February and March. Flower pigmentation is closely related to the accumulation of anthocyanin. Although anthocyanin biosynthesis has been studied extensively in herbaceous plants, little molecular information on the anthocyanin biosynthesis pathway of C. chekiangoleosa is yet known. In the present study, a cDNA library was constructed to obtain detailed and general data from the flowers of C. chekiangoleosa. To explore the transcriptome of C. chekiangoleosa and investigate genes involved in anthocyanin biosynthesis, a 454 GS FLX Titanium platform was used to generate an EST dataset. About 46,279 sequences were obtained, and 24,593 (53.1%) were annotated. Using Blast search against the AGRIS, 1740 unigenes were found homologous to 599 Arabidopsis transcription factor genes. Based on the transcriptome dataset, nine anthocyanin biosynthesis pathway genes (PAL, CHS1, CHS2, CHS3, CHI, F3H, DFR, ANS, and UFGT) were identified and cloned. The spatio-temporal expression patterns of these genes were also analyzed using quantitative real-time polymerase chain reaction. The study results not only enrich the gene resource but also provide valuable information for further studies concerning anthocyanin biosynthesis. Copyright © 2014 Elsevier B.V. All rights reserved.

  6. Whole genome sequencing revealed host adaptation-focused genomic plasticity of pathogenic Leptospira

    Science.gov (United States)

    Xu, Yinghua; Zhu, Yongzhang; Wang, Yuezhu; Chang, Yung-Fu; Zhang, Ying; Jiang, Xiugao; Zhuang, Xuran; Zhu, Yongqiang; Zhang, Jinlong; Zeng, Lingbing; Yang, Minjun; Li, Shijun; Wang, Shengyue; Ye, Qiang; Xin, Xiaofang; Zhao, Guoping; Zheng, Huajun; Guo, Xiaokui; Wang, Junzhi

    2016-01-01

    Leptospirosis, caused by pathogenic Leptospira spp., has recently been recognized as an emerging infectious disease worldwide. Despite its severity and global importance, knowledge about the molecular pathogenesis and virulence evolution of Leptospira spp. remains limited. Here we sequenced and analyzed 102 isolates representing global sources. A high genomic variability were observed among different Leptospira species, which was attributed to massive gene gain and loss events allowing for adaptation to specific niche conditions and changing host environments. Horizontal gene transfer and gene duplication allowed the stepwise acquisition of virulence factors in pathogenic Leptospira evolved from a recent common ancestor. More importantly, the abundant expansion of specific virulence-related protein families, such as metalloproteases-associated paralogs, were exclusively identified in pathogenic species, reflecting the importance of these protein families in the pathogenesis of leptospirosis. Our observations also indicated that positive selection played a crucial role on this bacteria adaptation to hosts. These novel findings may lead to greater understanding of the global diversity and virulence evolution of Leptospira spp. PMID:26833181

  7. Genomic DNA sequences from mastodon and woolly mammoth reveal deep speciation of forest and savanna elephants.

    Directory of Open Access Journals (Sweden)

    Nadin Rohland

    2010-12-01

    Full Text Available To elucidate the history of living and extinct elephantids, we generated 39,763 bp of aligned nuclear DNA sequence across 375 loci for African savanna elephant, African forest elephant, Asian elephant, the extinct American mastodon, and the woolly mammoth. Our data establish that the Asian elephant is the closest living relative of the extinct mammoth in the nuclear genome, extending previous findings from mitochondrial DNA analyses. We also find that savanna and forest elephants, which some have argued are the same species, are as or more divergent in the nuclear genome as mammoths and Asian elephants, which are considered to be distinct genera, thus resolving a long-standing debate about the appropriate taxonomic classification of the African elephants. Finally, we document a much larger effective population size in forest elephants compared with the other elephantid taxa, likely reflecting species differences in ancient geographic structure and range and differences in life history traits such as variance in male reproductive success.

  8. Multilocus sequence typing reveals two evolutionary lineages of Acidovorax avenae subsp. citrulli.

    Science.gov (United States)

    Feng, Jianjun; Schuenzel, Erin L; Li, Jianqiang; Schaad, Norman W

    2009-08-01

    Acidovorax avenae subsp. citrulli, causal agent of bacterial fruit blotch, has caused considerable damage to the watermelon and melon industry in China and the United States. Understanding the emergence and spread of this pathogen is important for controlling the disease. To build a fingerprinting database for reliable identification and tracking of strains of A. avenae subsp. citrulli, a multilocus sequence typing (MLST) scheme was developed using seven conserved loci. The study included 8 original strains from the 1978 description of A. avenae subsp. citrulli, 51 from China, and 34 from worldwide collections. Two major clonal complexes (CCs), CC1 and CC2, were identified within A. avenae subsp. citrulli; 48 strains typed as CC1 and 45 as CC2. All eight original 1978 strains isolated from watermelon and melon grouped in CC1. CC2 strains were predominant in the worldwide collection and all but five were isolated from watermelon. In China, a major seed producer for melon and watermelon, the predominant strains were CC1 and were found nearly equally on melon and watermelon.

  9. Modeling of the Ebola Virus Delta Peptide Reveals a Potential Lytic Sequence Motif

    Directory of Open Access Journals (Sweden)

    William R. Gallaher

    2015-01-01

    Full Text Available Filoviruses, such as Ebola and Marburg viruses, cause severe outbreaks of human infection, including the extensive epidemic of Ebola virus disease (EVD in West Africa in 2014. In the course of examining mutations in the glycoprotein gene associated with 2014 Ebola virus (EBOV sequences, a differential level of conservation was noted between the soluble form of glycoprotein (sGP and the full length glycoprotein (GP, which are both encoded by the GP gene via RNA editing. In the region of the proteins encoded after the RNA editing site sGP was more conserved than the overlapping region of GP when compared to a distant outlier species, Tai Forest ebolavirus. Half of the amino acids comprising the “delta peptide”, a 40 amino acid carboxy-terminal fragment of sGP, were identical between otherwise widely divergent species. A lysine-rich amphipathic peptide motif was noted at the carboxyl terminus of delta peptide with high structural relatedness to the cytolytic peptide of the non-structural protein 4 (NSP4 of rotavirus. EBOV delta peptide is a candidate viroporin, a cationic pore-forming peptide, and may contribute to EBOV pathogenesis.

  10. The identification of FANCD2 DNA binding domains reveals nuclear localization sequences.

    Science.gov (United States)

    Niraj, Joshi; Caron, Marie-Christine; Drapeau, Karine; Bérubé, Stéphanie; Guitton-Sert, Laure; Coulombe, Yan; Couturier, Anthony M; Masson, Jean-Yves

    2017-08-21

    Fanconi anemia (FA) is a recessive genetic disorder characterized by congenital abnormalities, progressive bone-marrow failure, and cancer susceptibility. The FA pathway consists of at least 21 FANC genes (FANCA-FANCV), and the encoded protein products interact in a common cellular pathway to gain resistance against DNA interstrand crosslinks. After DNA damage, FANCD2 is monoubiquitinated and accumulates on chromatin. FANCD2 plays a central role in the FA pathway, using yet unidentified DNA binding regions. By using synthetic peptide mapping and DNA binding screen by electromobility shift assays, we found that FANCD2 bears two major DNA binding domains predominantly consisting of evolutionary conserved lysine residues. Furthermore, one domain at the N-terminus of FANCD2 bears also nuclear localization sequences for the protein. Mutations in the bifunctional DNA binding/NLS domain lead to a reduction in FANCD2 monoubiquitination and increase in mitomycin C sensitivity. Such phenotypes are not fully rescued by fusion with an heterologous NLS, which enable separation of DNA binding and nuclear import functions within this domain that are necessary for FANCD2 functions. Collectively, our results enlighten the importance of DNA binding and NLS residues in FANCD2 to activate an efficient FA pathway. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. DNA interaction with platinum-based cytostatics revealed by DNA sequencing.

    Science.gov (United States)

    Smerkova, Kristyna; Vaculovic, Tomas; Vaculovicova, Marketa; Kynicky, Jindrich; Brtnicky, Martin; Eckschlager, Tomas; Stiborova, Marie; Hubalek, Jaromir; Adam, Vojtech

    2017-12-15

    The main mechanism of action of platinum-based cytostatic drugs - cisplatin, oxaliplatin and carboplatin - is the formation of DNA cross-links, which restricts the transcription due to the disability of DNA to enter the active site of the polymerase. The polymerase chain reaction (PCR) was employed as a simplified model of the amplification process in the cell nucleus. PCR with fluorescently labelled dideoxynucleotides commonly employed for DNA sequencing was used to monitor the effect of platinum-based cytostatics on DNA in terms of decrease in labeling efficiency dependent on a presence of the DNA-drug cross-link. It was found that significantly different amounts of the drugs - cisplatin (0.21 μg/mL), oxaliplatin (5.23 μg/mL), and carboplatin (71.11 μg/mL) - were required to cause the same quenching effect (50%) on the fluorescent labelling of 50 μg/mL of DNA. Moreover, it was found that even though the amounts of the drugs was applied to the reaction mixture differing by several orders of magnitude, the amount of incorporated platinum, quantified by inductively coupled plasma mass spectrometry, was in all cases at the level of tenths of μg per 5 μg of DNA. Copyright © 2017 Elsevier Inc. All rights reserved.

  12. The complete genome sequence of hyperthermophile Dictyoglomus turgidum DSM 6724™ reveals a specialized carbohydrate fermentor

    Directory of Open Access Journals (Sweden)

    Phillip Brumm

    2016-12-01

    Full Text Available Here we report the complete genome sequence of the chemoorganotrophic, extremely thermophilic bacterium, Dictyoglomus turgidum, which is a Gram negative, strictly anaerobic bacterium. D. turgidum and D. thermophilum together form the Dictyoglomi phylum. The two Dictyoglomus genomes are highly syntenic, and both are distantly related to Caldicellulosiruptor spp. D. turgidum is able to grow on a wide variety of polysaccharide substrates due to significant genomic commitment to glycosyl hydrolases, sixteen of which were cloned and expressed in our study. The GH5, GH10 and GH42 enzymes characterized in this study suggest that D. turgidum can utilize most plant-based polysaccharides except crystalline cellulose. The DNA polymerase I enzyme was also expressed and characterized. The pure enzyme showed improved amplification of long PCR targets compared to Taq polymerase. The genome contains a full complement of DNA modifying enzymes, and an unusually high copy number (4 of a new, ancestral family of polB type nucleotidyltransferases designated as MNT (minimal nucleotidyltransferases. Considering its optimal growth at 72ºC, D. turgidum has an anomalously low G+C content of 39.9% that may account for the presence of reverse gyrase, usually associated with hyperthermophiles.

  13. Transcriptome sequencing of two phenotypic mosaic Eucalyptus trees reveals large scale transcriptome re-modelling.

    Directory of Open Access Journals (Sweden)

    Amanda Padovan

    Full Text Available Phenotypic mosaic trees offer an ideal system for studying differential gene expression. We have investigated two mosaic eucalypt trees from two closely related species (Eucalyptus melliodora and E. sideroxylon, which each support two types of leaves: one part of the canopy is resistant to insect herbivory and the remaining leaves are susceptible. Driving this ecological distinction are differences in plant secondary metabolites. We used these phenotypic mosaics to investigate genome wide patterns of foliar gene expression with the aim of identifying patterns of differential gene expression and the somatic mutation(s that lead to this phenotypic mosaicism. We sequenced the mRNA pool from leaves of the resistant and susceptible ecotypes from both mosaic eucalypts using the Illumina HiSeq 2000 platform. We found large differences in pathway regulation and gene expression between the ecotypes of each mosaic. The expression of the genes in the MVA and MEP pathways is reflected by variation in leaf chemistry, however this is not the case for the terpene synthases. Apart from the terpene biosynthetic pathway, there are several other metabolic pathways that are differentially regulated between the two ecotypes, suggesting there is much more phenotypic diversity than has been described. Despite the close relationship between the two species, they show large differences in the global patterns of gene and pathway regulation.

  14. Molecular phylogenetic lineage of Plagiopogon and Askenasia (Protozoa, Ciliophora) revealed by their gene sequences

    Science.gov (United States)

    Liu, An; Yi, Zhenzhen; Lin, Xiaofeng; Hu, Xiaozhong; Al-Farraj, Saleh A.; Al-Rasheid, Khaled A. S.

    2015-08-01

    Prostomates and haptorians are two basal groups of ciliates with limited morphological characteristics available for taxonomy. Morphologically, the structures used to identify prostomates and haptorians are similar or even identical, which generate heavy taxonomic and phylogenetic confusion. In present work, phylogenetic positions lineage of two rare genera, Plagiopogon and Askenasia, were investigated. Three genes including small subunit ribosomal RNA gene (hereafter SSU rDNA), internal transcribed spacer region (ITS region), and large subunit ribosomal RNA gene (LSU rDNA) were analyzed, 10 new sequences five species each. Our findings included 1) class Prostomatea and order Haptorida are multiphyletic; 2) it may not be appropriate to place order Cyclotrichiida in subclass Haptoria, and the systematic lineage of order Cyclotrichiida needs to be verified further; 3) genus Plagiopogon branches consistently within a clade covering most prostomes and is basal of clade Colepidae, implying its close lineage to Prostomatea; and 4) Askenasia is phylogenetically distant from the subclass Haptoria but close to classes Prostomatea, Plagiopylea and Oligohymenophorea. We supposed that the toxicyst of Askenasia may be close to taxa of prostomes instead of haptorians, and the dorsal brush is a more typical morphological characteristics of haptorians than toxicysts.

  15. Complete motif analysis of sequence requirements for translation initiation at non-AUG start codons.

    Science.gov (United States)

    Diaz de Arce, Alexander J; Noderer, William L; Wang, Clifford L

    2018-01-25

    The initiation of mRNA translation from start codons other than AUG was previously believed to be rare and of relatively low impact. More recently, evidence has suggested that as much as half of all translation initiation utilizes non-AUG start codons, codons that deviate from AUG by a single base. Furthermore, non-AUG start codons have been shown to be involved in regulation of expression and disease etiology. Yet the ability to gauge expression based on the sequence of a translation initiation site (start codon and its flanking bases) has been limited. Here we have performed a comprehensive analysis of translation initiation sites that utilize non-AUG start codons. By combining genetic-reporter, cell-sorting, and high-throughput sequencing technologies, we have analyzed the expression associated with all possible variants of the -4 to +4 positions of non-AUG translation initiation site motifs. This complete motif analysis revealed that 1) with the right sequence context, certain non-AUG start codons can generate expression comparable to that of AUG start codons, 2) sequence context affects each non-AUG start codon differently, and 3) initiation at non-AUG start codons is highly sensitive to changes in the flanking sequences. Complete motif analysis has the potential to be a key tool for experimental and diagnostic genomics. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  16. RAPD and Internal Transcribed Spacer Sequence Analyses Reveal Zea nicaraguensis as a Section Luxuriantes Species Close to Zea luxurians

    Science.gov (United States)

    Wang, Pei; Lu, Yanli; Zheng, Mingmin; Rong, Tingzhao; Tang, Qilin

    2011-01-01

    Genetic relationship of a newly discovered teosinte from Nicaragua, Zea nicaraguensis with waterlogging tolerance, was determined based on randomly amplified polymorphic DNA (RAPD) markers and the internal transcribed spacer (ITS) sequences of nuclear ribosomal DNA using 14 accessions from Zea species. RAPD analysis showed that a total of 5,303 fragments were produced by 136 random decamer primers, of which 84.86% bands were polymorphic. RAPD-based UPGMA analysis demonstrated that the genus Zea can be divided into section Luxuriantes including Zea diploperennis, Zea luxurians, Zea perennis and Zea nicaraguensis, and section Zea including Zea mays ssp. mexicana, Zea mays ssp. parviglumis, Zea mays ssp. huehuetenangensis and Zea mays ssp. mays. ITS sequence analysis showed the lengths of the entire ITS region of the 14 taxa in Zea varied from 597 to 605 bp. The average GC content was 67.8%. In addition to the insertion/deletions, 78 variable sites were recorded in the total ITS region with 47 in ITS1, 5 in 5.8S, and 26 in ITS2. Sequences of these taxa were analyzed with neighbor-joining (NJ) and maximum parsimony (MP) methods to construct the phylogenetic trees, selecting Tripsacum dactyloides L. as the outgroup. The phylogenetic relationships of Zea species inferred from the ITS sequences are highly concordant with the RAPD evidence that resolved two major subgenus clades. Both RAPD and ITS sequence analyses indicate that Zea nicaraguensis is more closely related to Zea luxurians than the other teosintes and cultivated maize, which should be regarded as a section Luxuriantes species. PMID:21525982

  17. High throughput sequencing of small RNA component of leaves and inflorescence revealed conserved and novel miRNAs as well as phasiRNA loci in chickpea.

    Science.gov (United States)

    Srivastava, Sangeeta; Zheng, Yun; Kudapa, Himabindu; Jagadeeswaran, Guru; Hivrale, Vandana; Varshney, Rajeev K; Sunkar, Raman