bicolor genome database: Topics by WorldWideScience.org

Sample records for bicolor genome database

Survey and analysis of simple sequence repeats in the Laccaria bicolor genome, with development of microsatellite markers

Energy Technology Data Exchange (ETDEWEB)

Labbe, Jessy L [ORNL; Murat, Claude [INRA, Nancy, France; Morin, Emmanuelle [INRA, Nancy, France; Le Tacon, F [UMR, France; Martin, Francis [INRA, Nancy, France

2011-01-01

It is becoming clear that simple sequence repeats (SSRs) play a significant role in fungal genome organization, and they are a large source of genetic markers for population genetics and meiotic maps. We identified SSRs in the Laccaria bicolor genome by in silico survey and analyzed their distribution in the different genomic regions. We also compared the abundance and distribution of SSRs in L. bicolor with those of the following fungal genomes: Phanerochaete chrysosporium, Coprinopsis cinerea, Ustilago maydis, Cryptococcus neoformans, Aspergillus nidulans, Magnaporthe grisea, Neurospora crassa and Saccharomyces cerevisiae. Using the MISA computer program, we detected 277,062 SSRs in the L. bicolor genome representing 8% of the assembled genomic sequence. Among the analyzed basidiomycetes, L. bicolor exhibited the highest SSR density although no correlation between relative abundance and the genome sizes was observed. In most genomes the short motifs (mono- to trinucleotides) were more abundant than the longer repeated SSRs. Generally, in each organism, the occurrence, relative abundance, and relative density of SSRs decreased as the repeat unit increased. Furthermore, each organism had its own common and longest SSRs. In the L. bicolor genome, most of the SSRs were located in intergenic regions (73.3%) and the highest SSR density was observed in transposable elements (TEs; 6,706 SSRs/Mb). However, 81% of the protein-coding genes contained SSRs in their exons, suggesting that SSR polymorphism may alter gene phenotypes. Within a L. bicolor offspring, sequence polymorphism of 78 SSRs was mainly detected in non-TE intergenic regions. Unlike previously developed microsatellite markers, these new ones are spread throughout the genome; these markers could have immediate applications in population genetics.
The genome of Laccaria bicolor provides insights into mycorrhizal symbiosis

Energy Technology Data Exchange (ETDEWEB)

Martin, F.; Aerts, A.; Ahren, D.; Brun, A.; Danchin, E. G. J.; Duchaussoy, F.; Gibon, J.; Kohler, A.; Lindquist, E.; Peresa, V.; Salamov, A.; Shapiro, H. J.; Wuyts, J.; Blaudez, D.; Buee, M.; Brokstein, P.; Canback, B.; Cohen, D.; Courty, P. E.; Coutinho, P. M.; Delaruelle, C.; Detter, J. C.; Deveau, A.; DiFazio, S.; Duplessis, S.; Fraissinet-Tachet, L.; Lucic, E.; Frey-Klett, P.; Fourrey, C.; Feussner, I.; Gay, G.; Grimwood, J.; Hoegger, P. J.; Jain, P.; Kilaru, S.; Labbe, J.; Lin, Y. C.; Legue, V.; Le Tacon, F.; Marmeisse, R.; Melayah, D.; Montanini, B.; Muratet, M.; Nehls, U.; Niculita-Hirzel, H.; Secq, M. P. Oudot-Le; Peter, M.; Quesneville, H.; Rajashekar, B.; Reich, M.; Rouhier, N.; Schmutz, J.; Yin, T.; Chalot, M.; Henrissat, B.; Kues, U.; Lucas, S.; Van de Peer, Y.; Podila, G. K.; Polle, A.; Pukkila, P. J.; Richardson, P. M.; Rouze, P.; Sanders, I. R.; Stajich, J. E.; Tunlid, A.; Tuskan, G.; Grigoriev, I. V.

2007-08-10

Mycorrhizal symbioses the union of roots and soil fungi are universal in terrestrial ecosystems and may have been fundamental to land colonization by plants 1, 2. Boreal, temperate and montane forests all depend on ectomycorrhizae1. Identification of the primary factors that regulate symbiotic development and metabolic activity will therefore open the door to understanding the role of ectomycorrhizae in plant development and physiology, allowing the full ecological significance of this symbiosis to be explored. Here we report the genome sequence of the ectomycorrhizal basidiomycete Laccaria bicolor (Fig. 1) and highlight gene sets involved in rhizosphere colonization and symbiosis. This 65-megabase genome assembly contains 20,000 predicted protein-encoding genes and a very large number of transposons and repeated sequences. We detected unexpected genomic features, most notably a battery of effector-type small secreted proteins (SSPs) with unknown function, several of which are only expressed in symbiotic tissues. The most highly expressed SSP accumulates in the proliferating hyphae colonizing the host root. The ectomycorrhizae-specific SSPs probably have a decisive role in the establishment of the symbiosis. The unexpected observation that the genome of L. bicolor lacks carbohydrate-active enzymes involved in degradation of plant cell walls, but maintains the ability to degrade non-plant cell wall polysaccharides, reveals the dual saprotrophic and biotrophic lifestyle of the mycorrhizal fungus that enables it to grow within both soil and living plant roots. The predicted gene inventory of the L. bicolor genome, therefore, points to previously unknown mechanisms of symbiosis operating in biotrophic mycorrhizal fungi. The availability of this genome provides an unparalleled opportunity to develop a deeper understanding of the processes by which symbionts interact with plants within their ecosystem to perform vital functions in the carbon and nitrogen cycles that are
Characterization of Transposable Elements in Laccaria bicolor

Energy Technology Data Exchange (ETDEWEB)

Labbe, Jessy L [ORNL; Murat, Claude [INRA, Nancy, France; Morin, Emmanuelle [INRA, Nancy, France; Tuskan, Gerald A [ORNL; Le Tacon, F [UMR, France; Martin, Francis [INRA, Nancy, France

2012-01-01

Background: The publicly available Laccaria bicolor genome sequence has provided a considerable genomic resource allowing systematic identification of transposable elements (TEs) in this symbiotic ectomycorrhizal fungus. Using a TE-specific annotation pipeline we have characterized and analyzed TEs in the L. bicolor S238N-H82 genome. Methodology/Principal Findings: TEs occupy 24% of the 60 Mb L. bicolor genome and represent 25,787 full-length and partial copies elements distributed within 172 families. The most abundant elements were the Copia-like. TEs are not randomly distributed across the genome, but are tightly nested or clustered. The majority of TEs are ancient except some terminal inverted repeats (TIRS), long terminal repeats (LTRs) and a large retrotransposon derivative (LARD) element. There were three main periods of TEs expansion in L. bicolor; the first from 57 to 10 Mya, the second from 5 to 1 Mya and the most recent from 500,000 years ago until now. LTR retrotransposons are closely related to retrotransposons found in another basidiomycete, Coprinopsis cinerea. Conclusions: This analysis represents an initial characterization of TEs in the L. bicolor genome, contributes to genome assembly and to a greater understanding of the role TEs played in genome organization and evolution, and provides a valuable resource for the ongoing Laccaria Pan-Genome project supported by the U.S.-DOE Joint Genome Institute.
Complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera, and comparative analyses with other grass genomes

Science.gov (United States)

Saski, Christopher; Lee, Seung-Bum; Fjellheim, Siri; Guda, Chittibabu; Jansen, Robert K.; Luo, Hong; Tomkins, Jeffrey; Rognli, Odd Arne; Clarke, Jihong Liu

2009-01-01

Comparisons of complete chloroplast genome sequences of Hordeum vulgare, Sorghum bicolor and Agrostis stolonifera to six published grass chloroplast genomes reveal that gene content and order are similar but two microstructural changes have occurred. First, the expansion of the IR at the SSC/IRa boundary that duplicates a portion of the 5′ end of ndhH is restricted to the three genera of the subfamily Pooideae (Agrostis, Hordeum and Triticum). Second, a 6 bp deletion in ndhK is shared by Agrostis, Hordeum, Oryza and Triticum, and this event supports the sister relationship between the subfamilies Erhartoideae and Pooideae. Repeat analysis identified 19–37 direct and inverted repeats 30 bp or longer with a sequence identity of at least 90%. Seventeen of the 26 shared repeats are found in all the grass chloroplast genomes examined and are located in the same genes or intergenic spacer (IGS) regions. Examination of simple sequence repeats (SSRs) identified 16–21 potential polymorphic SSRs. Five IGS regions have 100% sequence identity among Zea mays, Saccharum officinarum and Sorghum bicolor, whereas no spacer regions were identical among Oryza sativa, Triticum aestivum, H. vulgare and A. stolonifera despite their close phylogenetic relationship. Alignment of EST sequences and DNA coding sequences identified six C–U conversions in both Sorghum bicolor and H. vulgare but only one in A. stolonifera. Phylogenetic trees based on DNA sequences of 61 protein-coding genes of 38 taxa using both maximum parsimony and likelihood methods provide moderate support for a sister relationship between the subfamilies Erhartoideae and Pooideae. PMID:17534593
The Sorghum bicolor genome and the diversification of grasses

Energy Technology Data Exchange (ETDEWEB)

Paterson, Andrew H.; Bowers, John E.; Bruggmann, Remy; dubchak, Inna; Grimwood, Jane; Gundlach, Heidrun; Haberer, Georg; Hellsten, Uffe; Mitros, Therese; Poliakov, Alexander; Schmutz, Jeremy; Spannagl, Manuel; Tang, Haibo; Wang, Xiyin; Wicker, Thomas; Bharti, Arvind K.; Chapman, Jarrod; Feltus, F. Alex; Gowik, Udo; Grigoriev, Igor V.; Lyons, Eric; Maher, Christopher A.; Martis, Mihaela; Marechania, Apurva; Otillar, Robert P.; Penning, Bryan W.; Salamov, Asaf. A.; Wang, Yu; Zhang, Lifang; Carpita, Nicholas C.; Freeling, Michael; Gingle, Alan R.; hash, C. Thomas; Keller, Beat; Klein, Patricia; Kresovich, Stephen; McCann, Maureen C.; Ming, Ray; Peterson, Daniel G.; ur-Rahman, Mehboob-; Ware, Doreen; Westhoff, Peter; Mayer, Klaus F. X.; Messing, Joachim; Rokhsar, Daniel S.

2008-08-20

Sorghum, an African grass related to sugar cane and maize, is grown for food, feed, fibre and fuel. We present an initial analysis of the approx730-megabase Sorghum bicolor (L.) Moench genome, placing approx98percent of genes in their chromosomal context using whole-genome shotgun sequence validated by genetic, physical and syntenic information. Genetic recombination is largely confined to about one-third of the sorghum genome with gene order and density similar to those of rice. Retrotransposon accumulation in recombinationally recalcitrant heterochromatin explains the approx75percent larger genome size of sorghum compared with rice. Although gene and repetitive DNA distributions have been preserved since palaeopolyploidization approx70 million years ago, most duplicated gene sets lost one member before the sorghum rice divergence. Concerted evolution makes one duplicated chromosomal segment appear to be only a few million years old. About 24percent of genes are grass-specific and 7percent are sorghum-specific. Recent gene and microRNA duplications may contribute to sorghum's drought tolerance.
Genome analysis methods: Sorghum bicolor [PGDBj Registered plant list, Marker list, QTL list, Plant DB link and Genome analysis methods[Archive

Lifescience Database Archive (English)

Full Text Available Sorghum bicolor Finished 2n=20 760 Mb 2009 Sanger (Clone-based) 10,717,203 reads 7...30 Mb 8.5x Arachne2 v.20060705 3,304 12,873 BLAST, GenomeScan 34,496 (Sbi1.4) JGI; http://www.phytozome.net/sorghum Sbi1 Sbi1.4 10.1038/nature07723 19189423 ...
The Sequenced Angiosperm Genomes and Genome Databases.

Science.gov (United States)

Chen, Fei; Dong, Wei; Zhang, Jiawei; Guo, Xinyue; Chen, Junhao; Wang, Zhengjia; Lin, Zhenguo; Tang, Haibao; Zhang, Liangsheng

2018-01-01

Angiosperms, the flowering plants, provide the essential resources for human life, such as food, energy, oxygen, and materials. They also promoted the evolution of human, animals, and the planet earth. Despite the numerous advances in genome reports or sequencing technologies, no review covers all the released angiosperm genomes and the genome databases for data sharing. Based on the rapid advances and innovations in the database reconstruction in the last few years, here we provide a comprehensive review for three major types of angiosperm genome databases, including databases for a single species, for a specific angiosperm clade, and for multiple angiosperm species. The scope, tools, and data of each type of databases and their features are concisely discussed. The genome databases for a single species or a clade of species are especially popular for specific group of researchers, while a timely-updated comprehensive database is more powerful for address of major scientific mysteries at the genome scale. Considering the low coverage of flowering plants in any available database, we propose construction of a comprehensive database to facilitate large-scale comparative studies of angiosperm genomes and to promote the collaborative studies of important questions in plant biology.
Genomic and transcriptomic analysis of Laccaria bicolor CAZome reveals insights into polysaccharides remodelling during symbiosis establishment.

Science.gov (United States)

Veneault-Fourrey, Claire; Commun, Carine; Kohler, Annegret; Morin, Emmanuelle; Balestrini, Raffaella; Plett, Jonathan; Danchin, Etienne; Coutinho, Pedro; Wiebenga, Ad; de Vries, Ronald P; Henrissat, Bernard; Martin, Francis

2014-11-01

Ectomycorrhizal fungi, living in soil forests, are required microorganisms to sustain tree growth and productivity. The establishment of mutualistic interaction with roots to form ectomycorrhiza (ECM) is not well known at the molecular level. In particular, how fungal and plant cell walls are rearranged to establish a fully functional ectomycorrhiza is poorly understood. Nevertheless, it is likely that Carbohydrate Active enZymes (CAZyme) produced by the fungus participate in this process. Genome-wide transcriptome profiling during ECM development was used to examine how the CAZome of Laccaria bicolor is regulated during symbiosis establishment. CAZymes active on fungal cell wall were upregulated during ECM development in particular after 4weeks of contact when the hyphae are surrounding the root cells and start to colonize the apoplast. We demonstrated that one expansin-like protein, whose expression is specific to symbiotic tissues, localizes within fungal cell wall. Whereas L. bicolor genome contained a constricted repertoire of CAZymes active on cellulose and hemicellulose, these CAZymes were expressed during the first steps of root cells colonization. L. bicolor retained the ability to use homogalacturonan, a pectin-derived substrate, as carbon source. CAZymes likely involved in pectin hydrolysis were mainly expressed at the stage of a fully mature ECM. All together, our data suggest an active remodelling of fungal cell wall with a possible involvement of expansin during ECM development. By contrast, a soft remodelling of the plant cell wall likely occurs through the loosening of the cellulose microfibrils by AA9 or GH12 CAZymes and middle lamella smooth remodelling through pectin (homogalacturonan) hydrolysis likely by GH28, GH12 CAZymes. Copyright © 2014 Elsevier Inc. All rights reserved.
The mitochondrial genome of the stingless bee Melipona bicolor (Hymenoptera, Apidae, Meliponini: sequence, gene organization and a unique tRNA translocation event conserved across the tribe Meliponini

Directory of Open Access Journals (Sweden)

Daniela Silvestre

2008-01-01

Full Text Available At present a complete mtDNA sequence has been reported for only two hymenopterans, the Old World honey bee, Apis mellifera and the sawfly Perga condei. Among the bee group, the tribe Meliponini (stingless bees has some distinction due to its Pantropical distribution, great number of species and large importance as main pollinators in several ecosystems, including the Brazilian rain forest. However few molecular studies have been conducted on this group of bees and few sequence data from mitochondrial genomes have been described. In this project, we PCR amplified and sequenced 78% of the mitochondrial genome of the stingless bee Melipona bicolor (Apidae, Meliponini. The sequenced region contains all of the 13 mitochondrial protein-coding genes, 18 of 22 tRNA genes, and both rRNA genes (one of them was partially sequenced. We also report the genome organization (gene content and order, gene translation, genetic code, and other molecular features, such as base frequencies, codon usage, gene initiation and termination. We compare these characteristics of M. bicolor to those of the mitochondrial genome of A. mellifera and other insects. A highly biased A+T content is a typical characteristic of the A. mellifera mitochondrial genome and it was even more extreme in that of M. bicolor. Length and compositional differences between M. bicolor and A. mellifera genes were detected and the gene order was compared. Eleven tRNA gene translocations were observed between these two species. This latter finding was surprising, considering the taxonomic proximity of these two bee tribes. The tRNA Lys gene translocation was investigated within Meliponini and showed high conservation across the Pantropical range of the tribe.
Partial structure of the phylloxin gene from the giant monkey frog, Phyllomedusa bicolor: parallel cloning of precursor cDNA and genomic DNA from lyophilized skin secretion.

Science.gov (United States)

Chen, Tianbao; Gagliardo, Ron; Walker, Brian; Zhou, Mei; Shaw, Chris

2005-12-01

Phylloxin is a novel prototype antimicrobial peptide from the skin of Phyllomedusa bicolor. Here, we describe parallel identification and sequencing of phylloxin precursor transcript (mRNA) and partial gene structure (genomic DNA) from the same sample of lyophilized skin secretion using our recently-described cloning technique. The open-reading frame of the phylloxin precursor was identical in nucleotide sequence to that previously reported and alignment with the nucleotide sequence derived from genomic DNA indicated the presence of a 175 bp intron located in a near identical position to that found in the dermaseptins. The highly-conserved structural organization of skin secretion peptide genes in P. bicolor can thus be extended to include that encoding phylloxin (plx). These data further reinforce our assertion that application of the described methodology can provide robust genomic/transcriptomic/peptidomic data without the need for specimen sacrifice.
The YH database: the first Asian diploid genome database

DEFF Research Database (Denmark)

Li, Guoqing; Ma, Lijia; Song, Chao

2009-01-01

genome consensus. The YH database is currently one of the three personal genome database, organizing the original data and analysis results in a user-friendly interface, which is an endeavor to achieve fundamental goals for establishing personal medicine. The database is available at http://yh.genomics.org.cn....
Agrobacterium-mediated insertional mutagenesis in the mycorrhizal fungus Laccaria bicolor.

Science.gov (United States)

Stephan, B I; Alvarez Crespo, M C; Kemppainen, M J; Pardo, A G

2017-05-01

Agrobacterium-mediated gene transfer (AMT) is extensively employed as a tool in fungal functional genomics and accordingly, in previous studies we used AMT on a dikaryotic strain of the ectomycorrhizal basidiomycete Laccaria bicolor. The interest in this fungus derives from its capacity to establish a symbiosis with tree roots, thereby playing a major role in nutrient cycling of forest ecosystems. The ectomycorrhizal symbiosis is a highly complex interaction involving many genes from both partners. To advance in the functional characterization of fungal genes, AMT was used on a monokaryotic L. bicolor. A collection of over 1200 transgenic strains was produced, of which 200 randomly selected strains were analyzed for their genomic T-DNA insertion patterns. By means of insertional mutagenesis, a number of transgenic strains were obtained displaying differential growth features. Moreover, mating with a compatible strain resulted in dikaryons that retained altered phenotypic features of the transgenic monokaryon. The analysis of the T-DNA integration pattern revealed mostly similar results to those reported in earlier studies, confirming the usefulness of AMT on different genetic backgrounds of L. bicolor. Taken together, our studies display the great versatility and potentiality of AMT as a tool for the genetic characterization of L. bicolor.
Mycobacteriophage genome database.

Science.gov (United States)

Joseph, Jerrine; Rajendran, Vasanthi; Hassan, Sameer; Kumar, Vanaja

2011-01-01

Mycobacteriophage genome database (MGDB) is an exclusive repository of the 64 completely sequenced mycobacteriophages with annotated information. It is a comprehensive compilation of the various gene parameters captured from several databases pooled together to empower mycobacteriophage researchers. The MGDB (Version No.1.0) comprises of 6086 genes from 64 mycobacteriophages classified into 72 families based on ACLAME database. Manual curation was aided by information available from public databases which was enriched further by analysis. Its web interface allows browsing as well as querying the classification. The main objective is to collect and organize the complexity inherent to mycobacteriophage protein classification in a rational way. The other objective is to browse the existing and new genomes and describe their functional annotation. The database is available for free at http://mpgdb.ibioinformatics.org/mpgdb.php.
i-Genome: A database to summarize oligonucleotide data in genomes

Directory of Open Access Journals (Sweden)

Chang Yu-Chung

2004-10-01

Full Text Available Abstract Background Information on the occurrence of sequence features in genomes is crucial to comparative genomics, evolutionary analysis, the analyses of regulatory sequences and the quantitative evaluation of sequences. Computing the frequencies and the occurrences of a pattern in complete genomes is time-consuming. Results The proposed database provides information about sequence features generated by exhaustively computing the sequences of the complete genome. The repetitive elements in the eukaryotic genomes, such as LINEs, SINEs, Alu and LTR, are obtained from Repbase. The database supports various complete genomes including human, yeast, worm, and 128 microbial genomes. Conclusions This investigation presents and implements an efficiently computational approach to accumulate the occurrences of the oligonucleotides or patterns in complete genomes. A database is established to maintain the information of the sequence features, including the distributions of oligonucleotide, the gene distribution, the distribution of repetitive elements in genomes and the occurrences of the oligonucleotides. The database can provide more effective and efficient way to access the repetitive features in genomes.
Rat Genome Database (RGD)

Data.gov (United States)

U.S. Department of Health & Human Services — The Rat Genome Database (RGD) is a collaborative effort between leading research institutions involved in rat genetic and genomic research to collect, consolidate,...
MIPS: a database for genomes and protein sequences.

Science.gov (United States)

Mewes, H W; Frishman, D; Güldener, U; Mannhaupt, G; Mayer, K; Mokrejs, M; Morgenstern, B; Münsterkötter, M; Rudd, S; Weil, B

2002-01-01

The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) continues to provide genome-related information in a systematic way. MIPS supports both national and European sequencing and functional analysis projects, develops and maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences, and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the databases for the comprehensive set of genomes (PEDANT genomes), the database of annotated human EST clusters (HIB), the database of complete cDNAs from the DHGP (German Human Genome Project), as well as the project specific databases for the GABI (Genome Analysis in Plants) and HNB (Helmholtz-Netzwerk Bioinformatik) networks. The Arabidospsis thaliana database (MATDB), the database of mitochondrial proteins (MITOP) and our contribution to the PIR International Protein Sequence Database have been described elsewhere [Schoof et al. (2002) Nucleic Acids Res., 30, 91-93; Scharfe et al. (2000) Nucleic Acids Res., 28, 155-158; Barker et al. (2001) Nucleic Acids Res., 29, 29-32]. All databases described, the protein analysis tools provided and the detailed descriptions of our projects can be accessed through the MIPS World Wide Web server (http://mips.gsf.de).
GenColors-based comparative genome databases for small eukaryotic genomes.

Science.gov (United States)

Felder, Marius; Romualdi, Alessandro; Petzold, Andreas; Platzer, Matthias; Sühnel, Jürgen; Glöckner, Gernot

2013-01-01

Many sequence data repositories can give a quick and easily accessible overview on genomes and their annotations. Less widespread is the possibility to compare related genomes with each other in a common database environment. We have previously described the GenColors database system (http://gencolors.fli-leibniz.de) and its applications to a number of bacterial genomes such as Borrelia, Legionella, Leptospira and Treponema. This system has an emphasis on genome comparison. It combines data from related genomes and provides the user with an extensive set of visualization and analysis tools. Eukaryote genomes are normally larger than prokaryote genomes and thus pose additional challenges for such a system. We have, therefore, adapted GenColors to also handle larger datasets of small eukaryotic genomes and to display eukaryotic gene structures. Further recent developments include whole genome views, genome list options and, for bacterial genome browsers, the display of horizontal gene transfer predictions. Two new GenColors-based databases for two fungal species (http://fgb.fli-leibniz.de) and for four social amoebas (http://sacgb.fli-leibniz.de) were set up. Both new resources open up a single entry point for related genomes for the amoebozoa and fungal research communities and other interested users. Comparative genomics approaches are greatly facilitated by these resources.
BGD: a database of bat genomes.

Science.gov (United States)

Fang, Jianfei; Wang, Xuan; Mu, Shuo; Zhang, Shuyi; Dong, Dong

2015-01-01

Bats account for ~20% of mammalian species, and are the only mammals with true powered flight. For the sake of their specialized phenotypic traits, many researches have been devoted to examine the evolution of bats. Until now, some whole genome sequences of bats have been assembled and annotated, however, a uniform resource for the annotated bat genomes is still unavailable. To make the extensive data associated with the bat genomes accessible to the general biological communities, we established a Bat Genome Database (BGD). BGD is an open-access, web-available portal that integrates available data of bat genomes and genes. It hosts data from six bat species, including two megabats and four microbats. Users can query the gene annotations using efficient searching engine, and it offers browsable tracks of bat genomes. Furthermore, an easy-to-use phylogenetic analysis tool was also provided to facilitate online phylogeny study of genes. To the best of our knowledge, BGD is the first database of bat genomes. It will extend our understanding of the bat evolution and be advantageous to the bat sequences analysis. BGD is freely available at: http://donglab.ecnu.edu.cn/databases/BatGenome/.
BGD: a database of bat genomes.

Directory of Open Access Journals (Sweden)

Jianfei Fang

Full Text Available Bats account for ~20% of mammalian species, and are the only mammals with true powered flight. For the sake of their specialized phenotypic traits, many researches have been devoted to examine the evolution of bats. Until now, some whole genome sequences of bats have been assembled and annotated, however, a uniform resource for the annotated bat genomes is still unavailable. To make the extensive data associated with the bat genomes accessible to the general biological communities, we established a Bat Genome Database (BGD. BGD is an open-access, web-available portal that integrates available data of bat genomes and genes. It hosts data from six bat species, including two megabats and four microbats. Users can query the gene annotations using efficient searching engine, and it offers browsable tracks of bat genomes. Furthermore, an easy-to-use phylogenetic analysis tool was also provided to facilitate online phylogeny study of genes. To the best of our knowledge, BGD is the first database of bat genomes. It will extend our understanding of the bat evolution and be advantageous to the bat sequences analysis. BGD is freely available at: http://donglab.ecnu.edu.cn/databases/BatGenome/.
GOBASE: an organelle genome database

OpenAIRE

O?Brien, Emmet A.; Zhang, Yue; Wang, Eric; Marie, Veronique; Badejoko, Wole; Lang, B. Franz; Burger, Gertraud

2008-01-01

The organelle genome database GOBASE, now in its 21st release (June 2008), contains all published mitochondrion-encoded sequences (?913 000) and chloroplast-encoded sequences (?250 000) from a wide range of eukaryotic taxa. For all sequences, information on related genes, exons, introns, gene products and taxonomy is available, as well as selected genome maps and RNA secondary structures. Recent major enhancements to database functionality include: (i) addition of an interface for RNA editing...

Robot bicolor system

Science.gov (United States)

Yamaba, Kazuo

1999-03-01

In case of robot vision, most important problem is the processing speed of acquiring and analyzing images are less than the speed of execution of the robot. In an actual robot color vision system, it is considered that the system should be processed at real time. We guessed this problem might be solved using by the bicolor analysis technique. We have been testing a system which we hope will give us insight to the properties of bicolor vision. The experiment is used the red channel of a color CCD camera and an image from a monochromatic camera to duplicate McCann's theory. To mix the two signals together, the mono image is copied into each of the red, green and blue memory banks of the image processing board and then added the red image to the red bank. On the contrary, pure color images, red, green and blue components are obtained from the original bicolor images in the novel color system after the scaling factor is added to each RGB image. Our search for a bicolor robot vision system was entirely successful.
Private and Efficient Query Processing on Outsourced Genomic Databases.

Science.gov (United States)

Ghasemi, Reza; Al Aziz, Md Momin; Mohammed, Noman; Dehkordi, Massoud Hadian; Jiang, Xiaoqian

2017-09-01

Applications of genomic studies are spreading rapidly in many domains of science and technology such as healthcare, biomedical research, direct-to-consumer services, and legal and forensic. However, there are a number of obstacles that make it hard to access and process a big genomic database for these applications. First, sequencing genomic sequence is a time consuming and expensive process. Second, it requires large-scale computation and storage systems to process genomic sequences. Third, genomic databases are often owned by different organizations, and thus, not available for public usage. Cloud computing paradigm can be leveraged to facilitate the creation and sharing of big genomic databases for these applications. Genomic data owners can outsource their databases in a centralized cloud server to ease the access of their databases. However, data owners are reluctant to adopt this model, as it requires outsourcing the data to an untrusted cloud service provider that may cause data breaches. In this paper, we propose a privacy-preserving model for outsourcing genomic data to a cloud. The proposed model enables query processing while providing privacy protection of genomic databases. Privacy of the individuals is guaranteed by permuting and adding fake genomic records in the database. These techniques allow cloud to evaluate count and top-k queries securely and efficiently. Experimental results demonstrate that a count and a top-k query over 40 Single Nucleotide Polymorphisms (SNPs) in a database of 20 000 records takes around 100 and 150 s, respectively.
Ginseng Genome Database: an open-access platform for genomics of Panax ginseng.

Science.gov (United States)

Jayakodi, Murukarthick; Choi, Beom-Soon; Lee, Sang-Choon; Kim, Nam-Hoon; Park, Jee Young; Jang, Woojong; Lakshmanan, Meiyappan; Mohan, Shobhana V G; Lee, Dong-Yup; Yang, Tae-Jin

2018-04-12

The ginseng (Panax ginseng C.A. Meyer) is a perennial herbaceous plant that has been used in traditional oriental medicine for thousands of years. Ginsenosides, which have significant pharmacological effects on human health, are the foremost bioactive constituents in this plant. Having realized the importance of this plant to humans, an integrated omics resource becomes indispensable to facilitate genomic research, molecular breeding and pharmacological study of this herb. The first draft genome sequences of P. ginseng cultivar "Chunpoong" were reported recently. Here, using the draft genome, transcriptome, and functional annotation datasets of P. ginseng, we have constructed the Ginseng Genome Database http://ginsengdb.snu.ac.kr /, the first open-access platform to provide comprehensive genomic resources of P. ginseng. The current version of this database provides the most up-to-date draft genome sequence (of approximately 3000 Mbp of scaffold sequences) along with the structural and functional annotations for 59,352 genes and digital expression of genes based on transcriptome data from different tissues, growth stages and treatments. In addition, tools for visualization and the genomic data from various analyses are provided. All data in the database were manually curated and integrated within a user-friendly query page. This database provides valuable resources for a range of research fields related to P. ginseng and other species belonging to the Apiales order as well as for plant research communities in general. Ginseng genome database can be accessed at http://ginsengdb.snu.ac.kr /.
The UCSC Genome Browser Database: 2008 update

DEFF Research Database (Denmark)

Karolchik, D; Kuhn, R M; Baertsch, R

2007-01-01

The University of California, Santa Cruz, Genome Browser Database (GBD) provides integrated sequence and annotation data for a large collection of vertebrate and model organism genomes. Seventeen new assemblies have been added to the database in the past year, for a total coverage of 19 vertebrat...
Recent updates and developments to plant genome size databases

Science.gov (United States)

Garcia, Sònia; Leitch, Ilia J.; Anadon-Rosell, Alba; Canela, Miguel Á.; Gálvez, Francisco; Garnatje, Teresa; Gras, Airy; Hidalgo, Oriane; Johnston, Emmeline; Mas de Xaxars, Gemma; Pellicer, Jaume; Siljak-Yakovlev, Sonja; Vallès, Joan; Vitales, Daniel; Bennett, Michael D.

2014-01-01

Two plant genome size databases have been recently updated and/or extended: the Plant DNA C-values database (http://data.kew.org/cvalues), and GSAD, the Genome Size in Asteraceae database (http://www.asteraceaegenomesize.com). While the first provides information on nuclear DNA contents across land plants and some algal groups, the second is focused on one of the largest and most economically important angiosperm families, Asteraceae. Genome size data have numerous applications: they can be used in comparative studies on genome evolution, or as a tool to appraise the cost of whole-genome sequencing programs. The growing interest in genome size and increasing rate of data accumulation has necessitated the continued update of these databases. Currently, the Plant DNA C-values database (Release 6.0, Dec. 2012) contains data for 8510 species, while GSAD has 1219 species (Release 2.0, June 2013), representing increases of 17 and 51%, respectively, in the number of species with genome size data, compared with previous releases. Here we provide overviews of the most recent releases of each database, and outline new features of GSAD. The latter include (i) a tool to visually compare genome size data between species, (ii) the option to export data and (iii) a webpage containing information about flow cytometry protocols. PMID:24288377
De-anonymizing Genomic Databases Using Phenotypic Traits

Directory of Open Access Journals (Sweden)

Humbert Mathias

2015-06-01

Full Text Available People increasingly have their genomes sequenced and some of them share their genomic data online. They do so for various purposes, including to find relatives and to help advance genomic research. An individual’s genome carries very sensitive, private information such as its owner’s susceptibility to diseases, which could be used for discrimination. Therefore, genomic databases are often anonymized. However, an individual’s genotype is also linked to visible phenotypic traits, such as eye or hair color, which can be used to re-identify users in anonymized public genomic databases, thus raising severe privacy issues. For instance, an adversary can identify a target’s genome using known her phenotypic traits and subsequently infer her susceptibility to Alzheimer’s disease. In this paper, we quantify, based on various phenotypic traits, the extent of this threat in several scenarios by implementing de-anonymization attacks on a genomic database of OpenSNP users sequenced by 23andMe. Our experimental results show that the proportion of correct matches reaches 23% with a supervised approach in a database of 50 participants. Our approach outperforms the baseline by a factor of four, in terms of the proportion of correct matches, in most scenarios. We also evaluate the adversary’s ability to predict individuals’ predisposition to Alzheimer’s disease, and we observe that the inference error can be halved compared to the baseline. We also analyze the effect of the number of known phenotypic traits on the success rate of the attack. As progress is made in genomic research, especially for genotype-phenotype associations, the threat presented in this paper will become more serious.
Benchmarking database performance for genomic data.

Science.gov (United States)

Khushi, Matloob

2015-06-01

Genomic regions represent features such as gene annotations, transcription factor binding sites and epigenetic modifications. Performing various genomic operations such as identifying overlapping/non-overlapping regions or nearest gene annotations are common research needs. The data can be saved in a database system for easy management, however, there is no comprehensive database built-in algorithm at present to identify overlapping regions. Therefore I have developed a novel region-mapping (RegMap) SQL-based algorithm to perform genomic operations and have benchmarked the performance of different databases. Benchmarking identified that PostgreSQL extracts overlapping regions much faster than MySQL. Insertion and data uploads in PostgreSQL were also better, although general searching capability of both databases was almost equivalent. In addition, using the algorithm pair-wise, overlaps of >1000 datasets of transcription factor binding sites and histone marks, collected from previous publications, were reported and it was found that HNF4G significantly co-locates with cohesin subunit STAG1 (SA1).Inc. © 2015 Wiley Periodicals, Inc.
CyanoBase: the cyanobacteria genome database update 2010

OpenAIRE

Nakao, Mitsuteru; Okamoto, Shinobu; Kohara, Mitsuyo; Fujishiro, Tsunakazu; Fujisawa, Takatomo; Sato, Shusei; Tabata, Satoshi; Kaneko, Takakazu; Nakamura, Yasukazu

2009-01-01

CyanoBase (http://genome.kazusa.or.jp/cyanobase) is the genome database for cyanobacteria, which are model organisms for photosynthesis. The database houses cyanobacteria species information, complete genome sequences, genome-scale experiment data, gene information, gene annotations and mutant information. In this version, we updated these datasets and improved the navigation and the visual display of the data views. In addition, a web service API now enables users to retrieve the data in var...
Genetic analysis of recombinant inbred lines for Sorghum bicolor × Sorghum propinquum.

Science.gov (United States)

Kong, Wenqian; Jin, Huizhe; Franks, Cleve D; Kim, Changsoo; Bandopadhyay, Rajib; Rana, Mukesh K; Auckland, Susan A; Goff, Valorie H; Rainville, Lisa K; Burow, Gloria B; Woodfin, Charles; Burke, John J; Paterson, Andrew H

2013-01-01

We describe a recombinant inbred line (RIL) population of 161 F5 genotypes for the widest euploid cross that can be made to cultivated sorghum (Sorghum bicolor) using conventional techniques, S. bicolor × Sorghum propinquum, that segregates for many traits related to plant architecture, growth and development, reproduction, and life history. The genetic map of the S. bicolor × S. propinquum RILs contains 141 loci on 10 linkage groups collectively spanning 773.1 cM. Although the genetic map has DNA marker density well-suited to quantitative trait loci mapping and samples most of the genome, our previous observations that sorghum pericentromeric heterochromatin is recalcitrant to recombination is highlighted by the finding that the vast majority of recombination in sorghum is concentrated in small regions of euchromatin that are distal to most chromosomes. The advancement of the RIL population in an environment to which the S. bicolor parent was well adapted (indeed bred for) but the S. propinquum parent was not largely eliminated an allele for short-day flowering that confounded many other traits, for example, permitting us to map new quantitative trait loci for flowering that previously eluded detection. Additional recombination that has accrued in the development of this RIL population also may have improved resolution of apices of heterozygote excess, accounting for their greater abundance in the F5 than the F2 generation. The S. bicolor × S. propinquum RIL population offers advantages over early-generation populations that will shed new light on genetic, environmental, and physiological/biochemical factors that regulate plant growth and development.
Database Description - TMBETA-GENOME | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available ENOME is a database for transmembrane β-barrel proteins in complete genomes. For each genome, calculations with machine learning algo...rithms and statistical methods have been perfumed and th
CyanoBase: the cyanobacteria genome database update 2010.

Science.gov (United States)

Nakao, Mitsuteru; Okamoto, Shinobu; Kohara, Mitsuyo; Fujishiro, Tsunakazu; Fujisawa, Takatomo; Sato, Shusei; Tabata, Satoshi; Kaneko, Takakazu; Nakamura, Yasukazu

2010-01-01

CyanoBase (http://genome.kazusa.or.jp/cyanobase) is the genome database for cyanobacteria, which are model organisms for photosynthesis. The database houses cyanobacteria species information, complete genome sequences, genome-scale experiment data, gene information, gene annotations and mutant information. In this version, we updated these datasets and improved the navigation and the visual display of the data views. In addition, a web service API now enables users to retrieve the data in various formats with other tools, seamlessly.
MIPS: a database for protein sequences and complete genomes.

Science.gov (United States)

Mewes, H W; Hani, J; Pfeiffer, F; Frishman, D

1998-01-01

The MIPS group [Munich Information Center for Protein Sequences of the German National Center for Environment and Health (GSF)] at the Max-Planck-Institute for Biochemistry, Martinsried near Munich, Germany, is involved in a number of data collection activities, including a comprehensive database of the yeast genome, a database reflecting the progress in sequencing the Arabidopsis thaliana genome, the systematic analysis of other small genomes and the collection of protein sequence data within the framework of the PIR-International Protein Sequence Database (described elsewhere in this volume). Through its WWW server (http://www.mips.biochem.mpg.de ) MIPS provides access to a variety of generic databases, including a database of protein families as well as automatically generated data by the systematic application of sequence analysis algorithms. The yeast genome sequence and its related information was also compiled on CD-ROM to provide dynamic interactive access to the 16 chromosomes of the first eukaryotic genome unraveled. PMID:9399795
The UCSC Genome Browser Database: update 2006

DEFF Research Database (Denmark)

Hinrichs, A S; Karolchik, D; Baertsch, R

2006-01-01

The University of California Santa Cruz Genome Browser Database (GBD) contains sequence and annotation data for the genomes of about a dozen vertebrate species and several major model organisms. Genome annotations typically include assembly data, sequence composition, genes and gene predictions, ...
Brassica ASTRA: an integrated database for Brassica genomic research.

Science.gov (United States)

Love, Christopher G; Robinson, Andrew J; Lim, Geraldine A C; Hopkins, Clare J; Batley, Jacqueline; Barker, Gary; Spangenberg, German C; Edwards, David

2005-01-01

Brassica ASTRA is a public database for genomic information on Brassica species. The database incorporates expressed sequences with Swiss-Prot and GenBank comparative sequence annotation as well as secondary Gene Ontology (GO) annotation derived from the comparison with Arabidopsis TAIR GO annotations. Simple sequence repeat molecular markers are identified within resident sequences and mapped onto the closely related Arabidopsis genome sequence. Bacterial artificial chromosome (BAC) end sequences derived from the Multinational Brassica Genome Project are also mapped onto the Arabidopsis genome sequence enabling users to identify candidate Brassica BACs corresponding to syntenic regions of Arabidopsis. This information is maintained in a MySQL database with a web interface providing the primary means of interrogation. The database is accessible at http://hornbill.cspp.latrobe.edu.au.
The UCSC genome browser database: update 2007

DEFF Research Database (Denmark)

Kuhn, R M; Karolchik, D; Zweig, A S

2006-01-01

The University of California, Santa Cruz Genome Browser Database contains, as of September 2006, sequence and annotation data for the genomes of 13 vertebrate and 19 invertebrate species. The Genome Browser displays a wide variety of annotations at all scales from the single nucleotide level up t...
INE: a rice genome database with an integrated map view.

Science.gov (United States)

Sakata, K; Antonio, B A; Mukai, Y; Nagasaki, H; Sakai, Y; Makino, K; Sasaki, T

2000-01-01

The Rice Genome Research Program (RGP) launched a large-scale rice genome sequencing in 1998 aimed at decoding all genetic information in rice. A new genome database called INE (INtegrated rice genome Explorer) has been developed in order to integrate all the genomic information that has been accumulated so far and to correlate these data with the genome sequence. A web interface based on Java applet provides a rapid viewing capability in the database. The first operational version of the database has been completed which includes a genetic map, a physical map using YAC (Yeast Artificial Chromosome) clones and PAC (P1-derived Artificial Chromosome) contigs. These maps are displayed graphically so that the positional relationships among the mapped markers on each chromosome can be easily resolved. INE incorporates the sequences and annotations of the PAC contig. A site on low quality information ensures that all submitted sequence data comply with the standard for accuracy. As a repository of rice genome sequence, INE will also serve as a common database of all sequence data obtained by collaborating members of the International Rice Genome Sequencing Project (IRGSP). The database can be accessed at http://www. dna.affrc.go.jp:82/giot/INE. html or its mirror site at http://www.staff.or.jp/giot/INE.html
The Ensembl genome database project.

Science.gov (United States)

Hubbard, T; Barker, D; Birney, E; Cameron, G; Chen, Y; Clark, L; Cox, T; Cuff, J; Curwen, V; Down, T; Durbin, R; Eyras, E; Gilbert, J; Hammond, M; Huminiecki, L; Kasprzyk, A; Lehvaslaiho, H; Lijnzaad, P; Melsopp, C; Mongin, E; Pettett, R; Pocock, M; Potter, S; Rust, A; Schmidt, E; Searle, S; Slater, G; Smith, J; Spooner, W; Stabenau, A; Stalker, J; Stupka, E; Ureta-Vidal, A; Vastrik, I; Clamp, M

2002-01-01

The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of the human genome sequence, with confirmed gene predictions that have been integrated with external data sources, and is available as either an interactive web site or as flat files. It is also an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements from sequence analysis to data storage and visualisation. The Ensembl site is one of the leading sources of human genome sequence annotation and provided much of the analysis for publication by the international human genome project of the draft genome. The Ensembl system is being installed around the world in both companies and academic sites on machines ranging from supercomputers to laptops.
gEVE: a genome-based endogenous viral element database provides comprehensive viral protein-coding sequences in mammalian genomes.

Science.gov (United States)

Nakagawa, So; Takahashi, Mahoko Ueda

2016-01-01

In mammals, approximately 10% of genome sequences correspond to endogenous viral elements (EVEs), which are derived from ancient viral infections of germ cells. Although most EVEs have been inactivated, some open reading frames (ORFs) of EVEs obtained functions in the hosts. However, EVE ORFs usually remain unannotated in the genomes, and no databases are available for EVE ORFs. To investigate the function and evolution of EVEs in mammalian genomes, we developed EVE ORF databases for 20 genomes of 19 mammalian species. A total of 736,771 non-overlapping EVE ORFs were identified and archived in a database named gEVE (http://geve.med.u-tokai.ac.jp). The gEVE database provides nucleotide and amino acid sequences, genomic loci and functional annotations of EVE ORFs for all 20 genomes. In analyzing RNA-seq data with the gEVE database, we successfully identified the expressed EVE genes, suggesting that the gEVE database facilitates studies of the genomic analyses of various mammalian species.Database URL: http://geve.med.u-tokai.ac.jp. © The Author(s) 2016. Published by Oxford University Press.
The Ruby UCSC API: accessing the UCSC genome database using Ruby.

Science.gov (United States)

Mishima, Hiroyuki; Aerts, Jan; Katayama, Toshiaki; Bonnal, Raoul J P; Yoshiura, Koh-ichiro

2012-09-21

The University of California, Santa Cruz (UCSC) genome database is among the most used sources of genomic annotation in human and other organisms. The database offers an excellent web-based graphical user interface (the UCSC genome browser) and several means for programmatic queries. A simple application programming interface (API) in a scripting language aimed at the biologist was however not yet available. Here, we present the Ruby UCSC API, a library to access the UCSC genome database using Ruby. The API is designed as a BioRuby plug-in and built on the ActiveRecord 3 framework for the object-relational mapping, making writing SQL statements unnecessary. The current version of the API supports databases of all organisms in the UCSC genome database including human, mammals, vertebrates, deuterostomes, insects, nematodes, and yeast.The API uses the bin index-if available-when querying for genomic intervals. The API also supports genomic sequence queries using locally downloaded *.2bit files that are not stored in the official MySQL database. The API is implemented in pure Ruby and is therefore available in different environments and with different Ruby interpreters (including JRuby). Assisted by the straightforward object-oriented design of Ruby and ActiveRecord, the Ruby UCSC API will facilitate biologists to query the UCSC genome database programmatically. The API is available through the RubyGem system. Source code and documentation are available at https://github.com/misshie/bioruby-ucsc-api/ under the Ruby license. Feedback and help is provided via the website at http://rubyucscapi.userecho.com/.
The Ruby UCSC API: accessing the UCSC genome database using Ruby

Science.gov (United States)

2012-01-01

Background The University of California, Santa Cruz (UCSC) genome database is among the most used sources of genomic annotation in human and other organisms. The database offers an excellent web-based graphical user interface (the UCSC genome browser) and several means for programmatic queries. A simple application programming interface (API) in a scripting language aimed at the biologist was however not yet available. Here, we present the Ruby UCSC API, a library to access the UCSC genome database using Ruby. Results The API is designed as a BioRuby plug-in and built on the ActiveRecord 3 framework for the object-relational mapping, making writing SQL statements unnecessary. The current version of the API supports databases of all organisms in the UCSC genome database including human, mammals, vertebrates, deuterostomes, insects, nematodes, and yeast. The API uses the bin index—if available—when querying for genomic intervals. The API also supports genomic sequence queries using locally downloaded *.2bit files that are not stored in the official MySQL database. The API is implemented in pure Ruby and is therefore available in different environments and with different Ruby interpreters (including JRuby). Conclusions Assisted by the straightforward object-oriented design of Ruby and ActiveRecord, the Ruby UCSC API will facilitate biologists to query the UCSC genome database programmatically. The API is available through the RubyGem system. Source code and documentation are available at https://github.com/misshie/bioruby-ucsc-api/ under the Ruby license. Feedback and help is provided via the website at http://rubyucscapi.userecho.com/. PMID:22994508

The Ruby UCSC API: accessing the UCSC genome database using Ruby

Directory of Open Access Journals (Sweden)

Mishima Hiroyuki

2012-09-01

Full Text Available Abstract Background The University of California, Santa Cruz (UCSC genome database is among the most used sources of genomic annotation in human and other organisms. The database offers an excellent web-based graphical user interface (the UCSC genome browser and several means for programmatic queries. A simple application programming interface (API in a scripting language aimed at the biologist was however not yet available. Here, we present the Ruby UCSC API, a library to access the UCSC genome database using Ruby. Results The API is designed as a BioRuby plug-in and built on the ActiveRecord 3 framework for the object-relational mapping, making writing SQL statements unnecessary. The current version of the API supports databases of all organisms in the UCSC genome database including human, mammals, vertebrates, deuterostomes, insects, nematodes, and yeast. The API uses the bin index—if available—when querying for genomic intervals. The API also supports genomic sequence queries using locally downloaded *.2bit files that are not stored in the official MySQL database. The API is implemented in pure Ruby and is therefore available in different environments and with different Ruby interpreters (including JRuby. Conclusions Assisted by the straightforward object-oriented design of Ruby and ActiveRecord, the Ruby UCSC API will facilitate biologists to query the UCSC genome database programmatically. The API is available through the RubyGem system. Source code and documentation are available at https://github.com/misshie/bioruby-ucsc-api/ under the Ruby license. Feedback and help is provided via the website at http://rubyucscapi.userecho.com/.
KAIKObase: An integrated silkworm genome database and data mining tool

Directory of Open Access Journals (Sweden)

Nagaraju Javaregowda

2009-10-01

Full Text Available Abstract Background The silkworm, Bombyx mori, is one of the most economically important insects in many developing countries owing to its large-scale cultivation for silk production. With the development of genomic and biotechnological tools, B. mori has also become an important bioreactor for production of various recombinant proteins of biomedical interest. In 2004, two genome sequencing projects for B. mori were reported independently by Chinese and Japanese teams; however, the datasets were insufficient for building long genomic scaffolds which are essential for unambiguous annotation of the genome. Now, both the datasets have been merged and assembled through a joint collaboration between the two groups. Description Integration of the two data sets of silkworm whole-genome-shotgun sequencing by the Japanese and Chinese groups together with newly obtained fosmid- and BAC-end sequences produced the best continuity (~3.7 Mb in N50 scaffold size among the sequenced insect genomes and provided a high degree of nucleotide coverage (88% of all 28 chromosomes. In addition, a physical map of BAC contigs constructed by fingerprinting BAC clones and a SNP linkage map constructed using BAC-end sequences were available. In parallel, proteomic data from two-dimensional polyacrylamide gel electrophoresis in various tissues and developmental stages were compiled into a silkworm proteome database. Finally, a Bombyx trap database was constructed for documenting insertion positions and expression data of transposon insertion lines. Conclusion For efficient usage of genome information for functional studies, genomic sequences, physical and genetic map information and EST data were compiled into KAIKObase, an integrated silkworm genome database which consists of 4 map viewers, a gene viewer, and sequence, keyword and position search systems to display results and data at the level of nucleotide sequence, gene, scaffold and chromosome. Integration of the
Specialized microbial databases for inductive exploration of microbial genome sequences

Directory of Open Access Journals (Sweden)

Cabau Cédric

2005-02-01

Full Text Available Abstract Background The enormous amount of genome sequence data asks for user-oriented databases to manage sequences and annotations. Queries must include search tools permitting function identification through exploration of related objects. Methods The GenoList package for collecting and mining microbial genome databases has been rewritten using MySQL as the database management system. Functions that were not available in MySQL, such as nested subquery, have been implemented. Results Inductive reasoning in the study of genomes starts from "islands of knowledge", centered around genes with some known background. With this concept of "neighborhood" in mind, a modified version of the GenoList structure has been used for organizing sequence data from prokaryotic genomes of particular interest in China. GenoChore http://bioinfo.hku.hk/genochore.html, a set of 17 specialized end-user-oriented microbial databases (including one instance of Microsporidia, Encephalitozoon cuniculi, a member of Eukarya has been made publicly available. These databases allow the user to browse genome sequence and annotation data using standard queries. In addition they provide a weekly update of searches against the world-wide protein sequences data libraries, allowing one to monitor annotation updates on genes of interest. Finally, they allow users to search for patterns in DNA or protein sequences, taking into account a clustering of genes into formal operons, as well as providing extra facilities to query sequences using predefined sequence patterns. Conclusion This growing set of specialized microbial databases organize data created by the first Chinese bacterial genome programs (ThermaList, Thermoanaerobacter tencongensis, LeptoList, with two different genomes of Leptospira interrogans and SepiList, Staphylococcus epidermidis associated to related organisms for comparison.
Enhanced annotations and features for comparing thousands of Pseudomonas genomes in the Pseudomonas genome database.

Science.gov (United States)

Winsor, Geoffrey L; Griffiths, Emma J; Lo, Raymond; Dhillon, Bhavjinder K; Shay, Julie A; Brinkman, Fiona S L

2016-01-04

The Pseudomonas Genome Database (http://www.pseudomonas.com) is well known for the application of community-based annotation approaches for producing a high-quality Pseudomonas aeruginosa PAO1 genome annotation, and facilitating whole-genome comparative analyses with other Pseudomonas strains. To aid analysis of potentially thousands of complete and draft genome assemblies, this database and analysis platform was upgraded to integrate curated genome annotations and isolate metadata with enhanced tools for larger scale comparative analysis and visualization. Manually curated gene annotations are supplemented with improved computational analyses that help identify putative drug targets and vaccine candidates or assist with evolutionary studies by identifying orthologs, pathogen-associated genes and genomic islands. The database schema has been updated to integrate isolate metadata that will facilitate more powerful analysis of genomes across datasets in the future. We continue to place an emphasis on providing high-quality updates to gene annotations through regular review of the scientific literature and using community-based approaches including a major new Pseudomonas community initiative for the assignment of high-quality gene ontology terms to genes. As we further expand from thousands of genomes, we plan to provide enhancements that will aid data visualization and analysis arising from whole-genome comparative studies including more pan-genome and population-based approaches. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
MIPS PlantsDB: a database framework for comparative plant genome research.

Science.gov (United States)

Nussbaumer, Thomas; Martis, Mihaela M; Roessner, Stephan K; Pfeifer, Matthias; Bader, Kai C; Sharma, Sapna; Gundlach, Heidrun; Spannagl, Manuel

2013-01-01

The rapidly increasing amount of plant genome (sequence) data enables powerful comparative analyses and integrative approaches and also requires structured and comprehensive information resources. Databases are needed for both model and crop plant organisms and both intuitive search/browse views and comparative genomics tools should communicate the data to researchers and help them interpret it. MIPS PlantsDB (http://mips.helmholtz-muenchen.de/plant/genomes.jsp) was initially described in NAR in 2007 [Spannagl,M., Noubibou,O., Haase,D., Yang,L., Gundlach,H., Hindemitt, T., Klee,K., Haberer,G., Schoof,H. and Mayer,K.F. (2007) MIPSPlantsDB-plant database resource for integrative and comparative plant genome research. Nucleic Acids Res., 35, D834-D840] and was set up from the start to provide data and information resources for individual plant species as well as a framework for integrative and comparative plant genome research. PlantsDB comprises database instances for tomato, Medicago, Arabidopsis, Brachypodium, Sorghum, maize, rice, barley and wheat. Building up on that, state-of-the-art comparative genomics tools such as CrowsNest are integrated to visualize and investigate syntenic relationships between monocot genomes. Results from novel genome analysis strategies targeting the complex and repetitive genomes of triticeae species (wheat and barley) are provided and cross-linked with model species. The MIPS Repeat Element Database (mips-REdat) and Catalog (mips-REcat) as well as tight connections to other databases, e.g. via web services, are further important components of PlantsDB.
The catfish genome database cBARBEL: an informatic platform for genome biology of ictalurid catfish.

Science.gov (United States)

Lu, Jianguo; Peatman, Eric; Yang, Qing; Wang, Shaolin; Hu, Zhiliang; Reecy, James; Kucuktas, Huseyin; Liu, Zhanjiang

2011-01-01

The catfish genome database, cBARBEL (abbreviated from catfish Breeder And Researcher Bioinformatics Entry Location) is an online open-access database for genome biology of ictalurid catfish (Ictalurus spp.). It serves as a comprehensive, integrative platform for all aspects of catfish genetics, genomics and related data resources. cBARBEL provides BLAST-based, fuzzy and specific search functions, visualization of catfish linkage, physical and integrated maps, a catfish EST contig viewer with SNP information overlay, and GBrowse-based organization of catfish genomic data based on sequence similarity with zebrafish chromosomes. Subsections of the database are tightly related, allowing a user with a sequence or search string of interest to navigate seamlessly from one area to another. As catfish genome sequencing proceeds and ongoing quantitative trait loci (QTL) projects bear fruit, cBARBEL will allow rapid data integration and dissemination within the catfish research community and to interested stakeholders. cBARBEL can be accessed at http://catfishgenome.org.
The Saccharomyces Genome Database Variant Viewer.

Science.gov (United States)

Sheppard, Travis K; Hitz, Benjamin C; Engel, Stacia R; Song, Giltae; Balakrishnan, Rama; Binkley, Gail; Costanzo, Maria C; Dalusag, Kyla S; Demeter, Janos; Hellerstedt, Sage T; Karra, Kalpana; Nash, Robert S; Paskov, Kelley M; Skrzypek, Marek S; Weng, Shuai; Wong, Edith D; Cherry, J Michael

2016-01-04

The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. In recent years, we have moved toward increased representation of sequence variation and allelic differences within S. cerevisiae. The publication of numerous additional genomes has motivated the creation of new tools for their annotation and analysis. Here we present the Variant Viewer: a dynamic open-source web application for the visualization of genomic and proteomic differences. Multiple sequence alignments have been constructed across high quality genome sequences from 11 different S. cerevisiae strains and stored in the SGD. The alignments and summaries are encoded in JSON and used to create a two-tiered dynamic view of the budding yeast pan-genome, available at http://www.yeastgenome.org/variant-viewer. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Human Ageing Genomic Resources: new and updated databases

Science.gov (United States)

Tacutu, Robi; Thornton, Daniel; Johnson, Emily; Budovsky, Arie; Barardo, Diogo; Craig, Thomas; Diana, Eugene; Lehmann, Gilad; Toren, Dmitri; Wang, Jingwei; Fraifeld, Vadim E

2018-01-01

Abstract In spite of a growing body of research and data, human ageing remains a poorly understood process. Over 10 years ago we developed the Human Ageing Genomic Resources (HAGR), a collection of databases and tools for studying the biology and genetics of ageing. Here, we present HAGR’s main functionalities, highlighting new additions and improvements. HAGR consists of six core databases: (i) the GenAge database of ageing-related genes, in turn composed of a dataset of >300 human ageing-related genes and a dataset with >2000 genes associated with ageing or longevity in model organisms; (ii) the AnAge database of animal ageing and longevity, featuring >4000 species; (iii) the GenDR database with >200 genes associated with the life-extending effects of dietary restriction; (iv) the LongevityMap database of human genetic association studies of longevity with >500 entries; (v) the DrugAge database with >400 ageing or longevity-associated drugs or compounds; (vi) the CellAge database with >200 genes associated with cell senescence. All our databases are manually curated by experts and regularly updated to ensure a high quality data. Cross-links across our databases and to external resources help researchers locate and integrate relevant information. HAGR is freely available online (http://genomics.senescence.info/). PMID:29121237
Kazusa Marker DataBase: a database for genomics, genetics, and molecular breeding in plants

Science.gov (United States)

Shirasawa, Kenta; Isobe, Sachiko; Tabata, Satoshi; Hirakawa, Hideki

2014-01-01

In order to provide useful genomic information for agronomical plants, we have established a database, the Kazusa Marker DataBase (http://marker.kazusa.or.jp). This database includes information on DNA markers, e.g., SSR and SNP markers, genetic linkage maps, and physical maps, that were developed at the Kazusa DNA Research Institute. Keyword searches for the markers, sequence data used for marker development, and experimental conditions are also available through this database. Currently, 10 plant species have been targeted: tomato (Solanum lycopersicum), pepper (Capsicum annuum), strawberry (Fragaria × ananassa), radish (Raphanus sativus), Lotus japonicus, soybean (Glycine max), peanut (Arachis hypogaea), red clover (Trifolium pratense), white clover (Trifolium repens), and eucalyptus (Eucalyptus camaldulensis). In addition, the number of plant species registered in this database will be increased as our research progresses. The Kazusa Marker DataBase will be a useful tool for both basic and applied sciences, such as genomics, genetics, and molecular breeding in crops. PMID:25320561
Saccharomyces genome database informs human biology

OpenAIRE

Skrzypek, Marek S; Nash, Robert S; Wong, Edith D; MacPherson, Kevin A; Hellerstedt, Sage T; Engel, Stacia R; Karra, Kalpana; Weng, Shuai; Sheppard, Travis K; Binkley, Gail; Simison, Matt; Miyasato, Stuart R; Cherry, J Michael

2017-01-01

Abstract The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is an expertly curated database of literature-derived functional information for the model organism budding yeast, Saccharomyces cerevisiae. SGD constantly strives to synergize new types of experimental data and bioinformatics predictions with existing data, and to organize them into a comprehensive and up-to-date information resource. The primary mission of SGD is to facilitate research into the biology of yeast and...
Unlimited Thirst for Genome Sequencing, Data Interpretation, and Database Usage in Genomic Era: The Road towards Fast-Track Crop Plant Improvement

Directory of Open Access Journals (Sweden)

Arun Prabhu Dhanapal

2015-01-01

Full Text Available The number of sequenced crop genomes and associated genomic resources is growing rapidly with the advent of inexpensive next generation sequencing methods. Databases have become an integral part of all aspects of science research, including basic and applied plant and animal sciences. The importance of databases keeps increasing as the volume of datasets from direct and indirect genomics, as well as other omics approaches, keeps expanding in recent years. The databases and associated web portals provide at a minimum a uniform set of tools and automated analysis across a wide range of crop plant genomes. This paper reviews some basic terms and considerations in dealing with crop plant databases utilization in advancing genomic era. The utilization of databases for variation analysis with other comparative genomics tools, and data interpretation platforms are well described. The major focus of this review is to provide knowledge on platforms and databases for genome-based investigations of agriculturally important crop plants. The utilization of these databases in applied crop improvement program is still being achieved widely; otherwise, the end for sequencing is not far away.
GDR (Genome Database for Rosaceae: integrated web resources for Rosaceae genomics and genetics research

Directory of Open Access Journals (Sweden)

Ficklin Stephen

2004-09-01

Full Text Available Abstract Background Peach is being developed as a model organism for Rosaceae, an economically important family that includes fruits and ornamental plants such as apple, pear, strawberry, cherry, almond and rose. The genomics and genetics data of peach can play a significant role in the gene discovery and the genetic understanding of related species. The effective utilization of these peach resources, however, requires the development of an integrated and centralized database with associated analysis tools. Description The Genome Database for Rosaceae (GDR is a curated and integrated web-based relational database. GDR contains comprehensive data of the genetically anchored peach physical map, an annotated peach EST database, Rosaceae maps and markers and all publicly available Rosaceae sequences. Annotations of ESTs include contig assembly, putative function, simple sequence repeats, and anchored position to the peach physical map where applicable. Our integrated map viewer provides graphical interface to the genetic, transcriptome and physical mapping information. ESTs, BACs and markers can be queried by various categories and the search result sites are linked to the integrated map viewer or to the WebFPC physical map sites. In addition to browsing and querying the database, users can compare their sequences with the annotated GDR sequences via a dedicated sequence similarity server running either the BLAST or FASTA algorithm. To demonstrate the utility of the integrated and fully annotated database and analysis tools, we describe a case study where we anchored Rosaceae sequences to the peach physical and genetic map by sequence similarity. Conclusions The GDR has been initiated to meet the major deficiency in Rosaceae genomics and genetics research, namely a centralized web database and bioinformatics tools for data storage, analysis and exchange. GDR can be accessed at http://www.genome.clemson.edu/gdr/.
GDR (Genome Database for Rosaceae): integrated web resources for Rosaceae genomics and genetics research.

Science.gov (United States)

Jung, Sook; Jesudurai, Christopher; Staton, Margaret; Du, Zhidian; Ficklin, Stephen; Cho, Ilhyung; Abbott, Albert; Tomkins, Jeffrey; Main, Dorrie

2004-09-09

Peach is being developed as a model organism for Rosaceae, an economically important family that includes fruits and ornamental plants such as apple, pear, strawberry, cherry, almond and rose. The genomics and genetics data of peach can play a significant role in the gene discovery and the genetic understanding of related species. The effective utilization of these peach resources, however, requires the development of an integrated and centralized database with associated analysis tools. The Genome Database for Rosaceae (GDR) is a curated and integrated web-based relational database. GDR contains comprehensive data of the genetically anchored peach physical map, an annotated peach EST database, Rosaceae maps and markers and all publicly available Rosaceae sequences. Annotations of ESTs include contig assembly, putative function, simple sequence repeats, and anchored position to the peach physical map where applicable. Our integrated map viewer provides graphical interface to the genetic, transcriptome and physical mapping information. ESTs, BACs and markers can be queried by various categories and the search result sites are linked to the integrated map viewer or to the WebFPC physical map sites. In addition to browsing and querying the database, users can compare their sequences with the annotated GDR sequences via a dedicated sequence similarity server running either the BLAST or FASTA algorithm. To demonstrate the utility of the integrated and fully annotated database and analysis tools, we describe a case study where we anchored Rosaceae sequences to the peach physical and genetic map by sequence similarity. The GDR has been initiated to meet the major deficiency in Rosaceae genomics and genetics research, namely a centralized web database and bioinformatics tools for data storage, analysis and exchange. GDR can be accessed at http://www.genome.clemson.edu/gdr/.
Viral Genome DataBase: storing and analyzing genes and proteins from complete viral genomes.

Science.gov (United States)

Hiscock, D; Upton, C

2000-05-01

The Viral Genome DataBase (VGDB) contains detailed information of the genes and predicted protein sequences from 15 completely sequenced genomes of large (&100 kb) viruses (2847 genes). The data that is stored includes DNA sequence, protein sequence, GenBank and user-entered notes, molecular weight (MW), isoelectric point (pI), amino acid content, A + T%, nucleotide frequency, dinucleotide frequency and codon use. The VGDB is a mySQL database with a user-friendly JAVA GUI. Results of queries can be easily sorted by any of the individual parameters. The software and additional figures and information are available at http://athena.bioc.uvic.ca/genomes/index.html .
Antioxidant Capacity, Cytotoxicity, and Acute Oral Toxicity of Gynura bicolor

Directory of Open Access Journals (Sweden)

Wuen Yew Teoh

2013-01-01

Full Text Available Gynura bicolor (Compositae which is widely used by the locals as natural remedies in folk medicine has limited scientific studies to ensure its efficacy and nontoxicity. The current study reports the total phenolic content, antioxidant capacity, cytotoxicity, and acute oral toxicity of crude methanol and its fractionated extracts (hexane, ethyl acetate, and water of G. bicolor leaves. Five human colon cancer cell lines (HT-29, HCT-15, SW480, Caco-2, and HCT 116, one human breast adenocarcinoma cell line (MCF7, and one human normal colon cell line (CCD-18Co were used to evaluate the cytotoxicity of G. bicolor. The present findings had clearly demonstrated that ethyl acetate extract of G. bicolor with the highest total phenolic content among the extracts showed the strongest antioxidant activity (DPPH radical scavenging assay and metal chelating assay, possessed cytotoxicity, and induced apoptotic and necrotic cell death, especially towards the HCT 116 and HCT-15 colon cancer cells. The acute oral toxicity study indicated that methanol extract of G. bicolor has negligible level of toxicity when administered orally and has been regarded as safe in experimental rats. The findings of the current study clearly established the chemoprevention potential of G. bicolor and thus provide scientific validation on the therapeutic claims of G. bicolor.
Nencki Genomics Database--Ensembl funcgen enhanced with intersections, user data and genome-wide TFBS motifs.

Science.gov (United States)

Krystkowiak, Izabella; Lenart, Jakub; Debski, Konrad; Kuterba, Piotr; Petas, Michal; Kaminska, Bozena; Dabrowski, Michal

2013-01-01

We present the Nencki Genomics Database, which extends the functionality of Ensembl Regulatory Build (funcgen) for the three species: human, mouse and rat. The key enhancements over Ensembl funcgen include the following: (i) a user can add private data, analyze them alongside the public data and manage access rights; (ii) inside the database, we provide efficient procedures for computing intersections between regulatory features and for mapping them to the genes. To Ensembl funcgen-derived data, which include data from ENCODE, we add information on conserved non-coding (putative regulatory) sequences, and on genome-wide occurrence of transcription factor binding site motifs from the current versions of two major motif libraries, namely, Jaspar and Transfac. The intersections and mapping to the genes are pre-computed for the public data, and the result of any procedure run on the data added by the users is stored back into the database, thus incrementally increasing the body of pre-computed data. As the Ensembl funcgen schema for the rat is currently not populated, our database is the first database of regulatory features for this frequently used laboratory animal. The database is accessible without registration using the mysql client: mysql -h database.nencki-genomics.org -u public. Registration is required only to add or access private data. A WSDL webservice provides access to the database from any SOAP client, including the Taverna Workbench with a graphical user interface.
Xylella fastidiosa comparative genomic database is an information resource to explore the annotation, genomic features, and biology of different strains

Directory of Open Access Journals (Sweden)

Alessandro M. Varani

2012-01-01

Full Text Available The Xylella fastidiosa comparative genomic database is a scientific resource with the aim to provide a user-friendly interface for accessing high-quality manually curated genomic annotation and comparative sequence analysis, as well as for identifying and mapping prophage-like elements, a marked feature of Xylella genomes. Here we describe a database and tools for exploring the biology of this important plant pathogen. The hallmarks of this database are the high quality genomic annotation, the functional and comparative genomic analysis and the identification and mapping of prophage-like elements. It is available from web site http://www.xylella.lncc.br.
MBGD update 2015: microbial genome database for flexible ortholog analysis utilizing a diverse set of genomic data.

Science.gov (United States)

Uchiyama, Ikuo; Mihara, Motohiro; Nishide, Hiroyo; Chiba, Hirokazu

2015-01-01

The microbial genome database for comparative analysis (MBGD) (available at http://mbgd.genome.ad.jp/) is a comprehensive ortholog database for flexible comparative analysis of microbial genomes, where the users are allowed to create an ortholog table among any specified set of organisms. Because of the rapid increase in microbial genome data owing to the next-generation sequencing technology, it becomes increasingly challenging to maintain high-quality orthology relationships while allowing the users to incorporate the latest genomic data available into an analysis. Because many of the recently accumulating genomic data are draft genome sequences for which some complete genome sequences of the same or closely related species are available, MBGD now stores draft genome data and allows the users to incorporate them into a user-specific ortholog database using the MyMBGD functionality. In this function, draft genome data are incorporated into an existing ortholog table created only from the complete genome data in an incremental manner to prevent low-quality draft data from affecting clustering results. In addition, to provide high-quality orthology relationships, the standard ortholog table containing all the representative genomes, which is first created by the rapid classification program DomClust, is now refined using DomRefine, a recently developed program for improving domain-level clustering using multiple sequence alignment information. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
GDR (Genome Database for Rosaceae): integrated web-database for Rosaceae genomics and genetics data.

Science.gov (United States)

Jung, Sook; Staton, Margaret; Lee, Taein; Blenda, Anna; Svancara, Randall; Abbott, Albert; Main, Dorrie

2008-01-01

The Genome Database for Rosaceae (GDR) is a central repository of curated and integrated genetics and genomics data of Rosaceae, an economically important family which includes apple, cherry, peach, pear, raspberry, rose and strawberry. GDR contains annotated databases of all publicly available Rosaceae ESTs, the genetically anchored peach physical map, Rosaceae genetic maps and comprehensively annotated markers and traits. The ESTs are assembled to produce unigene sets of each genus and the entire Rosaceae. Other annotations include putative function, microsatellites, open reading frames, single nucleotide polymorphisms, gene ontology terms and anchored map position where applicable. Most of the published Rosaceae genetic maps can be viewed and compared through CMap, the comparative map viewer. The peach physical map can be viewed using WebFPC/WebChrom, and also through our integrated GDR map viewer, which serves as a portal to the combined genetic, transcriptome and physical mapping information. ESTs, BACs, markers and traits can be queried by various categories and the search result sites are linked to the mapping visualization tools. GDR also provides online analysis tools such as a batch BLAST/FASTA server for the GDR datasets, a sequence assembly server and microsatellite and primer detection tools. GDR is available at http://www.rosaceae.org.
PairWise Neighbours database: overlaps and spacers among prokaryote genomes

Directory of Open Access Journals (Sweden)

Garcia-Vallvé Santiago

2009-06-01

Full Text Available Abstract Background Although prokaryotes live in a variety of habitats and possess different metabolic and genomic complexity, they have several genomic architectural features in common. The overlapping genes are a common feature of the prokaryote genomes. The overlapping lengths tend to be short because as the overlaps become longer they have more risk of deleterious mutations. The spacers between genes tend to be short too because of the tendency to reduce the non coding DNA among prokaryotes. However they must be long enough to maintain essential regulatory signals such as the Shine-Dalgarno (SD sequence, which is responsible of an efficient translation. Description PairWise Neighbours is an interactive and intuitive database used for retrieving information about the spacers and overlapping genes among bacterial and archaeal genomes. It contains 1,956,294 gene pairs from 678 fully sequenced prokaryote genomes and is freely available at the URL http://genomes.urv.cat/pwneigh. This database provides information about the overlaps and their conservation across species. Furthermore, it allows the wide analysis of the intergenic regions providing useful information such as the location and strength of the SD sequence. Conclusion There are experiments and bioinformatic analysis that rely on correct annotations of the initiation site. Therefore, a database that studies the overlaps and spacers among prokaryotes appears to be desirable. PairWise Neighbours database permits the reliability analysis of the overlapping structures and the study of the SD presence and location among the adjacent genes, which may help to check the annotation of the initiation sites.

Gramene database: Navigating plant comparative genomics resources

Directory of Open Access Journals (Sweden)

Parul Gupta

2016-11-01

Full Text Available Gramene (http://www.gramene.org is an online, open source, curated resource for plant comparative genomics and pathway analysis designed to support researchers working in plant genomics, breeding, evolutionary biology, system biology, and metabolic engineering. It exploits phylogenetic relationships to enrich the annotation of genomic data and provides tools to perform powerful comparative analyses across a wide spectrum of plant species. It consists of an integrated portal for querying, visualizing and analyzing data for 44 plant reference genomes, genetic variation data sets for 12 species, expression data for 16 species, curated rice pathways and orthology-based pathway projections for 66 plant species including various crops. Here we briefly describe the functions and uses of the Gramene database.
GenomeRNAi: a database for cell-based RNAi phenotypes.

Science.gov (United States)

Horn, Thomas; Arziman, Zeynep; Berger, Juerg; Boutros, Michael

2007-01-01

RNA interference (RNAi) has emerged as a powerful tool to generate loss-of-function phenotypes in a variety of organisms. Combined with the sequence information of almost completely annotated genomes, RNAi technologies have opened new avenues to conduct systematic genetic screens for every annotated gene in the genome. As increasing large datasets of RNAi-induced phenotypes become available, an important challenge remains the systematic integration and annotation of functional information. Genome-wide RNAi screens have been performed both in Caenorhabditis elegans and Drosophila for a variety of phenotypes and several RNAi libraries have become available to assess phenotypes for almost every gene in the genome. These screens were performed using different types of assays from visible phenotypes to focused transcriptional readouts and provide a rich data source for functional annotation across different species. The GenomeRNAi database provides access to published RNAi phenotypes obtained from cell-based screens and maps them to their genomic locus, including possible non-specific regions. The database also gives access to sequence information of RNAi probes used in various screens. It can be searched by phenotype, by gene, by RNAi probe or by sequence and is accessible at http://rnai.dkfz.de.
Sorghum bi-color

African Journals Online (AJOL)

sunny

2014-11-12

Nov 12, 2014 ... Biomass materials require reduction and densification for the purpose of handling and space requirements. Guinea corn (Sorghum bi-color) is a major source of biomass material in the tropic regions. The densification process involves some ... a closed-end die, the temperature and the use of binder.
Brassica database (BRAD) version 2.0: integrating and mining Brassicaceae species genomic resources.

Science.gov (United States)

Wang, Xiaobo; Wu, Jian; Liang, Jianli; Cheng, Feng; Wang, Xiaowu

2015-01-01

The Brassica database (BRAD) was built initially to assist users apply Brassica rapa and Arabidopsis thaliana genomic data efficiently to their research. However, many Brassicaceae genomes have been sequenced and released after its construction. These genomes are rich resources for comparative genomics, gene annotation and functional evolutionary studies of Brassica crops. Therefore, we have updated BRAD to version 2.0 (V2.0). In BRAD V2.0, 11 more Brassicaceae genomes have been integrated into the database, namely those of Arabidopsis lyrata, Aethionema arabicum, Brassica oleracea, Brassica napus, Camelina sativa, Capsella rubella, Leavenworthia alabamica, Sisymbrium irio and three extremophiles Schrenkiella parvula, Thellungiella halophila and Thellungiella salsuginea. BRAD V2.0 provides plots of syntenic genomic fragments between pairs of Brassicaceae species, from the level of chromosomes to genomic blocks. The Generic Synteny Browser (GBrowse_syn), a module of the Genome Browser (GBrowse), is used to show syntenic relationships between multiple genomes. Search functions for retrieving syntenic and non-syntenic orthologs, as well as their annotation and sequences are also provided. Furthermore, genome and annotation information have been imported into GBrowse so that all functional elements can be visualized in one frame. We plan to continually update BRAD by integrating more Brassicaceae genomes into the database. Database URL: http://brassicadb.org/brad/. © The Author(s) 2015. Published by Oxford University Press.
An Open Access Database of Genome-wide Association Results

Directory of Open Access Journals (Sweden)

Johnson Andrew D

2009-01-01

Full Text Available Abstract Background The number of genome-wide association studies (GWAS is growing rapidly leading to the discovery and replication of many new disease loci. Combining results from multiple GWAS datasets may potentially strengthen previous conclusions and suggest new disease loci, pathways or pleiotropic genes. However, no database or centralized resource currently exists that contains anywhere near the full scope of GWAS results. Methods We collected available results from 118 GWAS articles into a database of 56,411 significant SNP-phenotype associations and accompanying information, making this database freely available here. In doing so, we met and describe here a number of challenges to creating an open access database of GWAS results. Through preliminary analyses and characterization of available GWAS, we demonstrate the potential to gain new insights by querying a database across GWAS. Results Using a genomic bin-based density analysis to search for highly associated regions of the genome, positive control loci (e.g., MHC loci were detected with high sensitivity. Likewise, an analysis of highly repeated SNPs across GWAS identified replicated loci (e.g., APOE, LPL. At the same time we identified novel, highly suggestive loci for a variety of traits that did not meet genome-wide significant thresholds in prior analyses, in some cases with strong support from the primary medical genetics literature (SLC16A7, CSMD1, OAS1, suggesting these genes merit further study. Additional adjustment for linkage disequilibrium within most regions with a high density of GWAS associations did not materially alter our findings. Having a centralized database with standardized gene annotation also allowed us to examine the representation of functional gene categories (gene ontologies containing one or more associations among top GWAS results. Genes relating to cell adhesion functions were highly over-represented among significant associations (p -14, a finding
EuMicroSatdb: A database for microsatellites in the sequenced genomes of eukaryotes

Directory of Open Access Journals (Sweden)

Grover Atul

2007-07-01

Full Text Available Abstract Background Microsatellites have immense utility as molecular markers in different fields like genome characterization and mapping, phylogeny and evolutionary biology. Existing microsatellite databases are of limited utility for experimental and computational biologists with regard to their content and information output. EuMicroSatdb (Eukaryotic MicroSatellite database http://ipu.ac.in/usbt/EuMicroSatdb.htm is a web based relational database for easy and efficient positional mining of microsatellites from sequenced eukaryotic genomes. Description A user friendly web interface has been developed for microsatellite data retrieval using Active Server Pages (ASP. The backend database codes for data extraction and assembly have been written using Perl based scripts and C++. Precise need based microsatellites data retrieval is possible using different input parameters like microsatellite type (simple perfect or compound perfect, repeat unit length (mono- to hexa-nucleotide, repeat number, microsatellite length and chromosomal location in the genome. Furthermore, information about clustering of different microsatellites in the genome can also be retrieved. Finally, to facilitate primer designing for PCR amplification of any desired microsatellite locus, 200 bp upstream and downstream sequences are provided. Conclusion The database allows easy systematic retrieval of comprehensive information about simple and compound microsatellites, microsatellite clusters and their locus coordinates in 31 sequenced eukaryotic genomes. The information content of the database is useful in different areas of research like gene tagging, genome mapping, population genetics, germplasm characterization and in understanding microsatellite dynamics in eukaryotic genomes.
Rhizobial infection in Adesmia bicolor (Fabaceae) roots.

Science.gov (United States)

Bianco, Luciana

2014-09-01

The native legume Adesmia bicolor shows nitrogen fixation efficiency via symbiosis with soil rhizobia. The infection mechanism by means of which rhizobia infect their roots has not been fully elucidated to date. Therefore, the purpose of the present study was to identify the infection mechanism in Adesmia bicolor roots. To this end, inoculated roots were processed following conventional methods as part of our root anatomy study, and the shape and distribution of root nodules were analyzed as well. Neither root hairs nor infection threads were observed in the root system, whereas infection sites-later forming nodules-were observed in the longitudinal sections. Nodules were found to form between the main root and the lateral roots. It can be concluded that in Adesmia bicolor, a bacterial crack entry infection mechanism prevails and that such mechanism could be an adaptive strategy of this species which is typical of arid environments.
BBGD: an online database for blueberry genomic data

Directory of Open Access Journals (Sweden)

Matthews Benjamin F

2007-01-01

Full Text Available Abstract Background Blueberry is a member of the Ericaceae family, which also includes closely related cranberry and more distantly related rhododendron, azalea, and mountain laurel. Blueberry is a major berry crop in the United States, and one that has great nutritional and economical value. Extreme low temperatures, however, reduce crop yield and cause major losses to US farmers. A better understanding of the genes and biochemical pathways that are up- or down-regulated during cold acclimation is needed to produce blueberry cultivars with enhanced cold hardiness. To that end, the blueberry genomics database (BBDG was developed. Along with the analysis tools and web-based query interfaces, the database serves both the broader Ericaceae research community and the blueberry research community specifically by making available ESTs and gene expression data in searchable formats and in elucidating the underlying mechanisms of cold acclimation and freeze tolerance in blueberry. Description BBGD is the world's first database for blueberry genomics. BBGD is both a sequence and gene expression database. It stores both EST and microarray data and allows scientists to correlate expression profiles with gene function. BBGD is a public online database. Presently, the main focus of the database is the identification of genes in blueberry that are significantly induced or suppressed after low temperature exposure. Conclusion By using the database, researchers have developed EST-based markers for mapping and have identified a number of "candidate" cold tolerance genes that are highly expressed in blueberry flower buds after exposure to low temperatures.
A Ruby API to query the Ensembl database for genomic features.

Science.gov (United States)

Strozzi, Francesco; Aerts, Jan

2011-04-01

The Ensembl database makes genomic features available via its Genome Browser. It is also possible to access the underlying data through a Perl API for advanced querying. We have developed a full-featured Ruby API to the Ensembl databases, providing the same functionality as the Perl interface with additional features. A single Ruby API is used to access different releases of the Ensembl databases and is also able to query multi-species databases. Most functionality of the API is provided using the ActiveRecord pattern. The library depends on introspection to make it release independent. The API is available through the Rubygem system and can be installed with the command gem install ruby-ensembl-api.
PGSB/MIPS PlantsDB Database Framework for the Integration and Analysis of Plant Genome Data.

Science.gov (United States)

Spannagl, Manuel; Nussbaumer, Thomas; Bader, Kai; Gundlach, Heidrun; Mayer, Klaus F X

2017-01-01

Plant Genome and Systems Biology (PGSB), formerly Munich Institute for Protein Sequences (MIPS) PlantsDB, is a database framework for the integration and analysis of plant genome data, developed and maintained for more than a decade now. Major components of that framework are genome databases and analysis resources focusing on individual (reference) genomes providing flexible and intuitive access to data. Another main focus is the integration of genomes from both model and crop plants to form a scaffold for comparative genomics, assisted by specialized tools such as the CrowsNest viewer to explore conserved gene order (synteny). Data exchange and integrated search functionality with/over many plant genome databases is provided within the transPLANT project.
Uniform standards for genome databases in forest and fruit trees

Science.gov (United States)

TreeGenes and tfGDR serve the international forestry and fruit tree genomics research communities, respectively. These databases hold similar sequence data and provide resources for the submission and recovery of this information in order to enable comparative genomics research. Large-scale genotype...
MIPS: a database for protein sequences, homology data and yeast genome information.

Science.gov (United States)

Mewes, H W; Albermann, K; Heumann, K; Liebl, S; Pfeiffer, F

1997-01-01

The MIPS group (Martinsried Institute for Protein Sequences) at the Max-Planck-Institute for Biochemistry, Martinsried near Munich, Germany, collects, processes and distributes protein sequence data within the framework of the tripartite association of the PIR-International Protein Sequence Database (,). MIPS contributes nearly 50% of the data input to the PIR-International Protein Sequence Database. The database is distributed on CD-ROM together with PATCHX, an exhaustive supplement of unique, unverified protein sequences from external sources compiled by MIPS. Through its WWW server (http://www.mips.biochem.mpg.de/ ) MIPS permits internet access to sequence databases, homology data and to yeast genome information. (i) Sequence similarity results from the FASTA program () are stored in the FASTA database for all proteins from PIR-International and PATCHX. The database is dynamically maintained and permits instant access to FASTA results. (ii) Starting with FASTA database queries, proteins have been classified into families and superfamilies (PROT-FAM). (iii) The HPT (hashed position tree) data structure () developed at MIPS is a new approach for rapid sequence and pattern searching. (iv) MIPS provides access to the sequence and annotation of the complete yeast genome (), the functional classification of yeast genes (FunCat) and its graphical display, the 'Genome Browser' (). A CD-ROM based on the JAVA programming language providing dynamic interactive access to the yeast genome and the related protein sequences has been compiled and is available on request. PMID:9016498
Requirements and standards for organelle genome databases

Energy Technology Data Exchange (ETDEWEB)

Boore, Jeffrey L.

2006-01-09

Mitochondria and plastids (collectively called organelles)descended from prokaryotes that adopted an intracellular, endosymbioticlifestyle within early eukaryotes. Comparisons of their remnant genomesaddress a wide variety of biological questions, especially when includingthe genomes of their prokaryotic relatives and the many genes transferredto the eukaryotic nucleus during the transitions from endosymbiont toorganelle. The pace of producing complete organellar genome sequences nowmakes it unfeasible to do broad comparisons using the primary literatureand, even if it were feasible, it is now becoming uncommon for journalsto accept detailed descriptions of genome-level features. Unfortunatelyno database is currently useful for this task, since they have littlestandardization and are riddled with error. Here I outline what iscurrently wrong and what must be done to make this data useful to thescientific community.
License - TMBETA-GENOME | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us TMBETA-GENOME License License to Use This Database Last updated : 2015/03/09 You may use this database... the license terms regarding the use of this database and the requirements you must follow in using this database.... The license for this database is specified in the Creative Commons Attribu...tion-Share Alike 2.1 Japan . If you use data from this database, please be sure attribute this database as f....1 Japan . The summary of the Creative Commons Attribution-Share Alike 2.1 Japan is found here . With regard to this database
De novo transcriptome assembly of Sorghum bicolor variety Taejin

Directory of Open Access Journals (Sweden)

Yeonhwa Jo

2016-06-01

Full Text Available Sorghum (Sorghum bicolor, also known as great millet, is one of the most popular cultivated grass species in the world. Sorghum is frequently consumed as food for humans and animals as well as used for ethanol production. In this study, we conducted de novo transcriptome assembly for sorghum variety Taejin by next-generation sequencing, obtaining 8.748 GB of raw data. The raw data in this study can be available in NCBI SRA database with accession number of SRX1715644. Using the Trinity program, we identified 222,161 transcripts from sorghum variety Taejin. We further predicted coding regions within the assembled transcripts by the TransDecoder program, resulting in a total of 148,531 proteins. We carried out BLASTP against the Swiss-Prot protein sequence database to annotate the functions of the identified proteins. To our knowledge, this is the first transcriptome data for a sorghum variety derived from Korea, and it can be usefully applied to the generation of genetic markers.
BRAD, the genetics and genomics database for Brassica plants

Directory of Open Access Journals (Sweden)

Li Pingxia

2011-10-01

Full Text Available Abstract Background Brassica species include both vegetable and oilseed crops, which are very important to the daily life of common human beings. Meanwhile, the Brassica species represent an excellent system for studying numerous aspects of plant biology, specifically for the analysis of genome evolution following polyploidy, so it is also very important for scientific research. Now, the genome of Brassica rapa has already been assembled, it is the time to do deep mining of the genome data. Description BRAD, the Brassica database, is a web-based resource focusing on genome scale genetic and genomic data for important Brassica crops. BRAD was built based on the first whole genome sequence and on further data analysis of the Brassica A genome species, Brassica rapa (Chiifu-401-42. It provides datasets, such as the complete genome sequence of B. rapa, which was de novo assembled from Illumina GA II short reads and from BAC clone sequences, predicted genes and associated annotations, non coding RNAs, transposable elements (TE, B. rapa genes' orthologous to those in A. thaliana, as well as genetic markers and linkage maps. BRAD offers useful searching and data mining tools, including search across annotation datasets, search for syntenic or non-syntenic orthologs, and to search the flanking regions of a certain target, as well as the tools of BLAST and Gbrowse. BRAD allows users to enter almost any kind of information, such as a B. rapa or A. thaliana gene ID, physical position or genetic marker. Conclusion BRAD, a new database which focuses on the genetics and genomics of the Brassica plants has been developed, it aims at helping scientists and breeders to fully and efficiently use the information of genome data of Brassica plants. BRAD will be continuously updated and can be accessed through http://brassicadb.org.
Establishment of Vespa bicolor in Taiwan (Hymenoptera: Vespidae)

Science.gov (United States)

Sung, I-Hsin; Lu, Sheng-Shan; Chao, Jung-Tai; Yeh, Wen-Chi; Lee, Wei-Jie

2014-01-01

Abstract The establishment of a hornet, Vespa bicolor F., in Taiwan was confirmed based on successful field collection of adults of both sexes and two subterranean colonies. Information on nesting habitat, nest measurement, and colony composition of this species are provided in this article. V. bicolor is the ninth hornet species ever recorded from Taiwan. Possible pathway for the introduction of this alien species is also discussed. PMID:25434034
Building a genome database using an object-oriented approach.

Science.gov (United States)

Barbasiewicz, Anna; Liu, Lin; Lang, B Franz; Burger, Gertraud

2002-01-01

GOBASE is a relational database that integrates data associated with mitochondria and chloroplasts. The most important data in GOBASE, i. e., molecular sequences and taxonomic information, are obtained from the public sequence data repository at the National Center for Biotechnology Information (NCBI), and are validated by our experts. Maintaining a curated genomic database comes with a towering labor cost, due to the shear volume of available genomic sequences and the plethora of annotation errors and omissions in records retrieved from public repositories. Here we describe our approach to increase automation of the database population process, thereby reducing manual intervention. As a first step, we used Unified Modeling Language (UML) to construct a list of potential errors. Each case was evaluated independently, and an expert solution was devised, and represented as a diagram. Subsequently, the UML diagrams were used as templates for writing object-oriented automation programs in the Java programming language.
Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency

Directory of Open Access Journals (Sweden)

Rodrigo Aniceto

2015-01-01

Full Text Available Rapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and analyzing these large-scale processed data. To find an alternative to the frequently considered relational database model becomes a compelling task. Other data models may be more effective when dealing with a very large amount of nonconventional data, especially for writing and retrieving operations. In this paper, we discuss the Cassandra NoSQL database approach for storing genomic data. We perform an analysis of persistency and I/O operations with real data, using the Cassandra database system. We also compare the results obtained with a classical relational database system and another NoSQL database approach, MongoDB.
Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency

Science.gov (United States)

Aniceto, Rodrigo; Xavier, Rene; Guimarães, Valeria; Hondo, Fernanda; Holanda, Maristela; Walter, Maria Emilia; Lifschitz, Sérgio

2015-01-01

Rapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and analyzing these large-scale processed data. To find an alternative to the frequently considered relational database model becomes a compelling task. Other data models may be more effective when dealing with a very large amount of nonconventional data, especially for writing and retrieving operations. In this paper, we discuss the Cassandra NoSQL database approach for storing genomic data. We perform an analysis of persistency and I/O operations with real data, using the Cassandra database system. We also compare the results obtained with a classical relational database system and another NoSQL database approach, MongoDB. PMID:26558254

Evaluating the Cassandra NoSQL Database Approach for Genomic Data Persistency.

Science.gov (United States)

Aniceto, Rodrigo; Xavier, Rene; Guimarães, Valeria; Hondo, Fernanda; Holanda, Maristela; Walter, Maria Emilia; Lifschitz, Sérgio

2015-01-01

Rapid advances in high-throughput sequencing techniques have created interesting computational challenges in bioinformatics. One of them refers to management of massive amounts of data generated by automatic sequencers. We need to deal with the persistency of genomic data, particularly storing and analyzing these large-scale processed data. To find an alternative to the frequently considered relational database model becomes a compelling task. Other data models may be more effective when dealing with a very large amount of nonconventional data, especially for writing and retrieving operations. In this paper, we discuss the Cassandra NoSQL database approach for storing genomic data. We perform an analysis of persistency and I/O operations with real data, using the Cassandra database system. We also compare the results obtained with a classical relational database system and another NoSQL database approach, MongoDB.
Genetic Analysis of Recombinant Inbred Lines for Sorghum bicolor ? Sorghum propinquum

OpenAIRE

Kong, Wenqian; Jin, Huizhe; Franks, Cleve D.; Kim, Changsoo; Bandopadhyay, Rajib; Rana, Mukesh K.; Auckland, Susan A.; Goff, Valorie H.; Rainville, Lisa K.; Burow, Gloria B.; Woodfin, Charles; Burke, John J.; Paterson, Andrew H.

2013-01-01

We describe a recombinant inbred line (RIL) population of 161 F5 genotypes for the widest euploid cross that can be made to cultivated sorghum (Sorghum bicolor) using conventional techniques, S. bicolor ? Sorghum propinquum, that segregates for many traits related to plant architecture, growth and development, reproduction, and life history. The genetic map of the S. bicolor ? S. propinquum RILs contains 141 loci on 10 linkage groups collectively spanning 773.1 cM. Although the genetic map ha...
The Princeton Protein Orthology Database (P-POD): a comparative genomics analysis tool for biologists.

OpenAIRE

Sven Heinicke; Michael S Livstone; Charles Lu; Rose Oughtred; Fan Kang; Samuel V Angiuoli; Owen White; David Botstein; Kara Dolinski

2007-01-01

Many biological databases that provide comparative genomics information and tools are now available on the internet. While certainly quite useful, to our knowledge none of the existing databases combine results from multiple comparative genomics methods with manually curated information from the literature. Here we describe the Princeton Protein Orthology Database (P-POD, http://ortholog.princeton.edu), a user-friendly database system that allows users to find and visualize the phylogenetic r...
Antioxidant potential of impatiens bicolor royle and zizyphus oxyphylla edgew

International Nuclear Information System (INIS)

Qayum, M.; Kaleem, W.A.; Ahmad, S.

2014-01-01

The present investigation has been carried out to evaluate the antioxidant capacity and phenolic composition of Impatiens bicolor Royle and Zizyphus oxyphylla Edgew. The content of phenolic compounds ranged from 15.77 to 27.61 mg catechin equivalents/g of different parts of Zizyphus oxyphylla Edgew., extract and 17.74 mg catechin equivalents/g for Impatiens bicolor Royle extract. The HPLC-ESI-MS/MS analysis of phenolic compounds showed that ferulic acid-hexosides was the only compound detected in I. bicolor, while Z. oxyphylla fruit, stem and leaves exhibited several compounds. Total antioxidant capacity values measured by TEAC assay were 46.32 +- 0.89, 42.56 +- 1.65, 41.34 +- 0.20, and 48.58 +- 0.21 micro mol/g of extract, while those measured by FRAP assay were 102.40 +- 0.18, 207.54 +- 7.91, 254.89 +- 4.20, and 233.00 +- 9.07 micro mol Fe2+/g, for I. bicolor and Z. oxyphylla fruit, leaves and stem, respectively. TRAP values were 43.26 +- 1.27, 112.23 +- 0.00, 102.83 +- 1.66, and 117.37 +- 3.70 micro mol/g of extract for I. bicolor and Z. oxyphylla fruit, leaves and stem respectively. The results indicate that these two plants may be a potential source of antioxidants. (author)
Supervised Learning for Detection of Duplicates in Genomic Sequence Databases.

Directory of Open Access Journals (Sweden)

Qingyu Chen

Full Text Available First identified as an issue in 1996, duplication in biological databases introduces redundancy and even leads to inconsistency when contradictory information appears. The amount of data makes purely manual de-duplication impractical, and existing automatic systems cannot detect duplicates as precisely as can experts. Supervised learning has the potential to address such problems by building automatic systems that learn from expert curation to detect duplicates precisely and efficiently. While machine learning is a mature approach in other duplicate detection contexts, it has seen only preliminary application in genomic sequence databases.We developed and evaluated a supervised duplicate detection method based on an expert curated dataset of duplicates, containing over one million pairs across five organisms derived from genomic sequence databases. We selected 22 features to represent distinct attributes of the database records, and developed a binary model and a multi-class model. Both models achieve promising performance; under cross-validation, the binary model had over 90% accuracy in each of the five organisms, while the multi-class model maintains high accuracy and is more robust in generalisation. We performed an ablation study to quantify the impact of different sequence record features, finding that features derived from meta-data, sequence identity, and alignment quality impact performance most strongly. The study demonstrates machine learning can be an effective additional tool for de-duplication of genomic sequence databases. All Data are available as described in the supplementary material.
Supervised Learning for Detection of Duplicates in Genomic Sequence Databases.

Science.gov (United States)

Chen, Qingyu; Zobel, Justin; Zhang, Xiuzhen; Verspoor, Karin

2016-01-01

First identified as an issue in 1996, duplication in biological databases introduces redundancy and even leads to inconsistency when contradictory information appears. The amount of data makes purely manual de-duplication impractical, and existing automatic systems cannot detect duplicates as precisely as can experts. Supervised learning has the potential to address such problems by building automatic systems that learn from expert curation to detect duplicates precisely and efficiently. While machine learning is a mature approach in other duplicate detection contexts, it has seen only preliminary application in genomic sequence databases. We developed and evaluated a supervised duplicate detection method based on an expert curated dataset of duplicates, containing over one million pairs across five organisms derived from genomic sequence databases. We selected 22 features to represent distinct attributes of the database records, and developed a binary model and a multi-class model. Both models achieve promising performance; under cross-validation, the binary model had over 90% accuracy in each of the five organisms, while the multi-class model maintains high accuracy and is more robust in generalisation. We performed an ablation study to quantify the impact of different sequence record features, finding that features derived from meta-data, sequence identity, and alignment quality impact performance most strongly. The study demonstrates machine learning can be an effective additional tool for de-duplication of genomic sequence databases. All Data are available as described in the supplementary material.
Observations on nematodes from the Indonesian shortfin eel Anguilla bicolor bicolor McClelland in India, including a revalidation of Heliconema ahiri Karve, 1941 (Physalopteridae)

Czech Academy of Sciences Publication Activity Database

Moravec, František; Sheeba, S.; Kumar, A. B.

2013-01-01

Roč. 58, č. 4 (2013), s. 496-503 ISSN 1230-2821 R&D Projects: GA ČR GBP505/12/G112 Institutional support: RVO:60077344 Keywords : Parasitic nematode * Heliconema * Procamallanus * Spirocamallanus * freshwater eel * Anguilla bicolor bicolor * Kerala * India Subject RIV: EA - Cell Biology Impact factor: 0.965, year: 2013
[A study of Boletus bicolor from different areas using Fourier transform infrared spectrometry].

Science.gov (United States)

Zhou, Zai-Jin; Liu, Gang; Ren, Xian-Pei

2010-04-01

It is hard to differentiate the same species of wild growing mushrooms from different areas by macromorphological features. In this paper, Fourier transform infrared (FTIR) spectroscopy combined with principal component analysis was used to identify 58 samples of boletus bicolor from five different areas. Based on the fingerprint infrared spectrum of boletus bicolor samples, principal component analysis was conducted on 58 boletus bicolor spectra in the range of 1 350-750 cm(-1) using the statistical software SPSS 13.0. According to the result, the accumulated contributing ratio of the first three principal components accounts for 88.87%. They included almost all the information of samples. The two-dimensional projection plot using first and second principal component is a satisfactory clustering effect for the classification and discrimination of boletus bicolor. All boletus bicolor samples were divided into five groups with a classification accuracy of 98.3%. The study demonstrated that wild growing boletus bicolor at species level from different areas can be identified by FTIR spectra combined with principal components analysis.
The Perennial Ryegrass GenomeZipper – Targeted Use of Genome Resources for Comparative Grass Genomics

DEFF Research Database (Denmark)

Pfeiffer, Matthias; Martis, Mihaela; Asp, Torben

2013-01-01

(Lolium perenne) genome on the basis of conserved synteny to barley (Hordeum vulgare) and the model grass genome Brachypodium (Brachypodium distachyon) as well as rice (Oryza sativa) and sorghum (Sorghum bicolor). A transcriptome-based genetic linkage map of perennial ryegrass served as a scaffold......Whole-genome sequences established for model and major crop species constitute a key resource for advanced genomic research. For outbreeding forage and turf grass species like ryegrasses (Lolium spp.), such resources have yet to be developed. Here, we present a model of the perennial ryegrass...... to establish the chromosomal arrangement of syntenic genes from model grass species. This scaffold revealed a high degree of synteny and macrocollinearity and was then utilized to anchor a collection of perennial ryegrass genes in silico to their predicted genome positions. This resulted in the unambiguous...
Use of Genomic Databases for Inquiry-Based Learning about Influenza

Science.gov (United States)

Ledley, Fred; Ndung'u, Eric

2011-01-01

The genome projects of the past decades have created extensive databases of biological information with applications in both research and education. We describe an inquiry-based exercise that uses one such database, the National Center for Biotechnology Information Influenza Virus Resource, to advance learning about influenza. This database…
Using relational databases for improved sequence similarity searching and large-scale genomic analyses.

Science.gov (United States)

Mackey, Aaron J; Pearson, William R

2004-10-01

Relational databases are designed to integrate diverse types of information and manage large sets of search results, greatly simplifying genome-scale analyses. Relational databases are essential for management and analysis of large-scale sequence analyses, and can also be used to improve the statistical significance of similarity searches by focusing on subsets of sequence libraries most likely to contain homologs. This unit describes using relational databases to improve the efficiency of sequence similarity searching and to demonstrate various large-scale genomic analyses of homology-related data. This unit describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. These include basic use of the database to generate a novel sequence library subset, how to extend and use seqdb_demo for the storage of sequence similarity search results and making use of various kinds of stored search results to address aspects of comparative genomic analysis.
Accessing the SEED genome databases via Web services API: tools for programmers.

Science.gov (United States)

Disz, Terry; Akhter, Sajia; Cuevas, Daniel; Olson, Robert; Overbeek, Ross; Vonstein, Veronika; Stevens, Rick; Edwards, Robert A

2010-06-14

The SEED integrates many publicly available genome sequences into a single resource. The database contains accurate and up-to-date annotations based on the subsystems concept that leverages clustering between genomes and other clues to accurately and efficiently annotate microbial genomes. The backend is used as the foundation for many genome annotation tools, such as the Rapid Annotation using Subsystems Technology (RAST) server for whole genome annotation, the metagenomics RAST server for random community genome annotations, and the annotation clearinghouse for exchanging annotations from different resources. In addition to a web user interface, the SEED also provides Web services based API for programmatic access to the data in the SEED, allowing the development of third-party tools and mash-ups. The currently exposed Web services encompass over forty different methods for accessing data related to microbial genome annotations. The Web services provide comprehensive access to the database back end, allowing any programmer access to the most consistent and accurate genome annotations available. The Web services are deployed using a platform independent service-oriented approach that allows the user to choose the most suitable programming platform for their application. Example code demonstrate that Web services can be used to access the SEED using common bioinformatics programming languages such as Perl, Python, and Java. We present a novel approach to access the SEED database. Using Web services, a robust API for access to genomics data is provided, without requiring large volume downloads all at once. The API ensures timely access to the most current datasets available, including the new genomes as soon as they come online.
Investigation of mutations in the HBB gene using the 1,000 genomes database.

Science.gov (United States)

Carlice-Dos-Reis, Tânia; Viana, Jaime; Moreira, Fabiano Cordeiro; Cardoso, Greice de Lemos; Guerreiro, João; Santos, Sidney; Ribeiro-Dos-Santos, Ândrea

2017-01-01

Mutations in the HBB gene are responsible for several serious hemoglobinopathies, such as sickle cell anemia and β-thalassemia. Sickle cell anemia is one of the most common monogenic diseases worldwide. Due to its prevalence, diverse strategies have been developed for a better understanding of its molecular mechanisms. In silico analysis has been increasingly used to investigate the genotype-phenotype relationship of many diseases, and the sequences of healthy individuals deposited in the 1,000 Genomes database appear to be an excellent tool for such analysis. The objective of this study is to analyze the variations in the HBB gene in the 1,000 Genomes database, to describe the mutation frequencies in the different population groups, and to investigate the pattern of pathogenicity. The computational tool SNPEFF was used to align the data from 2,504 samples of the 1,000 Genomes database with the HG19 genome reference. The pathogenicity of each amino acid change was investigated using the databases CLINVAR, dbSNP and HbVar and five different predictors. Twenty different mutations were found in 209 healthy individuals. The African group had the highest number of individuals with mutations, and the European group had the lowest number. Thus, it is concluded that approximately 8.3% of phenotypically healthy individuals from the 1,000 Genomes database have some mutation in the HBB gene. The frequency of mutated genes was estimated at 0.042, so that the expected frequency of being homozygous or compound heterozygous for these variants in the next generation is approximately 0.002. In total, 193 subjects had a non-synonymous mutation, which 186 (7.4%) have a deleterious mutation. Considering that the 1,000 Genomes database is representative of the world's population, it can be estimated that fourteen out of every 10,000 individuals in the world will have a hemoglobinopathy in the next generation.
The Genomes OnLine Database (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification

Science.gov (United States)

Reddy, T.B.K.; Thomas, Alex D.; Stamatis, Dimitri; Bertsch, Jon; Isbandi, Michelle; Jansson, Jakob; Mallajosyula, Jyothi; Pagani, Ioanna; Lobos, Elizabeth A.; Kyrpides, Nikos C.

2015-01-01

The Genomes OnLine Database (GOLD; http://www.genomesonline.org) is a comprehensive online resource to catalog and monitor genetic studies worldwide. GOLD provides up-to-date status on complete and ongoing sequencing projects along with a broad array of curated metadata. Here we report version 5 (v.5) of the database. The newly designed database schema and web user interface supports several new features including the implementation of a four level (meta)genome project classification system and a simplified intuitive web interface to access reports and launch search tools. The database currently hosts information for about 19 200 studies, 56 000 Biosamples, 56 000 sequencing projects and 39 400 analysis projects. More than just a catalog of worldwide genome projects, GOLD is a manually curated, quality-controlled metadata warehouse. The problems encountered in integrating disparate and varying quality data into GOLD are briefly highlighted. GOLD fully supports and follows the Genomic Standards Consortium (GSC) Minimum Information standards. PMID:25348402
The Genomes OnLine Database (GOLD) v.5: a metadata management system based on a four level (meta)genome project classification

Energy Technology Data Exchange (ETDEWEB)

Reddy, Tatiparthi B. K. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Thomas, Alex D. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Stamatis, Dimitri [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Bertsch, Jon [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Isbandi, Michelle [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Jansson, Jakob [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Mallajosyula, Jyothi [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Pagani, Ioanna [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Lobos, Elizabeth A. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Kyrpides, Nikos C. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); King Abdulaziz Univ., Jeddah (Saudi Arabia)

2014-10-27

The Genomes OnLine Database (GOLD; http://www.genomesonline.org) is a comprehensive online resource to catalog and monitor genetic studies worldwide. GOLD provides up-to-date status on complete and ongoing sequencing projects along with a broad array of curated metadata. Within this paper, we report version 5 (v.5) of the database. The newly designed database schema and web user interface supports several new features including the implementation of a four level (meta)genome project classification system and a simplified intuitive web interface to access reports and launch search tools. The database currently hosts information for about 19 200 studies, 56 000 Biosamples, 56 000 sequencing projects and 39 400 analysis projects. More than just a catalog of worldwide genome projects, GOLD is a manually curated, quality-controlled metadata warehouse. The problems encountered in integrating disparate and varying quality data into GOLD are briefly highlighted. Lastly, GOLD fully supports and follows the Genomic Standards Consortium (GSC) Minimum Information standards.
Histological Features of the Gastrointestinal Tract of Wild Indonesian Shortfin Eel, Anguilla bicolor bicolor (McClelland, 1844, Captured in Peninsular Malaysia

Directory of Open Access Journals (Sweden)

Nurrul Shaqinah Nasruddin

2014-01-01

Full Text Available This study was conducted to record the histological features of the gastrointestinal tract of wild Indonesian shortfin eel, Anguilla bicolor bicolor (McClelland, 1844, captured in Peninsular Malaysia. The gastrointestinal tract was segmented into the oesophagus, stomach, and intestine. Then, the oesophagus was divided into five (first to fifth, the stomach into two (cardiac and pyloric, and the intestine into four segments (anterior, intermediate, posterior, and rectum for histological examinations. The stomach had significantly taller villi and thicker inner circular muscles compared to the intestine and oesophagus. The lamina propria was thickest in stomach, significantly when compared with oesophagus, but not with the intestine. However, the intestine showed significantly thicker outer longitudinal muscle while gastric glands were observed only in the stomach. The histological features were closely associated with the functions of the different segments of the gastrointestinal tract. In conclusion, the histological features of the gastrointestinal tract of A. b. bicolor are consistent with the feeding habit of a carnivorous fish.
Histological features of the gastrointestinal tract of wild Indonesian shortfin eel, Anguilla bicolor bicolor (McClelland, 1844), captured in Peninsular Malaysia.

Science.gov (United States)

Nasruddin, Nurrul Shaqinah; Azmai, Mohammad Noor Amal; Ismail, Ahmad; Saad, Mohd Zamri; Daud, Hassan Mohd; Zulkifli, Syaizwan Zahmir

2014-01-01

This study was conducted to record the histological features of the gastrointestinal tract of wild Indonesian shortfin eel, Anguilla bicolor bicolor (McClelland, 1844), captured in Peninsular Malaysia. The gastrointestinal tract was segmented into the oesophagus, stomach, and intestine. Then, the oesophagus was divided into five (first to fifth), the stomach into two (cardiac and pyloric), and the intestine into four segments (anterior, intermediate, posterior, and rectum) for histological examinations. The stomach had significantly taller villi and thicker inner circular muscles compared to the intestine and oesophagus. The lamina propria was thickest in stomach, significantly when compared with oesophagus, but not with the intestine. However, the intestine showed significantly thicker outer longitudinal muscle while gastric glands were observed only in the stomach. The histological features were closely associated with the functions of the different segments of the gastrointestinal tract. In conclusion, the histological features of the gastrointestinal tract of A. b. bicolor are consistent with the feeding habit of a carnivorous fish.
SoyTEdb: a comprehensive database of transposable elements in the soybean genome

Directory of Open Access Journals (Sweden)

Zhu Liucun

2010-02-01

Full Text Available Abstract Background Transposable elements are the most abundant components of all characterized genomes of higher eukaryotes. It has been documented that these elements not only contribute to the shaping and reshaping of their host genomes, but also play significant roles in regulating gene expression, altering gene function, and creating new genes. Thus, complete identification of transposable elements in sequenced genomes and construction of comprehensive transposable element databases are essential for accurate annotation of genes and other genomic components, for investigation of potential functional interaction between transposable elements and genes, and for study of genome evolution. The recent availability of the soybean genome sequence has provided an unprecedented opportunity for discovery, and structural and functional characterization of transposable elements in this economically important legume crop. Description Using a combination of structure-based and homology-based approaches, a total of 32,552 retrotransposons (Class I and 6,029 DNA transposons (Class II with clear boundaries and insertion sites were structurally annotated and clearly categorized, and a soybean transposable element database, SoyTEdb, was established. These transposable elements have been anchored in and integrated with the soybean physical map and genetic map, and are browsable and visualizable at any scale along the 20 soybean chromosomes, along with predicted genes and other sequence annotations. BLAST search and other infrastracture tools were implemented to facilitate annotation of transposable elements or fragments from soybean and other related legume species. The majority (> 95% of these elements (particularly a few hundred low-copy-number families are first described in this study. Conclusion SoyTEdb provides resources and information related to transposable elements in the soybean genome, representing the most comprehensive and the largest manually
H2DB: a heritability database across multiple species by annotating trait-associated genomic loci.

Science.gov (United States)

Kaminuma, Eli; Fujisawa, Takatomo; Tanizawa, Yasuhiro; Sakamoto, Naoko; Kurata, Nori; Shimizu, Tokurou; Nakamura, Yasukazu

2013-01-01

H2DB (http://tga.nig.ac.jp/h2db/), an annotation database of genetic heritability estimates for humans and other species, has been developed as a knowledge database to connect trait-associated genomic loci. Heritability estimates have been investigated for individual species, particularly in human twin studies and plant/animal breeding studies. However, there appears to be no comprehensive heritability database for both humans and other species. Here, we introduce an annotation database for genetic heritabilities of various species that was annotated by manually curating online public resources in PUBMED abstracts and journal contents. The proposed heritability database contains attribute information for trait descriptions, experimental conditions, trait-associated genomic loci and broad- and narrow-sense heritability specifications. Annotated trait-associated genomic loci, for which most are single-nucleotide polymorphisms derived from genome-wide association studies, may be valuable resources for experimental scientists. In addition, we assigned phenotype ontologies to the annotated traits for the purposes of discussing heritability distributions based on phenotypic classifications.
A Sorghum bicolor expression atlas reveals dynamic genotype-specific expression profiles for vegetative tissues of grain, sweet and bioenergy sorghums

Energy Technology Data Exchange (ETDEWEB)

Shakoor, N; Nair, R; Crasta, O; Morris, G; Feltus, A; Kresovich, S

2014-01-23

Background: Effective improvement in sorghum crop development necessitates a genomics-based approach to identify functional genes and QTLs. Sequenced in 2009, a comprehensive annotation of the sorghum genome and the development of functional genomics resources is key to enable the discovery and deployment of regulatory and metabolic genes and gene networks for crop improvement. Results: This study utilizes the first commercially available whole-transcriptome sorghum microarray (Sorgh-WTa520972F) to identify tissue and genotype-specific expression patterns for all identified Sorghum bicolor exons and UTRs. The genechip contains 1,026,373 probes covering 149,182 exons (27,577 genes) across the Sorghum bicolor nuclear, chloroplast, and mitochondrial genomes. Specific probesets were also included for putative non-coding RNAs that may play a role in gene regulation (e. g., microRNAs), and confirmed functional small RNAs in related species (maize and sugarcane) were also included in our array design. We generated expression data for 78 samples with a combination of four different tissue types (shoot, root, leaf and stem), two dissected stem tissues (pith and rind) and six diverse genotypes, which included 6 public sorghum lines (R159, Atlas, Fremont, PI152611, AR2400 and PI455230) representing grain, sweet, forage, and high biomass ideotypes. Conclusions: Here we present a summary of the microarray dataset, including analysis of tissue-specific gene expression profiles and associated expression profiles of relevant metabolic pathways. With an aim to enable identification and functional characterization of genes in sorghum, this expression atlas presents a new and valuable resource to the research community.

RICD: A rice indica cDNA database resource for rice functional genomics

Directory of Open Access Journals (Sweden)

Zhang Qifa

2008-11-01

Full Text Available Abstract Background The Oryza sativa L. indica subspecies is the most widely cultivated rice. During the last few years, we have collected over 20,000 putative full-length cDNAs and over 40,000 ESTs isolated from various cDNA libraries of two indica varieties Guangluai 4 and Minghui 63. A database of the rice indica cDNAs was therefore built to provide a comprehensive web data source for searching and retrieving the indica cDNA clones. Results Rice Indica cDNA Database (RICD is an online MySQL-PHP driven database with a user-friendly web interface. It allows investigators to query the cDNA clones by keyword, genome position, nucleotide or protein sequence, and putative function. It also provides a series of information, including sequences, protein domain annotations, similarity search results, SNPs and InDels information, and hyperlinks to gene annotation in both The Rice Annotation Project Database (RAP-DB and The TIGR Rice Genome Annotation Resource, expression atlas in RiceGE and variation report in Gramene of each cDNA. Conclusion The online rice indica cDNA database provides cDNA resource with comprehensive information to researchers for functional analysis of indica subspecies and for comparative genomics. The RICD database is available through our website http://www.ncgr.ac.cn/ricd.
CTDB: An Integrated Chickpea Transcriptome Database for Functional and Applied Genomics.

Directory of Open Access Journals (Sweden)

Mohit Verma

Full Text Available Chickpea is an important grain legume used as a rich source of protein in human diet. The narrow genetic diversity and limited availability of genomic resources are the major constraints in implementing breeding strategies and biotechnological interventions for genetic enhancement of chickpea. We developed an integrated Chickpea Transcriptome Database (CTDB, which provides the comprehensive web interface for visualization and easy retrieval of transcriptome data in chickpea. The database features many tools for similarity search, functional annotation (putative function, PFAM domain and gene ontology search and comparative gene expression analysis. The current release of CTDB (v2.0 hosts transcriptome datasets with high quality functional annotation from cultivated (desi and kabuli types and wild chickpea. A catalog of transcription factor families and their expression profiles in chickpea are available in the database. The gene expression data have been integrated to study the expression profiles of chickpea transcripts in major tissues/organs and various stages of flower development. The utilities, such as similarity search, ortholog identification and comparative gene expression have also been implemented in the database to facilitate comparative genomic studies among different legumes and Arabidopsis. Furthermore, the CTDB represents a resource for the discovery of functional molecular markers (microsatellites and single nucleotide polymorphisms between different chickpea types. We anticipate that integrated information content of this database will accelerate the functional and applied genomic research for improvement of chickpea. The CTDB web service is freely available at http://nipgr.res.in/ctdb.html.
DRDB: An Online Date Palm Genomic Resource Database

Directory of Open Access Journals (Sweden)

Zilong He

2017-11-01

Full Text Available Background: Date palm (Phoenix dactylifera L. is a cultivated woody plant with agricultural and economic importance in many countries around the world. With the advantages of next generation sequencing technologies, genome sequences for many date palm cultivars have been released recently. Short sequence repeat (SSR and single nucleotide polymorphism (SNP can be identified from these genomic data, and have been proven to be very useful biomarkers in plant genome analysis and breeding.Results: Here, we first improved the date palm genome assembly using 130X of HiSeq data generated in our lab. Then 246,445 SSRs (214,901 SSRs and 31,544 compound SSRs were annotated in this genome assembly; among the SSRs, mononucleotide SSRs (58.92% were the most abundant, followed by di- (29.92%, tri- (8.14%, tetra- (2.47%, penta- (0.36%, and hexa-nucleotide SSRs (0.19%. The high-quality PCR primer pairs were designed for most (174,497; 70.81% out of total SSRs. We also annotated 6,375,806 SNPs with raw read depth≥3 in 90% cultivars. To further reduce false positive SNPs, we only kept 5,572,650 (87.40% out of total SNPs with at least 20% cultivars support for downstream analyses. The high-quality PCR primer pairs were also obtained for 4,177,778 (65.53% SNPs. We reconstructed the phylogenetic relationships among the 62 cultivars using these variants and found that they can be divided into three clusters, namely North Africa, Egypt – Sudan, and Middle East – South Asian, with Egypt – Sudan being the admixture of North Africa and Middle East – South Asian cultivars; we further confirmed these clusters using principal component analysis. Moreover, 34,346 SSRs and 4,177,778 SNPs with PCR primers were assigned to shared cultivars for cultivar classification and diversity analysis. All these SSRs, SNPs and their classification are available in our database, and can be used for cultivar identification, comparison, and molecular breeding.Conclusion:DRDB is a
(Arachis hypogaea) and Sorghum (Sorghum bicolor)

African Journals Online (AJOL)

ADOWIE PERE

as enzyme activities of Arachis hypogaea and Sorghum bicolor in crude oil contaminated soil. Crude oil ... Treatments without crude oil were ... replicates were made for each treatment. .... dead sections of leaf margins, burning and stunted or.
PSSRdb: a relational database of polymorphic simple sequence repeats extracted from prokaryotic genomes.

Science.gov (United States)

Kumar, Pankaj; Chaitanya, Pasumarthy S; Nagarajaram, Hampapathalu A

2011-01-01

PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1-6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in some bacteria. Although SSR-mediated phase variation and antigenic variations have been well-studied in some bacteria there seems a lot of other species of prokaryotes yet to be investigated for SSR mediated adaptive and other evolutionary advantages. As a part of our on-going studies on SSR polymorphism in prokaryotes we compared the genome sequences of various strains and isolates available for 85 different species of prokaryotes and extracted a number of SSRs showing length variations and created a relational database called PSSRdb. This database gives useful information such as location of PSSRs in genomes, length variation across genomes, the regions harboring PSSRs, etc. The information provided in this database is very useful for further research and analysis of SSRs in prokaryotes.
Rapid storage and retrieval of genomic intervals from a relational database system using nested containment lists.

Science.gov (United States)

Wiley, Laura K; Sivley, R Michael; Bush, William S

2013-01-01

Efficient storage and retrieval of genomic annotations based on range intervals is necessary, given the amount of data produced by next-generation sequencing studies. The indexing strategies of relational database systems (such as MySQL) greatly inhibit their use in genomic annotation tasks. This has led to the development of stand-alone applications that are dependent on flat-file libraries. In this work, we introduce MyNCList, an implementation of the NCList data structure within a MySQL database. MyNCList enables the storage, update and rapid retrieval of genomic annotations from the convenience of a relational database system. Range-based annotations of 1 million variants are retrieved in under a minute, making this approach feasible for whole-genome annotation tasks. Database URL: https://github.com/bushlab/mynclist.
A SNP-centric database for the investigation of the human genome

Directory of Open Access Journals (Sweden)

Kohane Isaac S

2004-03-01

Full Text Available Abstract Background Single Nucleotide Polymorphisms (SNPs are an increasingly important tool for genetic and biomedical research. Although current genomic databases contain information on several million SNPs and are growing at a very fast rate, the true value of a SNP in this context is a function of the quality of the annotations that characterize it. Retrieving and analyzing such data for a large number of SNPs often represents a major bottleneck in the design of large-scale association studies. Description SNPper is a web-based application designed to facilitate the retrieval and use of human SNPs for high-throughput research purposes. It provides a rich local database generated by combining SNP data with the Human Genome sequence and with several other data sources, and offers the user a variety of querying, visualization and data export tools. In this paper we describe the structure and organization of the SNPper database, we review the available data export and visualization options, and we describe how the architecture of SNPper and its specialized data structures support high-volume SNP analysis. Conclusions The rich annotation database and the powerful data manipulation and presentation facilities it offers make SNPper a very useful online resource for SNP research. Its success proves the great need for integrated and interoperable resources in the field of computational biology, and shows how such systems may play a critical role in supporting the large-scale computational analysis of our genome.
BioQ: tracing experimental origins in public genomic databases using a novel data provenance model.

Science.gov (United States)

Saccone, Scott F; Quan, Jiaxi; Jones, Peter L

2012-04-15

Public genomic databases, which are often used to guide genetic studies of human disease, are now being applied to genomic medicine through in silico integrative genomics. These databases, however, often lack tools for systematically determining the experimental origins of the data. We introduce a new data provenance model that we have implemented in a public web application, BioQ, for assessing the reliability of the data by systematically tracing its experimental origins to the original subjects and biologics. BioQ allows investigators to both visualize data provenance as well as explore individual elements of experimental process flow using precise tools for detailed data exploration and documentation. It includes a number of human genetic variation databases such as the HapMap and 1000 Genomes projects. BioQ is freely available to the public at http://bioq.saclab.net.
Exploring Protein Function Using the Saccharomyces Genome Database.

Science.gov (United States)

Wong, Edith D

2017-01-01

Elucidating the function of individual proteins will help to create a comprehensive picture of cell biology, as well as shed light on human disease mechanisms, possible treatments, and cures. Due to its compact genome, and extensive history of experimentation and annotation, the budding yeast Saccharomyces cerevisiae is an ideal model organism in which to determine protein function. This information can then be leveraged to infer functions of human homologs. Despite the large amount of research and biological data about S. cerevisiae, many proteins' functions remain unknown. Here, we explore ways to use the Saccharomyces Genome Database (SGD; http://www.yeastgenome.org ) to predict the function of proteins and gain insight into their roles in various cellular processes.
EchoBASE: an integrated post-genomic database for Escherichia coli.

Science.gov (United States)

Misra, Raju V; Horler, Richard S P; Reindl, Wolfgang; Goryanin, Igor I; Thomas, Gavin H

2005-01-01

EchoBASE (http://www.ecoli-york.org) is a relational database designed to contain and manipulate information from post-genomic experiments using the model bacterium Escherichia coli K-12. Its aim is to collate information from a wide range of sources to provide clues to the functions of the approximately 1500 gene products that have no confirmed cellular function. The database is built on an enhanced annotation of the updated genome sequence of strain MG1655 and the association of experimental data with the E.coli genes and their products. Experiments that can be held within EchoBASE include proteomics studies, microarray data, protein-protein interaction data, structural data and bioinformatics studies. EchoBASE also contains annotated information on 'orphan' enzyme activities from this microbe to aid characterization of the proteins that catalyse these elusive biochemical reactions.
Differential chromosomal organization between Saguinus midas and Saguinus bicolor with accumulation of differences the repetitive sequence DNA.

Science.gov (United States)

Serfaty, Dayane Martins Barbosa; Carvalho, Natália Dayane Moura; Gross, Maria Claudia; Gordo, Marcelo; Schneider, Carlos Henrique

2017-10-01

Saguinus is the largest and most complex genus of the subfamily Callitrichinae, with 23 species distributed from the south of Central America to the north of South America with Saguinus midas having the largest geographical distribution while Saguinus bicolor has a very restricted one, affected by the population expansion in the state of Amazonas. Considering the phylogenetic proximity of the two species along with evidence on the existence of hybrids between them, as well as cytogenetic studies on Saguinus describing a conserved karyotypic macrostructure, we carried out a physical mapping of DNA repeated sequences in the mitotic chromosome of both species, since these sequences are less susceptible to evolutionary pressure and possibly perform an important function in speciation. Both species presented 2n = 46 chromosomes; in S. midas, chromosome Y is the smallest. Multiple ribosomal sites occur in both species, but chromosome pairs three and four may be regarded as markers that differ the species when subjected to G banding and distribution of retroelement LINE 1, suggesting that it may be cytogenetic marker in which it can contribute to identification of first generation hybrids in contact zone. Saguinus bicolor also presented differences in the LINE 1 distribution pattern for sexual chromosome X in individuals from different urban fragments, probably due to geographical isolation. In this context, cytogenetic analyses reveal a differential genomic organization pattern between species S. midas and S. bicolor, in addition to indicating that individuals from different urban fragments have been accumulating differences because of the isolation between them.
The need for high-quality whole-genome sequence databases in microbial forensics.

Science.gov (United States)

Sjödin, Andreas; Broman, Tina; Melefors, Öjar; Andersson, Gunnar; Rasmusson, Birgitta; Knutsson, Rickard; Forsman, Mats

2013-09-01

Microbial forensics is an important part of a strengthened capability to respond to biocrime and bioterrorism incidents to aid in the complex task of distinguishing between natural outbreaks and deliberate acts. The goal of a microbial forensic investigation is to identify and criminally prosecute those responsible for a biological attack, and it involves a detailed analysis of the weapon--that is, the pathogen. The recent development of next-generation sequencing (NGS) technologies has greatly increased the resolution that can be achieved in microbial forensic analyses. It is now possible to identify, quickly and in an unbiased manner, previously undetectable genome differences between closely related isolates. This development is particularly relevant for the most deadly bacterial diseases that are caused by bacterial lineages with extremely low levels of genetic diversity. Whole-genome analysis of pathogens is envisaged to be increasingly essential for this purpose. In a microbial forensic context, whole-genome sequence analysis is the ultimate method for strain comparisons as it is informative during identification, characterization, and attribution--all 3 major stages of the investigation--and at all levels of microbial strain identity resolution (ie, it resolves the full spectrum from family to isolate). Given these capabilities, one bottleneck in microbial forensics investigations is the availability of high-quality reference databases of bacterial whole-genome sequences. To be of high quality, databases need to be curated and accurate in terms of sequences, metadata, and genetic diversity coverage. The development of whole-genome sequence databases will be instrumental in successfully tracing pathogens in the future.
VaProS: a database-integration approach for protein/genome information retrieval

KAUST Repository

Gojobori, Takashi; Ikeo, Kazuho; Katayama, Yukie; Kawabata, Takeshi; Kinjo, Akira R.; Kinoshita, Kengo; Kwon, Yeondae; Migita, Ohsuke; Mizutani, Hisashi; Muraoka, Masafumi; Nagata, Koji; Omori, Satoshi; Sugawara, Hideaki; Yamada, Daichi; Yura, Kei

2016-01-01

Life science research now heavily relies on all sorts of databases for genome sequences, transcription, protein three-dimensional (3D) structures, protein–protein interactions, phenotypes and so forth. The knowledge accumulated by all the omics research is so vast that a computer-aided search of data is now a prerequisite for starting a new study. In addition, a combinatory search throughout these databases has a chance to extract new ideas and new hypotheses that can be examined by wet-lab experiments. By virtually integrating the related databases on the Internet, we have built a new web application that facilitates life science researchers for retrieving experts’ knowledge stored in the databases and for building a new hypothesis of the research target. This web application, named VaProS, puts stress on the interconnection between the functional information of genome sequences and protein 3D structures, such as structural effect of the gene mutation. In this manuscript, we present the notion of VaProS, the databases and tools that can be accessed without any knowledge of database locations and data formats, and the power of search exemplified in quest of the molecular mechanisms of lysosomal storage disease. VaProS can be freely accessed at http://p4d-info.nig.ac.jp/vapros/.
VaProS: a database-integration approach for protein/genome information retrieval

KAUST Repository

Gojobori, Takashi

2016-12-24

Life science research now heavily relies on all sorts of databases for genome sequences, transcription, protein three-dimensional (3D) structures, protein–protein interactions, phenotypes and so forth. The knowledge accumulated by all the omics research is so vast that a computer-aided search of data is now a prerequisite for starting a new study. In addition, a combinatory search throughout these databases has a chance to extract new ideas and new hypotheses that can be examined by wet-lab experiments. By virtually integrating the related databases on the Internet, we have built a new web application that facilitates life science researchers for retrieving experts’ knowledge stored in the databases and for building a new hypothesis of the research target. This web application, named VaProS, puts stress on the interconnection between the functional information of genome sequences and protein 3D structures, such as structural effect of the gene mutation. In this manuscript, we present the notion of VaProS, the databases and tools that can be accessed without any knowledge of database locations and data formats, and the power of search exemplified in quest of the molecular mechanisms of lysosomal storage disease. VaProS can be freely accessed at http://p4d-info.nig.ac.jp/vapros/.
A Utility Maximizing and Privacy Preserving Approach for Protecting Kinship in Genomic Databases.

Science.gov (United States)

Kale, Gulce; Ayday, Erman; Tastan, Oznur

2017-09-12

Rapid and low cost sequencing of genomes enabled widespread use of genomic data in research studies and personalized customer applications, where genomic data is shared in public databases. Although the identities of the participants are anonymized in these databases, sensitive information about individuals can still be inferred. One such information is kinship. We define two routes kinship privacy can leak and propose a technique to protect kinship privacy against these risks while maximizing the utility of shared data. The method involves systematic identification of minimal portions of genomic data to mask as new participants are added to the database. Choosing the proper positions to hide is cast as an optimization problem in which the number of positions to mask is minimized subject to privacy constraints that ensure the familial relationships are not revealed.We evaluate the proposed technique on real genomic data. Results indicate that concurrent sharing of data pertaining to a parent and an offspring results in high risks of kinship privacy, whereas the sharing data from further relatives together is often safer. We also show arrival order of family members have a high impact on the level of privacy risks and on the utility of sharing data. Available at: https://github.com/tastanlab/Kinship-Privacy. erman@cs.bilkent.edu.tr or oznur.tastan@cs.bilkent.edu.tr. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
CTDB: An Integrated Chickpea Transcriptome Database for Functional and Applied Genomics

OpenAIRE

Verma, Mohit; Kumar, Vinay; Patel, Ravi K.; Garg, Rohini; Jain, Mukesh

2015-01-01

Chickpea is an important grain legume used as a rich source of protein in human diet. The narrow genetic diversity and limited availability of genomic resources are the major constraints in implementing breeding strategies and biotechnological interventions for genetic enhancement of chickpea. We developed an integrated Chickpea Transcriptome Database (CTDB), which provides the comprehensive web interface for visualization and easy retrieval of transcriptome data in chickpea. The database fea...
GenoMycDB: a database for comparative analysis of mycobacterial genes and genomes.

Science.gov (United States)

Catanho, Marcos; Mascarenhas, Daniel; Degrave, Wim; Miranda, Antonio Basílio de

2006-03-31

Several databases and computational tools have been created with the aim of organizing, integrating and analyzing the wealth of information generated by large-scale sequencing projects of mycobacterial genomes and those of other organisms. However, with very few exceptions, these databases and tools do not allow for massive and/or dynamic comparison of these data. GenoMycDB (http://www.dbbm.fiocruz.br/GenoMycDB) is a relational database built for large-scale comparative analyses of completely sequenced mycobacterial genomes, based on their predicted protein content. Its central structure is composed of the results obtained after pair-wise sequence alignments among all the predicted proteins coded by the genomes of six mycobacteria: Mycobacterium tuberculosis (strains H37Rv and CDC1551), M. bovis AF2122/97, M. avium subsp. paratuberculosis K10, M. leprae TN, and M. smegmatis MC2 155. The database stores the computed similarity parameters of every aligned pair, providing for each protein sequence the predicted subcellular localization, the assigned cluster of orthologous groups, the features of the corresponding gene, and links to several important databases. Tables containing pairs or groups of potential homologs between selected species/strains can be produced dynamically by user-defined criteria, based on one or multiple sequence similarity parameters. In addition, searches can be restricted according to the predicted subcellular localization of the protein, the DNA strand of the corresponding gene and/or the description of the protein. Massive data search and/or retrieval are available, and different ways of exporting the result are offered. GenoMycDB provides an on-line resource for the functional classification of mycobacterial proteins as well as for the analysis of genome structure, organization, and evolution.
Genome-Wide Identification and Analysis of Arabidopsis Sodium Proton Antiporter (NHX and Human Sodium Proton Exchanger (NHE Homologs in Sorghum bicolor

Directory of Open Access Journals (Sweden)

P. Hima Kumari

2018-05-01

Full Text Available Na+ transporters play an important role during salt stress and development. The present study is aimed at genome-wide identification, in silico analysis of sodium-proton antiporter (NHX and sodium-proton exchanger (NHE-type transporters in Sorghum bicolor and their expression patterns under varied abiotic stress conditions. In Sorghum, seven NHX and nine NHE homologs were identified. Amiloride (a known inhibitor of Na+/H+ exchanger activity binding motif was noticed in both types of the transporters. Chromosome 2 was found to be a hotspot region with five sodium transporters. Phylogenetic analysis inferred six ortholog and three paralog groups. To gain an insight into functional divergence of SbNHX/NHE transporters, real-time gene expression was performed under salt, drought, heat, and cold stresses in embryo, root, stem, and leaf tissues. Expression patterns revealed that both SbNHXs and SbNHEs are responsive either to single or multiple abiotic stresses. The predicted protein–protein interaction networks revealed that only SbNHX7 is involved in the calcineurin B-like proteins (CBL- CBL interacting protein kinases (CIPK pathway. The study provides insights into the functional divergence of SbNHX/NHE transporter genes with tissue specific expressions in Sorghum under different abiotic stress conditions.
Importance of databases of nucleic acids for bioinformatic analysis focused to genomics

Science.gov (United States)

Jimenez-Gutierrez, L. R.; Barrios-Hernández, C. J.; Pedraza-Ferreira, G. R.; Vera-Cala, L.; Martinez-Perez, F.

2016-08-01

Recently, bioinformatics has become a new field of science, indispensable in the analysis of millions of nucleic acids sequences, which are currently deposited in international databases (public or private); these databases contain information of genes, RNA, ORF, proteins, intergenic regions, including entire genomes from some species. The analysis of this information requires computer programs; which were renewed in the use of new mathematical methods, and the introduction of the use of artificial intelligence. In addition to the constant creation of supercomputing units trained to withstand the heavy workload of sequence analysis. However, it is still necessary the innovation on platforms that allow genomic analyses, faster and more effectively, with a technological understanding of all biological processes.
Cs phytoremediation by Sorghum bicolor cultivated in soil and in hydroponic system.

Science.gov (United States)

Wang, Xu; Chen, Can; Wang, Jianlong

2017-04-03

Cs accumulation characteristics by Sorghum bicolor were investigated in hydroponic system (Cs level at 50-1000 μmol/L) and in soil (Cs-spiked concentration was 100 and 400 mg/kg soil). Two varieties of S. bicolor Cowly and Nengsi 2# grown on pot soil during the entire growth period (100 days) did not show significant differences on the height, dry weight (DW), and Cs accumulation. S. bicolor showed the potential phytoextraction ability for Cs-contaminated soil with the bioaccumulation factor (BCF) and the translocation factor (TF) values usually higher than 1 in soil system and in hydroponic system. The aerial parts of S. bicolor contributed to 86-92% of the total removed amounts of Cs from soil. Cs level in solution at 100 μmol/L gave the highest BCF and TF values of S. bicolor. Cs at low level tended to transfer to the aerial parts, whereas Cs at high level decreased the transfer ratio from root to shoot. In soil, the plant grew well when Cs spiked level was 100 mg/kg soil, but was inhibited by Cs at 400 mg/kg soil with Cs content in sorghum reaching 1147 mg/kg (roots), 2473 mg/kg (stems), and 2939 mg/kg (leaves). In hydroponic system, average Cs level in sorghum reached 5270 mg/kg (roots) and 4513 mg/kg (aerial parts), without significant damages to its biomass at 30 days after starting Cs treatment. Cs accumulation in sorghum tissues was positively correlated with the metal concentration in medium.

GEAR: A database of Genomic Elements Associated with drug Resistance

Science.gov (United States)

Wang, Yin-Ying; Chen, Wei-Hua; Xiao, Pei-Pei; Xie, Wen-Bin; Luo, Qibin; Bork, Peer; Zhao, Xing-Ming

2017-01-01

Drug resistance is becoming a serious problem that leads to the failure of standard treatments, which is generally developed because of genetic mutations of certain molecules. Here, we present GEAR (A database of Genomic Elements Associated with drug Resistance) that aims to provide comprehensive information about genomic elements (including genes, single-nucleotide polymorphisms and microRNAs) that are responsible for drug resistance. Right now, GEAR contains 1631 associations between 201 human drugs and 758 genes, 106 associations between 29 human drugs and 66 miRNAs, and 44 associations between 17 human drugs and 22 SNPs. These relationships are firstly extracted from primary literature with text mining and then manually curated. The drug resistome deposited in GEAR provides insights into the genetic factors underlying drug resistance. In addition, new indications and potential drug combinations can be identified based on the resistome. The GEAR database can be freely accessed through http://gear.comp-sysbio.org. PMID:28294141
ATGC: a database of orthologous genes from closely related prokaryotic genomes and a research platform for microevolution of prokaryotes

Energy Technology Data Exchange (ETDEWEB)

Novichkov, Pavel S.; Ratnere, Igor; Wolf, Yuri I.; Koonin, Eugene V.; Dubchak, Inna

2009-07-23

The database of Alignable Tight Genomic Clusters (ATGCs) consists of closely related genomes of archaea and bacteria, and is a resource for research into prokaryotic microevolution. Construction of a data set with appropriate characteristics is a major hurdle for this type of studies. With the current rate of genome sequencing, it is difficult to follow the progress of the field and to determine which of the available genome sets meet the requirements of a given research project, in particular, with respect to the minimum and maximum levels of similarity between the included genomes. Additionally, extraction of specific content, such as genomic alignments or families of orthologs, from a selected set of genomes is a complicated and time-consuming process. The database addresses these problems by providing an intuitive and efficient web interface to browse precomputed ATGCs, select appropriate ones and access ATGC-derived data such as multiple alignments of orthologous proteins, matrices of pairwise intergenomic distances based on genome-wide analysis of synonymous and nonsynonymous substitution rates and others. The ATGC database will be regularly updated following new releases of the NCBI RefSeq. The database is hosted by the Genomics Division at Lawrence Berkeley National laboratory and is publicly available at http://atgc.lbl.gov.
Biocuration at the Saccharomyces genome database.

Science.gov (United States)

Skrzypek, Marek S; Nash, Robert S

2015-08-01

Saccharomyces Genome Database is an online resource dedicated to managing information about the biology and genetics of the model organism, yeast (Saccharomyces cerevisiae). This information is derived primarily from scientific publications through a process of human curation that involves manual extraction of data and their organization into a comprehensive system of knowledge. This system provides a foundation for further analysis of experimental data coming from research on yeast as well as other organisms. In this review we will demonstrate how biocuration and biocurators add a key component, the biological context, to our understanding of how genes, proteins, genomes and cells function and interact. We will explain the role biocurators play in sifting through the wealth of biological data to incorporate and connect key information. We will also discuss the many ways we assist researchers with their various research needs. We hope to convince the reader that manual curation is vital in converting the flood of data into organized and interconnected knowledge, and that biocurators play an essential role in the integration of scientific information into a coherent model of the cell. © 2015 Wiley Periodicals, Inc.
Construction of an integrated database to support genomic sequence analysis

Energy Technology Data Exchange (ETDEWEB)

Gilbert, W.; Overbeek, R.

1994-11-01

The central goal of this project is to develop an integrated database to support comparative analysis of genomes including DNA sequence data, protein sequence data, gene expression data and metabolism data. In developing the logic-based system GenoBase, a broader integration of available data was achieved due to assistance from collaborators. Current goals are to easily include new forms of data as they become available and to easily navigate through the ensemble of objects described within the database. This report comments on progress made in these areas.
Updates to the Cool Season Food Legume Genome Database: Resources for pea, lentil, faba bean and chickpea genetics, genomics and breeding

Science.gov (United States)

The Cool Season Food Legume Genome database (CSFL, www.coolseasonfoodlegume.org) is an online resource for genomics, genetics, and breeding research for chickpea, lentil,pea, and faba bean. The user-friendly and curated website allows for all publicly available map,marker,trait, gene,transcript, ger...
The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases

Science.gov (United States)

Caspi, Ron; Altman, Tomer; Dale, Joseph M.; Dreher, Kate; Fulcher, Carol A.; Gilham, Fred; Kaipa, Pallavi; Karthikeyan, Athikkattuvalasu S.; Kothari, Anamika; Krummenacker, Markus; Latendresse, Mario; Mueller, Lukas A.; Paley, Suzanne; Popescu, Liviu; Pujar, Anuradha; Shearer, Alexander G.; Zhang, Peifen; Karp, Peter D.

2010-01-01

The MetaCyc database (MetaCyc.org) is a comprehensive and freely accessible resource for metabolic pathways and enzymes from all domains of life. The pathways in MetaCyc are experimentally determined, small-molecule metabolic pathways and are curated from the primary scientific literature. With more than 1400 pathways, MetaCyc is the largest collection of metabolic pathways currently available. Pathways reactions are linked to one or more well-characterized enzymes, and both pathways and enzymes are annotated with reviews, evidence codes, and literature citations. BioCyc (BioCyc.org) is a collection of more than 500 organism-specific Pathway/Genome Databases (PGDBs). Each BioCyc PGDB contains the full genome and predicted metabolic network of one organism. The network, which is predicted by the Pathway Tools software using MetaCyc as a reference, consists of metabolites, enzymes, reactions and metabolic pathways. BioCyc PGDBs also contain additional features, such as predicted operons, transport systems, and pathway hole-fillers. The BioCyc Web site offers several tools for the analysis of the PGDBs, including Omics Viewers that enable visualization of omics datasets on two different genome-scale diagrams and tools for comparative analysis. The BioCyc PGDBs generated by SRI are offered for adoption by any party interested in curation of metabolic, regulatory, and genome-related information about an organism. PMID:19850718
dbEM: A database of epigenetic modifiers curated from cancerous and normal genomes

Science.gov (United States)

Singh Nanda, Jagpreet; Kumar, Rahul; Raghava, Gajendra P. S.

2016-01-01

We have developed a database called dbEM (database of Epigenetic Modifiers) to maintain the genomic information of about 167 epigenetic modifiers/proteins, which are considered as potential cancer targets. In dbEM, modifiers are classified on functional basis and comprise of 48 histone methyl transferases, 33 chromatin remodelers and 31 histone demethylases. dbEM maintains the genomic information like mutations, copy number variation and gene expression in thousands of tumor samples, cancer cell lines and healthy samples. This information is obtained from public resources viz. COSMIC, CCLE and 1000-genome project. Gene essentiality data retrieved from COLT database further highlights the importance of various epigenetic proteins for cancer survival. We have also reported the sequence profiles, tertiary structures and post-translational modifications of these epigenetic proteins in cancer. It also contains information of 54 drug molecules against different epigenetic proteins. A wide range of tools have been integrated in dbEM e.g. Search, BLAST, Alignment and Profile based prediction. In our analysis, we found that epigenetic proteins DNMT3A, HDAC2, KDM6A, and TET2 are highly mutated in variety of cancers. We are confident that dbEM will be very useful in cancer research particularly in the field of epigenetic proteins based cancer therapeutics. This database is available for public at URL: http://crdd.osdd.net/raghava/dbem.
CpGislandEVO: A Database and Genome Browser for Comparative Evolutionary Genomics of CpG Islands

Directory of Open Access Journals (Sweden)

Guillermo Barturen

2013-01-01

Full Text Available Hypomethylated, CpG-rich DNA segments (CpG islands, CGIs are epigenome markers involved in key biological processes. Aberrant methylation is implicated in the appearance of several disorders as cancer, immunodeficiency, or centromere instability. Furthermore, methylation differences at promoter regions between human and chimpanzee strongly associate with genes involved in neurological/psychological disorders and cancers. Therefore, the evolutionary comparative analyses of CGIs can provide insights on the functional role of these epigenome markers in both health and disease. Given the lack of specific tools, we developed CpGislandEVO. Briefly, we first compile a database of statistically significant CGIs for the best assembled mammalian genome sequences available to date. Second, by means of a coupled browser front-end, we focus on the CGIs overlapping orthologous genes extracted from OrthoDB, thus ensuring the comparison between CGIs located on truly homologous genome segments. This allows comparing the main compositional features between homologous CGIs. Finally, to facilitate nucleotide comparisons, we lifted genome coordinates between assemblies from different species, which enables the analysis of sequence divergence by direct count of nucleotide substitutions and indels occurring between homologous CGIs. The resulting CpGislandEVO database, linking together CGIs and single-cytosine DNA methylation data from several mammalian species, is freely available at our website.
Cadmium phytoextraction from loam soil in tropical southern China by Sorghum bicolor.

Science.gov (United States)

Wang, Xu; Chen, Can; Wang, Jianlong

2017-06-03

The cadmium (Cd) uptake characteristics by Sorghum bicolor cv. Nengsi 2# and Cowley from the acidic sandy loam soil (pH = 6.1) during the entire growth period (100 days) were investigated in pot outdoors in a tropical district of southern China, Hainan Island. The Cd-spiked levels in soil were set as 3 and 15 mg/kg. Correspondingly, the available Cd levels in soil extracted by Mehlich III solution were 2.71 and 9.41 mg/kg, respectively. Basically, two varieties in a full growth period (100 days) did not show a significant difference in their growth and Cd uptake. Under high Cd stress, the plant growth was inhibited and its biomass weight and height decreased by 38.7-51.5% and 27.6-28.5%, respectively. However, S. bicolor showed higher bioaccumulation capability of Cd from soil to plant [bioconcentration factor (BCF)>4], and higher transfer capability of Cd from roots to shoots [translocation factor (TF)>1] under high Cd stress; Cd contents in the roots, stems, and leaves of S. bicolor reached 43.79-46.07, 63.28-70.60, and 63.10-66.06 mg/kg, respectively. S. bicolor exhibited the potential phytoextraction capability for low or moderate Cd-contamination in acidic sandy loam soil.
PIPEMicroDB: microsatellite database and primer generation tool for pigeonpea genome.

Science.gov (United States)

Sarika; Arora, Vasu; Iquebal, M A; Rai, Anil; Kumar, Dinesh

2013-01-01

Molecular markers play a significant role for crop improvement in desirable characteristics, such as high yield, resistance to disease and others that will benefit the crop in long term. Pigeonpea (Cajanus cajan L.) is the recently sequenced legume by global consortium led by ICRISAT (Hyderabad, India) and been analysed for gene prediction, synteny maps, markers, etc. We present PIgeonPEa Microsatellite DataBase (PIPEMicroDB) with an automated primer designing tool for pigeonpea genome, based on chromosome wise as well as location wise search of primers. Total of 123 387 Short Tandem Repeats (STRs) were extracted from pigeonpea genome, available in public domain using MIcroSAtellite tool (MISA). The database is an online relational database based on 'three-tier architecture' that catalogues information of microsatellites in MySQL and user-friendly interface is developed using PHP. Search for STRs may be customized by limiting their location on chromosome as well as number of markers in that range. This is a novel approach and is not been implemented in any of the existing marker database. This database has been further appended with Primer3 for primer designing of selected markers with left and right flankings of size up to 500 bp. This will enable researchers to select markers of choice at desired interval over the chromosome. Furthermore, one can use individual STRs of a targeted region over chromosome to narrow down location of gene of interest or linked Quantitative Trait Loci (QTLs). Although it is an in silico approach, markers' search based on characteristics and location of STRs is expected to be beneficial for researchers. Database URL: http://cabindb.iasri.res.in/pigeonpea/
Developing genomic knowledge bases and databases to support clinical management: current perspectives.

Science.gov (United States)

Huser, Vojtech; Sincan, Murat; Cimino, James J

2014-01-01

Personalized medicine, the ability to tailor diagnostic and treatment decisions for individual patients, is seen as the evolution of modern medicine. We characterize here the informatics resources available today or envisioned in the near future that can support clinical interpretation of genomic test results. We assume a clinical sequencing scenario (germline whole-exome sequencing) in which a clinical specialist, such as an endocrinologist, needs to tailor patient management decisions within his or her specialty (targeted findings) but relies on a genetic counselor to interpret off-target incidental findings. We characterize the genomic input data and list various types of knowledge bases that provide genomic knowledge for generating clinical decision support. We highlight the need for patient-level databases with detailed lifelong phenotype content in addition to genotype data and provide a list of recommendations for personalized medicine knowledge bases and databases. We conclude that no single knowledge base can currently support all aspects of personalized recommendations and that consolidation of several current resources into larger, more dynamic and collaborative knowledge bases may offer a future path forward.
The perennial ryegrass GenomeZipper: targeted use of genome resources for comparative grass genomics.

Science.gov (United States)

Pfeifer, Matthias; Martis, Mihaela; Asp, Torben; Mayer, Klaus F X; Lübberstedt, Thomas; Byrne, Stephen; Frei, Ursula; Studer, Bruno

2013-02-01

Whole-genome sequences established for model and major crop species constitute a key resource for advanced genomic research. For outbreeding forage and turf grass species like ryegrasses (Lolium spp.), such resources have yet to be developed. Here, we present a model of the perennial ryegrass (Lolium perenne) genome on the basis of conserved synteny to barley (Hordeum vulgare) and the model grass genome Brachypodium (Brachypodium distachyon) as well as rice (Oryza sativa) and sorghum (Sorghum bicolor). A transcriptome-based genetic linkage map of perennial ryegrass served as a scaffold to establish the chromosomal arrangement of syntenic genes from model grass species. This scaffold revealed a high degree of synteny and macrocollinearity and was then utilized to anchor a collection of perennial ryegrass genes in silico to their predicted genome positions. This resulted in the unambiguous assignment of 3,315 out of 8,876 previously unmapped genes to the respective chromosomes. In total, the GenomeZipper incorporates 4,035 conserved grass gene loci, which were used for the first genome-wide sequence divergence analysis between perennial ryegrass, barley, Brachypodium, rice, and sorghum. The perennial ryegrass GenomeZipper is an ordered, information-rich genome scaffold, facilitating map-based cloning and genome assembly in perennial ryegrass and closely related Poaceae species. It also represents a milestone in describing synteny between perennial ryegrass and fully sequenced model grass genomes, thereby increasing our understanding of genome organization and evolution in the most important temperate forage and turf grass species.
KGCAK: a K-mer based database for genome-wide phylogeny and complexity evaluation.

Science.gov (United States)

Wang, Dapeng; Xu, Jiayue; Yu, Jun

2015-09-16

The K-mer approach, treating genomic sequences as simple characters and counting the relative abundance of each string upon a fixed K, has been extensively applied to phylogeny inference for genome assembly, annotation, and comparison. To meet increasing demands for comparing large genome sequences and to promote the use of the K-mer approach, we develop a versatile database, KGCAK ( http://kgcak.big.ac.cn/KGCAK/ ), containing ~8,000 genomes that include genome sequences of diverse life forms (viruses, prokaryotes, protists, animals, and plants) and cellular organelles of eukaryotic lineages. It builds phylogeny based on genomic elements in an alignment-free fashion and provides in-depth data processing enabling users to compare the complexity of genome sequences based on K-mer distribution. We hope that KGCAK becomes a powerful tool for exploring relationship within and among groups of species in a tree of life based on genomic data.
Ethanol production from Sorghum bicolor using both separate and ...

African Journals Online (AJOL)

STORAGESEVER

2009-06-17

Jun 17, 2009 ... pre-treatment, enzymatic saccharification, detoxification of inhibitors and fermentation of Sorghum bicolor straw for ethanol production ..... The authors wish to acknowledge financial support from ... Official energy statistics from.
Evaluation of Sorghum bicolor leaf base extract for gastrointestinal ...

African Journals Online (AJOL)

PRECIOUS

2009-11-02

Nov 2, 2009 ... Key words: Sorghum bicolor, gastrointestinal, motility, diarrhoea, jejunum, ileum, fundus. INTRODUCTION ..... the propulsive movement of charcoal meal through the .... A delay in gastric emptying will prevent speedy evacua-.
Physiological and Biochemical Responses of a Medicinal Halophyte Limonium Bicolor (Bag.) Kuntze to Salt-Stress

International Nuclear Information System (INIS)

Wang, L.; Li, W.; Yang, H.; Wu, W.; Ma, L.; Huang, T.; Wang, X.

2016-01-01

Limonium bicolor (Bag.) Kuntze is a perennial herb belonging to the Plumbaginaceae family. It is a typical recretohalophyte as well as a medicinal plant, distributing at saline soil areas in coastal areas and grasslands. In this paper,physiological mechanisms of L. bicolor to defend salt stress and effects of salinity on medicinal ingredients were investigated. The effects of different NaCl concentrations on the number of salt glands, Na/sup +/ content, dry weight and water content in tissues, gas exchange parameters involving net CO/sub 2/ assimilation rate, stomatal conductance, intercellular CO/sub 2/ concentration and transpiration rate, malondialdehyde content and electrolyte leakage, activities of superoxide dismutase, peroxidase and catalase and accumulations of secondary metabolites such as total phenolic, total flavonoid, gallic acid and myricetrin of leaves were determined. The results show that 100 and 200 mM NaCl induced facilitated effects in L. bicolor reflected in the increase in dry weight, tissue water content, net CO/sub 2/ assimilation rate, the number of salt glands, activity of superoxide dismutase, and content of gallic acid and myricetrin. The 300 mM NaCl treatment resulted in obviously decline in gas exchange parameters, and significant increases in Na/sup +/ levels, malondialdehyde level and electrolyte leakage. It was suggested that increased salt tolerance of L. bicolor was due to the corresponding resistance mechanisms involving an increased number of salt glands, enhanced activities of antioxidant enzymes, and an accelerated accumulation of secondary metabolites. What's more, the results on effects of salinity on medicinal ingredients in L. bicolor under different salt concentrations could provide theoretical basis for the standardization cultivation technique of L. bicolor. (author)
JST Thesaurus Headwords and Synonyms: Sorghum bicolor [MeCab user dictionary for science technology term[Archive

Lifescience Database Archive (English)

Full Text Available MeCab user dictionary for science technology term Sorghum bicolor 名詞一般 * * * * モロコ...シモロコシモロコシ Thesaurus2015 200906063836088318 C LS06/LS72 UNKNOWN_2 Sorghum bicolor
The influence of time and severity of Striga infection on the Sorghum bicolor - Striga hermonthica association

NARCIS (Netherlands)

Ast, van A.

2006-01-01

Keywords: Striga hermonthica , Sorghum bicolor , infection time, infection level, tolerance.This thesis presents the results of a study on the interaction between the parasitic weed Strigahermonthica (Del.) Benth. and sorghum ( Sorghum bicolor [L.] Moench). The main objective of the study was
MetReS, an Efficient Database for Genomic Applications.

Science.gov (United States)

Vilaplana, Jordi; Alves, Rui; Solsona, Francesc; Mateo, Jordi; Teixidó, Ivan; Pifarré, Marc

2018-02-01

MetReS (Metabolic Reconstruction Server) is a genomic database that is shared between two software applications that address important biological problems. Biblio-MetReS is a data-mining tool that enables the reconstruction of molecular networks based on automated text-mining analysis of published scientific literature. Homol-MetReS allows functional (re)annotation of proteomes, to properly identify both the individual proteins involved in the processes of interest and their function. The main goal of this work was to identify the areas where the performance of the MetReS database performance could be improved and to test whether this improvement would scale to larger datasets and more complex types of analysis. The study was started with a relational database, MySQL, which is the current database server used by the applications. We also tested the performance of an alternative data-handling framework, Apache Hadoop. Hadoop is currently used for large-scale data processing. We found that this data handling framework is likely to greatly improve the efficiency of the MetReS applications as the dataset and the processing needs increase by several orders of magnitude, as expected to happen in the near future.
MaizeGDB: The Maize Genetics and Genomics Database.

Science.gov (United States)

Harper, Lisa; Gardiner, Jack; Andorf, Carson; Lawrence, Carolyn J

2016-01-01

MaizeGDB is the community database for biological information about the crop plant Zea mays. Genomic, genetic, sequence, gene product, functional characterization, literature reference, and person/organization contact information are among the datatypes stored at MaizeGDB. At the project's website ( http://www.maizegdb.org ) are custom interfaces enabling researchers to browse data and to seek out specific information matching explicit search criteria. In addition, pre-compiled reports are made available for particular types of data and bulletin boards are provided to facilitate communication and coordination among members of the community of maize geneticists.

Systematic discovery of unannotated genes in 11 yeast species using a database of orthologous genomic segments

LENUS (Irish Health Repository)

OhEigeartaigh, Sean S

2011-07-26

Abstract Background In standard BLAST searches, no information other than the sequences of the query and the database entries is considered. However, in situations where two genes from different species have only borderline similarity in a BLAST search, the discovery that the genes are located within a region of conserved gene order (synteny) can provide additional evidence that they are orthologs. Thus, for interpreting borderline search results, it would be useful to know whether the syntenic context of a database hit is similar to that of the query. This principle has often been used in investigations of particular genes or genomic regions, but to our knowledge it has never been implemented systematically. Results We made use of the synteny information contained in the Yeast Gene Order Browser database for 11 yeast species to carry out a systematic search for protein-coding genes that were overlooked in the original annotations of one or more yeast genomes but which are syntenic with their orthologs. Such genes tend to have been overlooked because they are short, highly divergent, or contain introns. The key features of our software - called SearchDOGS - are that the database entries are classified into sets of genomic segments that are already known to be orthologous, and that very weak BLAST hits are retained for further analysis if their genomic location is similar to that of the query. Using SearchDOGS we identified 595 additional protein-coding genes among the 11 yeast species, including two new genes in Saccharomyces cerevisiae. We found additional genes for the mating pheromone a-factor in six species including Kluyveromyces lactis. Conclusions SearchDOGS has proven highly successful for identifying overlooked genes in the yeast genomes. We anticipate that our approach can be adapted for study of further groups of species, such as bacterial genomes. More generally, the concept of doing sequence similarity searches against databases to which external
PReMod: a database of genome-wide mammalian cis-regulatory module predictions.

Science.gov (United States)

Ferretti, Vincent; Poitras, Christian; Bergeron, Dominique; Coulombe, Benoit; Robert, François; Blanchette, Mathieu

2007-01-01

We describe PReMod, a new database of genome-wide cis-regulatory module (CRM) predictions for both the human and the mouse genomes. The prediction algorithm, described previously in Blanchette et al. (2006) Genome Res., 16, 656-668, exploits the fact that many known CRMs are made of clusters of phylogenetically conserved and repeated transcription factors (TF) binding sites. Contrary to other existing databases, PReMod is not restricted to modules located proximal to genes, but in fact mostly contains distal predicted CRMs (pCRMs). Through its web interface, PReMod allows users to (i) identify pCRMs around a gene of interest; (ii) identify pCRMs that have binding sites for a given TF (or a set of TFs) or (iii) download the entire dataset for local analyses. Queries can also be refined by filtering for specific chromosomal regions, for specific regions relative to genes or for the presence of CpG islands. The output includes information about the binding sites predicted within the selected pCRMs, and a graphical display of their distribution within the pCRMs. It also provides a visual depiction of the chromosomal context of the selected pCRMs in terms of neighboring pCRMs and genes, all of which are linked to the UCSC Genome Browser and the NCBI. PReMod: http://genomequebec.mcgill.ca/PReMod.
Database Description - RMOS | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available base Description General information of database Database name RMOS Alternative nam...arch Unit Shoshi Kikuchi E-mail : Database classification Plant databases - Rice Microarray Data and other Gene Expression Database...s Organism Taxonomy Name: Oryza sativa Taxonomy ID: 4530 Database description The Ric...19&lang=en Whole data download - Referenced database Rice Expression Database (RED) Rice full-length cDNA Database... (KOME) Rice Genome Integrated Map Database (INE) Rice Mutant Panel Database (Tos17) Rice Genome Annotation Database
The phytophthora genome initiative database: informatics and analysis for distributed pathogenomic research.

Science.gov (United States)

Waugh, M; Hraber, P; Weller, J; Wu, Y; Chen, G; Inman, J; Kiphart, D; Sobral, B

2000-01-01

The Phytophthora Genome Initiative (PGI) is a distributed collaboration to study the genome and evolution of a particularly destructive group of plant pathogenic oomycete, with the goal of understanding the mechanisms of infection and resistance. NCGR provides informatics support for the collaboration as well as a centralized data repository. In the pilot phase of the project, several investigators prepared Phytophthora infestans and Phytophthora sojae EST and Phytophthora sojae BAC libraries and sent them to another laboratory for sequencing. Data from sequencing reactions were transferred to NCGR for analysis and curation. An analysis pipeline transforms raw data by performing simple analyses (i.e., vector removal and similarity searching) that are stored and can be retrieved by investigators using a web browser. Here we describe the database and access tools, provide an overview of the data therein and outline future plans. This resource has provided a unique opportunity for the distributed, collaborative study of a genus from which relatively little sequence data are available. Results may lead to insight into how better to control these pathogens. The homepage of PGI can be accessed at http:www.ncgr.org/pgi, with database access through the database access hyperlink.
HpBase: A genome database of a sea urchin, Hemicentrotus pulcherrimus.

Science.gov (United States)

Kinjo, Sonoko; Kiyomoto, Masato; Yamamoto, Takashi; Ikeo, Kazuho; Yaguchi, Shunsuke

2018-04-01

To understand the mystery of life, it is important to accumulate genomic information for various organisms because the whole genome encodes the commands for all the genes. Since the genome of Strongylocentrotus purpratus was sequenced in 2006 as the first sequenced genome in echinoderms, the genomic resources of other North American sea urchins have gradually been accumulated, but no sea urchin genomes are available in other areas, where many scientists have used the local species and reported important results. In this manuscript, we report a draft genome of the sea urchin Hemincentrotus pulcherrimus because this species has a long history as the target of developmental and cell biology in East Asia. The genome of H. pulcherrimus was assembled into 16,251 scaffold sequences with an N50 length of 143 kbp, and approximately 25,000 genes were identified in the genome. The size of the genome and the sequencing coverage were estimated to be approximately 800 Mbp and 100×, respectively. To provide these data and information of annotation, we constructed a database, HpBase (http://cell-innovation.nig.ac.jp/Hpul/). In HpBase, gene searches, genome browsing, and blast searches are available. In addition, HpBase includes the "recipes" for experiments from each lab using H. pulcherrimus. These recipes will continue to be updated according to the circumstances of individual scientists and can be powerful tools for experimental biologists and for the community. HpBase is a suitable dataset for evolutionary, developmental, and cell biologists to compare H. pulcherrimus genomic information with that of other species and to isolate gene information. © 2018 Japanese Society of Developmental Biologists.
Transgenic sorghum ( Sorghum bicolor L. Moench) developed by ...

African Journals Online (AJOL)

Sorghum (Sorghum bicolor (L.) Moench) is an important food and fodder crop. Fungal diseases such as anthracnose caused by Colletotrichum sublineolum reduce sorghum yields. Genetic transformation can be used to confer tolerance to plant diseases such as anthracnose. The tolerance can be developed by introducing ...
The Perennial Ryegrass GenomeZipper: Targeted Use of Genome Resources for Comparative Grass Genomics1[C][W

Science.gov (United States)

Pfeifer, Matthias; Martis, Mihaela; Asp, Torben; Mayer, Klaus F.X.; Lübberstedt, Thomas; Byrne, Stephen; Frei, Ursula; Studer, Bruno

2013-01-01

Whole-genome sequences established for model and major crop species constitute a key resource for advanced genomic research. For outbreeding forage and turf grass species like ryegrasses (Lolium spp.), such resources have yet to be developed. Here, we present a model of the perennial ryegrass (Lolium perenne) genome on the basis of conserved synteny to barley (Hordeum vulgare) and the model grass genome Brachypodium (Brachypodium distachyon) as well as rice (Oryza sativa) and sorghum (Sorghum bicolor). A transcriptome-based genetic linkage map of perennial ryegrass served as a scaffold to establish the chromosomal arrangement of syntenic genes from model grass species. This scaffold revealed a high degree of synteny and macrocollinearity and was then utilized to anchor a collection of perennial ryegrass genes in silico to their predicted genome positions. This resulted in the unambiguous assignment of 3,315 out of 8,876 previously unmapped genes to the respective chromosomes. In total, the GenomeZipper incorporates 4,035 conserved grass gene loci, which were used for the first genome-wide sequence divergence analysis between perennial ryegrass, barley, Brachypodium, rice, and sorghum. The perennial ryegrass GenomeZipper is an ordered, information-rich genome scaffold, facilitating map-based cloning and genome assembly in perennial ryegrass and closely related Poaceae species. It also represents a milestone in describing synteny between perennial ryegrass and fully sequenced model grass genomes, thereby increasing our understanding of genome organization and evolution in the most important temperate forage and turf grass species. PMID:23184232
Ethylene and jasmonic acid act as negative modulators during mutualistic symbiosis between Laccaria bicolor and Populus roots.

Science.gov (United States)

Plett, Jonathan M; Khachane, Amit; Ouassou, Malika; Sundberg, Björn; Kohler, Annegret; Martin, Francis

2014-04-01

The plant hormones ethylene, jasmonic acid and salicylic acid have interconnecting roles during the response of plant tissues to mutualistic and pathogenic symbionts. We used morphological studies of transgenic- or hormone-treated Populus roots as well as whole-genome oligoarrays to examine how these hormones affect root colonization by the mutualistic ectomycorrhizal fungus Laccaria bicolor S238N. We found that genes regulated by ethylene, jasmonic acid and salicylic acid were regulated in the late stages of the interaction between L. bicolor and poplar. Both ethylene and jasmonic acid treatments were found to impede fungal colonization of roots, and this effect was correlated to an increase in the expression of certain transcription factors (e.g. ETHYLENE RESPONSE FACTOR1) and a decrease in the expression of genes associated with microbial perception and cell wall modification. Further, we found that ethylene and jasmonic acid showed extensive transcriptional cross-talk, cross-talk that was opposed by salicylic acid signaling. We conclude that ethylene and jasmonic acid pathways are induced late in the colonization of root tissues in order to limit fungal growth within roots. This induction is probably an adaptive response by the plant such that its growth and vigor are not compromised by the fungus. © 2013 The Authors New Phytologist © 2013 New Phytologist Trust.
Update History of This Database - PGDBj Registered plant list, Marker list, QTL list, Plant DB link & Genome analysis methods | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us PGDBj Registered plant list, Marker list, QTL list, Plant DB link & Genome analysis methods ...B link & Genome analysis methods English archive site is opened. 2012/08/08 PGDBj... Registered plant list, Marker list, QTL list, Plant DB link & Genome analysis methods is opened. About This...ate History of This Database - PGDBj Registered plant list, Marker list, QTL list, Plant DB link & Genome analysis methods | LSDB Archive ...
TcruziDB, an Integrated Database, and the WWW Information Server for the Trypanosoma cruzi Genome Project

Directory of Open Access Journals (Sweden)

Degrave Wim

1997-01-01

Full Text Available Data analysis, presentation and distribution is of utmost importance to a genome project. A public domain software, ACeDB, has been chosen as the common basis for parasite genome databases, and a first release of TcruziDB, the Trypanosoma cruzi genome database, is available by ftp from ftp://iris.dbbm.fiocruz.br/pub/genomedb/TcruziDB as well as versions of the software for different operating systems (ftp://iris.dbbm.fiocruz.br/pub/unixsoft/. Moreover, data originated from the project are available from the WWW server at http://www.dbbm.fiocruz.br. It contains biological and parasitological data on CL Brener, its karyotype, all available T. cruzi sequences from Genbank, data on the EST-sequencing project and on available libraries, a T. cruzi codon table and a listing of activities and participating groups in the genome project, as well as meeting reports. T. cruzi discussion lists (tcruzi-l@iris.dbbm.fiocruz.br and tcgenics@iris.dbbm.fiocruz.br are being maintained for communication and to promote collaboration in the genome project
CBS Genome Atlas Database: a dynamic storage for bioinformatic results and sequence data

DEFF Research Database (Denmark)

Hallin, Peter Fischer; Ussery, David

2004-01-01

, these results counts to more than 220 pieces of information. The backbone of this solution consists of a program package written in Perl, which enables administrators to synchronize and update the database content. The MySQL database has been connected to the CBS web-server via PHP4, to present a dynamic web...... and frequent addition of new models are factors that require a dynamic database layout. Using basic tools like the GNU Make system, csh, Perl and MySQL, we have created a flexible database environment for storing and maintaining such results for a collection of complete microbial genomes. Currently...... content for users outside the center. This solution is tightly fitted to existing server infrastructure and the solutions proposed here can perhaps serve as a template for other research groups to solve database issues....
MBGD update 2013: the microbial genome database for exploring the diversity of microbial world.

Science.gov (United States)

Uchiyama, Ikuo; Mihara, Motohiro; Nishide, Hiroyo; Chiba, Hirokazu

2013-01-01

The microbial genome database for comparative analysis (MBGD, available at http://mbgd.genome.ad.jp/) is a platform for microbial genome comparison based on orthology analysis. As its unique feature, MBGD allows users to conduct orthology analysis among any specified set of organisms; this flexibility allows MBGD to adapt to a variety of microbial genomic study. Reflecting the huge diversity of microbial world, the number of microbial genome projects now becomes several thousands. To efficiently explore the diversity of the entire microbial genomic data, MBGD now provides summary pages for pre-calculated ortholog tables among various taxonomic groups. For some closely related taxa, MBGD also provides the conserved synteny information (core genome alignment) pre-calculated using the CoreAligner program. In addition, efficient incremental updating procedure can create extended ortholog table by adding additional genomes to the default ortholog table generated from the representative set of genomes. Combining with the functionalities of the dynamic orthology calculation of any specified set of organisms, MBGD is an efficient and flexible tool for exploring the microbial genome diversity.
LDSplitDB: a database for studies of meiotic recombination hotspots in MHC using human genomic data.

Science.gov (United States)

Guo, Jing; Chen, Hao; Yang, Peng; Lee, Yew Ti; Wu, Min; Przytycka, Teresa M; Kwoh, Chee Keong; Zheng, Jie

2018-04-20

Meiotic recombination happens during the process of meiosis when chromosomes inherited from two parents exchange genetic materials to generate chromosomes in the gamete cells. The recombination events tend to occur in narrow genomic regions called recombination hotspots. Its dysregulation could lead to serious human diseases such as birth defects. Although the regulatory mechanism of recombination events is still unclear, DNA sequence polymorphisms have been found to play crucial roles in the regulation of recombination hotspots. To facilitate the studies of the underlying mechanism, we developed a database named LDSplitDB which provides an integrative and interactive data mining and visualization platform for the genome-wide association studies of recombination hotspots. It contains the pre-computed association maps of the major histocompatibility complex (MHC) region in the 1000 Genomes Project and the HapMap Phase III datasets, and a genome-scale study of the European population from the HapMap Phase II dataset. Besides the recombination profiles, related data of genes, SNPs and different types of epigenetic modifications, which could be associated with meiotic recombination, are provided for comprehensive analysis. To meet the computational requirement of the rapidly increasing population genomics data, we prepared a lookup table of 400 haplotypes for recombination rate estimation using the well-known LDhat algorithm which includes all possible two-locus haplotype configurations. To the best of our knowledge, LDSplitDB is the first large-scale database for the association analysis of human recombination hotspots with DNA sequence polymorphisms. It provides valuable resources for the discovery of the mechanism of meiotic recombination hotspots. The information about MHC in this database could help understand the roles of recombination in human immune system. DATABASE URL: http://histone.scse.ntu.edu.sg/LDSplitDB.
Evaluation of relational and NoSQL database architectures to manage genomic annotations.

Science.gov (United States)

Schulz, Wade L; Nelson, Brent G; Felker, Donn K; Durant, Thomas J S; Torres, Richard

2016-12-01

While the adoption of next generation sequencing has rapidly expanded, the informatics infrastructure used to manage the data generated by this technology has not kept pace. Historically, relational databases have provided much of the framework for data storage and retrieval. Newer technologies based on NoSQL architectures may provide significant advantages in storage and query efficiency, thereby reducing the cost of data management. But their relative advantage when applied to biomedical data sets, such as genetic data, has not been characterized. To this end, we compared the storage, indexing, and query efficiency of a common relational database (MySQL), a document-oriented NoSQL database (MongoDB), and a relational database with NoSQL support (PostgreSQL). When used to store genomic annotations from the dbSNP database, we found the NoSQL architectures to outperform traditional, relational models for speed of data storage, indexing, and query retrieval in nearly every operation. These findings strongly support the use of novel database technologies to improve the efficiency of data management within the biological sciences. Copyright Â© 2016 Elsevier Inc. All rights reserved.
PGG.Population: a database for understanding the genomic diversity and genetic ancestry of human populations.

Science.gov (United States)

Zhang, Chao; Gao, Yang; Liu, Jiaojiao; Xue, Zhe; Lu, Yan; Deng, Lian; Tian, Lei; Feng, Qidi; Xu, Shuhua

2018-01-04

There are a growing number of studies focusing on delineating genetic variations that are associated with complex human traits and diseases due to recent advances in next-generation sequencing technologies. However, identifying and prioritizing disease-associated causal variants relies on understanding the distribution of genetic variations within and among populations. The PGG.Population database documents 7122 genomes representing 356 global populations from 107 countries and provides essential information for researchers to understand human genomic diversity and genetic ancestry. These data and information can facilitate the design of research studies and the interpretation of results of both evolutionary and medical studies involving human populations. The database is carefully maintained and constantly updated when new data are available. We included miscellaneous functions and a user-friendly graphical interface for visualization of genomic diversity, population relationships (genetic affinity), ancestral makeup, footprints of natural selection, and population history etc. Moreover, PGG.Population provides a useful feature for users to analyze data and visualize results in a dynamic style via online illustration. The long-term ambition of the PGG.Population, together with the joint efforts from other researchers who contribute their data to our database, is to create a comprehensive depository of geographic and ethnic variation of human genome, as well as a platform bringing influence on future practitioners of medicine and clinical investigators. PGG.Population is available at https://www.pggpopulation.org. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Cpf1-Database: web-based genome-wide guide RNA library design for gene knockout screens using CRISPR-Cpf1.

Science.gov (United States)

Park, Jeongbin; Bae, Sangsu

2018-03-15

Following the type II CRISPR-Cas9 system, type V CRISPR-Cpf1 endonucleases have been found to be applicable for genome editing in various organisms in vivo. However, there are as yet no web-based tools capable of optimally selecting guide RNAs (gRNAs) among all possible genome-wide target sites. Here, we present Cpf1-Database, a genome-wide gRNA library design tool for LbCpf1 and AsCpf1, which have DNA recognition sequences of 5'-TTTN-3' at the 5' ends of target sites. Cpf1-Database provides a sophisticated but simple way to design gRNAs for AsCpf1 nucleases on the genome scale. One can easily access the data using a straightforward web interface, and using the powerful collections feature one can easily design gRNAs for thousands of genes in short time. Free access at http://www.rgenome.net/cpf1-database/. sangsubae@hanyang.ac.kr.
Sorghum [Sorghum bicolor (L.) Moench] Seed Quality as Affected by ...

African Journals Online (AJOL)

... bicolor (L.) Moench seeds subjected to different field cultural management practices. ... Germination and vigour tests indicated that seed selection time did not ... In relation to this, variety E1291 showed better seed vigour, viability and yield ...
The Genomes On Line Database (GOLD) in 2009: status of genomic and metagenomic projects and their associated metadata

Science.gov (United States)

Liolios, Konstantinos; Chen, I-Min A.; Mavromatis, Konstantinos; Tavernarakis, Nektarios; Hugenholtz, Philip; Markowitz, Victor M.; Kyrpides, Nikos C.

2010-01-01

The Genomes On Line Database (GOLD) is a comprehensive resource for centralized monitoring of genome and metagenome projects worldwide. Both complete and ongoing projects, along with their associated metadata, can be accessed in GOLD through precomputed tables and a search page. As of September 2009, GOLD contains information for more than 5800 sequencing projects, of which 1100 have been completed and their sequence data deposited in a public repository. GOLD continues to expand, moving toward the goal of providing the most comprehensive repository of metadata information related to the projects and their organisms/environments in accordance with the Minimum Information about a (Meta)Genome Sequence (MIGS/MIMS) specification. GOLD is available at: http://www.genomesonline.org and has a mirror site at the Institute of Molecular Biology and Biotechnology, Crete, Greece, at: http://gold.imbb.forth.gr/ PMID:19914934
A database of PCR primers for the chloroplast genomes of higher plants

Science.gov (United States)

Heinze, Berthold

2007-01-01

Background Chloroplast genomes evolve slowly and many primers for PCR amplification and analysis of chloroplast sequences can be used across a wide array of genera. In some cases 'universal' primers have been designed for the purpose of working across species boundaries. However, the essential information on these primer sequences is scattered throughout the literature. Results A database is presented here which assembles published primer information for chloroplast DNA. Additional primers were designed to fill gaps where little or no primer information could be found. Amplicons are either the genes themselves (typically useful in studies of sequence variation in higher-order phylogeny) or they are spacers, introns, and intergenic regions (for studies of phylogeographic patterns within and among species). The current list of 'generic' primers consists of more than 700 sequences. Wherever possible, we give the locations of the primers in the thirteen fully sequenced chloroplast genomes (Nicotiana tabacum, Atropa belladonna, Spinacia oleracea, Arabidopsis thaliana, Populus trichocarpa, Oryza sativa, Pinus thunbergii, Marchantia polymorpha, Zea mays, Oenothera elata, Acorus calamus, Eucalyptus globulus, Medicago trunculata). Conclusion The database described here is designed to serve as a resource for researchers who are venturing into the study of poorly described chloroplast genomes, whether for large- or small-scale DNA sequencing projects, to study molecular variation or to investigate chloroplast evolution. PMID:17326828
A database of PCR primers for the chloroplast genomes of higher plants

Directory of Open Access Journals (Sweden)

Heinze Berthold

2007-02-01

Full Text Available Abstract Background Chloroplast genomes evolve slowly and many primers for PCR amplification and analysis of chloroplast sequences can be used across a wide array of genera. In some cases 'universal' primers have been designed for the purpose of working across species boundaries. However, the essential information on these primer sequences is scattered throughout the literature. Results A database is presented here which assembles published primer information for chloroplast DNA. Additional primers were designed to fill gaps where little or no primer information could be found. Amplicons are either the genes themselves (typically useful in studies of sequence variation in higher-order phylogeny or they are spacers, introns, and intergenic regions (for studies of phylogeographic patterns within and among species. The current list of 'generic' primers consists of more than 700 sequences. Wherever possible, we give the locations of the primers in the thirteen fully sequenced chloroplast genomes (Nicotiana tabacum, Atropa belladonna, Spinacia oleracea, Arabidopsis thaliana, Populus trichocarpa, Oryza sativa, Pinus thunbergii, Marchantia polymorpha, Zea mays, Oenothera elata, Acorus calamus, Eucalyptus globulus, Medicago trunculata. Conclusion The database described here is designed to serve as a resource for researchers who are venturing into the study of poorly described chloroplast genomes, whether for large- or small-scale DNA sequencing projects, to study molecular variation or to investigate chloroplast evolution.

In planta transformation of sorghum (Sorghum bicolor (L.) Moench)

Indian Academy of Sciences (India)

An in planta transformation protocol for sorghum (Sorghum bicolor (L.) Moench) using shoot apical meristem of germinating seedlings is reported in this study. Agrobacterium tumefaciens strain, LBA4404 with pCAMBIA1303 vector and construct pCAMBIA1303TPS1 were individually used for transformation. Since, the ...
A database of phylogenetically atypical genes in archaeal and bacterial genomes, identified using the DarkHorse algorithm

Directory of Open Access Journals (Sweden)

Allen Eric E

2008-10-01

Full Text Available Abstract Background The process of horizontal gene transfer (HGT is believed to be widespread in Bacteria and Archaea, but little comparative data is available addressing its occurrence in complete microbial genomes. Collection of high-quality, automated HGT prediction data based on phylogenetic evidence has previously been impractical for large numbers of genomes at once, due to prohibitive computational demands. DarkHorse, a recently described statistical method for discovering phylogenetically atypical genes on a genome-wide basis, provides a means to solve this problem through lineage probability index (LPI ranking scores. LPI scores inversely reflect phylogenetic distance between a test amino acid sequence and its closest available database matches. Proteins with low LPI scores are good horizontal gene transfer candidates; those with high scores are not. Description The DarkHorse algorithm has been applied to 955 microbial genome sequences, and the results organized into a web-searchable relational database, called the DarkHorse HGT Candidate Resource http://darkhorse.ucsd.edu. Users can select individual genomes or groups of genomes to screen by LPI score, search for protein functions by descriptive annotation or amino acid sequence similarity, or select proteins with unusual G+C composition in their underlying coding sequences. The search engine reports LPI scores for match partners as well as query sequences, providing the opportunity to explore whether potential HGT donor sequences are phylogenetically typical or atypical within their own genomes. This information can be used to predict whether or not sufficient information is available to build a well-supported phylogenetic tree using the potential donor sequence. Conclusion The DarkHorse HGT Candidate database provides a powerful, flexible set of tools for identifying phylogenetically atypical proteins, allowing researchers to explore both individual HGT events in single genomes, and
SolCyc: a database hub at the Sol Genomics Network (SGN) for the manual curation of metabolic networks in Solanum and Nicotiana specific databases

Science.gov (United States)

Foerster, Hartmut; Bombarely, Aureliano; Battey, James N D; Sierro, Nicolas; Ivanov, Nikolai V; Mueller, Lukas A

2018-01-01

Abstract SolCyc is the entry portal to pathway/genome databases (PGDBs) for major species of the Solanaceae family hosted at the Sol Genomics Network. Currently, SolCyc comprises six organism-specific PGDBs for tomato, potato, pepper, petunia, tobacco and one Rubiaceae, coffee. The metabolic networks of those PGDBs have been computationally predicted by the pathologic component of the pathway tools software using the manually curated multi-domain database MetaCyc (http://www.metacyc.org/) as reference. SolCyc has been recently extended by taxon-specific databases, i.e. the family-specific SolanaCyc database, containing only curated data pertinent to species of the nightshade family, and NicotianaCyc, a genus-specific database that stores all relevant metabolic data of the Nicotiana genus. Through manual curation of the published literature, new metabolic pathways have been created in those databases, which are complemented by the continuously updated, relevant species-specific pathways from MetaCyc. At present, SolanaCyc comprises 199 pathways and 29 superpathways and NicotianaCyc accounts for 72 pathways and 13 superpathways. Curator-maintained, taxon-specific databases such as SolanaCyc and NicotianaCyc are characterized by an enrichment of data specific to these taxa and free of falsely predicted pathways. Both databases have been used to update recently created Nicotiana-specific databases for Nicotiana tabacum, Nicotiana benthamiana, Nicotiana sylvestris and Nicotiana tomentosiformis by propagating verifiable data into those PGDBs. In addition, in-depth curation of the pathways in N.tabacum has been carried out which resulted in the elimination of 156 pathways from the 569 pathways predicted by pathway tools. Together, in-depth curation of the predicted pathway network and the supplementation with curated data from taxon-specific databases has substantially improved the curation status of the species–specific N.tabacum PGDB. The implementation of this
Prediction of social structure and genetic relatedness in colonies of the facultative polygynous stingless bee Melipona bicolor (Hymenoptera, Apidae).

Science.gov (United States)

Dos Reis, Evelyze Pinheiro; de Oliveira Campos, Lucio Antonio; Tavares, Mara Garcia

2011-04-01

Stingless bee colonies typically consist of one single-mated mother queen and her worker offspring. The stingless bee Melipona bicolor (Hymenoptera: Apidae) shows facultative polygyny, which makes this species particularly suitable for testing theoretical expectations concerning social behavior. In this study, we investigated the social structure and genetic relatedness among workers from eight natural and six manipulated colonies of M. bicolor over a period of one year. The populations of M. bicolor contained monogynous and polygynous colonies. The estimated genetic relatedness among workers from monogynous and polygynous colonies was 0.75 ± 0.12 and 0.53 ± 0.16 (mean ± SEM), respectively. Although the parental genotypes had significant effects on genetic relatedness in monogynous and polygynous colonies, polygyny markedly decreased the relatedness among nestmate workers. Our findings also demonstrate that polygyny in M. bicolor may arise from the adoption of related or unrelated queens.
Prediction of social structure and genetic relatedness in colonies of the facultative polygynous stingless bee Melipona bicolor (Hymenoptera, Apidae

Directory of Open Access Journals (Sweden)

Evelyze Pinheiro dos Reis

2011-01-01

Full Text Available Stingless bee colonies typically consist of one single-mated mother queen and her worker offspring. The stingless bee Melipona bicolor (Hymenoptera: Apidae shows facultative polygyny, which makes this species particularly suitable for testing theoretical expectations concerning social behavior. In this study, we investigated the social structure and genetic relatedness among workers from eight natural and six manipulated colonies of M. bicolor over a period of one year. The populations of M. bicolor contained monogynous and polygynous colonies. The estimated genetic relatedness among workers from monogynous and polygynous colonies was 0.75 ± 0.12 and 0.53 ± 0.16 (mean ± SEM, respectively. Although the parental genotypes had significant effects on genetic relatedness in monogynous and polygynous colonies, polygyny markedly decreased the relatedness among nestmate workers. Our findings also demonstrate that polygyny in M. bicolor may arise from the adoption of related or unrelated queens.
OryzaGenome: Genome Diversity Database of Wild Oryza Species

KAUST Repository

Ohyanagi, Hajime

2015-11-18

The species in the genus Oryza, encompassing nine genome types and 23 species, are a rich genetic resource and may have applications in deeper genomic analyses aiming to understand the evolution of plant genomes. With the advancement of next-generation sequencing (NGS) technology, a flood of Oryza species reference genomes and genomic variation information has become available in recent years. This genomic information, combined with the comprehensive phenotypic information that we are accumulating in our Oryzabase, can serve as an excellent genotype-phenotype association resource for analyzing rice functional and structural evolution, and the associated diversity of the Oryza genus. Here we integrate our previous and future phenotypic/habitat information and newly determined genotype information into a united repository, named OryzaGenome, providing the variant information with hyperlinks to Oryzabase. The current version of OryzaGenome includes genotype information of 446 O. rufipogon accessions derived by imputation and of 17 accessions derived by imputation-free deep sequencing. Two variant viewers are implemented: SNP Viewer as a conventional genome browser interface and Variant Table as a textbased browser for precise inspection of each variant one by one. Portable VCF (variant call format) file or tabdelimited file download is also available. Following these SNP (single nucleotide polymorphism) data, reference pseudomolecules/ scaffolds/contigs and genome-wide variation information for almost all of the closely and distantly related wild Oryza species from the NIG Wild Rice Collection will be available in future releases. All of the resources can be accessed through http://viewer.shigen.info/oryzagenome/.
KLT-type relations for QCD and bicolor amplitudes from color-factor symmetry

Science.gov (United States)

Brown, Robert W.; Naculich, Stephen G.

2018-03-01

Color-factor symmetry is used to derive a KLT-type relation for tree-level QCD amplitudes containing gluons and an arbitrary number of massive or massless quark-antiquark pairs, generalizing the expression for Yang-Mills amplitudes originally postulated by Bern, De Freitas, and Wong. An explicit expression is given for all amplitudes with two or fewer quark-antiquark pairs in terms of the (modified) momentum kernel. We also introduce the bicolor scalar theory, the "zeroth copy" of QCD, containing massless biadjoint scalars and massive bifundamental scalars, generalizing the biadjoint scalar theory of Cachazo, He, and Yuan. We derive KLT-type relations for tree-level amplitudes of biadjoint and bicolor theories using the color-factor symmetry possessed by these theories.
In vitro antimalarial activity of Calophyllum bicolor and hemozoin crystals observed by Transmission Electron Microscope (TEM)

OpenAIRE

Abbas Jamilah

2018-01-01

Objective : In continuation of our antimalarial candidate drug discovery program on Indonesia medicinal plants especially from stem bark of Calophyllum bicolor. Metode : We extracted of bioactive crude extract with hexane, acetone and methanol from stem bark of Calophyllum bicolor and evaluated their antimalarial activity by using parasite Plasmodium falciparum in vitro. Results: Methanol fraction showed most active and potent antimalarial activity dose dependent in in vitro experiments with ...
Visualizing information across multidimensional post-genomic structured and textual databases.

Science.gov (United States)

Tao, Ying; Friedman, Carol; Lussier, Yves A

2005-04-15

Visualizing relationships among biological information to facilitate understanding is crucial to biological research during the post-genomic era. Although different systems have been developed to view gene-phenotype relationships for specific databases, very few have been designed specifically as a general flexible tool for visualizing multidimensional genotypic and phenotypic information together. Our goal is to develop a method for visualizing multidimensional genotypic and phenotypic information and a model that unifies different biological databases in order to present the integrated knowledge using a uniform interface. We developed a novel, flexible and generalizable visualization tool, called PhenoGenesviewer (PGviewer), which in this paper was used to display gene-phenotype relationships from a human-curated database (OMIM) and from an automatic method using a Natural Language Processing tool called BioMedLEE. Data obtained from multiple databases were first integrated into a uniform structure and then organized by PGviewer. PGviewer provides a flexible query interface that allows dynamic selection and ordering of any desired dimension in the databases. Based on users' queries, results can be visualized using hierarchical expandable trees that present views specified by users according to their research interests. We believe that this method, which allows users to dynamically organize and visualize multiple dimensions, is a potentially powerful and promising tool that should substantially facilitate biological research. PhenogenesViewer as well as its support and tutorial are available at http://www.dbmi.columbia.edu/pgviewer/ Lussier@dbmi.columbia.edu.
The duplicated genes database: identification and functional annotation of co-localised duplicated genes across genomes.

Directory of Open Access Journals (Sweden)

Marion Ouedraogo

Full Text Available BACKGROUND: There has been a surge in studies linking genome structure and gene expression, with special focus on duplicated genes. Although initially duplicated from the same sequence, duplicated genes can diverge strongly over evolution and take on different functions or regulated expression. However, information on the function and expression of duplicated genes remains sparse. Identifying groups of duplicated genes in different genomes and characterizing their expression and function would therefore be of great interest to the research community. The 'Duplicated Genes Database' (DGD was developed for this purpose. METHODOLOGY: Nine species were included in the DGD. For each species, BLAST analyses were conducted on peptide sequences corresponding to the genes mapped on a same chromosome. Groups of duplicated genes were defined based on these pairwise BLAST comparisons and the genomic location of the genes. For each group, Pearson correlations between gene expression data and semantic similarities between functional GO annotations were also computed when the relevant information was available. CONCLUSIONS: The Duplicated Gene Database provides a list of co-localised and duplicated genes for several species with the available gene co-expression level and semantic similarity value of functional annotation. Adding these data to the groups of duplicated genes provides biological information that can prove useful to gene expression analyses. The Duplicated Gene Database can be freely accessed through the DGD website at http://dgd.genouest.org.
Flavonoids from the Leaves of Impatiens bicolor

OpenAIRE

TAHIR, Aurangzeb HASAN and Muhammad Nawaz

2005-01-01

Three new flavanone glycosides, naringenin 4'-O-b-D-glucuronopyranoside, naringenin 4'-O-a-L rham\\-nopyranoside and naringenin 4'-O-b-D-xylopyranoside, were characterized from the leaves of Impatiens bicolor, together with 6 known glycosides: naringenin 4'-O-b-D-glucopyranoside, kaempferol 7-O-b-D-glucuronopyranoside, quercetin 3-O-b-D-glucopyranoside, kaempferol 5-O-b-D-xylopyranoside, kaempferol 3-O-b-D-galactopyranoside and kaempferol 7-O-b-D-xylopyranoside. The...
Flavonoids from the Leaves of Impatiens bicolor

OpenAIRE

TAHIR, Aurangzeb HASAN and Muhammad Nawaz; HASAN, Aurangzeb

2014-01-01

Three new flavanone glycosides, naringenin 4'-O-b-D-glucuronopyranoside, naringenin 4'-O-a-L rham\\-nopyranoside and naringenin 4'-O-b-D-xylopyranoside, were characterized from the leaves of Impatiens bicolor, together with 6 known glycosides: naringenin 4'-O-b-D-glucopyranoside, kaempferol 7-O-b-D-glucuronopyranoside, quercetin 3-O-b-D-glucopyranoside, kaempferol 5-O-b-D-xylopyranoside, kaempferol 3-O-b-D-galactopyranoside and kaempferol 7-O-b-D-xylo...
An Integrated Molecular Database on Indian Insects.

Science.gov (United States)

Pratheepa, Maria; Venkatesan, Thiruvengadam; Gracy, Gandhi; Jalali, Sushil Kumar; Rangheswaran, Rajagopal; Antony, Jomin Cruz; Rai, Anil

2018-01-01

MOlecular Database on Indian Insects (MODII) is an online database linking several databases like Insect Pest Info, Insect Barcode Information System (IBIn), Insect Whole Genome sequence, Other Genomic Resources of National Bureau of Agricultural Insect Resources (NBAIR), Whole Genome sequencing of Honey bee viruses, Insecticide resistance gene database and Genomic tools. This database was developed with a holistic approach for collecting information about phenomic and genomic information of agriculturally important insects. This insect resource database is available online for free at http://cib.res.in. http://cib.res.in/.
Molecular Cloning and Expression Analysis of Cu/Zn SOD Gene from Gynura bicolor DC.

Directory of Open Access Journals (Sweden)

Xin Xu

2017-01-01

Full Text Available Superoxide dismutase is an important antioxidant enzyme extensively existing in eukaryote, which scavenges reactive oxygen species (ROS and plays an essential role in stress tolerance of higher plants. A full-length cDNA encoding Cu/Zn SOD was cloned from leaves of Gynura bicolor DC. by rapid amplification of cDNA ends (RACE. The full-length cDNA of Cu/Zn SOD is 924 bp and has a 681 bp open reading frame encoding 227 amino acids. Bioinformatics analysis revealed that belonged to the plant SOD super family. Cu/Zn SODs of the Helianthus annuus, Mikania micrantha, and Solidago canadensis var. scabra all have 86% similarity to the G. bicolor Cu/Zn SOD. Analysis of the expression of Cu/Zn SOD under different treatments revealed that Cu/Zn SOD was a stress-responsive gene, especially to 1-MCP. It indicates that the Cu/Zn SOD gene would be an important gene in the resistance to stresses and will be helpful in providing evidence for future research on underlying molecular mechanism and choosing proper postharvest treatments for G. bicolor.
Ultraestructura del bambú Guadua angustifolia var. bicolor (Poaceae: Bambusoideae, presente en Costa Rica

Directory of Open Access Journals (Sweden)

Mayra Montiel

2006-06-01

Full Text Available Se estudió la ultraestructura anatómica de la lámina y la vaina de la hoja, así como de la bráctea del culmo, de Guadua angustifolia var. bicolor, por medio del microscopio electrónico de barrido. Se encontraron similitudes con otras guaduas: estomas de alto domo, células largas con paredes sinuosas y células de sílice. Son propios de la var. bicolor el patrón estomático en la zona abaxial de la hoja cerca de la vaina, con la presencia de gran cantidad de tricomas ganchudos y sin papilas; las brácteas de color café dorado que cubren los culmos y tienen tricomas papilares que cubren el haz; y el abundante número de tricomas auriculares, en grupos de 12.The anatomy of several parts of Guadua angustifolia var. bicolor was analyzed and characterized under a scanning electron icroscope. any similarities ere observed with other Guadua species, particularly the presence of high dome stomata, of large cells with sinuous walls and of silica cells. Specific bicolor characteristics include (1 a different stomatal pattern in the adaxial zone of the leaf base (close to the sheath; (2 the abundance of hook-sshaped trichomes without papilla; (3 the distinctive golden brown color of the bract that covers the culm (caused by papillar trichomes that cover the adaxial sheath; and (4 the size of the groups of auricular trichomes (formed by 12 trichomes. Rev. Biol. Trop. 54(Suppl. 2: 13-19. Epub 2006 Dec. 01.
Human Ageing Genomic Resources: Integrated databases and tools for the biology and genetics of ageing

Science.gov (United States)

Tacutu, Robi; Craig, Thomas; Budovsky, Arie; Wuttke, Daniel; Lehmann, Gilad; Taranukha, Dmitri; Costa, Joana; Fraifeld, Vadim E.; de Magalhães, João Pedro

2013-01-01

The Human Ageing Genomic Resources (HAGR, http://genomics.senescence.info) is a freely available online collection of research databases and tools for the biology and genetics of ageing. HAGR features now several databases with high-quality manually curated data: (i) GenAge, a database of genes associated with ageing in humans and model organisms; (ii) AnAge, an extensive collection of longevity records and complementary traits for >4000 vertebrate species; and (iii) GenDR, a newly incorporated database, containing both gene mutations that interfere with dietary restriction-mediated lifespan extension and consistent gene expression changes induced by dietary restriction. Since its creation about 10 years ago, major efforts have been undertaken to maintain the quality of data in HAGR, while further continuing to develop, improve and extend it. This article briefly describes the content of HAGR and details the major updates since its previous publications, in terms of both structure and content. The completely redesigned interface, more intuitive and more integrative of HAGR resources, is also presented. Altogether, we hope that through its improvements, the current version of HAGR will continue to provide users with the most comprehensive and accessible resources available today in the field of biogerontology. PMID:23193293
A DATABASE FOR TRACKING TOXICOGENOMIC SAMPLES AND PROCEDURES WITH GENOMIC, PROTEOMIC AND METABONOMIC COMPONENTS

Science.gov (United States)

A Database for Tracking Toxicogenomic Samples and Procedures with Genomic, Proteomic and Metabonomic Components Wenjun Bao1, Jennifer Fostel2, Michael D. Waters2, B. Alex Merrick2, Drew Ekman3, Mitchell Kostich4, Judith Schmid1, David Dix1Office of Research and Developmen...
PATtyFams: Protein families for the microbial genomes in the PATRIC database

Directory of Open Access Journals (Sweden)

James J Davis

2016-02-01

Full Text Available The ability to build accurate protein families is a fundamental operation in bioinformatics that influences comparative analyses, genome annotation and metabolic modeling. For several years we have been maintaining protein families for all microbial genomes in the PATRIC database (Pathosystems Resource Integration Center, patricbrc.org in order to drive many of the comparative analysis tools that are available through the PATRIC website. However, due to the burgeoning number of genomes, traditional approaches for generating protein families are becoming prohibitive. In this report, we describe a new approach for generating protein families, which we call PATtyFams. This method uses the k-mer-based function assignments available through RAST (Rapid Annotation using Subsystem Technology to rapidly guide family formation, and then differentiates the function-based groups into families using a Markov Cluster algorithm (MCL. This new approach for generating protein families is rapid, scalable and has properties that are consistent with alignment-based methods.
Construction of an ortholog database using the semantic web technology for integrative analysis of genomic data.

Science.gov (United States)

Chiba, Hirokazu; Nishide, Hiroyo; Uchiyama, Ikuo

2015-01-01

Recently, various types of biological data, including genomic sequences, have been rapidly accumulating. To discover biological knowledge from such growing heterogeneous data, a flexible framework for data integration is necessary. Ortholog information is a central resource for interlinking corresponding genes among different organisms, and the Semantic Web provides a key technology for the flexible integration of heterogeneous data. We have constructed an ortholog database using the Semantic Web technology, aiming at the integration of numerous genomic data and various types of biological information. To formalize the structure of the ortholog information in the Semantic Web, we have constructed the Ortholog Ontology (OrthO). While the OrthO is a compact ontology for general use, it is designed to be extended to the description of database-specific concepts. On the basis of OrthO, we described the ortholog information from our Microbial Genome Database for Comparative Analysis (MBGD) in the form of Resource Description Framework (RDF) and made it available through the SPARQL endpoint, which accepts arbitrary queries specified by users. In this framework based on the OrthO, the biological data of different organisms can be integrated using the ortholog information as a hub. Besides, the ortholog information from different data sources can be compared with each other using the OrthO as a shared ontology. Here we show some examples demonstrating that the ortholog information described in RDF can be used to link various biological data such as taxonomy information and Gene Ontology. Thus, the ortholog database using the Semantic Web technology can contribute to biological knowledge discovery through integrative data analysis.
Cyclone: java-based querying and computing with Pathway/Genome databases.

Science.gov (United States)

Le Fèvre, François; Smidtas, Serge; Schächter, Vincent

2007-05-15

Cyclone aims at facilitating the use of BioCyc, a collection of Pathway/Genome Databases (PGDBs). Cyclone provides a fully extensible Java Object API to analyze and visualize these data. Cyclone can read and write PGDBs, and can write its own data in the CycloneML format. This format is automatically generated from the BioCyc ontology by Cyclone itself, ensuring continued compatibility. Cyclone objects can also be stored in a relational database CycloneDB. Queries can be written in SQL, and in an intuitive and concise object-oriented query language, Hibernate Query Language (HQL). In addition, Cyclone interfaces easily with Java software including the Eclipse IDE for HQL edition, the Jung API for graph algorithms or Cytoscape for graph visualization. Cyclone is freely available under an open source license at: http://sourceforge.net/projects/nemo-cyclone. For download and installation instructions, tutorials, use cases and examples, see http://nemo-cyclone.sourceforge.net.

BarleyBase—an expression profiling database for plant genomics

Science.gov (United States)

Shen, Lishuang; Gong, Jian; Caldo, Rico A.; Nettleton, Dan; Cook, Dianne; Wise, Roger P.; Dickerson, Julie A.

2005-01-01

BarleyBase (BB) (www.barleybase.org) is an online database for plant microarrays with integrated tools for data visualization and statistical analysis. BB houses raw and normalized expression data from the two publicly available Affymetrix genome arrays, Barley1 and Arabidopsis ATH1 with plans to include the new Affymetrix 61K wheat, maize, soybean and rice arrays, as they become available. BB contains a broad set of query and display options at all data levels, ranging from experiments to individual hybridizations to probe sets down to individual probes. Users can perform cross-experiment queries on probe sets based on observed expression profiles and/or based on known biological information. Probe set queries are integrated with visualization and analysis tools such as the R statistical toolbox, data filters and a large variety of plot types. Controlled vocabularies for gene and plant ontologies, as well as interconnecting links to physical or genetic map and other genomic data in PlantGDB, Gramene and GrainGenes, allow users to perform EST alignments and gene function prediction using Barley1 exemplar sequences, thus, enhancing cross-species comparison. PMID:15608273
Peptide secretion in the cutaneous glands of South American tree frog Phyllomedusa bicolor: an ultrastructural study.

Science.gov (United States)

Lacombe, C; Cifuentes-Diaz, C; Dunia, I; Auber-Thomay, M; Nicolas, P; Amiche, M

2000-09-01

The development of the dermal glands of the arboreal frog Phyllomedusa bicolor was investigated by immunocytochemistry and electron microscopy. The 3 types of glands (mucous, lipid and serous) differed in size and secretory activity. The mucous and serous glands were apparent in the tadpole skin, whereas the lipid glands developed later in ontogenesis. The peptide antibiotics dermaseptins and the D-amino acid-containing peptide opioids dermorphins and deltorphins are abundant in the skin secretions of P. bicolor. Although these peptides differ in their structure and activity they are derived from precursors that have very similar preproregions. We used an antibody to the common preproregion of preprodermaseptins and preprodeltorphins and immunofluorescence analysis to show that only the serous glands are specifically involved in the biosynthesis and secretion of dermaseptins and deltorphins. Scanning and transmission electron microscopy revealed that the serous glands of P bicolor have morphological features, especially the secretory granules, which differ from those of the glands in Xenopus laevis skin.
Genome update: the 1000th genome - a cautionary tale

DEFF Research Database (Denmark)

Lagesen, Karin; Ussery, David; Wassenaar, Gertrude Maria

2010-01-01

conclusions for example about the largest bacterial genome sequenced. Biological diversity is far greater than many have thought. For example, analysis of multiple Escherichia coli genomes has led to an estimate of around 45 000 gene families more genes than are recognized in the human genome. Moreover......There are now more than 1000 sequenced prokaryotic genomes deposited in public databases and available for analysis. Currently, although the sequence databases GenBank, DNA Database of Japan and EMBL are synchronized continually, there are slight differences in content at the genomes level...... for a variety of logistical reasons, including differences in format and loading errors, such as those caused by file transfer protocol interruptions. This means that the 1000th genome will be different in the various databases. Some of the data on the highly accessed web pages are inaccurate, leading to false...
Genetic Dissection of Bioenergy-Related Traits in Sweet Sorghum (Sorghum bicolor) under Danish Agro-Climatic Conditions

DEFF Research Database (Denmark)

Mocoeur, Anne Raymonde Joelle

Sorghum (Sorghum bicolor (L.) Moench), a C4 African originated grass, ranks 5th most important crop worldwide, feeding over 500 million people in tropical regions as it withstands a wide panel of biotic and abiotic stresses. The small and simple diploid genome of sorghum was elected as the third...... plant for sequencing in 2009 promoting it as a C4 model plant. Among the very diverse genetic resources available for sorghum, sweet sorghum plants; amassing large quantities of juice-rich and sugar-rich stem, grain and vegetative biomass; have been enlightened as bioenergy crop as it can produced from...... a single plant food, feed and fuel. Sweet sorghum has gained interest in Europe to replace maize, for biogas and bioenergy productions, but this versatile crop is sensitive to chilling temperatures and little breeding efforts have been done toward its cold acclimation. The state-of-art of using...
Antimicrobial screening of impatiens bicolor royle

International Nuclear Information System (INIS)

Nisar, M.; Ali, I.; Qayum, M.; Kaleem, W.A.; Shah, R.M.; Zia-ul-Haq, M.

2010-01-01

Extracts of Impatiens bicolor Royle obtained from n-hexane (A); dichloromethane (B), ethyl acetate (C), n-butanol (D), aqueous (E) as well as crude (F) were tested In vitro for their antibacterial and antifungal activities. Antibacterial study performed against 6 bacteria viz., Escherichia coli, Bacillus subtilis, Shigella flexenari, Staphylococcus aureus, Pseudomonas aeruginosa and Salmonella typhi indicated that crude and its fractions had no activity at all against any microorganism. The antifungal activity of these extracts was performed against 6 fungi viz., Trichophyton longifusus, Candida albicans, Aspergilus flavus, Microsporum canis, Fusarium solani and Candida glaberata. The extracts showed moderate activity against different fungal strains. (author)
Decatropis bicolor (Zucc.) Radlk essential oil induces apoptosis of the MDA-MB-231 breast cancer cell line.

Science.gov (United States)

Estanislao Gómez, C C; Aquino Carreño, A; Pérez Ishiwara, D G; San Martín Martínez, E; Morales López, J; Pérez Hernández, N; Gómez García, M C

2016-08-05

Decatropis bicolor (Zucc.)Radlk is a plant that has been traditionally used for the treatment of breast cancer in some communities of Mexico. So, the aim of this study was to determine the cytotoxic and apoptotic effect of the essential oil of Decatropis bicolor against breast cancer cell line, MDA-MB-231. The essential oil obtained from hydrodestillation of leaves of Decatropis bicolor was studied for its biological activity against breast cancer cells MDA-MB-231 by MTT assay, Hematoxylin-eosin stain, Annexin V-FITC, TUNEL and western blot assays and for its chemical composition by GC-MS. The results showed a relevant cytotoxic effect of the essential oil towards MDA-MB-231 cells in a dose- and time- dependent manner, with an IC50 of 53.81 ± 1.691 μg/ml but not in the epithelial mammary cell line MCF10A (207.51 ± 3.26 μg/ml). Morphological examination displayed apoptotic characteristics in the treated cells like cell size reduction, membrane blebbing and apoptotic bodies. In addition, the apoptotic rate significantly increased as well as DNA fragmentation and western blot analysis revealed that the essential oil induced apoptosis in the MDA-MB-231 cells via intrinsic pathways due to the activation of Bax, caspases 9 and 3. Phytochemical analysis of the Decatropis bicolor essential oil showed the presence of twenty-three compounds. Major components of the oil were 1,5-cyclooctadiene,3-(methyl-2)propenyl (18.38 %), β-terpineol (8.16 %) and 1-(3-methyl-cyclopent-2-enyl)-cyclohexene (6.12 %). This study suggests that essential oil of Decatropis bicolor has a potential cytotoxic and antitumoral effect against breast cancer cells, with the presence of potential bioactive compounds. Our results contribute to the validation of the anticancer activity of the plant in Mexican traditional medicine.
ATGC database and ATGC-COGs: an updated resource for micro- and macro-evolutionary studies of prokaryotic genomes and protein family annotation.

Science.gov (United States)

Kristensen, David M; Wolf, Yuri I; Koonin, Eugene V

2017-01-04

The Alignable Tight Genomic Clusters (ATGCs) database is a collection of closely related bacterial and archaeal genomes that provides several tools to aid research into evolutionary processes in the microbial world. Each ATGC is a taxonomy-independent cluster of 2 or more completely sequenced genomes that meet the objective criteria of a high degree of local gene order (synteny) and a small number of synonymous substitutions in the protein-coding genes. As such, each ATGC is suited for analysis of microevolutionary variations within a cohesive group of organisms (e.g. species), whereas the entire collection of ATGCs is useful for macroevolutionary studies. The ATGC database includes many forms of pre-computed data, in particular ATGC-COGs (Clusters of Orthologous Genes), multiple sequence alignments, a set of 'index' orthologs representing the most well-conserved members of each ATGC-COG, the phylogenetic tree of the organisms within each ATGC, etc. Although the ATGC database contains several million proteins from thousands of genomes organized into hundreds of clusters (roughly a 4-fold increase since the last version of the ATGC database), it is now built with completely automated methods and will be regularly updated following new releases of the NCBI RefSeq database. The ATGC database is hosted jointly at the University of Iowa at dmk-brain.ecn.uiowa.edu/ATGC/ and the NCBI at ftp.ncbi.nlm.nih.gov/pub/kristensen/ATGC/atgc_home.html. Published by Oxford University Press on behalf of Nucleic Acids Research 2016. This work is written by (a) US Government employee(s) and is in the public domain in the US.
DNA-based and geometric morphometric analysis to validate species designation: a case study of the subterranean rodent Ctenomys bicolor.

Science.gov (United States)

Stolz, J F B; Gonçalves, G L; Leipnitz, L; Freitas, T R O

2013-10-25

The genus Ctenomys (Rodentia: Ctenomyidae) shows several taxonomic inconsistencies. In this study, we used an integrative approach including DNA sequences, karyotypes, and geometric morphometrics to evaluate the taxonomic validity of a nominal species, Ctenomys bicolor, which was described based on only one specimen in 1912 by Miranda Ribeiro, and since then neglected. We sampled near the type locality assigned to this species and collected 10 specimens. A total of 820 base pairs of the cytochrome b gene were sequenced and analyzed together with nine other species and four morphotypes obtained from GenBank. Bayesian analyses showed that C. bicolor is monophyletic and related to the Bolivian-Matogrossense group, a clade that originated about 3 mya. We compared the cranial shape through morphometric geometrics of C. bicolor, including the specimen originally sampled in 1912, with other species representative of the same phylogenetic group (C. boliviensis and C. steinbachi). C. bicolor shows unique skull traits that distinguish it from all other currently known taxa. Our findings confirm that the specimen collected by Miranda Ribeiro is a valid species, and improve the knowledge about Ctenomys in the Amazon region.
Effect of mycorrhiza symbiosis on the Nacl salinity in Sorghum bicolor

African Journals Online (AJOL)

In order to determine mycorrhizal symbiosis on the Nacl salinity tolerance in Sorghum bicolor (aspydfyd cultivar), an experiment with two factors was done in Damghan Islamic Azad University laboratory (Iran) in 2007. The first factor with two levels (mycorihizal and non-mycorihizal) and second factor with six levels Nacl ...
Analysis of aluminium sensitivity in sorghum (Sorghum bicolor (L.) Moench) genotypes

NARCIS (Netherlands)

Tan, K.

1993-01-01

Twelve genotypes of sorghum ( Sorghum bicolor (L.) Moench) differing in Al sensitivity were grown in an acid soil (with additions of lime or MgSO ₄ ) and in nutrient solutions (with or without Al at constant pH) for periods between 14 and 35 days.
Identification of genes differentially expressed in ectomycorrhizal roots during the Pinus pinaster-Laccaria bicolor interaction.

Science.gov (United States)

Flores-Monterroso, Aranzazu; Canales, Javier; de la Torre, Fernando; Ávila, Concepción; Cánovas, Francisco M

2013-06-01

Ectomycorrhizal associations are of major ecological importance in temperate and boreal forests. The development of a functional ectomycorrhiza requires many genetic and biochemical changes. In this study, suppressive subtraction hybridization was used to identify differentially expressed genes in the roots of maritime pine (Pinus pinaster Aiton) inoculated with Laccaria bicolor, a mycorrhizal fungus. A total number of 200 unigenes were identified as being differentially regulated in maritime pine roots during the development of mycorrhiza. These unigenes were classified into 10 categories according to the function of their homologues in the GenBank database. Approximately, 40 % of the differentially expressed transcripts were genes that coded for unknown proteins in the databases or that had no homology to known genes. A group of these differentially expressed genes was selected to validate the results using quantitative real-time PCR. The transcript levels of the representative genes were compared between the non-inoculated and inoculated plants at 1, 5, 15 and 30 days after inoculation. The observed expression patterns indicate (1) changes in the composition of the wall cell, (2) tight regulation of defence genes during the development of mycorrhiza and (3) changes in carbon and nitrogen metabolism. Ammonium excess or deficiency dramatically affected the stability of ectomycorrhiza and altered gene expression in maritime pine roots.
Effect of heavy metal and edta application on plant growth and phyto-extraction potential of sorghum (sorghum bicolor)

International Nuclear Information System (INIS)

Bacaha, N.; Shamas, R.; Bakht, J.; Rafi, A.; Farhatullah, M.; Gillani, S.

2015-01-01

Pot experiment was conducted to evaluate the phyto-extraction capacity of heavy metals by Sorghum. Sorghum bicolor was grown in soil artificially contaminated with different concentrations of lead (300, 350 and 400 mg/kg), chromium (50, 100 and 150 mg/kg) and cadmium (100, 150 and 200 mg/kg). Five mM EDTA was applied, as chelating agent to the plants after 4 weeks of sowing. Plants were grown for a total of two months and fresh weight and dry weight of shoot and heavy metal accumulation were analyzed at six and eight weeks after sowing. The results revealed that application of cadmium, chromium and lead and EDTA adversely affected shoot length, fresh weight and dry weight of S. bicolor at both time intervals. Heavy metals uptake increased with the increment of heavy metal by S. bicolor species. Application of 5mM EDTA enhanced the uptake of heavy metal. (author)
Profiling of Escherichia coli Chromosome database.

Science.gov (United States)

Yamazaki, Yukiko; Niki, Hironori; Kato, Jun-ichi

2008-01-01

The Profiling of Escherichia coli Chromosome (PEC) database (http://www.shigen.nig.ac.jp/ecoli/pec/) is designed to allow E. coli researchers to efficiently access information from functional genomics studies. The database contains two principal types of data: gene essentiality and a large collection of E. coli genetic research resources. The essentiality data are based on data compilation from published single-gene essentiality studies and on cell growth studies of large-deletion mutants. Using the circular and linear viewers for both whole genomes and the minimal genome, users can not only gain an overview of the genome structure but also retrieve information on contigs, gene products, mutants, deletions, and so forth. In particular, genome-wide exhaustive mutants are an essential resource for studying E. coli gene functions. Although the genomic database was constructed independently from the genetic resources database, users may seamlessly access both types of data. In addition to these data, the PEC database also provides a summary of homologous genes of other bacterial genomes and of protein structure information, with a comprehensive interface. The PEC is thus a convenient and useful platform for contemporary E. coli researchers.
MIPS Arabidopsis thaliana Database (MAtDB): an integrated biological knowledge resource for plant genomics

Science.gov (United States)

Schoof, Heiko; Ernst, Rebecca; Nazarov, Vladimir; Pfeifer, Lukas; Mewes, Hans-Werner; Mayer, Klaus F. X.

2004-01-01

Arabidopsis thaliana is the most widely studied model plant. Functional genomics is intensively underway in many laboratories worldwide. Beyond the basic annotation of the primary sequence data, the annotated genetic elements of Arabidopsis must be linked to diverse biological data and higher order information such as metabolic or regulatory pathways. The MIPS Arabidopsis thaliana database MAtDB aims to provide a comprehensive resource for Arabidopsis as a genome model that serves as a primary reference for research in plants and is suitable for transfer of knowledge to other plants, especially crops. The genome sequence as a common backbone serves as a scaffold for the integration of data, while, in a complementary effort, these data are enhanced through the application of state-of-the-art bioinformatics tools. This information is visualized on a genome-wide and a gene-by-gene basis with access both for web users and applications. This report updates the information given in a previous report and provides an outlook on further developments. The MAtDB web interface can be accessed at http://mips.gsf.de/proj/thal/db. PMID:14681437
The Genomes OnLine Database (GOLD) v.4: status of genomic and metagenomic projects and their associated metadata

Science.gov (United States)

Pagani, Ioanna; Liolios, Konstantinos; Jansson, Jakob; Chen, I-Min A.; Smirnova, Tatyana; Nosrat, Bahador; Markowitz, Victor M.; Kyrpides, Nikos C.

2012-01-01

The Genomes OnLine Database (GOLD, http://www.genomesonline.org/) is a comprehensive resource for centralized monitoring of genome and metagenome projects worldwide. Both complete and ongoing projects, along with their associated metadata, can be accessed in GOLD through precomputed tables and a search page. As of September 2011, GOLD, now on version 4.0, contains information for 11 472 sequencing projects, of which 2907 have been completed and their sequence data has been deposited in a public repository. Out of these complete projects, 1918 are finished and 989 are permanent drafts. Moreover, GOLD contains information for 340 metagenome studies associated with 1927 metagenome samples. GOLD continues to expand, moving toward the goal of providing the most comprehensive repository of metadata information related to the projects and their organisms/environments in accordance with the Minimum Information about any (x) Sequence specification and beyond. PMID:22135293
Database Description - RED | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available ase Description General information of database Database name RED Alternative name Rice Expression Database...enome Research Unit Shoshi Kikuchi E-mail : Database classification Plant databases - Rice Database classifi...cation Microarray, Gene Expression Organism Taxonomy Name: Oryza sativa Taxonomy ID: 4530 Database descripti... Article title: Rice Expression Database: the gateway to rice functional genomics...nt Science (2002) Dec 7 (12):563-564 External Links: Original website information Database maintenance site
Download - PGDBj Registered plant list, Marker list, QTL list, Plant DB link & Genome analysis methods | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us PGDBj Registered plant list, Marker list, QTL list, Plant DB link & Genome analysis methods ...t_db_link_en.zip (36.3 KB) - 6 Genome analysis methods pgdbj_dna_marker_linkage_map_genome_analysis_methods_... of This Database Site Policy | Contact Us Download - PGDBj Registered plant list, Marker list, QTL list, Plant DB link & Genome analysis methods | LSDB Archive ...
Sequence modelling and an extensible data model for genomic database

Energy Technology Data Exchange (ETDEWEB)

Li, Peter Wei-Der [California Univ., San Francisco, CA (United States); Univ. of California, Berkeley, CA (United States)

1992-01-01

The Human Genome Project (HGP) plans to sequence the human genome by the beginning of the next century. It will generate DNA sequences of more than 10 billion bases and complex marker sequences (maps) of more than 100 million markers. All of these information will be stored in database management systems (DBMSs). However, existing data models do not have the abstraction mechanism for modelling sequences and existing DBMS`s do not have operations for complex sequences. This work addresses the problem of sequence modelling in the context of the HGP and the more general problem of an extensible object data model that can incorporate the sequence model as well as existing and future data constructs and operators. First, we proposed a general sequence model that is application and implementation independent. This model is used to capture the sequence information found in the HGP at the conceptual level. In addition, abstract and biological sequence operators are defined for manipulating the modelled sequences. Second, we combined many features of semantic and object oriented data models into an extensible framework, which we called the ``Extensible Object Model``, to address the need of a modelling framework for incorporating the sequence data model with other types of data constructs and operators. This framework is based on the conceptual separation between constructors and constraints. We then used this modelling framework to integrate the constructs for the conceptual sequence model. The Extensible Object Model is also defined with a graphical representation, which is useful as a tool for database designers. Finally, we defined a query language to support this model and implement the query processor to demonstrate the feasibility of the extensible framework and the usefulness of the conceptual sequence model.
Sequence modelling and an extensible data model for genomic database

Energy Technology Data Exchange (ETDEWEB)

Li, Peter Wei-Der (California Univ., San Francisco, CA (United States) Lawrence Berkeley Lab., CA (United States))

1992-01-01

The Human Genome Project (HGP) plans to sequence the human genome by the beginning of the next century. It will generate DNA sequences of more than 10 billion bases and complex marker sequences (maps) of more than 100 million markers. All of these information will be stored in database management systems (DBMSs). However, existing data models do not have the abstraction mechanism for modelling sequences and existing DBMS's do not have operations for complex sequences. This work addresses the problem of sequence modelling in the context of the HGP and the more general problem of an extensible object data model that can incorporate the sequence model as well as existing and future data constructs and operators. First, we proposed a general sequence model that is application and implementation independent. This model is used to capture the sequence information found in the HGP at the conceptual level. In addition, abstract and biological sequence operators are defined for manipulating the modelled sequences. Second, we combined many features of semantic and object oriented data models into an extensible framework, which we called the Extensible Object Model'', to address the need of a modelling framework for incorporating the sequence data model with other types of data constructs and operators. This framework is based on the conceptual separation between constructors and constraints. We then used this modelling framework to integrate the constructs for the conceptual sequence model. The Extensible Object Model is also defined with a graphical representation, which is useful as a tool for database designers. Finally, we defined a query language to support this model and implement the query processor to demonstrate the feasibility of the extensible framework and the usefulness of the conceptual sequence model.
Database Description - RMG | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available ase Description General information of database Database name RMG Alternative name ...raki 305-8602, Japan National Institute of Agrobiological Sciences E-mail : Database... classification Nucleotide Sequence Databases Organism Taxonomy Name: Oryza sativa Japonica Group Taxonomy ID: 39947 Database...rnal: Mol Genet Genomics (2002) 268: 434–445 External Links: Original website information Database...available URL of Web services - Need for user registration Not available About This Database Database Descri

Identification of differentially expressed genes in sorghum (Sorghum bicolor) brown midrib mutants

Science.gov (United States)

Sorghum (Sorghum bicolor L.), with a high biomass yield and excellent tolerance to drought and low nutrition, has been recommended as one of the most competitive bioenergy crops. Brown midrib (bmr) mutant sorghum with reduced lignin content showed a high potential for the improvement of bioethanol ...
Database Description - KOME | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available base Description General information of database Database name KOME Alternative nam... Sciences Plant Genome Research Unit Shoshi Kikuchi E-mail : Database classification Plant databases - Rice ...Organism Taxonomy Name: Oryza sativa Taxonomy ID: 4530 Database description Information about approximately ...Hayashizaki Y, Kikuchi S. Journal: PLoS One. 2007 Nov 28; 2(11):e1235. External Links: Original website information Database...OS) Rice mutant panel database (Tos17) A Database of Plant Cis-acting Regulatory
Database Description - GETDB | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available abase Description General information of database Database name GETDB Alternative n...ame Gal4 Enhancer Trap Insertion Database DOI 10.18908/lsdba.nbdc00236-000 Creator Creator Name: Shigeo Haya... Chuo-ku, Kobe 650-0047 Tel: +81-78-306-3185 FAX: +81-78-306-3183 E-mail: Database classification Expression... Invertebrate genome database Organism Taxonomy Name: Drosophila melanogaster Taxonomy ID: 7227 Database des...riginal website information Database maintenance site Drosophila Genetic Resource
SinEx DB: a database for single exon coding sequences in mammalian genomes.

Science.gov (United States)

Jorquera, Roddy; Ortiz, Rodrigo; Ossandon, F; Cárdenas, Juan Pablo; Sepúlveda, Rene; González, Carolina; Holmes, David S

2016-01-01

Eukaryotic genes are typically interrupted by intragenic, noncoding sequences termed introns. However, some genes lack introns in their coding sequence (CDS) and are generally known as 'single exon genes' (SEGs). In this work, a SEG is defined as a nuclear, protein-coding gene that lacks introns in its CDS. Whereas, many public databases of Eukaryotic multi-exon genes are available, there are only two specialized databases for SEGs. The present work addresses the need for a more extensive and diverse database by creating SinEx DB, a publicly available, searchable database of predicted SEGs from 10 completely sequenced mammalian genomes including human. SinEx DB houses the DNA and protein sequence information of these SEGs and includes their functional predictions (KOG) and the relative distribution of these functions within species. The information is stored in a relational database built with My SQL Server 5.1.33 and the complete dataset of SEG sequences and their functional predictions are available for downloading. SinEx DB can be interrogated by: (i) a browsable phylogenetic schema, (ii) carrying out BLAST searches to the in-house SinEx DB of SEGs and (iii) via an advanced search mode in which the database can be searched by key words and any combination of searches by species and predicted functions. SinEx DB provides a rich source of information for advancing our understanding of the evolution and function of SEGs.Database URL: www.sinex.cl. © The Author(s) 2016. Published by Oxford University Press.
Genomic dissection of anthracnose resistant response in sorghum [Sorghum bicolor (L.)

Science.gov (United States)

The goal of this project is to use a genomics-based approaches to identify anthracnose resistance loci from diverse sorghum germplasm as an effort to the disease resistance mechanism of at least one of these genes. This information will provide plant breeders with a tool kit that can be used to maxi...
Application Of Database Program in selecting Sorghum (Sorghum bicolor L) Mutant Lines

International Nuclear Information System (INIS)

H, Soeranto

2000-01-01

Computer database software namely MSTAT and paradox have been exercised in the field of mutation breeding especially in the process of selecting plant mutant lines of sorghum. In MSTAT, selecting mutant lines can be done by activating the SELECTION function and then followed by entering mathematical formulas for the selection criterion. Another alternative is by defining the desired selection intensity to the analysis results of subprogram SORT. Including the selected plant mutant lines in BRSERIES program, it will make their progenies be easier to be traced in subsequent generations. In paradox, an application program for selecting mutant lines can be made by combining facilities of Table, form and report. Selecting mutant lines with defined selection criterion can easily be done through filtering data. As a relation database, paradox ensures that the application program for selecting mutant lines and progeny trachings, can be made easier, efficient and interactive
PineElm_SSRdb: a microsatellite marker database identified from genomic, chloroplast, mitochondrial and EST sequences of pineapple (Ananas comosus (L.) Merrill).

Science.gov (United States)

Chaudhary, Sakshi; Mishra, Bharat Kumar; Vivek, Thiruvettai; Magadum, Santoshkumar; Yasin, Jeshima Khan

2016-01-01

Simple Sequence Repeats or microsatellites are resourceful molecular genetic markers. There are only few reports of SSR identification and development in pineapple. Complete genome sequence of pineapple available in the public domain can be used to develop numerous novel SSRs. Therefore, an attempt was made to identify SSRs from genomic, chloroplast, mitochondrial and EST sequences of pineapple which will help in deciphering genetic makeup of its germplasm resources. A total of 359511 SSRs were identified in pineapple (356385 from genome sequence, 45 from chloroplast sequence, 249 in mitochondrial sequence and 2832 from EST sequences). The list of EST-SSR markers and their details are available in the database. PineElm_SSRdb is an open source database available for non-commercial academic purpose at http://app.bioelm.com/ with a mapping tool which can develop circular maps of selected marker set. This database will be of immense use to breeders, researchers and graduates working on Ananas spp. and to others working on cross-species transferability of markers, investigating diversity, mapping and DNA fingerprinting.
The MAR databases: development and implementation of databases specific for marine metagenomics.

Science.gov (United States)

Klemetsen, Terje; Raknes, Inge A; Fu, Juan; Agafonov, Alexander; Balasundaram, Sudhagar V; Tartari, Giacomo; Robertsen, Espen; Willassen, Nils P

2018-01-04

We introduce the marine databases; MarRef, MarDB and MarCat (https://mmp.sfb.uit.no/databases/), which are publicly available resources that promote marine research and innovation. These data resources, which have been implemented in the Marine Metagenomics Portal (MMP) (https://mmp.sfb.uit.no/), are collections of richly annotated and manually curated contextual (metadata) and sequence databases representing three tiers of accuracy. While MarRef is a database for completely sequenced marine prokaryotic genomes, which represent a marine prokaryote reference genome database, MarDB includes all incomplete sequenced prokaryotic genomes regardless level of completeness. The last database, MarCat, represents a gene (protein) catalog of uncultivable (and cultivable) marine genes and proteins derived from marine metagenomics samples. The first versions of MarRef and MarDB contain 612 and 3726 records, respectively. Each record is built up of 106 metadata fields including attributes for sampling, sequencing, assembly and annotation in addition to the organism and taxonomic information. Currently, MarCat contains 1227 records with 55 metadata fields. Ontologies and controlled vocabularies are used in the contextual databases to enhance consistency. The user-friendly web interface lets the visitors browse, filter and search in the contextual databases and perform BLAST searches against the corresponding sequence databases. All contextual and sequence databases are freely accessible and downloadable from https://s1.sfb.uit.no/public/mar/. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Legume and Lotus japonicus Databases

DEFF Research Database (Denmark)

Hirakawa, Hideki; Mun, Terry; Sato, Shusei

2014-01-01

Since the genome sequence of Lotus japonicus, a model plant of family Fabaceae, was determined in 2008 (Sato et al. 2008), the genomes of other members of the Fabaceae family, soybean (Glycine max) (Schmutz et al. 2010) and Medicago truncatula (Young et al. 2011), have been sequenced. In this sec....... In this section, we introduce representative, publicly accessible online resources related to plant materials, integrated databases containing legume genome information, and databases for genome sequence and derived marker information of legume species including L. japonicus...
Phyllomedusa bicolor skin secretion and the Kambô ritual

OpenAIRE

den Brave, Paul S; Bruins, Eugéne; Bronkhorst, Maarten W G A

2014-01-01

The ritual of Kambô or Sapo is a type of voluntary envenomation. During this purification ritual a shaman healer, from various South American countries, deliberately burns the right shoulder with a glowing stick from a fireplace. Excretions of Phyllomedusa bicolor (or Giant Leaf Frog, Kambô or Sapo) are then applied to these fresh wounds. This ritual is used as a means of purification of the body, supposedly brings luck to hunters, increases stamina and enhances physical and sexual strength. ...
Genome cluster database. A sequence family analysis platform for Arabidopsis and rice.

Science.gov (United States)

Horan, Kevin; Lauricha, Josh; Bailey-Serres, Julia; Raikhel, Natasha; Girke, Thomas

2005-05-01

The genome-wide protein sequences from Arabidopsis (Arabidopsis thaliana) and rice (Oryza sativa) spp. japonica were clustered into families using sequence similarity and domain-based clustering. The two fundamentally different methods resulted in separate cluster sets with complementary properties to compensate the limitations for accurate family analysis. Functional names for the identified families were assigned with an efficient computational approach that uses the description of the most common molecular function gene ontology node within each cluster. Subsequently, multiple alignments and phylogenetic trees were calculated for the assembled families. All clustering results and their underlying sequences were organized in the Web-accessible Genome Cluster Database (http://bioinfo.ucr.edu/projects/GCD) with rich interactive and user-friendly sequence family mining tools to facilitate the analysis of any given family of interest for the plant science community. An automated clustering pipeline ensures current information for future updates in the annotations of the two genomes and clustering improvements. The analysis allowed the first systematic identification of family and singlet proteins present in both organisms as well as those restricted to one of them. In addition, the established Web resources for mining these data provide a road map for future studies of the composition and structure of protein families between the two species.
Radioinduced variation in genetic improvement of sorghum (Sorghum bicolor (l.). Moench)

International Nuclear Information System (INIS)

Gutierrez del Rio, E.

1984-01-01

A genetic variability study among 25 varieties of sorghum (Sorghum bicolor (L.) Moench) is presented. The populations are irradiated with 0, 10, 20, 30, 40, 50 and 60 Krads of cobalt 60 as far as M 5 generation. An individual selection is done taking into consideration agronomic characteristics like precocity, type, size. height of the plant. (M.A.C.) [pt
KoVariome: Korean National Standard Reference Variome database of whole genomes with comprehensive SNV, indel, CNV, and SV analyses.

Science.gov (United States)

Kim, Jungeun; Weber, Jessica A; Jho, Sungwoong; Jang, Jinho; Jun, JeHoon; Cho, Yun Sung; Kim, Hak-Min; Kim, Hyunho; Kim, Yumi; Chung, OkSung; Kim, Chang Geun; Lee, HyeJin; Kim, Byung Chul; Han, Kyudong; Koh, InSong; Chae, Kyun Shik; Lee, Semin; Edwards, Jeremy S; Bhak, Jong

2018-04-04

High-coverage whole-genome sequencing data of a single ethnicity can provide a useful catalogue of population-specific genetic variations, and provides a critical resource that can be used to more accurately identify pathogenic genetic variants. We report a comprehensive analysis of the Korean population, and present the Korean National Standard Reference Variome (KoVariome). As a part of the Korean Personal Genome Project (KPGP), we constructed the KoVariome database using 5.5 terabases of whole genome sequence data from 50 healthy Korean individuals in order to characterize the benign ethnicity-relevant genetic variation present in the Korean population. In total, KoVariome includes 12.7M single-nucleotide variants (SNVs), 1.7M short insertions and deletions (indels), 4K structural variations (SVs), and 3.6K copy number variations (CNVs). Among them, 2.4M (19%) SNVs and 0.4M (24%) indels were identified as novel. We also discovered selective enrichment of 3.8M SNVs and 0.5M indels in Korean individuals, which were used to filter out 1,271 coding-SNVs not originally removed from the 1,000 Genomes Project when prioritizing disease-causing variants. KoVariome health records were used to identify novel disease-causing variants in the Korean population, demonstrating the value of high-quality ethnic variation databases for the accurate interpretation of individual genomes and the precise characterization of genetic variations.
SNPpy--database management for SNP data from genome wide association studies.

Directory of Open Access Journals (Sweden)

Faheem Mitha

Full Text Available BACKGROUND: We describe SNPpy, a hybrid script database system using the Python SQLAlchemy library coupled with the PostgreSQL database to manage genotype data from Genome-Wide Association Studies (GWAS. This system makes it possible to merge study data with HapMap data and merge across studies for meta-analyses, including data filtering based on the values of phenotype and Single-Nucleotide Polymorphism (SNP data. SNPpy and its dependencies are open source software. RESULTS: The current version of SNPpy offers utility functions to import genotype and annotation data from two commercial platforms. We use these to import data from two GWAS studies and the HapMap Project. We then export these individual datasets to standard data format files that can be imported into statistical software for downstream analyses. CONCLUSIONS: By leveraging the power of relational databases, SNPpy offers integrated management and manipulation of genotype and phenotype data from GWAS studies. The analysis of these studies requires merging across GWAS datasets as well as patient and marker selection. To this end, SNPpy enables the user to filter the data and output the results as standardized GWAS file formats. It does low level and flexible data validation, including validation of patient data. SNPpy is a practical and extensible solution for investigators who seek to deploy central management of their GWAS data.
The syndrome of inappropriate antidiuretic hormone secretion after giant leaf frog (Phyllomedusa bicolor) venom exposure.

Science.gov (United States)

Leban, Vid; Kozelj, Gordana; Brvar, Miran

2016-09-15

In Europe body purification and natural balance restoring rituals are becoming increasingly popular, but an introduction of Amazonian shamanic rituals in urban Europe can result in unexpected adverse events. A 44-year-old woman attended a Kambô or Sapo ritual in Slovenia where dried skin secretion from a giant leaf frog (Phyllomedusa bicolor) was applied to five freshly burned wounds at her shoulder. Afterwards, she drank 6 litres of water and gradually developed nausea and vomiting, confusion, lethargy, muscle weakness, spasms and cramps, seizure, decreased consciousness level and short-term memory loss. The initial laboratory tests showed profound plasma hypoosmolality (251 mOsm/kg) proportional to hyponatremia (116 mmol/L) combined with inappropriately elevated urine osmolality (523 mOsm/kg) and high urine sodium concentration (87 mmol/L) indicating a syndrome of inappropriate antidiuretic hormone secretion. The patient was treated with 0.9% sodium chloride and a restriction of water intake. Plasma osmolality and hyponatremia improved one day after venom exposure, but the symptoms disappeared as late as the third day. In patients presenting with neurological symptoms and a line of small body burns Phyllomedusa bicolor venom exposure should be suspected. Acute symptomatic hyponatremia after Phyllomedusa bicolor venom exposure is the result of inappropriate antidiuretic hormone secretion that can be exacerbated by excessive water intake. Copyright © 2016 Elsevier Ltd. All rights reserved.
Genomic treasure troves: complete genome sequencing of herbarium and insect museum specimens.

Science.gov (United States)

Staats, Martijn; Erkens, Roy H J; van de Vossenberg, Bart; Wieringa, Jan J; Kraaijeveld, Ken; Stielow, Benjamin; Geml, József; Richardson, James E; Bakker, Freek T

2013-01-01

Unlocking the vast genomic diversity stored in natural history collections would create unprecedented opportunities for genome-scale evolutionary, phylogenetic, domestication and population genomic studies. Many researchers have been discouraged from using historical specimens in molecular studies because of both generally limited success of DNA extraction and the challenges associated with PCR-amplifying highly degraded DNA. In today's next-generation sequencing (NGS) world, opportunities and prospects for historical DNA have changed dramatically, as most NGS methods are actually designed for taking short fragmented DNA molecules as templates. Here we show that using a standard multiplex and paired-end Illumina sequencing approach, genome-scale sequence data can be generated reliably from dry-preserved plant, fungal and insect specimens collected up to 115 years ago, and with minimal destructive sampling. Using a reference-based assembly approach, we were able to produce the entire nuclear genome of a 43-year-old Arabidopsis thaliana (Brassicaceae) herbarium specimen with high and uniform sequence coverage. Nuclear genome sequences of three fungal specimens of 22-82 years of age (Agaricus bisporus, Laccaria bicolor, Pleurotus ostreatus) were generated with 81.4-97.9% exome coverage. Complete organellar genome sequences were assembled for all specimens. Using de novo assembly we retrieved between 16.2-71.0% of coding sequence regions, and hence remain somewhat cautious about prospects for de novo genome assembly from historical specimens. Non-target sequence contaminations were observed in 2 of our insect museum specimens. We anticipate that future museum genomics projects will perhaps not generate entire genome sequences in all cases (our specimens contained relatively small and low-complexity genomes), but at least generating vital comparative genomic data for testing (phylo)genetic, demographic and genetic hypotheses, that become increasingly more horizontal
Scent gland constituents of the Middle American burrowing python, Loxocemus bicolor (Serpentes: Loxocemidae).

Science.gov (United States)

Schulze, Thies; Weldon, Paul J; Schulz, Stefan

2017-07-14

Analysis by gas chromatography/mass spectrometry of the scent gland secretions of male and female Middle American burrowing pythons (Loxocemus bicolor) revealed the presence of over 300 components including cholesterol, fatty acids, glyceryl monoalkyl ethers, and alcohols. The fatty acids, over 100 of which were identified, constitute most of the compounds in the secretions and show the greatest structural diversity. They include saturated and unsaturated, unbranched and mono-, di-, and trimethyl-branched compounds ranging in carbon-chain length from 13 to 24. The glyceryl monoethers possess saturated or unsaturated, straight or methyl-branched alkyl chains ranging in carbon-chain length from 13 to 24. Alcohols, which have not previously been reported from the scent glands, possess straight, chiefly saturated carbon chains ranging in length from 13 to 24. Sex or individual differences in secretion composition were not observed. Compounds in the scent gland secretions of L. bicolor may deter offending arthropods, such as ants.
Analysis of disease-associated objects at the Rat Genome Database

Science.gov (United States)

Wang, Shur-Jen; Laulederkind, Stanley J. F.; Hayman, G. T.; Smith, Jennifer R.; Petri, Victoria; Lowry, Timothy F.; Nigam, Rajni; Dwinell, Melinda R.; Worthey, Elizabeth A.; Munzenmaier, Diane H.; Shimoyama, Mary; Jacob, Howard J.

2013-01-01

The Rat Genome Database (RGD) is the premier resource for genetic, genomic and phenotype data for the laboratory rat, Rattus norvegicus. In addition to organizing biological data from rats, the RGD team focuses on manual curation of gene–disease associations for rat, human and mouse. In this work, we have analyzed disease-associated strains, quantitative trait loci (QTL) and genes from rats. These disease objects form the basis for seven disease portals. Among disease portals, the cardiovascular disease and obesity/metabolic syndrome portals have the highest number of rat strains and QTL. These two portals share 398 rat QTL, and these shared QTL are highly concentrated on rat chromosomes 1 and 2. For disease-associated genes, we performed gene ontology (GO) enrichment analysis across portals using RatMine enrichment widgets. Fifteen GO terms, five from each GO aspect, were selected to profile enrichment patterns of each portal. Of the selected biological process (BP) terms, ‘regulation of programmed cell death’ was the top enriched term across all disease portals except in the obesity/metabolic syndrome portal where ‘lipid metabolic process’ was the most enriched term. ‘Cytosol’ and ‘nucleus’ were common cellular component (CC) annotations for disease genes, but only the cancer portal genes were highly enriched with ‘nucleus’ annotations. Similar enrichment patterns were observed in a parallel analysis using the DAVID functional annotation tool. The relationship between the preselected 15 GO terms and disease terms was examined reciprocally by retrieving rat genes annotated with these preselected terms. The individual GO term–annotated gene list showed enrichment in physiologically related diseases. For example, the ‘regulation of blood pressure’ genes were enriched with cardiovascular disease annotations, and the ‘lipid metabolic process’ genes with obesity annotations. Furthermore, we were able to enhance enrichment of neurological
Dicty_cDB: SSL552 [Dicty_cDB

Lifescience Database Archive (English)

Full Text Available 28 0.36 3 BZ330174 |BZ330174.1 hv90e08.b1 WGS-SbicolorF (JM107 adapted methyl filtered) Sorghum bicolor gen...S-SbicolorF (JM107 adapted methyl filtered) Sorghum bicolor genomic clone hv90e08 5', DNA sequence. 36 0.63 ...2 BZ335743 |BZ335743.1 hz26f02.g1 WGS-SbicolorF (JM107 adapted methyl filtered) Sorghum bicolor genomic clon...e hz26f02 5', DNA sequence. 36 0.68 2 BZ628823 |BZ628823.1 ih62d12.g1 WGS-SbicolorF (DH5a methyl filtered) S...06.1 ic39h07.b1 WGS-SbicolorF (JM107 adapted methyl filtered) Sorghum bicolor genomic clone ic39h07 5', DNA
Identification and characterization of two dermorphins from skin extracts of the Amazonian frog Phyllomedusa bicolor.

Science.gov (United States)

Mignogna, G; Severini, C; Simmaco, M; Negri, L; Erspamer, G F; Kreil, G; Barra, D

1992-05-11

Skin extracts of South American hylid frogs of the subfamily Phyllomedusinae contain dermorphins and deltorphins, opioid heptapeptides highly selective for either mu or delta receptors. In all these peptides, a D-amino acid is present in the second position. The structure of the precursors for Ala-deltorphins was recently deduced from cloned cDNAs derived from skin of Phyllomedusa bicolor (Richter et al. (1990) Proc. Natl. Acad. Sci. USA 87, 4836-4839). From the amino acid sequence of these precursors, the existence of three peptides related to dermorphin could be predicted. From methanol extracts of skin of Ph. bicolor we have isolated two of these peptides, [Lys7]dermorphin-OH and [Trp4,Asn7]dermorphin-OH. The biological activity of these new dermorphins and their amidated counterparts is presented.

Genomics Portals: integrative web-platform for mining genomics data.

Science.gov (United States)

Shinde, Kaustubh; Phatak, Mukta; Johannes, Freudenberg M; Chen, Jing; Li, Qian; Vineet, Joshi K; Hu, Zhen; Ghosh, Krishnendu; Meller, Jaroslaw; Medvedovic, Mario

2010-01-13

A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc), and the integration with an extensive knowledge base that can be used in such analysis. The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org.
Marker list - PGDBj Registered plant list, Marker list, QTL list, Plant DB link & Genome analysis methods | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us PGDBj Registered plant list, Marker list, QTL list, Plant DB link & Genome analysis methods ...Database Site Policy | Contact Us Marker list - PGDBj Registered plant list, Marker list, QTL list, Plant DB link & Genome analysis methods | LSDB Archive ...
MIPS plant genome information resources.

Science.gov (United States)

Spannagl, Manuel; Haberer, Georg; Ernst, Rebecca; Schoof, Heiko; Mayer, Klaus F X

2007-01-01

The Munich Institute for Protein Sequences (MIPS) has been involved in maintaining plant genome databases since the Arabidopsis thaliana genome project. Genome databases and analysis resources have focused on individual genomes and aim to provide flexible and maintainable data sets for model plant genomes as a backbone against which experimental data, for example from high-throughput functional genomics, can be organized and evaluated. In addition, model genomes also form a scaffold for comparative genomics, and much can be learned from genome-wide evolutionary studies.
HALLAZGO CASUAL DE UN PICHÓN DE Tiaris bicolor ( AV E S : THRAUPIDAE EN EL TRACTO DIGESTIVO DE UNA SERPIENTE Boa constrictor I CASUAL FINDING OF ONE CHICK OF Tiaris bicolor (AVES: THRAUPIDAE IN THE DIGESTIVE TRACT OF A Boa constrictor

Directory of Open Access Journals (Sweden)

Cristina Sainz-Borgo

2018-05-01

Full Text Available The Boa constrictor (Red-Tailed Boa is a specie that has a very diverse diet ranging from small prey such as birds, squirrels, fish and lizards to larger prey such as deer. The aim of this note was to report the casual finding of one chick of Tiaris bicolor (Black-faced Grassquit in the digestive tract of a B. constrictor that accidentally was seriously injured with a lawnmower, causing his death. One of their injuries exposed the T. bicolor chick eaten by the snake. The observation was made in a garden of a residential area located in southeast of Caracas city (Venezuela..
PEMANFAATAN LIMBAH IKAN SIDAT INDONESIA (Anguilla bicolor SEBAGAI TEPUNG PADA INDUSTRI PENGOLAHAN IKAN DI PALABUHANRATU, KABUPATEN SUKABUMI

Directory of Open Access Journals (Sweden)

RA Hangesti Emi Widyasari

2014-05-01

Full Text Available ABSTRACTThis research aims to analyze the nutritive value of flour head, liver powder and bone meal as by product of Indonesian eel (Anguilla bicolor processing. Eel waste flour was made by using a thermal process in the drum dryer fish flouring mill PT. Carmelitha Lestari in Bogor, whereas proximate analysis for chemical tests were performed in the laboratory of Integrated Chemical Laboratory, IPB and direct observation was conducted in PT Jawa Suisan Indah, Palabuhanratu Sukabumi district in October 2012—April 2013. The nutritive value based on proximate analysis showed that head flour, liver flour, and bone flour contained protein 61.78%, 53.92%, and 41.01%; fat 15.55%, 27.28%, 13.07%; carbohydrate 11.48%; 14.96%, 8.13%; water 5.44%, 8.48%, 3.01%; ash 12.95%, 3.62%, 37.49%, and crude fiber 1.33%, 0.04%, 1.11%, respectively.Keywords: Anguilla bicolor, bone flour, head flour, liver flour, nutritive valueABSTRAKPenelitian ini bertujuan untuk menganalisis kandungan gizi tepung kepala, tepung tulang dan tepung hati ikan yang merupakan limbah pengolahan ikan sidat Indonesia (Anguilla bicolor. Tepung limbah ikan sidat dibuat berdasarkan proses termal menggunakan drum dryer di pabrik penepungan ikan PT. Carmelitha Lestari di Bogor dan analisis proksimat untuk uji kimiawi dilakukan di Laboratorium Kimia Terpadu, IPB serta observasi langsung di PT Jawa Suisan Indah, Palabuhanratu Kabupaten Sukabumi pada bulan Oktober 2012—April 2013. Hasil analisis proksimat tepung kepala, tepung hati dan tepung tulang mengandung protein berturut-turut sebesar 61.78%, 53.92%, dan 41.01%; lemak sebesar 15.55%; 27.28%; 13.07%, karbohidrat sebesar 11.48%; 14.96%; 8.13%, kadar air sebesar 5.44%; 8.48%; 3.01%, kadar abu 12.95%; 3.62%; 37.49% dan serat kasar 1.33%; 0.04%; 1.11%.Kata kunci: Anguilla bicolor, nilai gizi, tepung hati, tepung kepala, tepung tulang
Comparing genomes: databases and computational tools for comparative analysis of prokaryotic genomes - DOI: 10.3395/reciis.v1i2.Sup.105en

Directory of Open Access Journals (Sweden)

Marcos Catanho

2007-12-01

Full Text Available Since the 1990's, the complete genetic code of more than 600 living organisms has been deciphered, such as bacteria, yeasts, protozoan parasites, invertebrates and vertebrates, including Homo sapiens, and plants. More than 2,000 other genome projects representing medical, commercial, environmental and industrial interests, or comprising model organisms, important for the development of the scientific research, are currently in progress. The achievement of complete genome sequences of numerous species combined with the tremendous progress in computation that occurred in the last few decades allowed the use of new holistic approaches in the study of genome structure, organization and evolution, as well as in the field of gene prediction and functional classification. Numerous public or proprietary databases and computational tools have been created attempting to optimize the access to this information through the web. In this review, we present the main resources available through the web for comparative analysis of prokaryotic genomes. We concentrated on the group of mycobacteria that contains important human and animal pathogens. The birth of Bioinformatics and Computational Biology and the contributions of these disciplines to the scientific development of this field are also discussed.
Overexpression of SbMyb60 in sorghum bicolor impacts both primary and secondary metabolism

Science.gov (United States)

Few transcription factors have been identified in C4 grasses that either positively or negatively regulate monolignol biosynthesis. Previously, overexpression of SbMyb60 in sorghum (Sorghum bicolor (L.) Moench) was shown to induce monolignol synthesis, which led to elevated lignin deposition and al...
MIPS Arabidopsis thaliana Database (MAtDB): an integrated biological knowledge resource based on the first complete plant genome

Science.gov (United States)

Schoof, Heiko; Zaccaria, Paolo; Gundlach, Heidrun; Lemcke, Kai; Rudd, Stephen; Kolesov, Grigory; Arnold, Roland; Mewes, H. W.; Mayer, Klaus F. X.

2002-01-01

Arabidopsis thaliana is the first plant for which the complete genome has been sequenced and published. Annotation of complex eukaryotic genomes requires more than the assignment of genetic elements to the sequence. Besides completing the list of genes, we need to discover their cellular roles, their regulation and their interactions in order to understand the workings of the whole plant. The MIPS Arabidopsis thaliana Database (MAtDB; http://mips.gsf.de/proj/thal/db) started out as a repository for genome sequence data in the European Scientists Sequencing Arabidopsis (ESSA) project and the Arabidopsis Genome Initiative. Our aim is to transform MAtDB into an integrated biological knowledge resource by integrating diverse data, tools, query and visualization capabilities and by creating a comprehensive resource for Arabidopsis as a reference model for other species, including crop plants. PMID:11752263
Genome Sequence Databases (Overview): Sequencing and Assembly

Energy Technology Data Exchange (ETDEWEB)

Lapidus, Alla L.

2009-01-01

From the date its role in heredity was discovered, DNA has been generating interest among scientists from different fields of knowledge: physicists have studied the three dimensional structure of the DNA molecule, biologists tried to decode the secrets of life hidden within these long molecules, and technologists invent and improve methods of DNA analysis. The analysis of the nucleotide sequence of DNA occupies a special place among the methods developed. Thanks to the variety of sequencing technologies available, the process of decoding the sequence of genomic DNA (or whole genome sequencing) has become robust and inexpensive. Meanwhile the assembly of whole genome sequences remains a challenging task. In addition to the need to assemble millions of DNA fragments of different length (from 35 bp (Solexa) to 800 bp (Sanger)), great interest in analysis of microbial communities (metagenomes) of different complexities raises new problems and pushes some new requirements for sequence assembly tools to the forefront. The genome assembly process can be divided into two steps: draft assembly and assembly improvement (finishing). Despite the fact that automatically performed assembly (or draft assembly) is capable of covering up to 98% of the genome, in most cases, it still contains incorrectly assembled reads. The error rate of the consensus sequence produced at this stage is about 1/2000 bp. A finished genome represents the genome assembly of much higher accuracy (with no gaps or incorrectly assembled areas) and quality ({approx}1 error/10,000 bp), validated through a number of computer and laboratory experiments.
The Importance of Biological Databases in Biological Discovery.

Science.gov (United States)

Baxevanis, Andreas D; Bateman, Alex

2015-06-19

Biological databases play a central role in bioinformatics. They offer scientists the opportunity to access a wide variety of biologically relevant data, including the genomic sequences of an increasingly broad range of organisms. This unit provides a brief overview of major sequence databases and portals, such as GenBank, the UCSC Genome Browser, and Ensembl. Model organism databases, including WormBase, The Arabidopsis Information Resource (TAIR), and those made available through the Mouse Genome Informatics (MGI) resource, are also covered. Non-sequence-centric databases, such as Online Mendelian Inheritance in Man (OMIM), the Protein Data Bank (PDB), MetaCyc, and the Kyoto Encyclopedia of Genes and Genomes (KEGG), are also discussed. Copyright © 2015 John Wiley & Sons, Inc.
Genomics Portals: integrative web-platform for mining genomics data

Directory of Open Access Journals (Sweden)

Ghosh Krishnendu

2010-01-01

Full Text Available Abstract Background A large amount of experimental data generated by modern high-throughput technologies is available through various public repositories. Our knowledge about molecular interaction networks, functional biological pathways and transcriptional regulatory modules is rapidly expanding, and is being organized in lists of functionally related genes. Jointly, these two sources of information hold a tremendous potential for gaining new insights into functioning of living systems. Results Genomics Portals platform integrates access to an extensive knowledge base and a large database of human, mouse, and rat genomics data with basic analytical visualization tools. It provides the context for analyzing and interpreting new experimental data and the tool for effective mining of a large number of publicly available genomics datasets stored in the back-end databases. The uniqueness of this platform lies in the volume and the diversity of genomics data that can be accessed and analyzed (gene expression, ChIP-chip, ChIP-seq, epigenomics, computationally predicted binding sites, etc, and the integration with an extensive knowledge base that can be used in such analysis. Conclusion The integrated access to primary genomics data, functional knowledge and analytical tools makes Genomics Portals platform a unique tool for interpreting results of new genomics experiments and for mining the vast amount of data stored in the Genomics Portals backend databases. Genomics Portals can be accessed and used freely at http://GenomicsPortals.org.
The Mouse Genome Database (MGD): facilitating mouse as a model for human biology and disease.

Science.gov (United States)

Eppig, Janan T; Blake, Judith A; Bult, Carol J; Kadin, James A; Richardson, Joel E

2015-01-01

The Mouse Genome Database (MGD, http://www.informatics.jax.org) serves the international biomedical research community as the central resource for integrated genomic, genetic and biological data on the laboratory mouse. To facilitate use of mouse as a model in translational studies, MGD maintains a core of high-quality curated data and integrates experimentally and computationally generated data sets. MGD maintains a unified catalog of genes and genome features, including functional RNAs, QTL and phenotypic loci. MGD curates and provides functional and phenotype annotations for mouse genes using the Gene Ontology and Mammalian Phenotype Ontology. MGD integrates phenotype data and associates mouse genotypes to human diseases, providing critical mouse-human relationships and access to repositories holding mouse models. MGD is the authoritative source of nomenclature for genes, genome features, alleles and strains following guidelines of the International Committee on Standardized Genetic Nomenclature for Mice. A new addition to MGD, the Human-Mouse: Disease Connection, allows users to explore gene-phenotype-disease relationships between human and mouse. MGD has also updated search paradigms for phenotypic allele attributes, incorporated incidental mutation data, added a module for display and exploration of genes and microRNA interactions and adopted the JBrowse genome browser. MGD resources are freely available to the scientific community. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
dBBQs: dataBase of Bacterial Quality scores.

Science.gov (United States)

Wanchai, Visanu; Patumcharoenpol, Preecha; Nookaew, Intawat; Ussery, David

2017-12-28

It is well-known that genome sequencing technologies are becoming significantly cheaper and faster. As a result of this, the exponential growth in sequencing data in public databases allows us to explore ever growing large collections of genome sequences. However, it is less known that the majority of available sequenced genome sequences in public databases are not complete, drafts of varying qualities. We have calculated quality scores for around 100,000 bacterial genomes from all major genome repositories and put them in a fast and easy-to-use database. Prokaryotic genomic data from all sources were collected and combined to make a non-redundant set of bacterial genomes. The genome quality score for each was calculated by four different measurements: assembly quality, number of rRNA and tRNA genes, and the occurrence of conserved functional domains. The dataBase of Bacterial Quality scores (dBBQs) was designed to store and retrieve quality scores. It offers fast searching and download features which the result can be used for further analysis. In addition, the search results are shown in interactive JavaScript chart framework using DC.js. The analysis of quality scores across major public genome databases find that around 68% of the genomes are of acceptable quality for many uses. dBBQs (available at http://arc-gem.uams.edu/dbbqs ) provides genome quality scores for all available prokaryotic genome sequences with a user-friendly Web-interface. These scores can be used as cut-offs to get a high-quality set of genomes for testing bioinformatics tools or improving the analysis. Moreover, all data of the four measurements that were combined to make the quality score for each genome, which can potentially be used for further analysis. dBBQs will be updated regularly and is freely use for non-commercial purpose.
Expression Pattern of the Alpha-Kafirin Promoter Coupled with a Signal Peptide from Sorghum bicolor L. Moench

Directory of Open Access Journals (Sweden)

Norazlina Ahmad

2012-01-01

Full Text Available Regulatory sequences with endosperm specificity are essential for foreign gene expression in the desired tissue for both grain quality improvement and molecular pharming. In this study, promoters of seed storage α-kafirin genes coupled with signal sequence (ss were isolated from Sorghum bicolor L. Moench genomic DNA by PCR. The α-kafirin promoter (α-kaf contains endosperm specificity-determining motifs, prolamin-box, the O2-box 1, CATC, and TATA boxes required for α-kafirin gene expression in sorghum seeds. The constructs pMB-Ubi-gfp and pMB-kaf-gfp were microprojectile bombarded into various sorghum and sweet corn explants. GFP expression was detected on all explants using the Ubi promoter but only in seeds for the α-kaf promoter. This shows that the α-kaf promoter isolated was functional and demonstrated seed-specific GFP expression. The constructs pMB-Ubi-ss-gfp and pMB-kaf-ss-gfp were also bombarded into the same explants. Detection of GFP expression showed that the signal peptide (SP::GFP fusion can assemble and fold properly, preserving the fluorescent properties of GFP.
Transitioning from phosphate mining to agriculture: Responses to urea and slow release fertilizers for Sorghum bicolor.

Science.gov (United States)

Ruthrof, Katinka X; Steel, Emma; Misra, Sunil; McComb, Jen; O'Hara, Graham; Hardy, Giles E St J; Howieson, John

2018-06-01

Globally, land-use transition from mining to agriculture is becoming increasingly attractive and necessary for many reasons. However, low levels of necessary plant nutrients, and high levels of heavy metals, can hamper plant growth, affecting yield, and potentially, food safety. In post-phosphate mining substrates, for example, nitrogen (N) is a key limiting nutrient, and, although legumes are planted prior to cereals, N supplementation is still necessary. We undertook two field trials on Christmas Island, Australia, to determine whether Sorghum bicolor could be grown successfully in a post-phosphate mining substrate. The first trial investigated N (urea) demand (amount of N required for adequate crop growth) for S. bicolor, and whether N addition could reduce the naturally occurring cadmium (Cd) concentrations in the crop. The second trial examined whether slow release nitrogen fertilizers (SRF) could replace urea to increase biomass and reduce Cd concentrations. Our first trial demonstrated that S. bicolor has a high N demand, with the highest biomass being recorded in the 160kg/ha urea treatment. However, plants treated with 80, 120 and 160kg/ha were not significantly different from one another. After 7weeks of growth, leaf Cd concentrations were significantly lower for all urea treatments compared with the control plants. However, after 23weeks, seed Cd concentrations did not differ across treatments. Our second trial demonstrated that the application of SRF (Macracote® and Sulsync®) and 160kg/ha urea significantly increased biomass above the control plants. There was, however, no treatment response in terms of Cd or N concentrations in the seed at final harvest. Thus, we have shown that N is currently critical for S. bicolor, even following legume cropping, and that high biomass and a significant reduction in Cd can be attained with appropriate levels of urea. Our work has important implications for cereal growth and food safety in post-mining agriculture
ANISEED 2017: extending the integrated ascidian database to the exploration and evolutionary comparison of genome-scale datasets.

Science.gov (United States)

Brozovic, Matija; Dantec, Christelle; Dardaillon, Justine; Dauga, Delphine; Faure, Emmanuel; Gineste, Mathieu; Louis, Alexandra; Naville, Magali; Nitta, Kazuhiro R; Piette, Jacques; Reeves, Wendy; Scornavacca, Céline; Simion, Paul; Vincentelli, Renaud; Bellec, Maelle; Aicha, Sameh Ben; Fagotto, Marie; Guéroult-Bellone, Marion; Haeussler, Maximilian; Jacox, Edwin; Lowe, Elijah K; Mendez, Mickael; Roberge, Alexis; Stolfi, Alberto; Yokomori, Rui; Brown, C Titus; Cambillau, Christian; Christiaen, Lionel; Delsuc, Frédéric; Douzery, Emmanuel; Dumollard, Rémi; Kusakabe, Takehiro; Nakai, Kenta; Nishida, Hiroki; Satou, Yutaka; Swalla, Billie; Veeman, Michael; Volff, Jean-Nicolas; Lemaire, Patrick

2018-01-04

ANISEED (www.aniseed.cnrs.fr) is the main model organism database for tunicates, the sister-group of vertebrates. This release gives access to annotated genomes, gene expression patterns, and anatomical descriptions for nine ascidian species. It provides increased integration with external molecular and taxonomy databases, better support for epigenomics datasets, in particular RNA-seq, ChIP-seq and SELEX-seq, and features novel interactive interfaces for existing and novel datatypes. In particular, the cross-species navigation and comparison is enhanced through a novel taxonomy section describing each represented species and through the implementation of interactive phylogenetic gene trees for 60% of tunicate genes. The gene expression section displays the results of RNA-seq experiments for the three major model species of solitary ascidians. Gene expression is controlled by the binding of transcription factors to cis-regulatory sequences. A high-resolution description of the DNA-binding specificity for 131 Ciona robusta (formerly C. intestinalis type A) transcription factors by SELEX-seq is provided and used to map candidate binding sites across the Ciona robusta and Phallusia mammillata genomes. Finally, use of a WashU Epigenome browser enhances genome navigation, while a Genomicus server was set up to explore microsynteny relationships within tunicates and with vertebrates, Amphioxus, echinoderms and hemichordates. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
KONAGAbase: a genomic and transcriptomic database for the diamondback moth, Plutella xylostella.

Science.gov (United States)

Jouraku, Akiya; Yamamoto, Kimiko; Kuwazaki, Seigo; Urio, Masahiro; Suetsugu, Yoshitaka; Narukawa, Junko; Miyamoto, Kazuhisa; Kurita, Kanako; Kanamori, Hiroyuki; Katayose, Yuichi; Matsumoto, Takashi; Noda, Hiroaki

2013-07-09

The diamondback moth (DBM), Plutella xylostella, is one of the most harmful insect pests for crucifer crops worldwide. DBM has rapidly evolved high resistance to most conventional insecticides such as pyrethroids, organophosphates, fipronil, spinosad, Bacillus thuringiensis, and diamides. Therefore, it is important to develop genomic and transcriptomic DBM resources for analysis of genes related to insecticide resistance, both to clarify the mechanism of resistance of DBM and to facilitate the development of insecticides with a novel mode of action for more effective and environmentally less harmful insecticide rotation. To contribute to this goal, we developed KONAGAbase, a genomic and transcriptomic database for DBM (KONAGA is the Japanese word for DBM). KONAGAbase provides (1) transcriptomic sequences of 37,340 ESTs/mRNAs and 147,370 RNA-seq contigs which were clustered and assembled into 84,570 unigenes (30,695 contigs, 50,548 pseudo singletons, and 3,327 singletons); and (2) genomic sequences of 88,530 WGS contigs with 246,244 degenerate contigs and 106,455 singletons from which 6,310 de novo identified repeat sequences and 34,890 predicted gene-coding sequences were extracted. The unigenes and predicted gene-coding sequences were clustered and 32,800 representative sequences were extracted as a comprehensive putative gene set. These sequences were annotated with BLAST descriptions, Gene Ontology (GO) terms, and Pfam descriptions, respectively. KONAGAbase contains rich graphical user interface (GUI)-based web interfaces for easy and efficient searching, browsing, and downloading sequences and annotation data. Five useful search interfaces consisting of BLAST search, keyword search, BLAST result-based search, GO tree-based search, and genome browser are provided. KONAGAbase is publicly available from our website (http://dbm.dna.affrc.go.jp/px/) through standard web browsers. KONAGAbase provides DBM comprehensive transcriptomic and draft genomic sequences with
Induced maturation of eel Anguilla bicolor using different hormone combination

Directory of Open Access Journals (Sweden)

Agus Oman Sudrajat

2015-10-01

Full Text Available ABSTRACT Artificial reproduction of eel Anguilla bicolor is not yet well-established because of insufficient broodstock number. In this research, induction of Indonesian eel gonad maturation was performed by hormonal with a combination of pregnant mare serum gonadotropin (PMSG, human chorionic gonadotropin (HCG antidopamin and recombinant growth hormone (rGH. This research consisted of five treatments namely: control (NaCl 0,9%, PMSG 20 IU/ kg, PMSG 20 IU/kg + antidopamin 10 ppm/kg, PMSG 20 IU/kg + antidopamin 10 ppm/kg + rGH 10 μg/kg dan PMSG 20 IU/kg + HCG 10 IU/kg. Each treatment contained 10 fishes. Hormonal induction was conducted by intramuscular injections, as much as five times at intervals of seven days. Furthermore observations on gonadal development were performed after injection for 21 days. The results showed that the treatment generated pregnancy level of 100%, while control was 0%. The best treatment was PMSG 20 IU/kg + antidopamin 10 ppm/kg+ rGH 10 μg/kg, seen from a more mature phase of the gametes, spermatocytes in male and oocytes with perinukleolar phase in female fish. Eel at the body weight of 120.4 to 207.8 g and at the body length of 40.9 to 43.1 cm was male, at the body weight of 274.8 g and at the body length of 47 cm was in intersexual phase, and at the body weight of 323.4 g and at the body length of 53 cm was female. Keywords: Anguilla bicolor, antidopamin, hormones, PMSG, rGH, HCG ABSTRAK Pemijahan ikan sidat secara buatan belum dapat dilakukan karena keterbatasan induk matang gonad. Penelitian ini dilakukan untuk mengetahui pengaruh pemberian hormon terhadap percepatan proses perkembangan gonad ikan sidat (Anguilla bicolor. Hormon yang digunakan adalah kombinasi dari pregnant mare serum gonadotropin (PMSG, human chorionic gonadotropin (HCG, antidopamin dan recombinant growth hormone (rGH. Induksi hormonal untuk mempercepat perkembangan gonad ikan sidat dilakukan melalui lima perlakuan yaitu yaitu kontrol
The Eukaryotic Pathogen Databases: a functional genomic resource integrating data from human and veterinary parasites.

Science.gov (United States)

Harb, Omar S; Roos, David S

2015-01-01

Over the past 20 years, advances in high-throughput biological techniques and the availability of computational resources including fast Internet access have resulted in an explosion of large genome-scale data sets "big data." While such data are readily available for download and personal use and analysis from a variety of repositories, often such analysis requires access to seldom-available computational skills. As a result a number of databases have emerged to provide scientists with online tools enabling the interrogation of data without the need for sophisticated computational skills beyond basic knowledge of Internet browser utility. This chapter focuses on the Eukaryotic Pathogen Databases (EuPathDB: http://eupathdb.org) Bioinformatic Resource Center (BRC) and illustrates some of the available tools and methods.
Transfer of the cytochrome P450-dependent dhurrin pathway from Sorghum bicolor into Nicotiana tabacum chloroplasts for light-driven synthesis

DEFF Research Database (Denmark)

Gnanasekaran, Thiyagarajan; Karcher, Daniel; Nielsen, Agnieszka Janina Zygadlo

2016-01-01

. For this purpose, we stably engineered the dhurrin pathway from Sorghum bicolor into the chloroplasts of Nicotiana tabacum (tobacco). Dhurrin is a cyanogenic glucoside and its synthesis from the amino acid tyrosine is catalysed by two membrane-bound cytochrome P450 enzymes (CYP79A1 and CYP71E1) and a soluble...... glucosyltransferase (UGT85B1), and is dependent on electron transfer from a P450 oxidoreductase. The entire pathway was introduced into the chloroplast by integrating CYP79A1, CYP71E1, and UGT85B1 into a neutral site of the N. tabacum chloroplast genome. The two P450s and the UGT85B1 were functional when expressed...... compared to 6% in sorghum. The results obtained pave the way for plant P450s involved in the synthesis of economically important compounds to be engineered into the thylakoid membrane of chloroplasts, and demonstrate that their full catalytic cycle can be driven directly by photosynthesis-derived electrons....

BISQUE: locus- and variant-specific conversion of genomic, transcriptomic and proteomic database identifiers.

Science.gov (United States)

Meyer, Michael J; Geske, Philip; Yu, Haiyuan

2016-05-15

Biological sequence databases are integral to efforts to characterize and understand biological molecules and share biological data. However, when analyzing these data, scientists are often left holding disparate biological currency-molecular identifiers from different databases. For downstream applications that require converting the identifiers themselves, there are many resources available, but analyzing associated loci and variants can be cumbersome if data is not given in a form amenable to particular analyses. Here we present BISQUE, a web server and customizable command-line tool for converting molecular identifiers and their contained loci and variants between different database conventions. BISQUE uses a graph traversal algorithm to generalize the conversion process for residues in the human genome, genes, transcripts and proteins, allowing for conversion across classes of molecules and in all directions through an intuitive web interface and a URL-based web service. BISQUE is freely available via the web using any major web browser (http://bisque.yulab.org/). Source code is available in a public GitHub repository (https://github.com/hyulab/BISQUE). haiyuan.yu@cornell.edu Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Full Data of Yeast Interacting Proteins Database (Original Version) - Yeast Interacting Proteins Database | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us Yeast Interacting Proteins Database Full Data of Yeast Interacting Proteins Database (Origin...al Version) Data detail Data name Full Data of Yeast Interacting Proteins Database (Original Version) DOI 10....18908/lsdba.nbdc00742-004 Description of data contents The entire data in the Yeast Interacting Proteins Database...eir interactions are required. Several sources including YPD (Yeast Proteome Database, Costanzo, M. C., Hoga...ematic name in the SGD (Saccharomyces Genome Database; http://www.yeastgenome.org /). Bait gene name The gen
ProFITS of maize: a database of protein families involved in the transduction of signalling in the maize genome

Directory of Open Access Journals (Sweden)

Zhang Zhenhai

2010-10-01

Full Text Available Abstract Background Maize (Zea mays ssp. mays L. is an important model for plant basic and applied research. In 2009, the B73 maize genome sequencing made a great step forward, using clone by clone strategy; however, functional annotation and gene classification of the maize genome are still limited. Thus, a well-annotated datasets and informative database will be important for further research discoveries. Signal transduction is a fundamental biological process in living cells, and many protein families participate in this process in sensing, amplifying and responding to various extracellular or internal stimuli. Therefore, it is a good starting point to integrate information on the maize functional genes involved in signal transduction. Results Here we introduce a comprehensive database 'ProFITS' (Protein Families Involved in the Transduction of Signalling, which endeavours to identify and classify protein kinases/phosphatases, transcription factors and ubiquitin-proteasome-system related genes in the B73 maize genome. Users can explore gene models, corresponding transcripts and FLcDNAs using the three abovementioned protein hierarchical categories, and visualize them using an AJAX-based genome browser (JBrowse or Generic Genome Browser (GBrowse. Functional annotations such as GO annotation, protein signatures, protein best-hits in the Arabidopsis and rice genome are provided. In addition, pre-calculated transcription factor binding sites of each gene are generated and mutant information is incorporated into ProFITS. In short, ProFITS provides a user-friendly web interface for studies in signal transduction process in maize. Conclusion ProFITS, which utilizes both the B73 maize genome and full length cDNA (FLcDNA datasets, provides users a comprehensive platform of maize annotation with specific focus on the categorization of families involved in the signal transduction process. ProFITS is designed as a user-friendly web interface and it is
ODG: Omics database generator - a tool for generating, querying, and analyzing multi-omics comparative databases to facilitate biological understanding.

Science.gov (United States)

Guhlin, Joseph; Silverstein, Kevin A T; Zhou, Peng; Tiffin, Peter; Young, Nevin D

2017-08-10

Rapid generation of omics data in recent years have resulted in vast amounts of disconnected datasets without systemic integration and knowledge building, while individual groups have made customized, annotated datasets available on the web with few ways to link them to in-lab datasets. With so many research groups generating their own data, the ability to relate it to the larger genomic and comparative genomic context is becoming increasingly crucial to make full use of the data. The Omics Database Generator (ODG) allows users to create customized databases that utilize published genomics data integrated with experimental data which can be queried using a flexible graph database. When provided with omics and experimental data, ODG will create a comparative, multi-dimensional graph database. ODG can import definitions and annotations from other sources such as InterProScan, the Gene Ontology, ENZYME, UniPathway, and others. This annotation data can be especially useful for studying new or understudied species for which transcripts have only been predicted, and rapidly give additional layers of annotation to predicted genes. In better studied species, ODG can perform syntenic annotation translations or rapidly identify characteristics of a set of genes or nucleotide locations, such as hits from an association study. ODG provides a web-based user-interface for configuring the data import and for querying the database. Queries can also be run from the command-line and the database can be queried directly through programming language hooks available for most languages. ODG supports most common genomic formats as well as generic, easy to use tab-separated value format for user-provided annotations. ODG is a user-friendly database generation and query tool that adapts to the supplied data to produce a comparative genomic database or multi-layered annotation database. ODG provides rapid comparative genomic annotation and is therefore particularly useful for non-model or
OryzaGenome: Genome Diversity Database of Wild Oryza Species

KAUST Repository

Ohyanagi, Hajime; Ebata, Toshinobu; Huang, Xuehui; Gong, Hao; Fujita, Masahiro; Mochizuki, Takako; Toyoda, Atsushi; Fujiyama, Asao; Kaminuma, Eli; Nakamura, Yasukazu; Feng, Qi; Wang, Zi Xuan; Han, Bin; Kurata, Nori

2015-01-01

. Portable VCF (variant call format) file or tabdelimited file download is also available. Following these SNP (single nucleotide polymorphism) data, reference pseudomolecules/ scaffolds/contigs and genome-wide variation information for almost all
LC-MS/MS-based proteome profiling in Daphnia pulex and Daphnia longicephala: the Daphnia pulex genome database as a key for high throughput proteomics in Daphnia

Directory of Open Access Journals (Sweden)

Mayr Tobias

2009-04-01

Full Text Available Abstract Background Daphniids, commonly known as waterfleas, serve as important model systems for ecology, evolution and the environmental sciences. The sequencing and annotation of the Daphnia pulex genome both open future avenues of research on this model organism. As proteomics is not only essential to our understanding of cell function, and is also a powerful validation tool for predicted genes in genome annotation projects, a first proteomic dataset is presented in this article. Results A comprehensive set of 701,274 peptide tandem-mass-spectra, derived from Daphnia pulex, was generated, which lead to the identification of 531 proteins. To measure the impact of the Daphnia pulex filtered models database for mass spectrometry based Daphnia protein identification, this result was compared with results obtained with the Swiss-Prot and the Drosophila melanogaster database. To further validate the utility of the Daphnia pulex database for research on other Daphnia species, additional 407,778 peptide tandem-mass-spectra, obtained from Daphnia longicephala, were generated and evaluated, leading to the identification of 317 proteins. Conclusion Peptides identified in our approach provide the first experimental evidence for the translation of a broad variety of predicted coding regions within the Daphnia genome. Furthermore it could be demonstrated that identification of Daphnia longicephala proteins using the Daphnia pulex protein database is feasible but shows a slightly reduced identification rate. Data provided in this article clearly demonstrates that the Daphnia genome database is the key for mass spectrometry based high throughput proteomics in Daphnia.
Detecting non-orthology in the COGs database and other approaches grouping orthologs using genome-specific best hits.

Science.gov (United States)

Dessimoz, Christophe; Boeckmann, Brigitte; Roth, Alexander C J; Gonnet, Gaston H

2006-01-01

Correct orthology assignment is a critical prerequisite of numerous comparative genomics procedures, such as function prediction, construction of phylogenetic species trees and genome rearrangement analysis. We present an algorithm for the detection of non-orthologs that arise by mistake in current orthology classification methods based on genome-specific best hits, such as the COGs database. The algorithm works with pairwise distance estimates, rather than computationally expensive and error-prone tree-building methods. The accuracy of the algorithm is evaluated through verification of the distribution of predicted cases, case-by-case phylogenetic analysis and comparisons with predictions from other projects using independent methods. Our results show that a very significant fraction of the COG groups include non-orthologs: using conservative parameters, the algorithm detects non-orthology in a third of all COG groups. Consequently, sequence analysis sensitive to correct orthology assignments will greatly benefit from these findings.
Using FlyBase, a Database of Drosophila Genes and Genomes.

Science.gov (United States)

Marygold, Steven J; Crosby, Madeline A; Goodman, Joshua L

2016-01-01

For nearly 25 years, FlyBase (flybase.org) has provided a freely available online database of biological information about Drosophila species, focusing on the model organism D. melanogaster. The need for a centralized, integrated view of Drosophila research has never been greater as advances in genomic, proteomic, and high-throughput technologies add to the quantity and diversity of available data and resources.FlyBase has taken several approaches to respond to these changes in the research landscape. Novel report pages have been generated for new reagent types and physical interaction data; Drosophila models of human disease are now represented and showcased in dedicated Human Disease Model Reports; other integrated reports have been established that bring together related genes, datasets, or reagents; Gene Reports have been revised to improve access to new data types and to highlight functional data; links to external sites have been organized and expanded; and new tools have been developed to display and interrogate all these data, including improved batch processing and bulk file availability. In addition, several new community initiatives have served to enhance interactions between researchers and FlyBase, resulting in direct user contributions and improved feedback.This chapter provides an overview of the data content, organization, and available tools within FlyBase, focusing on recent improvements. We hope it serves as a guide for our diverse user base, enabling efficient and effective exploration of the database and thereby accelerating research discoveries.
Assembly, Annotation, and Analysis of Multiple Mycorrhizal Fungal Genomes

Energy Technology Data Exchange (ETDEWEB)

Initiative Consortium, Mycorrhizal Genomics; Kuo, Alan; Grigoriev, Igor; Kohler, Annegret; Martin, Francis

2013-03-08

Mycorrhizal fungi play critical roles in host plant health, soil community structure and chemistry, and carbon and nutrient cycling, all areas of intense interest to the US Dept. of Energy (DOE) Joint Genome Institute (JGI). To this end we are building on our earlier sequencing of the Laccaria bicolor genome by partnering with INRA-Nancy and the mycorrhizal research community in the MGI to sequence and analyze dozens of mycorrhizal genomes of all Basidiomycota and Ascomycota orders and multiple ecological types (ericoid, orchid, and ectomycorrhizal). JGI has developed and deployed high-throughput sequencing techniques, and Assembly, RNASeq, and Annotation Pipelines. In 2012 alone we sequenced, assembled, and annotated 12 draft or improved genomes of mycorrhizae, and predicted ~;;232831 genes and ~;;15011 multigene families, All of this data is publicly available on JGI MycoCosm (http://jgi.doe.gov/fungi/), which provides access to both the genome data and tools with which to analyze the data. Preliminary comparisons of the current total of 14 public mycorrhizal genomes suggest that 1) short secreted proteins potentially involved in symbiosis are more enriched in some orders than in others amongst the mycorrhizal Agaricomycetes, 2) there are wide ranges of numbers of genes involved in certain functional categories, such as signal transduction and post-translational modification, and 3) novel gene families are specific to some ecological types.
dBBQs: dataBase of Bacterial Quality scores

OpenAIRE

Wanchai, Visanu; Patumcharoenpol, Preecha; Nookaew, Intawat; Ussery, David

2017-01-01

Background: It is well-known that genome sequencing technologies are becoming significantly cheaper and faster. As a result of this, the exponential growth in sequencing data in public databases allows us to explore ever growing large collections of genome sequences. However, it is less known that the majority of available sequenced genome sequences in public databases are not complete, drafts of varying qualities. We have calculated quality scores for around 100,000 bacterial genomes from al...
Genome-wide screen for universal individual identification SNPs based on the HapMap and 1000 Genomes databases.

Science.gov (United States)

Huang, Erwen; Liu, Changhui; Zheng, Jingjing; Han, Xiaolong; Du, Weian; Huang, Yuanjian; Li, Chengshi; Wang, Xiaoguang; Tong, Dayue; Ou, Xueling; Sun, Hongyu; Zeng, Zhaoshu; Liu, Chao

2018-04-03

Differences among SNP panels for individual identification in SNP-selecting and populations led to few common SNPs, compromising their universal applicability. To screen all universal SNPs, we performed a genome-wide SNP mining in multiple populations based on HapMap and 1000Genomes databases. SNPs with high minor allele frequencies (MAF) in 37 populations were selected. With MAF from ≥0.35 to ≥0.43, the number of selected SNPs decreased from 2769 to 0. A total of 117 SNPs with MAF ≥0.39 have no linkage disequilibrium with each other in every population. For 116 of the 117 SNPs, cumulative match probability (CMP) ranged from 2.01 × 10-48 to 1.93 × 10-50 and cumulative exclusion probability (CEP) ranged from 0.9999999996653 to 0.9999999999945. In 134 tested Han samples, 110 of the 117 SNPs remained within high MAF and conformed to Hardy-Weinberg equilibrium, with CMP = 4.70 × 10-47 and CEP = 0.999999999862. By analyzing the same number of autosomal SNPs as in the HID-Ion AmpliSeq Identity Panel, i.e. 90 randomized out of the 110 SNPs, our panel yielded preferable CMP and CEP. Taken together, the 110-SNPs panel is advantageous for forensic test, and this study provided plenty of highly informative SNPs for compiling final universal panels.
MIPS: analysis and annotation of proteins from whole genomes.

Science.gov (United States)

Mewes, H W; Amid, C; Arnold, R; Frishman, D; Güldener, U; Mannhaupt, G; Münsterkötter, M; Pagel, P; Strack, N; Stümpflen, V; Warfsmann, J; Ruepp, A

2004-01-01

The Munich Information Center for Protein Sequences (MIPS-GSF), Neuherberg, Germany, provides protein sequence-related information based on whole-genome analysis. The main focus of the work is directed toward the systematic organization of sequence-related attributes as gathered by a variety of algorithms, primary information from experimental data together with information compiled from the scientific literature. MIPS maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the database of complete cDNAs (German Human Genome Project, NGFN), the database of mammalian protein-protein interactions (MPPI), the database of FASTA homologies (SIMAP), and the interface for the fast retrieval of protein-associated information (QUIPOS). The Arabidopsis thaliana database, the rice database, the plant EST databases (MATDB, MOsDB, SPUTNIK), as well as the databases for the comprehensive set of genomes (PEDANT genomes) are described elsewhere in the 2003 and 2004 NAR database issues, respectively. All databases described, and the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de).
Thoroughbred Horse Single Nucleotide Polymorphism and Expression Database: HSDB

Directory of Open Access Journals (Sweden)

Joon-Ho Lee

2014-09-01

Full Text Available Genetics is important for breeding and selection of horses but there is a lack of well-established horse-related browsers or databases. In order to better understand horses, more variants and other integrated information are needed. Thus, we construct a horse genomic variants database including expression and other information. Horse Single Nucleotide Polymorphism and Expression Database (HSDB (http://snugenome2.snu.ac.kr/HSDB provides the number of unexplored genomic variants still remaining to be identified in the horse genome including rare variants by using population genome sequences of eighteen horses and RNA-seq of four horses. The identified single nucleotide polymorphisms (SNPs were confirmed by comparing them with SNP chip data and variants of RNA-seq, which showed a concordance level of 99.02% and 96.6%, respectively. Moreover, the database provides the genomic variants with their corresponding transcriptional profiles from the same individuals to help understand the functional aspects of these variants. The database will contribute to genetic improvement and breeding strategies of Thoroughbreds.
Expanded microbial genome coverage and improved protein family annotation in the COG database.

Science.gov (United States)

Galperin, Michael Y; Makarova, Kira S; Wolf, Yuri I; Koonin, Eugene V

2015-01-01

Microbial genome sequencing projects produce numerous sequences of deduced proteins, only a small fraction of which have been or will ever be studied experimentally. This leaves sequence analysis as the only feasible way to annotate these proteins and assign to them tentative functions. The Clusters of Orthologous Groups of proteins (COGs) database (http://www.ncbi.nlm.nih.gov/COG/), first created in 1997, has been a popular tool for functional annotation. Its success was largely based on (i) its reliance on complete microbial genomes, which allowed reliable assignment of orthologs and paralogs for most genes; (ii) orthology-based approach, which used the function(s) of the characterized member(s) of the protein family (COG) to assign function(s) to the entire set of carefully identified orthologs and describe the range of potential functions when there were more than one; and (iii) careful manual curation of the annotation of the COGs, aimed at detailed prediction of the biological function(s) for each COG while avoiding annotation errors and overprediction. Here we present an update of the COGs, the first since 2003, and a comprehensive revision of the COG annotations and expansion of the genome coverage to include representative complete genomes from all bacterial and archaeal lineages down to the genus level. This re-analysis of the COGs shows that the original COG assignments had an error rate below 0.5% and allows an assessment of the progress in functional genomics in the past 12 years. During this time, functions of many previously uncharacterized COGs have been elucidated and tentative functional assignments of many COGs have been validated, either by targeted experiments or through the use of high-throughput methods. A particularly important development is the assignment of functions to several widespread, conserved proteins many of which turned out to participate in translation, in particular rRNA maturation and tRNA modification. The new version of the
Toxic hepatitis caused by the excretions of the Phyllomedusa bicolor frog ? a case report

OpenAIRE

Pogorzelska, Joanna; ?api?ski, Tadeusz W.

2017-01-01

The Kamb? ritual consists of various types of skin scarification and subsequent application of Phyllomedusa bicolor secretion to the fresh wounds. In Europe, the ritual of Kamb? is becoming more popular, but its use can lead to serious multiple organ damage, sometimes life-threatening. Our manuscript shows a patient with toxic liver damage probably associated with the Kamb? ritual.
pico-PLAZA, a genome database of microbial photosynthetic eukaryotes.

Science.gov (United States)

Vandepoele, Klaas; Van Bel, Michiel; Richard, Guilhem; Van Landeghem, Sofie; Verhelst, Bram; Moreau, Hervé; Van de Peer, Yves; Grimsley, Nigel; Piganeau, Gwenael

2013-08-01

With the advent of next generation genome sequencing, the number of sequenced algal genomes and transcriptomes is rapidly growing. Although a few genome portals exist to browse individual genome sequences, exploring complete genome information from multiple species for the analysis of user-defined sequences or gene lists remains a major challenge. pico-PLAZA is a web-based resource (http://bioinformatics.psb.ugent.be/pico-plaza/) for algal genomics that combines different data types with intuitive tools to explore genomic diversity, perform integrative evolutionary sequence analysis and study gene functions. Apart from homologous gene families, multiple sequence alignments, phylogenetic trees, Gene Ontology, InterPro and text-mining functional annotations, different interactive viewers are available to study genome organization using gene collinearity and synteny information. Different search functions, documentation pages, export functions and an extensive glossary are available to guide non-expert scientists. To illustrate the versatility of the platform, different case studies are presented demonstrating how pico-PLAZA can be used to functionally characterize large-scale EST/RNA-Seq data sets and to perform environmental genomics. Functional enrichments analysis of 16 Phaeodactylum tricornutum transcriptome libraries offers a molecular view on diatom adaptation to different environments of ecological relevance. Furthermore, we show how complementary genomic data sources can easily be combined to identify marker genes to study the diversity and distribution of algal species, for example in metagenomes, or to quantify intraspecific diversity from environmental strains. © 2013 John Wiley & Sons Ltd and Society for Applied Microbiology.
Genomic Prediction from Whole Genome Sequence in Livestock: The 1000 Bull Genomes Project

DEFF Research Database (Denmark)

Hayes, Benjamin J; MacLeod, Iona M; Daetwyler, Hans D

Advantages of using whole genome sequence data to predict genomic estimated breeding values (GEBV) include better persistence of accuracy of GEBV across generations and more accurate GEBV across breeds. The 1000 Bull Genomes Project provides a database of whole genome sequenced key ancestor bulls....... In a dairy data set, predictions using BayesRC and imputed sequence data from 1000 Bull Genomes were 2% more accurate than with 800k data. We could demonstrate the method identified causal mutations in some cases. Further improvements will come from more accurate imputation of sequence variant genotypes...
Dicty_cDB: SSM365 [Dicty_cDB

Lifescience Database Archive (English)

Full Text Available WGS-SbicolorF (JM107 adapted methyl filtered) Sorghum bicolor genomic clone hv90e08 5', DNA sequence. 36 1.2... 2 BZ330175 |BZ330175.1 hv90e08.g1 WGS-SbicolorF (JM107 adapted methyl filtered) ...rF (JM107 adapted methyl filtered) Sorghum bicolor genomic clone hz26f02 5', DNA sequence. 36 1.4 2 BZ628823... |BZ628823.1 ih62d12.g1 WGS-SbicolorF (DH5a methyl filtered) Sorghum bicolor geno...mic clone ih62d12 5', DNA sequence. 36 1.7 2 BZ340606 |BZ340606.1 ic39h07.b1 WGS-SbicolorF (JM107 adapted methyl filter
The development of large-scale de-identified biomedical databases in the age of genomics-principles and challenges.

Science.gov (United States)

Dankar, Fida K; Ptitsyn, Andrey; Dankar, Samar K

2018-04-10

Contemporary biomedical databases include a wide range of information types from various observational and instrumental sources. Among the most important features that unite biomedical databases across the field are high volume of information and high potential to cause damage through data corruption, loss of performance, and loss of patient privacy. Thus, issues of data governance and privacy protection are essential for the construction of data depositories for biomedical research and healthcare. In this paper, we discuss various challenges of data governance in the context of population genome projects. The various challenges along with best practices and current research efforts are discussed through the steps of data collection, storage, sharing, analysis, and knowledge dissemination.
Flavonoids Isolated From the Flowers of Limonium bicolor and their In vitro Antitumor Evaluation.

Science.gov (United States)

Chen, Jian; Teng, Jiehui; Ma, Li; Tong, Haiying; Ren, Bingru; Wang, Linshan; Li, Weilin

2017-01-01

Limonium bicolor , a halophytic species, can grow in saline or saline-alkali soil, is well known as a traditional Chinese medicine. Recently it attracted much attention for its treatment for cancer. The present study was performed to evaluate this species from the phytochemical standpoint and the possible relationship between the antitumor activity and its natural products. The chemical constituents from the flowers of L. bicolor were investigated through bioassay-guided fractionation and isolation. All the individual compounds were characterized by spectroscopic analysis and their potential antitumor activity was tested against three different human tumor cell lines by MTT assays. The EtOAc extract was proven as the most potent fraction and further fractionation led to the isolation of 15 natural flavonoids, which were characterized as luteolin (1), acacetin (2), quercetin (3), isorhamnetin (4), kaempferol (5), eriodictyol (6), kaempferol-3-O-α-L-rhamnoside (7), kaempferol-3-O-β-D-glucoside (8), quercetin-3-O-α-L-rhamnoside (9), quercetin-3-O-β-D-glucoside (10), quercetin-3-O-β-D-galactoside (11), myricetin-3-O-α-L-rhamnoside (12), kaempferol-3-O-(6″-O-galloyl)-β-D-glucoside (13), hesperidin (14) and rutin (15). The biotesting results demonstrated that both compounds 1 and 3 showed good cytotoxicity against human colon cancer cells (LOVO). Compound 5 exhibited relative greater growth inhibition against both human breast cancer cells (MCF-7) and osteosarcoma cell lines (U2-OS) at the concentration of 100 μg/mL. On the basis of these findings, the flavonoids were deduced to be potentially responsible for the antitumor activity of L. bicolor . The preliminary structure-activity relationship analysis suggests that the 3-O-glycosylation moiety in natural flavonoids was not essential for the antiproliferative activity on LOVO and U2-OS cells. The phytochemical investigation of Limonium bicolor led to the isolation of 15 flavonoids.The biotesting of the

Dietary administration of Gynura bicolor (Roxb. Willd.) DC water extract enhances immune response and survival rate against Vibrio alginolyticus and white spot syndrome virus in white shrimp Litopeneaus vannamei.

Science.gov (United States)

Wu, Chih-Chung; Chang, Yueh-Ping; Wang, Jyh-Jye; Liu, Chun-Hung; Wong, Saou-Lien; Jiang, Chii-Ming; Hsieh, Shu-Ling

2015-01-01

Gynura bicolor (Roxb. & Willd.) DC., a perennial plant belonging to the Asteraceae family, is originated from the tropical area of Asia. The total hemocyte count (THC), phenoloxidase (PO) activity, respiratory bursts (RBs), superoxide dismutase (SOD) activity, and lysozyme activity were examined after white shrimp Litopenaeus vannamei had been fed diets containing the water extract of G. bicolor at 0 (control), 0.5, 1.0, and 2.0 g (kg diet)(-1) for 7-28 days. The results indicated that these parameters increased accordingly with the amount of extract and time. THCs of the shrimp fed the G. bicolor diets at 1.0 and 2.0 g (kg diet)(-1) were significantly higher than that fed the control diet for 14-28 days. For the shrimp fed the G. bicolor diets at 0.5, 1.0, and 2.0 g (kg diet)(-1), the PO, RBs, and lysozyme activities reached the highest levels after 7 days, whereas SOD activity reached the highest levels after 14 days. In a separate experiment, white shrimp L. vannamei fed the diets containing the G. bicolor extract for 28 days were challenged with Vibrio alginolyticus at 3 × 10(6) cfu shrimp(-1) and white spot syndrome virus (WSSV) at 1 × 10(3) copies shrimp(-1). The survival rate of the shrimp fed the G. bicolor diets was significantly higher than that of the shrimp fed the control diet at 48-144 h post challenge V. alginolyticus and WSSV. For the shrimp fed the G. bicolor diets at 0.5, 1 and 2 g (kg diet)(-1) under challenges of V. alginolyticus and WSSV, their LPS- and β-1,3-glucan-binding protein (LGBP) and peroxinectin (PE) mRNA expressions were significantly higher than those of the challenged control shrimp at 12-96 and 24-144 h post-challenge, respectively. We concluded that dietary administration of a G. bicolor extract could enhance the innate immunity within 28 days as evidenced by the increases in immune parameters (PO, RBs, and lysozyme) and antioxidant enzyme (SOD) activities of shrimp to against V. alginolyticus and WSSV
Novel storage technologies for raw and clarified syrup biomass feedstocks from sweet sorghum (Sorghum bicolor L. Moench)

Science.gov (United States)

Attention is currently focused on developing sustainable supply chains of sugar feedstocks for new, flexible biorefineries. Fundamental processing needs identified by industry for the large-scale manufacture of biofuels and bioproducts from sweet sorghum (Sorghum bicolor L. Moench) include stabiliz...
Materials as regard about ecology and spreading of lycodine striatum bicolor nik in Tajikistan

International Nuclear Information System (INIS)

Sattorov, T.S.; Khidirov, Kh.; Mukhammadkulov, M.

2003-01-01

In this article is placed new scientific information about biology, ecology and spreading of Lycodine striatum bicolor within the territory of Tajikistan. Finding available in this article concerning spreading of flus snake are considered to be new. This scarce snake was discovered for the first time in Northern part of Tajikistan. This new information will enrich our notions about Reptile fauna of Tajikistan
Toxic hepatitis caused by the excretions of the Phyllomedusa bicolor frog - a case report.

Science.gov (United States)

Pogorzelska, Joanna; Łapiński, Tadeusz W

2017-03-01

The Kambô ritual consists of various types of skin scarification and subsequent application of Phyllomedusa bicolor secretion to the fresh wounds. In Europe, the ritual of Kambô is becoming more popular, but its use can lead to serious multiple organ damage, sometimes life-threatening. Our manuscript shows a patient with toxic liver damage probably associated with the Kambô ritual.
MAKER2: an annotation pipeline and genome-database management tool for second-generation genome projects.

Science.gov (United States)

Holt, Carson; Yandell, Mark

2011-12-22

Second-generation sequencing technologies are precipitating major shifts with regards to what kinds of genomes are being sequenced and how they are annotated. While the first generation of genome projects focused on well-studied model organisms, many of today's projects involve exotic organisms whose genomes are largely terra incognita. This complicates their annotation, because unlike first-generation projects, there are no pre-existing 'gold-standard' gene-models with which to train gene-finders. Improvements in genome assembly and the wide availability of mRNA-seq data are also creating opportunities to update and re-annotate previously published genome annotations. Today's genome projects are thus in need of new genome annotation tools that can meet the challenges and opportunities presented by second-generation sequencing technologies. We present MAKER2, a genome annotation and data management tool designed for second-generation genome projects. MAKER2 is a multi-threaded, parallelized application that can process second-generation datasets of virtually any size. We show that MAKER2 can produce accurate annotations for novel genomes where training-data are limited, of low quality or even non-existent. MAKER2 also provides an easy means to use mRNA-seq data to improve annotation quality; and it can use these data to update legacy annotations, significantly improving their quality. We also show that MAKER2 can evaluate the quality of genome annotations, and identify and prioritize problematic annotations for manual review. MAKER2 is the first annotation engine specifically designed for second-generation genome projects. MAKER2 scales to datasets of any size, requires little in the way of training data, and can use mRNA-seq data to improve annotation quality. It can also update and manage legacy genome annotation datasets.
'Brasileirinha': cultivar de abóbora (Cucurbita moschata de frutos bicolores com valor ornamental e aptidão para consumo verde 'Brasileirinha': an ornamental bicolor squash (Cucurbita moschata cultivar for immature fruit consumption

Directory of Open Access Journals (Sweden)

Leonardo S Boiteux

2007-03-01

Full Text Available 'Brasileirinha' é uma cultivar de abóbora (Cucurbita moschata com frutos bicolores que foi desenvolvida com o objetivo de disponibilizar um produto diferenciado devido ao aspecto ornamental e a composição nutricional de seus frutos. Esta cultivar foi selecionada na geração F7, sendo obtida via cruzamentos convencionais entre um acesso de frutos bicolores, provavelmente devido à presença de um alelo do gene B, e a cultivar Mocinha (com frutos imaturos de cor verde uniforme. A característica peculiar da cultivar Brasileirinha é a produção de frutos com casca apresentando uma marcante coloração bicolor (coloração amarela na região proximal e verde na posição distal dos frutos. A polpa apresenta coloração amarela-esverdeada em frutos colhidos imaturos e, à medida que o fruto amadurece, intensifica-se uma coloração alaranjada. Beta-caroteno e luteína são os principais carotenóides presentes em frutos para consumo verde. Em frutos em completo estádio de maturação (polpa laranja intensa verifica-se a acumulação de beta-caroteno e alfa-caroteno (precursores da vitamina A em torno de 243 mg g-1. A cultivar Brasileirinha tem apresentado boa resistência de campo a diferentes raças de oídio (Podosphaera xanthii. Esta cultivar é preferencialmente recomendada para consumo como abobrinha verde (no estádio de fruto imaturo e para fins ornamentais (frutos em todos os estádios. Uma opção é o uso de frutos jovens em conservas. A cultivar Brasileirinha é recomendada para plantio em todas as tradicionais regiões produtoras do país. O sistema de produção para esta cultivar tem sido o mesmo adotado para outros tipos de abóboras.'Brasileirinha' is a squash (Cucurbita moschata cultivar developed by Embrapa Vegetable Crops, with a appealing ornamental appearance and carotenoid composition of its fruits might provide raw material for the development of value-added products targeting new market niches. This cultivar is an F7
Overexpression of Laccaria bicolor aquaporin JQ585595 alters root water transport properties in ectomycorrhizal white spruce (Picea glauca) seedlings.

Science.gov (United States)

Xu, Hao; Kemppainen, Minna; El Kayal, Walid; Lee, Seong Hee; Pardo, Alejandro G; Cooke, Janice E K; Zwiazek, Janusz J

2015-01-01

The contribution of hyphae to water transport in ectomycorrhizal (ECM) white spruce (Picea glauca) seedlings was examined by altering expression of a major water-transporting aquaporin in Laccaria bicolor. Picea glauca was inoculated with wild-type (WT), mock transgenic or L. bicolor aquaporin JQ585595-overexpressing (OE) strains and exposed to root temperatures ranging from 5 to 20°C to examine the root water transport properties, physiological responses and plasma membrane intrinsic protein (PIP) expression in colonized plants. Mycorrhization increased shoot water potential, transpiration, net photosynthetic rates, root hydraulic conductivity and root cortical cell hydraulic conductivity in seedlings. At 20°C, OE plants had higher root hydraulic conductivity compared with WT plants and the increases were accompanied by higher expression of P. glauca PIP GQ03401_M18.1 in roots. In contrast to WT L. bicolor, the effects of OE fungi on root and root cortical cell hydraulic conductivities were abolished at 10 and 5°C in the absence of major changes in the examined transcript levels of P. glauca root PIPs. The results provide evidence for the importance of fungal aquaporins in root water transport of mycorrhizal plants. They also demonstrate links between hyphal water transport, root aquaporin expression and root water transport in ECM plants. © 2014 The Authors. New Phytologist © 2014 New Phytologist Trust.
The master two-dimensional gel database of human AMA cell proteins: towards linking protein and genome sequence and mapping information (update 1991)

DEFF Research Database (Denmark)

Celis, J E; Leffers, H; Rasmussen, H H

1991-01-01

autoantigens" and "cDNAs". For convenience we have included an alphabetical list of all known proteins recorded in this database. In the long run, the main goal of this database is to link protein and DNA sequencing and mapping information (Human Genome Program) and to provide an integrated picture......The master two-dimensional gel database of human AMA cells currently lists 3801 cellular and secreted proteins, of which 371 cellular polypeptides (306 IEF; 65 NEPHGE) were added to the master images during the last 10 months. These include: (i) very basic and acidic proteins that do not focus...
Genomic Testing

Science.gov (United States)

... this database. Top of Page Evaluation of Genomic Applications in Practice and Prevention (EGAPP™) In 2004, the Centers for Disease Control and Prevention launched the EGAPP initiative to establish and test a ... and other applications of genomic technology that are in transition from ...
Data on the genome-wide identification of CNL R-genes in Setaria italica (L.) P. Beauv.

Science.gov (United States)

Andersen, Ethan J; Nepal, Madhav P

2017-08-01

We report data associated with the identification of 242 disease resistance genes (R-genes) in the genome of Setaria italica as presented in "Genetic diversity of disease resistance genes in foxtail millet ( Setaria italica L.)" (Andersen and Nepal, 2017) [1]. Our data describe the structure and evolution of the Coiled-coil, Nucleotide-binding site, Leucine-rich repeat (CNL) R-genes in foxtail millet. The CNL genes were identified through rigorous extraction and analysis of recently available plant genome sequences using cutting-edge analytical software. Data visualization includes gene structure diagrams, chromosomal syntenic maps, a chromosomal density plot, and a maximum-likelihood phylogenetic tree comparing Sorghum bicolor , Panicum virgatum , Setaria italica , and Arabidopsis thaliana . Compilation of InterProScan annotations, Gene Ontology (GO) annotations, and Basic Local Alignment Search Tool (BLAST) results for the 242 R-genes identified in the foxtail millet genome are also included in tabular format.
A New Single Nucleotide Polymorphism Database for Rainbow Trout Generated Through Whole Genome Resequencing

Directory of Open Access Journals (Sweden)

Guangtu Gao

2018-04-01

heterozygosity within each population. We also provide functional annotation based on the genome position of each SNP and evaluate the use of clonal lines for filtering of PSVs and MSVs. These SNPs form a new database, which provides an important resource for a new high density SNP array design and for other SNP genotyping platforms used for genetic and genomics studies of this iconic salmonid fish species.
Can’t take the heat: Temperature-enhanced toxicity in the mayfly Isonychia bicolor exposed to the neonicotinoid insecticide imidacloprid

International Nuclear Information System (INIS)

Camp, A.A.; Buchwalter, D.B.

2016-01-01

Highlights: • Temperature has a strong modulating influence on toxicity in aquatic insects. • Increasing temperature decreased the time to onset of imidacloprid toxicity. • Increasing temperature increased the uptake rates of imidacloprid in different taxa. • Sublethal behavioral effects of contaminants are important to assess in toxicology. - Abstract: Neonicotinoid insecticide usage has increased globally in recent decades. Neonicotinoids, such as imidacloprid, are potent insect neurotoxicants that may pose a threat to non-target aquatic organisms, such as aquatic insects. In nature, insects typically live in thermally fluctuating conditions, which may significantly alter both contaminant exposures and affects. Here we investigate the relationship between temperature and time-to-effect for imidacloprid toxicity with the aquatic insect Isonychia bicolor, a lotic mayfly. Additionally, we examined the mechanisms driving temperature-enhanced toxicity including metabolic rate, imidacloprid uptake rate, and tissue bioconcentration. Experiments included acute toxicity tests utilizing sublethal endpoints and mortality, as well as respirometry and radiotracer assays with ["1"4C] imidacloprid. Further, we conducted additional uptake experiments with a suite of aquatic invertebrates (including I. bicolor, Neocloeon triangulifer, Macaffertium modestum, Pteronarcys proteus, Acroneuria carolinensis, and Pleuroceridae sp) to confirm and contextualize our findings from initial experiments. The 96 h EC_5_0 (immobility) for I. bicolor at 15 °C was 5.81 μg/L which was approximately 3.2 fold lower than concentrations associated with 50% mortality. Assays examining the impact of temperature were conducted at 15, 18, 21, and 24 °C and demonstrated that time-to-effect for sublethal impairment and immobility was significantly decreased with increasing temperature. Uptake experiments with ["1"4C] imidacloprid revealed that initial uptake rates were significantly increased with
Can’t take the heat: Temperature-enhanced toxicity in the mayfly Isonychia bicolor exposed to the neonicotinoid insecticide imidacloprid

Energy Technology Data Exchange (ETDEWEB)

Camp, A.A., E-mail: aacamp@ncsu.edu; Buchwalter, D.B., E-mail: dbbuchwa@ncsu.edu

2016-09-15

Highlights: • Temperature has a strong modulating influence on toxicity in aquatic insects. • Increasing temperature decreased the time to onset of imidacloprid toxicity. • Increasing temperature increased the uptake rates of imidacloprid in different taxa. • Sublethal behavioral effects of contaminants are important to assess in toxicology. - Abstract: Neonicotinoid insecticide usage has increased globally in recent decades. Neonicotinoids, such as imidacloprid, are potent insect neurotoxicants that may pose a threat to non-target aquatic organisms, such as aquatic insects. In nature, insects typically live in thermally fluctuating conditions, which may significantly alter both contaminant exposures and affects. Here we investigate the relationship between temperature and time-to-effect for imidacloprid toxicity with the aquatic insect Isonychia bicolor, a lotic mayfly. Additionally, we examined the mechanisms driving temperature-enhanced toxicity including metabolic rate, imidacloprid uptake rate, and tissue bioconcentration. Experiments included acute toxicity tests utilizing sublethal endpoints and mortality, as well as respirometry and radiotracer assays with [{sup 14}C] imidacloprid. Further, we conducted additional uptake experiments with a suite of aquatic invertebrates (including I. bicolor, Neocloeon triangulifer, Macaffertium modestum, Pteronarcys proteus, Acroneuria carolinensis, and Pleuroceridae sp) to confirm and contextualize our findings from initial experiments. The 96 h EC{sub 50} (immobility) for I. bicolor at 15 °C was 5.81 μg/L which was approximately 3.2 fold lower than concentrations associated with 50% mortality. Assays examining the impact of temperature were conducted at 15, 18, 21, and 24 °C and demonstrated that time-to-effect for sublethal impairment and immobility was significantly decreased with increasing temperature. Uptake experiments with [{sup 14}C] imidacloprid revealed that initial uptake rates were significantly
Visualization for genomics: the Microbial Genome Viewer.

Science.gov (United States)

Kerkhoven, Robert; van Enckevort, Frank H J; Boekhorst, Jos; Molenaar, Douwe; Siezen, Roland J

2004-07-22

A Web-based visualization tool, the Microbial Genome Viewer, is presented that allows the user to combine complex genomic data in a highly interactive way. This Web tool enables the interactive generation of chromosome wheels and linear genome maps from genome annotation data stored in a MySQL database. The generated images are in scalable vector graphics (SVG) format, which is suitable for creating high-quality scalable images and dynamic Web representations. Gene-related data such as transcriptome and time-course microarray experiments can be superimposed on the maps for visual inspection. The Microbial Genome Viewer 1.0 is freely available at http://www.cmbi.kun.nl/MGV
G-InforBIO: integrated system for microbial genomics

Directory of Open Access Journals (Sweden)

Abe Takashi

2006-08-01

Full Text Available Abstract Background Genome databases contain diverse kinds of information, including gene annotations and nucleotide and amino acid sequences. It is not easy to integrate such information for genomic study. There are few tools for integrated analyses of genomic data, therefore, we developed software that enables users to handle, manipulate, and analyze genome data with a variety of sequence analysis programs. Results The G-InforBIO system is a novel tool for genome data management and sequence analysis. The system can import genome data encoded as eXtensible Markup Language documents as formatted text documents, including annotations and sequences, from DNA Data Bank of Japan and GenBank encoded as flat files. The genome database is constructed automatically after importing, and the database can be exported as documents formatted with eXtensible Markup Language or tab-deliminated text. Users can retrieve data from the database by keyword searches, edit annotation data of genes, and process data with G-InforBIO. In addition, information in the G-InforBIO database can be analyzed seamlessly with nine different software programs, including programs for clustering and homology analyses. Conclusion The G-InforBIO system simplifies genome analyses by integrating several available software programs to allow efficient handling and manipulation of genome data. G-InforBIO is freely available from the download site.
Phyllomedusa bicolor skin secretion and the Kambô ritual.

Science.gov (United States)

den Brave, Paul S; Bruins, Eugéne; Bronkhorst, Maarten W G A

2014-01-01

The ritual of Kambô or Sapo is a type of voluntary envenomation. During this purification ritual a shaman healer, from various South American countries, deliberately burns the right shoulder with a glowing stick from a fireplace. Excretions of Phyllomedusa bicolor (or Giant Leaf Frog, Kambô or Sapo) are then applied to these fresh wounds. This ritual is used as a means of purification of the body, supposedly brings luck to hunters, increases stamina and enhances physical and sexual strength. All the peripheral and most of the central effects of the secretion can be ascribed to the exceptionally high content of active peptides, easily absorbed through burned skin. This article describes the ritual and the bio-active peptides from the secretion.
Rapid detection of structural variation in a human genome using nanochannel-based genome mapping technology

DEFF Research Database (Denmark)

Cao, Hongzhi; Hastie, Alex R.; Cao, Dandan

2014-01-01

mutations; however, none of the current detection methods are comprehensive, and currently available methodologies are incapable of providing sufficient resolution and unambiguous information across complex regions in the human genome. To address these challenges, we applied a high-throughput, cost......-effective genome mapping technology to comprehensively discover genome-wide SVs and characterize complex regions of the YH genome using long single molecules (>150 kb) in a global fashion. RESULTS: Utilizing nanochannel-based genome mapping technology, we obtained 708 insertions/deletions and 17 inversions larger...... fosmid data. Of the remaining 270 SVs, 260 are insertions and 213 overlap known SVs in the Database of Genomic Variants. Overall, 609 out of 666 (90%) variants were supported by experimental orthogonal methods or historical evidence in public databases. At the same time, genome mapping also provides...
Field performance of Quercus bicolor established as repeatedly air-root-pruned container and bareroot planting stock

Science.gov (United States)

J.W." Jerry" Van Sambeek; Larry D. Godsey; William D. Walter; Harold E. Garrett; John P. Dwyer

2016-01-01

Benefits of repeated air-root-pruning of seedlings when stepping up to progressively larger containers include excellent lateral root distribution immediately below the root collar and an exceptionally fibrous root ball. To evaluate long-term field performance of repeatedly air-root-pruned container stock, three plantings of swamp white oak (Quercus bicolor...
The Human Gene Mutation Database: building a comprehensive mutation repository for clinical and molecular genetics, diagnostic testing and personalized genomic medicine.

Science.gov (United States)

Stenson, Peter D; Mort, Matthew; Ball, Edward V; Shaw, Katy; Phillips, Andrew; Cooper, David N

2014-01-01

The Human Gene Mutation Database (HGMD®) is a comprehensive collection of germline mutations in nuclear genes that underlie, or are associated with, human inherited disease. By June 2013, the database contained over 141,000 different lesions detected in over 5,700 different genes, with new mutation entries currently accumulating at a rate exceeding 10,000 per annum. HGMD was originally established in 1996 for the scientific study of mutational mechanisms in human genes. However, it has since acquired a much broader utility as a central unified disease-oriented mutation repository utilized by human molecular geneticists, genome scientists, molecular biologists, clinicians and genetic counsellors as well as by those specializing in biopharmaceuticals, bioinformatics and personalized genomics. The public version of HGMD (http://www.hgmd.org) is freely available to registered users from academic institutions/non-profit organizations whilst the subscription version (HGMD Professional) is available to academic, clinical and commercial users under license via BIOBASE GmbH.
Improving Microbial Genome Annotations in an Integrated Database Context

Science.gov (United States)

Chen, I-Min A.; Markowitz, Victor M.; Chu, Ken; Anderson, Iain; Mavromatis, Konstantinos; Kyrpides, Nikos C.; Ivanova, Natalia N.

2013-01-01

Effective comparative analysis of microbial genomes requires a consistent and complete view of biological data. Consistency regards the biological coherence of annotations, while completeness regards the extent and coverage of functional characterization for genomes. We have developed tools that allow scientists to assess and improve the consistency and completeness of microbial genome annotations in the context of the Integrated Microbial Genomes (IMG) family of systems. All publicly available microbial genomes are characterized in IMG using different functional annotation and pathway resources, thus providing a comprehensive framework for identifying and resolving annotation discrepancies. A rule based system for predicting phenotypes in IMG provides a powerful mechanism for validating functional annotations, whereby the phenotypic traits of an organism are inferred based on the presence of certain metabolic reactions and pathways and compared to experimentally observed phenotypes. The IMG family of systems are available at http://img.jgi.doe.gov/. PMID:23424620

Improving microbial genome annotations in an integrated database context.

Directory of Open Access Journals (Sweden)

I-Min A Chen

Full Text Available Effective comparative analysis of microbial genomes requires a consistent and complete view of biological data. Consistency regards the biological coherence of annotations, while completeness regards the extent and coverage of functional characterization for genomes. We have developed tools that allow scientists to assess and improve the consistency and completeness of microbial genome annotations in the context of the Integrated Microbial Genomes (IMG family of systems. All publicly available microbial genomes are characterized in IMG using different functional annotation and pathway resources, thus providing a comprehensive framework for identifying and resolving annotation discrepancies. A rule based system for predicting phenotypes in IMG provides a powerful mechanism for validating functional annotations, whereby the phenotypic traits of an organism are inferred based on the presence of certain metabolic reactions and pathways and compared to experimentally observed phenotypes. The IMG family of systems are available at http://img.jgi.doe.gov/.
Database Description - RGP physicalmap | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available classification Plant databases - Rice Database classification Sequence Physical map Organism Taxonomy Name: ...inobe Journal: Nature Genetics (1994) 8: 365-372. External Links: Article title: Physical Mapping of Rice Ch...rnal: DNA Research (1997) 4(2): 133-140. External Links: Article title: Physical Mapping of Rice Chromosomes... T Sasaki Journal: Genome Research (1996) 6(10): 935-942. External Links: Article title: Physical mapping of
IMG: the integrated microbial genomes database and comparative analysis system

Science.gov (United States)

Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Grechkin, Yuri; Ratner, Anna; Jacob, Biju; Huang, Jinghua; Williams, Peter; Huntemann, Marcel; Anderson, Iain; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

2012-01-01

The Integrated Microbial Genomes (IMG) system serves as a community resource for comparative analysis of publicly available genomes in a comprehensive integrated context. IMG integrates publicly available draft and complete genomes from all three domains of life with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and reviewing the annotations of genes and genomes in a comparative context. IMG's data content and analytical capabilities have been continuously extended through regular updates since its first release in March 2005. IMG is available at http://img.jgi.doe.gov. Companion IMG systems provide support for expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er), teaching courses and training in microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu) and analysis of genomes related to the Human Microbiome Project (IMG/HMP: http://www.hmpdacc-resources.org/img_hmp). PMID:22194640
GC-MS analysis, evaluation of phytochemicals, anti-oxidant, thrombolytic and anti-inflammatory activities of Exacum bicolor

Directory of Open Access Journals (Sweden)

Appaji Mahesh Ashwini

2015-12-01

Full Text Available The aim of the present study was to investigate the GC-MS analysis, phytochemical screening, anti-oxidant, thrombolytic and anti-inflammatory activities of methanol extract of leaves of Exacum bicolor. FTIR analysis confirmed the presence of alcohol, phenols, alkanes, aromatic compounds, aldehyde and ethers. GC-MS analysis revealed the presence of eight phyto-constituents. The total phenol, flavonoid and alkaloid contents were 18.0 ± 0.2 mg/GAE/g, 13.1 ± 0.4 mg QE/g and 108.0 ± 1.2 mg AE/g respectively. The DPPH assay exhibited potent anti-oxidant abilities with IC50 8.8 µg/mL. Significant thrombolytic activity was demonstrated by clot lysis method (45.1 ± 0.8%. The methanol extract showed significant membrane stabilization on human red blood cell with IC50 value of 37.4 µg/mL. There was a significant correlation (R2>0.98 with total phenolic content versus anti-oxidant and anti-inflammatory activity. The above results confirmed that E. bicolor could be a promising anti-oxidant, thrombolytic and anti-inflammatory agent.
phiGENOME: an integrative navigation throughout bacteriophage genomes.

Science.gov (United States)

Stano, Matej; Klucar, Lubos

2011-11-01

phiGENOME is a web-based genome browser generating dynamic and interactive graphical representation of phage genomes stored in the phiSITE, database of gene regulation in bacteriophages. phiGENOME is an integral part of the phiSITE web portal (http://www.phisite.org/phigenome) and it was optimised for visualisation of phage genomes with the emphasis on the gene regulatory elements. phiGENOME consists of three components: (i) genome map viewer built using Adobe Flash technology, providing dynamic and interactive graphical display of phage genomes; (ii) sequence browser based on precisely formatted HTML tags, providing detailed exploration of genome features on the sequence level and (iii) regulation illustrator, based on Scalable Vector Graphics (SVG) and designed for graphical representation of gene regulations. Bringing 542 complete genome sequences accompanied with their rich annotations and references, makes phiGENOME a unique information resource in the field of phage genomics. Copyright Â© 2011 Elsevier Inc. All rights reserved.
Lightweight genome viewer: portable software for browsing genomics data in its chromosomal context.

Science.gov (United States)

Faith, Jeremiah J; Olson, Andrew J; Gardner, Timothy S; Sachidanandam, Ravi

2007-09-18

Lightweight genome viewer (lwgv) is a web-based tool for visualization of sequence annotations in their chromosomal context. It performs most of the functions of larger genome browsers, while relying on standard flat-file formats and bypassing the database needs of most visualization tools. Visualization as an aide to discovery requires display of novel data in conjunction with static annotations in their chromosomal context. With database-based systems, displaying dynamic results requires temporary tables that need to be tracked for removal. lwgv simplifies the visualization of user-generated results on a local computer. The dynamic results of these analyses are written to transient files, which can import static content from a more permanent file. lwgv is currently used in many different applications, from whole genome browsers to single-gene RNAi design visualization, demonstrating its applicability in a large variety of contexts and scales. lwgv provides a lightweight alternative to large genome browsers for visualizing biological annotations and dynamic analyses in their chromosomal context. It is particularly suited for applications ranging from short sequences to medium-sized genomes when the creation and maintenance of a large software and database infrastructure is not necessary or desired.
Field damage of sorghum (Sorghum bicolor) with reduced lignin levels by naturally occurring insect pests and pathogens

Science.gov (United States)

Mutant lines of sorghum with low levels of lignin are potentially useful for bioenergy production, but may have problems with insects or disease. Field grown normal and low lignin bmr6 and bmr12 sorghum (Sorghum bicolor) were examined for insect and disease damage in the field, and insect damage in ...
Toxic hepatitis caused by the excretions of the Phyllomedusa bicolor frog – a case report

Science.gov (United States)

Pogorzelska, Joanna

2017-01-01

The Kambô ritual consists of various types of skin scarification and subsequent application of Phyllomedusa bicolor secretion to the fresh wounds. In Europe, the ritual of Kambô is becoming more popular, but its use can lead to serious multiple organ damage, sometimes life-threatening. Our manuscript shows a patient with toxic liver damage probably associated with the Kambô ritual. PMID:28856288
Data on the genome-wide identification of CNL R-genes in Setaria italica (L. P. Beauv.

Directory of Open Access Journals (Sweden)

Ethan J. Andersen

2017-08-01

Full Text Available We report data associated with the identification of 242 disease resistance genes (R-genes in the genome of Setaria italica as presented in “Genetic diversity of disease resistance genes in foxtail millet (Setaria italica L.” (Andersen and Nepal, 2017 [1]. Our data describe the structure and evolution of the Coiled-coil, Nucleotide-binding site, Leucine-rich repeat (CNL R-genes in foxtail millet. The CNL genes were identified through rigorous extraction and analysis of recently available plant genome sequences using cutting-edge analytical software. Data visualization includes gene structure diagrams, chromosomal syntenic maps, a chromosomal density plot, and a maximum-likelihood phylogenetic tree comparing Sorghum bicolor, Panicum virgatum, Setaria italica, and Arabidopsis thaliana. Compilation of InterProScan annotations, Gene Ontology (GO annotations, and Basic Local Alignment Search Tool (BLAST results for the 242 R-genes identified in the foxtail millet genome are also included in tabular format.
Un caso de leucismo en larvas de Phyllomedusa bicolor (Anura: Hylidae) reproducidas en cautiverio

OpenAIRE

Adán Villagarcia Olmeño; Enrique Cuevas

2014-01-01

En este artículo se reporta el primer caso de leucismo en Phyllomedusa bicolor (Boaddaert 1772). La larva leucística descrita fue resultado de un trabajo de manejo reproductivo en cautiverio, y fue fotografiada 97 días después de su eclosión, mostrando una coloración blanca uniforme a excepción de los reflejos dorados de los flancos y el tono oscuro de los ojos.
PGSB/MIPS Plant Genome Information Resources and Concepts for the Analysis of Complex Grass Genomes.

Science.gov (United States)

Spannagl, Manuel; Bader, Kai; Pfeifer, Matthias; Nussbaumer, Thomas; Mayer, Klaus F X

2016-01-01

PGSB (Plant Genome and Systems Biology; formerly MIPS-Munich Institute for Protein Sequences) has been involved in developing, implementing and maintaining plant genome databases for more than a decade. Genome databases and analysis resources have focused on individual genomes and aim to provide flexible and maintainable datasets for model plant genomes as a backbone against which experimental data, e.g., from high-throughput functional genomics, can be organized and analyzed. In addition, genomes from both model and crop plants form a scaffold for comparative genomics, assisted by specialized tools such as the CrowsNest viewer to explore conserved gene order (synteny) between related species on macro- and micro-levels.The genomes of many economically important Triticeae plants such as wheat, barley, and rye present a great challenge for sequence assembly and bioinformatic analysis due to their enormous complexity and large genome size. Novel concepts and strategies have been developed to deal with these difficulties and have been applied to the genomes of wheat, barley, rye, and other cereals. This includes the GenomeZipper concept, reference-guided exome assembly, and "chromosome genomics" based on flow cytometry sorted chromosomes.
Using SQL Databases for Sequence Similarity Searching and Analysis.

Science.gov (United States)

Pearson, William R; Mackey, Aaron J

2017-09-13

Relational databases can integrate diverse types of information and manage large sets of similarity search results, greatly simplifying genome-scale analyses. By focusing on taxonomic subsets of sequences, relational databases can reduce the size and redundancy of sequence libraries and improve the statistical significance of homologs. In addition, by loading similarity search results into a relational database, it becomes possible to explore and summarize the relationships between all of the proteins in an organism and those in other biological kingdoms. This unit describes how to use relational databases to improve the efficiency of sequence similarity searching and demonstrates various large-scale genomic analyses of homology-related data. It also describes the installation and use of a simple protein sequence database, seqdb_demo, which is used as a basis for the other protocols. The unit also introduces search_demo, a database that stores sequence similarity search results. The search_demo database is then used to explore the evolutionary relationships between E. coli proteins and proteins in other organisms in a large-scale comparative genomic analysis. © 2017 by John Wiley & Sons, Inc. Copyright © 2017 John Wiley & Sons, Inc.
Genomic characterization of large heterochromatic gaps in the human genome assembly.

Directory of Open Access Journals (Sweden)

Nicolas Altemose

2014-05-01

Full Text Available The largest gaps in the human genome assembly correspond to multi-megabase heterochromatic regions composed primarily of two related families of tandem repeats, Human Satellites 2 and 3 (HSat2,3. The abundance of repetitive DNA in these regions challenges standard mapping and assembly algorithms, and as a result, the sequence composition and potential biological functions of these regions remain largely unexplored. Furthermore, existing genomic tools designed to predict consensus-based descriptions of repeat families cannot be readily applied to complex satellite repeats such as HSat2,3, which lack a consistent repeat unit reference sequence. Here we present an alignment-free method to characterize complex satellites using whole-genome shotgun read datasets. Utilizing this approach, we classify HSat2,3 sequences into fourteen subfamilies and predict their chromosomal distributions, resulting in a comprehensive satellite reference database to further enable genomic studies of heterochromatic regions. We also identify 1.3 Mb of non-repetitive sequence interspersed with HSat2,3 across 17 unmapped assembly scaffolds, including eight annotated gene predictions. Finally, we apply our satellite reference database to high-throughput sequence data from 396 males to estimate array size variation of the predominant HSat3 array on the Y chromosome, confirming that satellite array sizes can vary between individuals over an order of magnitude (7 to 98 Mb and further demonstrating that array sizes are distributed differently within distinct Y haplogroups. In summary, we present a novel framework for generating initial reference databases for unassembled genomic regions enriched with complex satellite DNA, and we further demonstrate the utility of these reference databases for studying patterns of sequence variation within human populations.
FunCoup 3.0: database of genome-wide functional coupling networks.

Science.gov (United States)

Schmitt, Thomas; Ogris, Christoph; Sonnhammer, Erik L L

2014-01-01

We present an update of the FunCoup database (http://FunCoup.sbc.su.se) of functional couplings, or functional associations, between genes and gene products. Identifying these functional couplings is an important step in the understanding of higher level mechanisms performed by complex cellular processes. FunCoup distinguishes between four classes of couplings: participation in the same signaling cascade, participation in the same metabolic process, co-membership in a protein complex and physical interaction. For each of these four classes, several types of experimental and statistical evidence are combined by Bayesian integration to predict genome-wide functional coupling networks. The FunCoup framework has been completely re-implemented to allow for more frequent future updates. It contains many improvements, such as a regularization procedure to automatically downweight redundant evidences and a novel method to incorporate phylogenetic profile similarity. Several datasets have been updated and new data have been added in FunCoup 3.0. Furthermore, we have developed a new Web site, which provides powerful tools to explore the predicted networks and to retrieve detailed information about the data underlying each prediction.
ChickVD: a sequence variation database for the chicken genome

DEFF Research Database (Denmark)

Wang, Jing; He, Ximiao; Ruan, Jue

2005-01-01

Working in parallel with the efforts to sequence the chicken (Gallus gallus) genome, the Beijing Genomics Institute led an international team of scientists from China, USA, UK, Sweden, The Netherlands and Germany to map extensive DNA sequence variation throughout the chicken genome by sampling DN...... on quantitative trait loci using data from collaborating institutions and public resources. Our data can be queried by search engine and homology-based BLAST searches. ChickVD is publicly accessible at http://chicken.genomics.org.cn. Udgivelsesdato: 2005-Jan-1...
Un caso de leucismo en larvas de Phyllomedusa bicolor (Anura: Hylidae reproducidas en cautiverio

Directory of Open Access Journals (Sweden)

Adán Villagarcia Olmeño

2014-12-01

Full Text Available En este artículo se reporta el primer caso de leucismo en Phyllomedusa bicolor (Boaddaert 1772. La larva leucística descrita fue resultado de un trabajo de manejo reproductivo en cautiverio, y fue fotografiada 97 días después de su eclosión, mostrando una coloración blanca uniforme a excepción de los reflejos dorados de los flancos y el tono oscuro de los ojos.
Un caso de leucismo en larvas de Phyllomedusa bicolor (Anura: Hylidae reproducidas en cautiverio

Directory of Open Access Journals (Sweden)

Adan Villagarcía Olmeño

2015-01-01

Full Text Available En este artículo se reporta el primer caso de leucismo en Phyllomedusa bicolor (Boaddaert 1772. La larva leucística descrita fue resultado de un trabajo de manejo reproductivo en cautiverio, y fue fotografiada 97 días después de su eclosión, mostrando una coloración blanca uniforme a excepción de los reflejos dorados de los flancos y el tono oscuro de los ojos.
Preliminary characterization of mitochondrial genome of Melipona scutellaris, a Brazilian stingless bee.

Science.gov (United States)

Silverio, Manuella Souza; Rodovalho, Vinícius de Rezende; Bonetti, Ana Maria; de Oliveira, Guilherme Corrêa; Cuadros-Orellana, Sara; Ueira-Vieira, Carlos; Rodrigues dos Santos, Anderson

2014-01-01

Bees are manufacturers of relevant economical products and have a pollinator role fundamental to ecosystems. Traditionally, studies focused on the genus Melipona have been mostly based on behavioral, and social organization and ecological aspects. Only recently the evolutionary history of this genus has been assessed using molecular markers, including mitochondrial genes. Even though these studies have shed light on the evolutionary history of the Melipona genus, a more accurate picture may emerge when full nuclear and mitochondrial genomes of Melipona species become available. Here we present the assembly, annotation, and characterization of a draft mitochondrial genome of the Brazilian stingless bee Melipona scutellaris using Melipona bicolor as a reference organism. Using Illumina MiSeq data, we achieved the annotation of all protein coding genes, as well as the genes for the two ribosomal subunits (16S and 12S) and transfer RNA genes as well. Using the COI sequence as a DNA barcode, we found that M. cramptoni is the closest species to M. scutellaris.
Lightweight genome viewer: portable software for browsing genomics data in its chromosomal context

Directory of Open Access Journals (Sweden)

Gardner Timothy S

2007-09-01

Full Text Available Abstract Background Lightweight genome viewer (lwgv is a web-based tool for visualization of sequence annotations in their chromosomal context. It performs most of the functions of larger genome browsers, while relying on standard flat-file formats and bypassing the database needs of most visualization tools. Visualization as an aide to discovery requires display of novel data in conjunction with static annotations in their chromosomal context. With database-based systems, displaying dynamic results requires temporary tables that need to be tracked for removal. Results lwgv simplifies the visualization of user-generated results on a local computer. The dynamic results of these analyses are written to transient files, which can import static content from a more permanent file. lwgv is currently used in many different applications, from whole genome browsers to single-gene RNAi design visualization, demonstrating its applicability in a large variety of contexts and scales. Conclusion lwgv provides a lightweight alternative to large genome browsers for visualizing biological annotations and dynamic analyses in their chromosomal context. It is particularly suited for applications ranging from short sequences to medium-sized genomes when the creation and maintenance of a large software and database infrastructure is not necessary or desired.
A Bayesian method for identifying missing enzymes in predicted metabolic pathway databases

Directory of Open Access Journals (Sweden)

Karp Peter D

2004-06-01

Full Text Available Abstract Background The PathoLogic program constructs Pathway/Genome databases by using a genome's annotation to predict the set of metabolic pathways present in an organism. PathoLogic determines the set of reactions composing those pathways from the enzymes annotated in the organism's genome. Most annotation efforts fail to assign function to 40–60% of sequences. In addition, large numbers of sequences may have non-specific annotations (e.g., thiolase family protein. Pathway holes occur when a genome appears to lack the enzymes needed to catalyze reactions in a pathway. If a protein has not been assigned a specific function during the annotation process, any reaction catalyzed by that protein will appear as a missing enzyme or pathway hole in a Pathway/Genome database. Results We have developed a method that efficiently combines homology and pathway-based evidence to identify candidates for filling pathway holes in Pathway/Genome databases. Our program not only identifies potential candidate sequences for pathway holes, but combines data from multiple, heterogeneous sources to assess the likelihood that a candidate has the required function. Our algorithm emulates the manual sequence annotation process, considering not only evidence from homology searches, but also considering evidence from genomic context (i.e., is the gene part of an operon? and functional context (e.g., are there functionally-related genes nearby in the genome? to determine the posterior belief that a candidate has the required function. The method can be applied across an entire metabolic pathway network and is generally applicable to any pathway database. The program uses a set of sequences encoding the required activity in other genomes to identify candidate proteins in the genome of interest, and then evaluates each candidate by using a simple Bayes classifier to determine the probability that the candidate has the desired function. We achieved 71% precision at a

[Bipolaris bicolor (Mitra) Shoemaker: Species associated to folial spot in pupunha palm (Bactris gasipaes Kunth) in Brazil.].

Science.gov (United States)

Rodríguez-Morejón, K; Kimati, H; Fancelli, M I

1998-03-01

One species of hiphomycetos group, belonging to the genus Bipolaris Shoemaker that was identified like Bipolaris bicolor (Mitra) Shoemaker is recorded for the first time on pupunha palm (Bactris gasipaes Kunth) from Brazil. The comparison with other close species reported like pathogenic folial spot in genus Arecaceae is made. Its morphological and cultural characteristics are described.
Construction of an Ostrea edulis database from genomic and expressed sequence tags (ESTs) obtained from Bonamia ostreae infected haemocytes: Development of an immune-enriched oligo-microarray.

Science.gov (United States)

Pardo, Belén G; Álvarez-Dios, José Antonio; Cao, Asunción; Ramilo, Andrea; Gómez-Tato, Antonio; Planas, Josep V; Villalba, Antonio; Martínez, Paulino

2016-12-01

The flat oyster, Ostrea edulis, is one of the main farmed oysters, not only in Europe but also in the United States and Canada. Bonamiosis due to the parasite Bonamia ostreae has been associated with high mortality episodes in this species. This parasite is an intracellular protozoan that infects haemocytes, the main cells involved in oyster defence. Due to the economical and ecological importance of flat oyster, genomic data are badly needed for genetic improvement of the species, but they are still very scarce. The objective of this study is to develop a sequence database, OedulisDB, with new genomic and transcriptomic resources, providing new data and convenient tools to improve our knowledge of the oyster's immune mechanisms. Transcriptomic and genomic sequences were obtained using 454 pyrosequencing and compiled into an O. edulis database, OedulisDB, consisting of two sets of 10,318 and 7159 unique sequences that represent the oyster's genome (WG) and de novo haemocyte transcriptome (HT), respectively. The flat oyster transcriptome was obtained from two strains (naïve and tolerant) challenged with B. ostreae, and from their corresponding non-challenged controls. Approximately 78.5% of 5619 HT unique sequences were successfully annotated by Blast search using public databases. A total of 984 sequences were identified as being related to immune response and several key immune genes were identified for the first time in flat oyster. Additionally, transcriptome information was used to design and validate the first oligo-microarray in flat oyster enriched with immune sequences from haemocytes. Our transcriptomic and genomic sequencing and subsequent annotation have largely increased the scarce resources available for this economically important species and have enabled us to develop an OedulisDB database and accompanying tools for gene expression analysis. This study represents the first attempt to characterize in depth the O. edulis haemocyte transcriptome in
CyanoClust: comparative genome resources of cyanobacteria and plastids.

Science.gov (United States)

Sasaki, Naobumi V; Sato, Naoki

2010-01-01

Cyanobacteria, which perform oxygen-evolving photosynthesis as do chloroplasts of plants and algae, are one of the best-studied prokaryotic phyla and one from which many representative genomes have been sequenced. Lack of a suitable comparative genomic database has been a problem in cyanobacterial genomics because many proteins involved in physiological functions such as photosynthesis and nitrogen fixation are not catalogued in commonly used databases, such as Clusters of Orthologous Proteins (COG). CyanoClust is a database of homolog groups in cyanobacteria and plastids that are produced by the program Gclust. We have developed a web-server system for the protein homology database featuring cyanobacteria and plastids. Database URL: http://cyanoclust.c.u-tokyo.ac.jp/.
NeisseriaBase: a specialised Neisseria genomic resource and analysis platform.

Science.gov (United States)

Zheng, Wenning; Mutha, Naresh V R; Heydari, Hamed; Dutta, Avirup; Siow, Cheuk Chuen; Jakubovics, Nicholas S; Wee, Wei Yee; Tan, Shi Yang; Ang, Mia Yang; Wong, Guat Jah; Choo, Siew Woh

2016-01-01

Background. The gram-negative Neisseria is associated with two of the most potent human epidemic diseases: meningococcal meningitis and gonorrhoea. In both cases, disease is caused by bacteria colonizing human mucosal membrane surfaces. Overall, the genus shows great diversity and genetic variation mainly due to its ability to acquire and incorporate genetic material from a diverse range of sources through horizontal gene transfer. Although a number of databases exist for the Neisseria genomes, they are mostly focused on the pathogenic species. In this present study we present the freely available NeisseriaBase, a database dedicated to the genus Neisseria encompassing the complete and draft genomes of 15 pathogenic and commensal Neisseria species. Methods. The genomic data were retrieved from National Center for Biotechnology Information (NCBI) and annotated using the RAST server which were then stored into the MySQL database. The protein-coding genes were further analyzed to obtain information such as calculation of GC content (%), predicted hydrophobicity and molecular weight (Da) using in-house Perl scripts. The web application was developed following the secure four-tier web application architecture: (1) client workstation, (2) web server, (3) application server, and (4) database server. The web interface was constructed using PHP, JavaScript, jQuery, AJAX and CSS, utilizing the model-view-controller (MVC) framework. The in-house developed bioinformatics tools implemented in NeisseraBase were developed using Python, Perl, BioPerl and R languages. Results. Currently, NeisseriaBase houses 603,500 Coding Sequences (CDSs), 16,071 RNAs and 13,119 tRNA genes from 227 Neisseria genomes. The database is equipped with interactive web interfaces. Incorporation of the JBrowse genome browser in the database enables fast and smooth browsing of Neisseria genomes. NeisseriaBase includes the standard BLAST program to facilitate homology searching, and for Virulence Factor
NeisseriaBase: a specialised Neisseria genomic resource and analysis platform

Directory of Open Access Journals (Sweden)

Wenning Zheng

2016-03-01

Full Text Available Background. The gram-negative Neisseria is associated with two of the most potent human epidemic diseases: meningococcal meningitis and gonorrhoea. In both cases, disease is caused by bacteria colonizing human mucosal membrane surfaces. Overall, the genus shows great diversity and genetic variation mainly due to its ability to acquire and incorporate genetic material from a diverse range of sources through horizontal gene transfer. Although a number of databases exist for the Neisseria genomes, they are mostly focused on the pathogenic species. In this present study we present the freely available NeisseriaBase, a database dedicated to the genus Neisseria encompassing the complete and draft genomes of 15 pathogenic and commensal Neisseria species. Methods. The genomic data were retrieved from National Center for Biotechnology Information (NCBI and annotated using the RAST server which were then stored into the MySQL database. The protein-coding genes were further analyzed to obtain information such as calculation of GC content (%, predicted hydrophobicity and molecular weight (Da using in-house Perl scripts. The web application was developed following the secure four-tier web application architecture: (1 client workstation, (2 web server, (3 application server, and (4 database server. The web interface was constructed using PHP, JavaScript, jQuery, AJAX and CSS, utilizing the model-view-controller (MVC framework. The in-house developed bioinformatics tools implemented in NeisseraBase were developed using Python, Perl, BioPerl and R languages. Results. Currently, NeisseriaBase houses 603,500 Coding Sequences (CDSs, 16,071 RNAs and 13,119 tRNA genes from 227 Neisseria genomes. The database is equipped with interactive web interfaces. Incorporation of the JBrowse genome browser in the database enables fast and smooth browsing of Neisseria genomes. NeisseriaBase includes the standard BLAST program to facilitate homology searching, and for Virulence
Tandemly arranged chalcone synthase A genes contribute to the spatially regulated expression of siRNA and the natural bicolor floral phenotype in Petunia hybrida.

Science.gov (United States)

Morita, Yasumasa; Saito, Ryoko; Ban, Yusuke; Tanikawa, Natsu; Kuchitsu, Kazuyuki; Ando, Toshio; Yoshikawa, Manabu; Habu, Yoshiki; Ozeki, Yoshihiro; Nakayama, Masayoshi

2012-06-01

The natural bicolor floral traits of the horticultural petunia (Petunia hybrida) cultivars Picotee and Star are caused by the spatial repression of the chalcone synthase A (CHS-A) gene, which encodes an anthocyanin biosynthetic enzyme. Here we show that Picotee and Star petunias carry the same short interfering RNA (siRNA)-producing locus, consisting of two intact CHS-A copies, PhCHS-A1 and PhCHS-A2, in a tandem head-to-tail orientation. The precursor CHS mRNAs are transcribed from the two CHS-A copies throughout the bicolored petals, but the mature CHS mRNAs are not found in the white tissues. An analysis of small RNAs revealed the accumulation of siRNAs of 21 nucleotides that originated from the exon 2 region of both CHS-A copies. This accumulation is closely correlated with the disappearance of the CHS mRNAs, indicating that the bicolor floral phenotype is caused by the spatially regulated post-transcriptional silencing of both CHS-A genes. Linkage between the tandemly arranged CHS-A allele and the bicolor floral trait indicates that the CHS-A allele is a necessary factor to confer the trait. We suppose that the spatially regulated production of siRNAs in Picotee and Star flowers is triggered by another putative regulatory locus, and that the silencing mechanism in this case may be different from other known mechanisms of post-transcriptional gene silencing in plants. A sequence analysis of wild Petunia species indicated that these tandem CHS-A genes originated from Petunia integrifolia and/or Petunia inflata, the parental species of P. hybrida, as a result of a chromosomal rearrangement rather than a gene duplication event. © 2012 The Authors. The Plant Journal © 2012 Blackwell Publishing Ltd.
gb4gv: a genome browser for geminivirus

Directory of Open Access Journals (Sweden)

Eric S. Ho

2017-04-01

Full Text Available Background Geminiviruses (family Geminiviridae are prevalent plant viruses that imperil agriculture globally, causing serious damage to the livelihood of farmers, particularly in developing countries. The virus evolves rapidly, attributing to its single-stranded genome propensity, resulting in worldwide circulation of diverse and viable genomes. Genomics is a prominent approach taken by researchers in elucidating the infectious mechanism of the virus. Currently, the NCBI Viral Genome website is a popular repository of viral genomes that conveniently provides researchers a centralized data source of genomic information. However, unlike the genome of living organisms, viral genomes most often maintain peculiar characteristics that fit into no single genome architecture. By imposing a unified annotation scheme on the myriad of viral genomes may downplay their hallmark features. For example, the viron of begomoviruses prevailing in America encapsulates two similar-sized circular DNA components and both are required for systemic infection of plants. However, the bipartite components are kept separately in NCBI as individual genomes with no explicit association in linking them. Thus, our goal is to build a comprehensive Geminivirus genomics database, namely gb4gv, that not only preserves genomic characteristics of the virus, but also supplements biologically relevant annotations that help to interrogate this virus, for example, the targeted host, putative iterons, siRNA targets, etc. Methods We have employed manual and automatic methods to curate 508 genomes from four major genera of Geminiviridae, and 161 associated satellites obtained from NCBI RefSeq and PubMed databases. Results These data are available for free access without registration from our website. Besides genomic content, our website provides visualization capability inherited from UCSC Genome Browser. Discussion With the genomic information readily accessible, we hope that our database
REDIdb: the RNA editing database.

Science.gov (United States)

Picardi, Ernesto; Regina, Teresa Maria Rosaria; Brennicke, Axel; Quagliariello, Carla

2007-01-01

The RNA Editing Database (REDIdb) is an interactive, web-based database created and designed with the aim to allocate RNA editing events such as substitutions, insertions and deletions occurring in a wide range of organisms. The database contains both fully and partially sequenced DNA molecules for which editing information is available either by experimental inspection (in vitro) or by computational detection (in silico). Each record of REDIdb is organized in a specific flat-file containing a description of the main characteristics of the entry, a feature table with the editing events and related details and a sequence zone with both the genomic sequence and the corresponding edited transcript. REDIdb is a relational database in which the browsing and identification of editing sites has been simplified by means of two facilities to either graphically display genomic or cDNA sequences or to show the corresponding alignment. In both cases, all editing sites are highlighted in colour and their relative positions are detailed by mousing over. New editing positions can be directly submitted to REDIdb after a user-specific registration to obtain authorized secure access. This first version of REDIdb database stores 9964 editing events and can be freely queried at http://biologia.unical.it/py_script/search.html.
First record of Triaenodes bicolor (Curtis, 1834) (Insecta: Trichoptera) from the Ecoregion Hellenic Western Balkans

OpenAIRE

Ibrahimi, Halil; Kuçi, Ruzhdi; Bilalli, Astrit; Gashi, Ermira

2017-01-01

We collected adult caddisfly specimens with entomological nets and ultraviolet light traps monthly from May to November 2012 in Brezne Lake situated in Dragash Municipality. During this investigation we found the Leptocerid species Triaenodes bicolor for the first time in Kosovo; it is also the first record for Ecoregion 6, Hellenic Western Balkans. Additionally, this is the first record of the genus Triaenodes from Kosovo. In total seven males and three females of this species were found. Tr...
Techno-politics of genomic nationalism: tracing genomics and its use in drug regulation in Japan and Taiwan.

Science.gov (United States)

Kuo, Wen-Hua

2011-10-01

This paper compares the development of genomics as a form of state project in Japan and Taiwan. Broadening the concepts of genomic sovereignty and bionationalism, I argue that the establishment and use of genomic databases vary according to techno-political context. While both Japan and Taiwan hold population-based databases to be necessary for scientific advance and competitiveness, they differ in how they have attempted to transform the information produced by databases into regulatory schemes for drug approval. The effectiveness of Taiwan's biobank is severely limited by the IRB reviewing process. By contrast, while updating its regulations for drug approval, Japan, is using pharmacogenomics to deal with matters relating to ethnic identity. By analysing genomic initiatives in the political context that nurtures them, this paper seeks to capture how global science and local societies interact and offers insight into the assessment of state-sponsored science in East Asia as they become transnational. Copyright © 2011 Elsevier Ltd. All rights reserved.
A Guide to the PLAZA 3.0 Plant Comparative Genomic Database.

Science.gov (United States)

Vandepoele, Klaas

2017-01-01

PLAZA 3.0 is an online resource for comparative genomics and offers a versatile platform to study gene functions and gene families or to analyze genome organization and evolution in the green plant lineage. Starting from genome sequence information for over 35 plant species, precomputed comparative genomic data sets cover homologous gene families, multiple sequence alignments, phylogenetic trees, and genomic colinearity information within and between species. Complementary functional data sets, a Workbench, and interactive visualization tools are available through a user-friendly web interface, making PLAZA an excellent starting point to translate sequence or omics data sets into biological knowledge. PLAZA is available at http://bioinformatics.psb.ugent.be/plaza/ .
SIGMA: A System for Integrative Genomic Microarray Analysis of Cancer Genomes

Directory of Open Access Journals (Sweden)

Davies Jonathan J

2006-12-01

Full Text Available Abstract Background The prevalence of high resolution profiling of genomes has created a need for the integrative analysis of information generated from multiple methodologies and platforms. Although the majority of data in the public domain are gene expression profiles, and expression analysis software are available, the increase of array CGH studies has enabled integration of high throughput genomic and gene expression datasets. However, tools for direct mining and analysis of array CGH data are limited. Hence, there is a great need for analytical and display software tailored to cross platform integrative analysis of cancer genomes. Results We have created a user-friendly java application to facilitate sophisticated visualization and analysis such as cross-tumor and cross-platform comparisons. To demonstrate the utility of this software, we assembled array CGH data representing Affymetrix SNP chip, Stanford cDNA arrays and whole genome tiling path array platforms for cross comparison. This cancer genome database contains 267 profiles from commonly used cancer cell lines representing 14 different tissue types. Conclusion In this study we have developed an application for the visualization and analysis of data from high resolution array CGH platforms that can be adapted for analysis of multiple types of high throughput genomic datasets. Furthermore, we invite researchers using array CGH technology to deposit both their raw and processed data, as this will be a continually expanding database of cancer genomes. This publicly available resource, the System for Integrative Genomic Microarray Analysis (SIGMA of cancer genomes, can be accessed at http://sigma.bccrc.ca.
Virus Database and Online Inquiry System Based on Natural Vectors.

Science.gov (United States)

Dong, Rui; Zheng, Hui; Tian, Kun; Yau, Shek-Chung; Mao, Weiguang; Yu, Wenping; Yin, Changchuan; Yu, Chenglong; He, Rong Lucy; Yang, Jie; Yau, Stephen St

2017-01-01

We construct a virus database called VirusDB (http://yaulab.math.tsinghua.edu.cn/VirusDB/) and an online inquiry system to serve people who are interested in viral classification and prediction. The database stores all viral genomes, their corresponding natural vectors, and the classification information of the single/multiple-segmented viral reference sequences downloaded from National Center for Biotechnology Information. The online inquiry system serves the purpose of computing natural vectors and their distances based on submitted genomes, providing an online interface for accessing and using the database for viral classification and prediction, and back-end processes for automatic and manual updating of database content to synchronize with GenBank. Submitted genomes data in FASTA format will be carried out and the prediction results with 5 closest neighbors and their classifications will be returned by email. Considering the one-to-one correspondence between sequence and natural vector, time efficiency, and high accuracy, natural vector is a significant advance compared with alignment methods, which makes VirusDB a useful database in further research.
Genomic prediction applied to high-biomass sorghum for bioenergy production.

Science.gov (United States)

de Oliveira, Amanda Avelar; Pastina, Maria Marta; de Souza, Vander Filipe; da Costa Parrella, Rafael Augusto; Noda, Roberto Willians; Simeone, Maria Lúcia Ferreira; Schaffert, Robert Eugene; de Magalhães, Jurandir Vieira; Damasceno, Cynthia Maria Borges; Margarido, Gabriel Rodrigues Alves

2018-01-01

The increasing cost of energy and finite oil and gas reserves have created a need to develop alternative fuels from renewable sources. Due to its abiotic stress tolerance and annual cultivation, high-biomass sorghum ( Sorghum bicolor L. Moench) shows potential as a bioenergy crop. Genomic selection is a useful tool for accelerating genetic gains and could restructure plant breeding programs by enabling early selection and reducing breeding cycle duration. This work aimed at predicting breeding values via genomic selection models for 200 sorghum genotypes comprising landrace accessions and breeding lines from biomass and saccharine groups. These genotypes were divided into two sub-panels, according to breeding purpose. We evaluated the following phenotypic biomass traits: days to flowering, plant height, fresh and dry matter yield, and fiber, cellulose, hemicellulose, and lignin proportions. Genotyping by sequencing yielded more than 258,000 single-nucleotide polymorphism markers, which revealed population structure between subpanels. We then fitted and compared genomic selection models BayesA, BayesB, BayesCπ, BayesLasso, Bayes Ridge Regression and random regression best linear unbiased predictor. The resulting predictive abilities varied little between the different models, but substantially between traits. Different scenarios of prediction showed the potential of using genomic selection results between sub-panels and years, although the genotype by environment interaction negatively affected accuracies. Functional enrichment analyses performed with the marker-predicted effects suggested several interesting associations, with potential for revealing biological processes relevant to the studied quantitative traits. This work shows that genomic selection can be successfully applied in biomass sorghum breeding programs.
A geographically-diverse collection of 418 human gut microbiome pathway genome databases

KAUST Repository

Hahn, Aria S.

2017-04-11

Advances in high-throughput sequencing are reshaping how we perceive microbial communities inhabiting the human body, with implications for therapeutic interventions. Several large-scale datasets derived from hundreds of human microbiome samples sourced from multiple studies are now publicly available. However, idiosyncratic data processing methods between studies introduce systematic differences that confound comparative analyses. To overcome these challenges, we developed GutCyc, a compendium of environmental pathway genome databases (ePGDBs) constructed from 418 assembled human microbiome datasets using MetaPathways, enabling reproducible functional metagenomic annotation. We also generated metabolic network reconstructions for each metagenome using the Pathway Tools software, empowering researchers and clinicians interested in visualizing and interpreting metabolic pathways encoded by the human gut microbiome. For the first time, GutCyc provides consistent annotations and metabolic pathway predictions, making possible comparative community analyses between health and disease states in inflammatory bowel disease, Crohn’s disease, and type 2 diabetes. GutCyc data products are searchable online, or may be downloaded and explored locally using MetaPathways and Pathway Tools.
Databases and web tools for cancer genomics study.

Science.gov (United States)

Yang, Yadong; Dong, Xunong; Xie, Bingbing; Ding, Nan; Chen, Juan; Li, Yongjun; Zhang, Qian; Qu, Hongzhu; Fang, Xiangdong

2015-02-01

Publicly-accessible resources have promoted the advance of scientific discovery. The era of genomics and big data has brought the need for collaboration and data sharing in order to make effective use of this new knowledge. Here, we describe the web resources for cancer genomics research and rate them on the basis of the diversity of cancer types, sample size, omics data comprehensiveness, and user experience. The resources reviewed include data repository and analysis tools; and we hope such introduction will promote the awareness and facilitate the usage of these resources in the cancer research community. Copyright © 2015 The Authors. Production and hosting by Elsevier Ltd.. All rights reserved.
Global Metabolic Reconstruction and Metabolic Gene Evolution in the Cattle Genome

Science.gov (United States)

Kim, Woonsu; Park, Hyesun; Seo, Seongwon

2016-01-01

The sequence of cattle genome provided a valuable opportunity to systematically link genetic and metabolic traits of cattle. The objectives of this study were 1) to reconstruct genome-scale cattle-specific metabolic pathways based on the most recent and updated cattle genome build and 2) to identify duplicated metabolic genes in the cattle genome for better understanding of metabolic adaptations in cattle. A bioinformatic pipeline of an organism for amalgamating genomic annotations from multiple sources was updated. Using this, an amalgamated cattle genome database based on UMD_3.1, was created. The amalgamated cattle genome database is composed of a total of 33,292 genes: 19,123 consensus genes between NCBI and Ensembl databases, 8,410 and 5,493 genes only found in NCBI or Ensembl, respectively, and 266 genes from NCBI scaffolds. A metabolic reconstruction of the cattle genome and cattle pathway genome database (PGDB) was also developed using Pathway Tools, followed by an intensive manual curation. The manual curation filled or revised 68 pathway holes, deleted 36 metabolic pathways, and added 23 metabolic pathways. Consequently, the curated cattle PGDB contains 304 metabolic pathways, 2,460 reactions including 2,371 enzymatic reactions, and 4,012 enzymes. Furthermore, this study identified eight duplicated genes in 12 metabolic pathways in the cattle genome compared to human and mouse. Some of these duplicated genes are related with specific hormone biosynthesis and detoxifications. The updated genome-scale metabolic reconstruction is a useful tool for understanding biology and metabolic characteristics in cattle. There has been significant improvements in the quality of cattle genome annotations and the MetaCyc database. The duplicated metabolic genes in the cattle genome compared to human and mouse implies evolutionary changes in the cattle genome and provides a useful information for further research on understanding metabolic adaptations of cattle. PMID
Amphidinolide P from the Brazilian octocoral Stragulum bicolor

Directory of Open Access Journals (Sweden)

Thiciana S. Sousa

Full Text Available Abstract Dinoflagellates are an important source of unique bioactive secondary metabolites. Symbiotic species, commonly named zooxanthellae, transfer most of their photosynthetically fixed carbon to their host. The mutualistic relationship provides the organic metabolites used for energy production but there are very few reports of the role of the dinoflagellates in the production of secondary metabolites in the symbiotic association. Corals and other related cnidarians are the most well-known animals containing symbiotic dinoflagellates. In the present paper we describe the isolation of amphidinolide P (1 from the octocoral Stragulum bicolor and its prey, the nudibranch Marionia limceana, collected off the coasts of Fortaleza (Ceará, Brazil. The coral extracts also contained 3-O-methyl derivative (2 of amphidinolide P, together with minor compounds still under investigation. Amphidinolides have been so far reported only in laboratory cultures of Amphidinium sp., thus compounds 1 and 2 represents the first identification of these polyketides in invertebrates. The finding proves the possibility to isolate amphidinolides from a natural symbiosis, enabling further biological and biotechnological studies.
Volatile compounds from beneficial or pathogenic bacteria differentially regulate root exudation, transcription of iron transporters, and defense signaling pathways in Sorghum bicolor.

Science.gov (United States)

Hernández-Calderón, Erasto; Aviles-Garcia, Maria Elizabeth; Castulo-Rubio, Diana Yazmín; Macías-Rodríguez, Lourdes; Ramírez, Vicente Montejano; Santoyo, Gustavo; López-Bucio, José; Valencia-Cantero, Eduardo

2018-02-01

Our results show that Sorghum bicolor is able to recognize bacteria through its volatile compounds and differentially respond to beneficial or pathogens via eliciting nutritional or defense adaptive traits. Plants establish beneficial, harmful, or neutral relationships with bacteria. Plant growth promoting rhizobacteria (PGPR) emit volatile compounds (VCs), which may act as molecular cues influencing plant development, nutrition, and/or defense. In this study, we compared the effects of VCs produced by bacteria with different lifestyles, including Arthrobacter agilis UMCV2, Bacillus methylotrophicus M4-96, Sinorhizobium meliloti 1021, the plant pathogen Pseudomonas aeruginosa PAO1, and the commensal rhizobacterium Bacillus sp. L2-64, on S. bicolor. We show that VCs from all tested bacteria, except Bacillus sp. L2-64, increased biomass and chlorophyll content, and improved root architecture, but notheworthy A. agilis induced the release of attractant molecules, whereas P. aeruginosa activated the exudation of growth inhibitory compounds by roots. An analysis of the expression of iron-transporters SbIRT1, SbIRT2, SbYS1, and SbYS2 and genes related to plant defense pathways COI1 and PR-1 indicated that beneficial, pathogenic, and commensal bacteria could up-regulate iron transporters, whereas only beneficial and pathogenic species could induce a defense response. These results show how S. bicolor could recognize bacteria through their volatiles profiles and highlight that PGPR or pathogens can elicit nutritional or defensive traits in plants.
A genome browser database for rice (Oryza sativa) and Chinese ...

African Journals Online (AJOL)

STORAGESEVER

2009-10-19

Oct 19, 2009 ... sativa) and Chinese cabbage (Brassica rapa) genomes. The genome ... tant staple food for a large part of the world's human population. .... some banding region for selection and the overview panel shows the location of ...

BGDB: a database of bivalent genes.

Science.gov (United States)

Li, Qingyan; Lian, Shuabin; Dai, Zhiming; Xiang, Qian; Dai, Xianhua

2013-01-01

Bivalent gene is a gene marked with both H3K4me3 and H3K27me3 epigenetic modification in the same area, and is proposed to play a pivotal role related to pluripotency in embryonic stem (ES) cells. Identification of these bivalent genes and understanding their functions are important for further research of lineage specification and embryo development. So far, lots of genome-wide histone modification data were generated in mouse and human ES cells. These valuable data make it possible to identify bivalent genes, but no comprehensive data repositories or analysis tools are available for bivalent genes currently. In this work, we develop BGDB, the database of bivalent genes. The database contains 6897 bivalent genes in human and mouse ES cells, which are manually collected from scientific literature. Each entry contains curated information, including genomic context, sequences, gene ontology and other relevant information. The web services of BGDB database were implemented with PHP + MySQL + JavaScript, and provide diverse query functions. Database URL: http://dailab.sysu.edu.cn/bgdb/
LCGbase: A Comprehensive Database for Lineage-Based Co-regulated Genes.

Science.gov (United States)

Wang, Dapeng; Zhang, Yubin; Fan, Zhonghua; Liu, Guiming; Yu, Jun

2012-01-01

Animal genes of different lineages, such as vertebrates and arthropods, are well-organized and blended into dynamic chromosomal structures that represent a primary regulatory mechanism for body development and cellular differentiation. The majority of genes in a genome are actually clustered, which are evolutionarily stable to different extents and biologically meaningful when evaluated among genomes within and across lineages. Until now, many questions concerning gene organization, such as what is the minimal number of genes in a cluster and what is the driving force leading to gene co-regulation, remain to be addressed. Here, we provide a user-friendly database-LCGbase (a comprehensive database for lineage-based co-regulated genes)-hosting information on evolutionary dynamics of gene clustering and ordering within animal kingdoms in two different lineages: vertebrates and arthropods. The database is constructed on a web-based Linux-Apache-MySQL-PHP framework and effective interactive user-inquiry service. Compared to other gene annotation databases with similar purposes, our database has three comprehensible advantages. First, our database is inclusive, including all high-quality genome assemblies of vertebrates and representative arthropod species. Second, it is human-centric since we map all gene clusters from other genomes in an order of lineage-ranks (such as primates, mammals, warm-blooded, and reptiles) onto human genome and start the database from well-defined gene pairs (a minimal cluster where the two adjacent genes are oriented as co-directional, convergent, and divergent pairs) to large gene clusters. Furthermore, users can search for any adjacent genes and their detailed annotations. Third, the database provides flexible parameter definitions, such as the distance of transcription start sites between two adjacent genes, which is extendable to genes that flanking the cluster across species. We also provide useful tools for sequence alignment, gene
TabSQL: a MySQL tool to facilitate mapping user data to public databases.

Science.gov (United States)

Xia, Xiao-Qin; McClelland, Michael; Wang, Yipeng

2010-06-23

With advances in high-throughput genomics and proteomics, it is challenging for biologists to deal with large data files and to map their data to annotations in public databases. We developed TabSQL, a MySQL-based application tool, for viewing, filtering and querying data files with large numbers of rows. TabSQL provides functions for downloading and installing table files from public databases including the Gene Ontology database (GO), the Ensembl databases, and genome databases from the UCSC genome bioinformatics site. Any other database that provides tab-delimited flat files can also be imported. The downloaded gene annotation tables can be queried together with users' data in TabSQL using either a graphic interface or command line. TabSQL allows queries across the user's data and public databases without programming. It is a convenient tool for biologists to annotate and enrich their data.
MASiVEdb: the Sirevirus Plant Retrotransposon Database

Directory of Open Access Journals (Sweden)

Bousios Alexandros

2012-04-01

Full Text Available Abstract Background Sireviruses are an ancient genus of the Copia superfamily of LTR retrotransposons, and the only one that has exclusively proliferated within plant genomes. Based on experimental data and phylogenetic analyses, Sireviruses have successfully infiltrated many branches of the plant kingdom, extensively colonizing the genomes of grass species. Notably, it was recently shown that they have been a major force in the make-up and evolution of the maize genome, where they currently occupy ~21% of the nuclear content and ~90% of the Copia population. It is highly likely, therefore, that their life dynamics have been fundamental in the genome composition and organization of a plethora of plant hosts. To assist studies into their impact on plant genome evolution and also facilitate accurate identification and annotation of transposable elements in sequencing projects, we developed MASiVEdb (Mapping and Analysis of SireVirus Elements Database, a collective and systematic resource of Sireviruses in plants. Description Taking advantage of the increasing availability of plant genomic sequences, and using an updated version of MASiVE, an algorithm specifically designed to identify Sireviruses based on their highly conserved genome structure, we populated MASiVEdb (http://bat.infspire.org/databases/masivedb/ with data on 16,243 intact Sireviruses (total length >158Mb discovered in 11 fully-sequenced plant genomes. MASiVEdb is unlike any other transposable element database, providing a multitude of highly curated and detailed information on a specific genus across its hosts, such as complete set of coordinates, insertion age, and an analytical breakdown of the structure and gene complement of each element. All data are readily available through basic and advanced query interfaces, batch retrieval, and downloadable files. A purpose-built system is also offered for detecting and visualizing similarity between user sequences and Sireviruses, as
Autism genetic database (AGD: a comprehensive database including autism susceptibility gene-CNVs integrated with known noncoding RNAs and fragile sites

Directory of Open Access Journals (Sweden)

Talebizadeh Zohreh

2009-09-01

Full Text Available Abstract Background Autism is a highly heritable complex neurodevelopmental disorder, therefore identifying its genetic basis has been challenging. To date, numerous susceptibility genes and chromosomal abnormalities have been reported in association with autism, but most discoveries either fail to be replicated or account for a small effect. Thus, in most cases the underlying causative genetic mechanisms are not fully understood. In the present work, the Autism Genetic Database (AGD was developed as a literature-driven, web-based, and easy to access database designed with the aim of creating a comprehensive repository for all the currently reported genes and genomic copy number variations (CNVs associated with autism in order to further facilitate the assessment of these autism susceptibility genetic factors. Description AGD is a relational database that organizes data resulting from exhaustive literature searches for reported susceptibility genes and CNVs associated with autism. Furthermore, genomic information about human fragile sites and noncoding RNAs was also downloaded and parsed from miRBase, snoRNA-LBME-db, piRNABank, and the MIT/ICBP siRNA database. A web client genome browser enables viewing of the features while a web client query tool provides access to more specific information for the features. When applicable, links to external databases including GenBank, PubMed, miRBase, snoRNA-LBME-db, piRNABank, and the MIT siRNA database are provided. Conclusion AGD comprises a comprehensive list of susceptibility genes and copy number variations reported to-date in association with autism, as well as all known human noncoding RNA genes and fragile sites. Such a unique and inclusive autism genetic database will facilitate the evaluation of autism susceptibility factors in relation to known human noncoding RNAs and fragile sites, impacting on human diseases. As a result, this new autism database offers a valuable tool for the research
Toward genome-enabled mycology.

Science.gov (United States)

Hibbett, David S; Stajich, Jason E; Spatafora, Joseph W

2013-01-01

Genome-enabled mycology is a rapidly expanding field that is characterized by the pervasive use of genome-scale data and associated computational tools in all aspects of fungal biology. Genome-enabled mycology is integrative and often requires teams of researchers with diverse skills in organismal mycology, bioinformatics and molecular biology. This issue of Mycologia presents the first complete fungal genomes in the history of the journal, reflecting the ongoing transformation of mycology into a genome-enabled science. Here, we consider the prospects for genome-enabled mycology and the technical and social challenges that will need to be overcome to grow the database of complete fungal genomes and enable all fungal biologists to make use of the new data.
SoyDB: a knowledge database of soybean transcription factors

Directory of Open Access Journals (Sweden)

Valliyodan Babu

2010-01-01

Full Text Available Abstract Background Transcription factors play the crucial rule of regulating gene expression and influence almost all biological processes. Systematically identifying and annotating transcription factors can greatly aid further understanding their functions and mechanisms. In this article, we present SoyDB, a user friendly database containing comprehensive knowledge of soybean transcription factors. Description The soybean genome was recently sequenced by the Department of Energy-Joint Genome Institute (DOE-JGI and is publicly available. Mining of this sequence identified 5,671 soybean genes as putative transcription factors. These genes were comprehensively annotated as an aid to the soybean research community. We developed SoyDB - a knowledge database for all the transcription factors in the soybean genome. The database contains protein sequences, predicted tertiary structures, putative DNA binding sites, domains, homologous templates in the Protein Data Bank (PDB, protein family classifications, multiple sequence alignments, consensus protein sequence motifs, web logo of each family, and web links to the soybean transcription factor database PlantTFDB, known EST sequences, and other general protein databases including Swiss-Prot, Gene Ontology, KEGG, EMBL, TAIR, InterPro, SMART, PROSITE, NCBI, and Pfam. The database can be accessed via an interactive and convenient web server, which supports full-text search, PSI-BLAST sequence search, database browsing by protein family, and automatic classification of a new protein sequence into one of 64 annotated transcription factor families by hidden Markov models. Conclusions A comprehensive soybean transcription factor database was constructed and made publicly accessible at http://casp.rnet.missouri.edu/soydb/.
A database and API for variation, dense genotyping and resequencing data

Directory of Open Access Journals (Sweden)

Flicek Paul

2010-05-01

Full Text Available Abstract Background Advances in sequencing and genotyping technologies are leading to the widespread availability of multi-species variation data, dense genotype data and large-scale resequencing projects. The 1000 Genomes Project and similar efforts in other species are challenging the methods previously used for storage and manipulation of such data necessitating the redesign of existing genome-wide bioinformatics resources. Results Ensembl has created a database and software library to support data storage, analysis and access to the existing and emerging variation data from large mammalian and vertebrate genomes. These tools scale to thousands of individual genome sequences and are integrated into the Ensembl infrastructure for genome annotation and visualisation. The database and software system is easily expanded to integrate both public and non-public data sources in the context of an Ensembl software installation and is already being used outside of the Ensembl project in a number of database and application environments. Conclusions Ensembl's powerful, flexible and open source infrastructure for the management of variation, genotyping and resequencing data is freely available at http://www.ensembl.org.
Final Technical Report on the Genome Sequence DataBase (GSDB): DE-FG03 95 ER 62062 September 1997-September 1999

Energy Technology Data Exchange (ETDEWEB)

Harger, Carol A.

1999-10-28

Since September 1997 NCGR has produced two web-based tools for researchers to use to access and analyze data in the Genome Sequence DataBase (GSDB). These tools are: Sequence Viewer, a nucleotide sequence and annotation visualization tool, and MAR-Finder, a tool that predicts, base upon statistical inferences, the location of matrix attachment regions (MARS) within a nucleotide sequence. [The annual report for June 1996 to August 1997 is included as an attachment to this final report.
CyanoClust: comparative genome resources of cyanobacteria and plastids

OpenAIRE

Sasaki, Naobumi V.; Sato, Naoki

2010-01-01

Cyanobacteria, which perform oxygen-evolving photosynthesis as do chloroplasts of plants and algae, are one of the best-studied prokaryotic phyla and one from which many representative genomes have been sequenced. Lack of a suitable comparative genomic database has been a problem in cyanobacterial genomics because many proteins involved in physiological functions such as photosynthesis and nitrogen fixation are not catalogued in commonly used databases, such as Clusters of Orthologous Protein...
Efficient Identification of Causal Mutations through Sequencing of Bulked F2 from Two Allelic Bloomless Mutants of Sorghum bicolor

Directory of Open Access Journals (Sweden)

Yinping Jiao

2018-01-01

Full Text Available Sorghum (Sorghum bicolor Moench, L. plant accumulates copious layers of epi-cuticular wax (EW on its aerial surfaces, to a greater extent than most other crops. EW provides a vapor barrier that reduces water loss, and is therefore considered to be a major determinant of sorghum's drought tolerance. However, little is known about the genes responsible for wax accumulation in sorghum. We isolated two allelic mutants, bloomless40-1 (bm40-1 and bm40-2, from a mutant library constructed from ethyl methane sulfonate (EMS treated seeds of an inbred, BTx623. Both mutants were nearly devoid of the EW layer. Each bm mutant was crossed to the un-mutated BTx623 to generated F2 populations that segregated for the bm phenotype. Genomic DNA from 20 bm F2 plants from each population was bulked for whole genome sequencing. A single gene, Sobic.001G228100, encoding a GDSL-like lipase/acylhydrolase, had unique homozygous mutations in each bulked F2 population. Mutant bm40-1 harbored a missense mutation in the gene, whereas bm40-2 had a splice donor site mutation. Our findings thus provide strong evidence that mutation in this GDSL-like lipase gene causes the bm phenotype, and further demonstrate that this approach of sequencing two independent allelic mutant populations is an efficient method for identifying causal mutations. Combined with allelic mutants, MutMap provides powerful method to identify all causal genes for the large collection of bm mutants in sorghum, which will provide insight into how sorghum plants accumulate such abundant EW on their aerial surface. This knowledge may facilitate the development of tools for engineering drought-tolerant crops with reduced water loss.
Bioremediation of soil contaminated by waste motor oil in 55000 and 65000 and phytoremediation by Sorghum bicolor inoculated with Burkholderia cepacia and Penicillium chrysogenum

Directory of Open Access Journals (Sweden)

Sánchez-Yáñez Juan Manuel

2015-11-01

Full Text Available In soil spill a high concentration of waste motor oil (WMO it´s causing lost soil fertility, which is solved by remediation, but is expensive and polluting, an ecological alternative is bioremediation (BR by biostimulation follow by phytoremediation (PY with Sorghum bicolor using Burkholderia cepacia and Penicillium chrysogenum, promoting growth plant microorganisms (PGPM at concentration value below to the maximum according to NOM-138 SEMARNAT/SS-2003 de 4400 ppm/Kg soil. The objectives of this research were a bioremediation of soil contaminated by high WMO concentrations by biostimulation with mineral solution and Vicia sativa as green manure (GM, and subsequent b phytoremediation by S. bicolor with B. cepacia and P. chrysogenum to reduce remaining WMO at concentration below to maximum according to NOM-138 SEMARNAT/SS-2003. The results showed that biostimulation with mineral solution and V. sativa reduced WMO from 55000 to 33400 ppm, and from 65000 to 24300 ppm. Follow by PY by S. bicolor with B. cepacia and P. chrysogenum decreased WMO from 33400 ppm to 210 ppm, and from 24300 ppm to 360 ppm, compared to soil as negative control in which WMO did not change by natural attenuation. This suggests that to integrate BR and PY is an ecological option instead to apply chemical technique expensive and causing environmental pollution.
Preliminary Characterization of Mitochondrial Genome of Melipona scutellaris, a Brazilian Stingless Bee

Directory of Open Access Journals (Sweden)

Manuella Souza Silverio

2014-01-01

Full Text Available Bees are manufacturers of relevant economical products and have a pollinator role fundamental to ecosystems. Traditionally, studies focused on the genus Melipona have been mostly based on behavioral, and social organization and ecological aspects. Only recently the evolutionary history of this genus has been assessed using molecular markers, including mitochondrial genes. Even though these studies have shed light on the evolutionary history of the Melipona genus, a more accurate picture may emerge when full nuclear and mitochondrial genomes of Melipona species become available. Here we present the assembly, annotation, and characterization of a draft mitochondrial genome of the Brazilian stingless bee Melipona scutellaris using Melipona bicolor as a reference organism. Using Illumina MiSeq data, we achieved the annotation of all protein coding genes, as well as the genes for the two ribosomal subunits (16S and 12S and transfer RNA genes as well. Using the COI sequence as a DNA barcode, we found that M. cramptoni is the closest species to M. scutellaris.
Genome-wide data-mining of candidate human splice translational efficiency polymorphisms (STEPs and an online database.

Directory of Open Access Journals (Sweden)

Christopher A Raistrick

2010-10-01

Full Text Available Variation in pre-mRNA splicing is common and in some cases caused by genetic variants in intronic splicing motifs. Recent studies into the insulin gene (INS discovered a polymorphism in a 5' non-coding intron that influences the likelihood of intron retention in the final mRNA, extending the 5' untranslated region and maintaining protein quality. Retention was also associated with increased insulin levels, suggesting that such variants--splice translational efficiency polymorphisms (STEPs--may relate to disease phenotypes through differential protein expression. We set out to explore the prevalence of STEPs in the human genome and validate this new category of protein quantitative trait loci (pQTL using publicly available data.Gene transcript and variant data were collected and mined for candidate STEPs in motif regions. Sequences from transcripts containing potential STEPs were analysed for evidence of splice site recognition and an effect in expressed sequence tags (ESTs. 16 publicly released genome-wide association data sets of common diseases were searched for association to candidate polymorphisms with HapMap frequency data. Our study found 3324 candidate STEPs lying in motif sequences of 5' non-coding introns and further mining revealed 170 with transcript evidence of intron retention. 21 potential STEPs had EST evidence of intron retention or exon extension, as well as population frequency data for comparison.Results suggest that the insulin STEP was not a unique example and that many STEPs may occur genome-wide with potentially causal effects in complex disease. An online database of STEPs is freely accessible at http://dbstep.genes.org.uk/.
MIPS: curated databases and comprehensive secondary data resources in 2010.

Science.gov (United States)

Mewes, H Werner; Ruepp, Andreas; Theis, Fabian; Rattei, Thomas; Walter, Mathias; Frishman, Dmitrij; Suhre, Karsten; Spannagl, Manuel; Mayer, Klaus F X; Stümpflen, Volker; Antonov, Alexey

2011-01-01

The Munich Information Center for Protein Sequences (MIPS at the Helmholtz Center for Environmental Health, Neuherberg, Germany) has many years of experience in providing annotated collections of biological data. Selected data sets of high relevance, such as model genomes, are subjected to careful manual curation, while the bulk of high-throughput data is annotated by automatic means. High-quality reference resources developed in the past and still actively maintained include Saccharomyces cerevisiae, Neurospora crassa and Arabidopsis thaliana genome databases as well as several protein interaction data sets (MPACT, MPPI and CORUM). More recent projects are PhenomiR, the database on microRNA-related phenotypes, and MIPS PlantsDB for integrative and comparative plant genome research. The interlinked resources SIMAP and PEDANT provide homology relationships as well as up-to-date and consistent annotation for 38,000,000 protein sequences. PPLIPS and CCancer are versatile tools for proteomics and functional genomics interfacing to a database of compilations from gene lists extracted from literature. A novel literature-mining tool, EXCERBT, gives access to structured information on classified relations between genes, proteins, phenotypes and diseases extracted from Medline abstracts by semantic analysis. All databases described here, as well as the detailed descriptions of our projects can be accessed through the MIPS WWW server (http://mips.helmholtz-muenchen.de).
Mining biological databases for candidate disease genes

Science.gov (United States)

Braun, Terry A.; Scheetz, Todd; Webster, Gregg L.; Casavant, Thomas L.

2001-07-01

The publicly-funded effort to sequence the complete nucleotide sequence of the human genome, the Human Genome Project (HGP), has currently produced more than 93% of the 3 billion nucleotides of the human genome into a preliminary `draft' format. In addition, several valuable sources of information have been developed as direct and indirect results of the HGP. These include the sequencing of model organisms (rat, mouse, fly, and others), gene discovery projects (ESTs and full-length), and new technologies such as expression analysis and resources (micro-arrays or gene chips). These resources are invaluable for the researchers identifying the functional genes of the genome that transcribe and translate into the transcriptome and proteome, both of which potentially contain orders of magnitude more complexity than the genome itself. Preliminary analyses of this data identified approximately 30,000 - 40,000 human `genes.' However, the bulk of the effort still remains -- to identify the functional and structural elements contained within the transcriptome and proteome, and to associate function in the transcriptome and proteome to genes. A fortuitous consequence of the HGP is the existence of hundreds of databases containing biological information that may contain relevant data pertaining to the identification of disease-causing genes. The task of mining these databases for information on candidate genes is a commercial application of enormous potential. We are developing a system to acquire and mine data from specific databases to aid our efforts to identify disease genes. A high speed cluster of Linux of workstations is used to analyze sequence and perform distributed sequence alignments as part of our data mining and processing. This system has been used to mine GeneMap99 sequences within specific genomic intervals to identify potential candidate disease genes associated with Bardet-Biedle Syndrome (BBS).
Enhancement of photosynthesis in Sorghum bicolor by ultraviolet radiation

International Nuclear Information System (INIS)

Johnson, G.A.; Day, T.A.

2002-01-01

We assessed the influence of ultraviolet radiation (UV) on net photosynthetic CO 2 assimilation rate (Pn) in Sorghum bicolor, with particular attention to examining whether UV can enhance Pn via direct absorption of UV and absorption of UV-induced blue fluorescence by photosynthetic pigments. A polychromatic UV response spectrum of leaves was constructed by measuring Pn under different UV supplements using filters that had sharp transmission cut-offs from 280 to 382 nm, against a background of non-saturating visible light. When the abaxial surface was irradiated, P n averaged 4.6% higher with the UV supplement that cut-off UV at 311 nm, compared to lower and higher UV wavelength supplements. This former supplement differed from higher wavelength supplements by primarily providing more UV between 320 and 350 nm. To assess the possibility of direct absorption of UV by photosynthetic pigments, we measured the absorbance of extracted chlorophylls. Chlorophyll a had absorbance peaks at 340 and 389 nm that were 49 and 72% of that at the sorét peak. Chlorophyll b had absorbance peaks at 315 and 346 nm that were both 35% of that at the sorét peak. Since the epidermis transmits some UV, the strong UV absorbance of chlorophyll implies a potential role for irradiance beyond the bounds of the conventionally defined photosynthetically active radiation waveband (400–700 nm). To assess the role of absorption of UV-induced blue fluorescence, we measured the UV-induced fluorescence excitation and emission spectra of leaves. Abaxial excitation peaked at 328 nm, while emission peaked at 446 nm. In this analysis, we used our abaxial fluorescence excitation spectrum and the UV photosynthetic inhibition spectrum of Caldwell et al. (1986) to weight the UV irradiance with each cut-off filter, thereby estimating the potential contribution of UV-induced blue fluorescence to photosynthesis and the inhibitory effects of UV irradiance on photosynthesis, respectively. With a non
Germinação e crescimento in vitro de Cattleya bicolor Lindley (Orchidaceae)

OpenAIRE

Suzuki,Rogério Mamoru; Almeida,Vanessa de; Pescador,Rosete; Ferreira,Wagner de Melo

2010-01-01

A germinação de sementes de orquídeas in vitro vem sendo utilizada desde o início do século passado. Apesar disso, o conhecimento disponível a respeito da composição nutricional dos meios de cultura que favorecem a germinação e o crescimento in vitro de orquídeas ainda é bastante escasso. Diante da ameaça de extinção da Cattleya bicolor e devido à escassez de conhecimento a respeito da germinação e do crescimento in vitro dessa espécie, este trabalho teve como objetivo avaliar a influência do...
Artemita bicolor Kertész, novo sinônimo de Artemita podexargenteus Enderlein, (Diptera, Stratiomyidae com notas nas terminálias masculina e feminina Artemita bicolor Kertész, new synonym of Artemita podexargenteus Enderlein, (Diptera, Stratiomyidae with notes on male and female terminalia

Directory of Open Access Journals (Sweden)

Alexandre Ururahy-Rodrigues

2004-06-01

Full Text Available O gênero de Stratiomyidae, Artemita Walker, 1854 esta representado na região Neotropical por 14 espécies, seis das quais ocorrem no Brasil. Apesar das importantes revisões de KERTÉSZ (1914 e JAMES (1971 o conhecimento sobre a variação morfológica no grupo ainda é rudimentar, principalmente com relação a terminália. Neste trabalho, com base na morfologia da terminália Artemita bicolor Kertész, 1914 é proposta como sinônimo júnior de Artemita podexargenteus Enderlein, 1914 e a última é redescrita.The Stratiomyidae genus Artemita Walker, 1854 is represented in the Neotropical Region by 14 species, 6 of which occur in Brazil. Despite of the important revisions by KERTÉSZ (1914 and JAMES (1971 knowledge of morphological variation within the group is rudimentary, mainly with respect to the terminalia. In this work, Artemita bicolor Kertész, 1914 is proposed as a junior synonym of Artemita podexargenteus Enderlein, 1914 and the latter is redescribed based on terminalia morphology.
Genomic Enzymology: Web Tools for Leveraging Protein Family Sequence-Function Space and Genome Context to Discover Novel Functions.

Science.gov (United States)

Gerlt, John A

2017-08-22

The exponentially increasing number of protein and nucleic acid sequences provides opportunities to discover novel enzymes, metabolic pathways, and metabolites/natural products, thereby adding to our knowledge of biochemistry and biology. The challenge has evolved from generating sequence information to mining the databases to integrating and leveraging the available information, i.e., the availability of "genomic enzymology" web tools. Web tools that allow identification of biosynthetic gene clusters are widely used by the natural products/synthetic biology community, thereby facilitating the discovery of novel natural products and the enzymes responsible for their biosynthesis. However, many novel enzymes with interesting mechanisms participate in uncharacterized small-molecule metabolic pathways; their discovery and functional characterization also can be accomplished by leveraging information in protein and nucleic acid databases. This Perspective focuses on two genomic enzymology web tools that assist the discovery novel metabolic pathways: (1) Enzyme Function Initiative-Enzyme Similarity Tool (EFI-EST) for generating sequence similarity networks to visualize and analyze sequence-function space in protein families and (2) Enzyme Function Initiative-Genome Neighborhood Tool (EFI-GNT) for generating genome neighborhood networks to visualize and analyze the genome context in microbial and fungal genomes. Both tools have been adapted to other applications to facilitate target selection for enzyme discovery and functional characterization. As the natural products community has demonstrated, the enzymology community needs to embrace the essential role of web tools that allow the protein and genome sequence databases to be leveraged for novel insights into enzymological problems.

Fungal genome resources at NCBI

Science.gov (United States)

Robbertse, B.; Tatusova, T.

2011-01-01

The National Center for Biotechnology Information (NCBI) is well known for the nucleotide sequence archive, GenBank and sequence analysis tool BLAST. However, NCBI integrates many types of biomolecular data from variety of sources and makes it available to the scientific community as interactive web resources as well as organized releases of bulk data. These tools are available to explore and compare fungal genomes. Searching all databases with Fungi [organism] at http://www.ncbi.nlm.nih.gov/ is the quickest way to find resources of interest with fungal entries. Some tools though are resources specific and can be indirectly accessed from a particular database in the Entrez system. These include graphical viewers and comparative analysis tools such as TaxPlot, TaxMap and UniGene DDD (found via UniGene Homepage). Gene and BioProject pages also serve as portals to external data such as community annotation websites, BioGrid and UniProt. There are many different ways of accessing genomic data at NCBI. Depending on the focus and goal of research projects or the level of interest, a user would select a particular route for accessing genomic databases and resources. This review article describes methods of accessing fungal genome data and provides examples that illustrate the use of analysis tools. PMID:22737589
Halotolerant/alkalophilic bacteria associated with the cyanobacterium Arthrospira platensis (Nordstedt Gomont that promote early growth in Sorghum bicolor (L. Moench

Directory of Open Access Journals (Sweden)

Gómez G. Liliana Cecilia

2012-04-01

Full Text Available
Arthrospira platensis associated bacteria (APAB identified through molecuar biology like Bacillus okhensis, Indibacter alkaliphilus and Halomonas sp., are also producing 3-indol acetic acid (IAA, these bacteria was used in early plant growth promotion tests over Sorghum bicolor, these bioassay was considered indirect evidence to suggest that APAB also may have stimulatory effects over A. platensis growth naturally. I. alkaliphilus and B. okhensis enhanced early germination of S. bicolor seads, with better results than that achieved by Azospirillum brasilense, bacterium used like reference as a common plant growth promoting rizobacteria. The three APAB enhanced significative differences (P≤0.05 over morphoagronomic parameters, I. alkaliphilus and B. okhensis exhibit better resoults in elongation stimulation and root and foliage dry weight. Above evidence suggest this bacteria like plant growth promoting and it recomended testing with A. platensis axenic cultures and its associated bactteri for understanding true interaction between them.
GenomePeek—an online tool for prokaryotic genome and metagenome analysis

Directory of Open Access Journals (Sweden)

Katelyn McNair

2015-06-01

Full Text Available As more and more prokaryotic sequencing takes place, a method to quickly and accurately analyze this data is needed. Previous tools are mainly designed for metagenomic analysis and have limitations; such as long runtimes and significant false positive error rates. The online tool GenomePeek (edwards.sdsu.edu/GenomePeek was developed to analyze both single genome and metagenome sequencing files, quickly and with low error rates. GenomePeek uses a sequence assembly approach where reads to a set of conserved genes are extracted, assembled and then aligned against the highly specific reference database. GenomePeek was found to be faster than traditional approaches while still keeping error rates low, as well as offering unique data visualization options.
MicroScope: a platform for microbial genome annotation and comparative genomics.

Science.gov (United States)

Vallenet, D; Engelen, S; Mornico, D; Cruveiller, S; Fleury, L; Lajus, A; Rouy, Z; Roche, D; Salvignol, G; Scarpelli, C; Médigue, C

2009-01-01

The initial outcome of genome sequencing is the creation of long text strings written in a four letter alphabet. The role of in silico sequence analysis is to assist biologists in the act of associating biological knowledge with these sequences, allowing investigators to make inferences and predictions that can be tested experimentally. A wide variety of software is available to the scientific community, and can be used to identify genomic objects, before predicting their biological functions. However, only a limited number of biologically interesting features can be revealed from an isolated sequence. Comparative genomics tools, on the other hand, by bringing together the information contained in numerous genomes simultaneously, allow annotators to make inferences based on the idea that evolution and natural selection are central to the definition of all biological processes. We have developed the MicroScope platform in order to offer a web-based framework for the systematic and efficient revision of microbial genome annotation and comparative analysis (http://www.genoscope.cns.fr/agc/microscope). Starting with the description of the flow chart of the annotation processes implemented in the MicroScope pipeline, and the development of traditional and novel microbial annotation and comparative analysis tools, this article emphasizes the essential role of expert annotation as a complement of automatic annotation. Several examples illustrate the use of implemented tools for the review and curation of annotations of both new and publicly available microbial genomes within MicroScope's rich integrated genome framework. The platform is used as a viewer in order to browse updated annotation information of available microbial genomes (more than 440 organisms to date), and in the context of new annotation projects (117 bacterial genomes). The human expertise gathered in the MicroScope database (about 280,000 independent annotations) contributes to improve the quality of
Global repeat discovery and estimation of genomic copy number in a large, complex genome using a high-throughput 454 sequence survey

Directory of Open Access Journals (Sweden)

Varala Kranthi

2007-05-01

Full Text Available Abstract Background Extensive computational and database tools are available to mine genomic and genetic databases for model organisms, but little genomic data is available for many species of ecological or agricultural significance, especially those with large genomes. Genome surveys using conventional sequencing techniques are powerful, particularly for detecting sequences present in many copies per genome. However these methods are time-consuming and have potential drawbacks. High throughput 454 sequencing provides an alternative method by which much information can be gained quickly and cheaply from high-coverage surveys of genomic DNA. Results We sequenced 78 million base-pairs of randomly sheared soybean DNA which passed our quality criteria. Computational analysis of the survey sequences provided global information on the abundant repetitive sequences in soybean. The sequence was used to determine the copy number across regions of large genomic clones or contigs and discover higher-order structures within satellite repeats. We have created an annotated, online database of sequences present in multiple copies in the soybean genome. The low bias of pyrosequencing against repeat sequences is demonstrated by the overall composition of the survey data, which matches well with past estimates of repetitive DNA content obtained by DNA re-association kinetics (Cot analysis. Conclusion This approach provides a potential aid to conventional or shotgun genome assembly, by allowing rapid assessment of copy number in any clone or clone-end sequence. In addition, we show that partial sequencing can provide access to partial protein-coding sequences.
Final Technical Report on the Genome Sequence DataBase (GSDB): DE-FG03 95 ER 62062 September 1997-September 1999; FINAL

International Nuclear Information System (INIS)

Harger, Carol A.

1999-01-01

Since September 1997 NCGR has produced two web-based tools for researchers to use to access and analyze data in the Genome Sequence DataBase (GSDB). These tools are: Sequence Viewer, a nucleotide sequence and annotation visualization tool, and MAR-Finder, a tool that predicts, base upon statistical inferences, the location of matrix attachment regions (MARS) within a nucleotide sequence.[The annual report for June 1996 to August 1997 is included as an attachment to this final report.
Database Resources of the BIG Data Center in 2018.

Science.gov (United States)

2018-01-04

The BIG Data Center at Beijing Institute of Genomics (BIG) of the Chinese Academy of Sciences provides freely open access to a suite of database resources in support of worldwide research activities in both academia and industry. With the vast amounts of omics data generated at ever-greater scales and rates, the BIG Data Center is continually expanding, updating and enriching its core database resources through big-data integration and value-added curation, including BioCode (a repository archiving bioinformatics tool codes), BioProject (a biological project library), BioSample (a biological sample library), Genome Sequence Archive (GSA, a data repository for archiving raw sequence reads), Genome Warehouse (GWH, a centralized resource housing genome-scale data), Genome Variation Map (GVM, a public repository of genome variations), Gene Expression Nebulas (GEN, a database of gene expression profiles based on RNA-Seq data), Methylation Bank (MethBank, an integrated databank of DNA methylomes), and Science Wikis (a series of biological knowledge wikis for community annotations). In addition, three featured web services are provided, viz., BIG Search (search as a service; a scalable inter-domain text search engine), BIG SSO (single sign-on as a service; a user access control system to gain access to multiple independent systems with a single ID and password) and Gsub (submission as a service; a unified submission service for all relevant resources). All of these resources are publicly accessible through the home page of the BIG Data Center at http://bigd.big.ac.cn. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
The Effect of Salicylic Acid and Gibberellin on Seed Reserve Utilization, Germination and Enzyme Activity of Sorghum (Sorghum bicolor L. Seeds Under Drought Stress

Directory of Open Access Journals (Sweden)

Roghayyeh Sheykhbaglou

2014-03-01

Full Text Available Seed priming methods have been used to increases germination characteristics under stress conditions. The study aimed was to determine the effect of salicylic acid and gibberellin on seed reserve utilization, germination and enzyme activity of sorghum (Sorghum bicolor L. seeds under drought stress. Factorial experiment was carried out in completely randomized design with three replications. The first factor was the seed treatments (unpriming, salicylic acid and gibberellin and the second factor was drought stress (0, -4, -8 and -12 bar. The results indicated that for these traits: germination percentage, germination index, weight of utilized (mobilized seed, seed reserve utilization efficiency, seedling dry weight and seed reserve depletion percentage was a significant treatment Ч drought interaction. Thus priming improved study traits in Sorghum (Sorghum bicolor L. seeds under drought stress. Also, priming improves enzyme activity as compared to the unprimed seeds.
HEpD: a database describing epigenetic differences between Thoroughbred and Jeju horses.

Science.gov (United States)

Gim, Jeong-An; Lee, Sugi; Kim, Dae-Soo; Jeong, Kwang-Seuk; Hong, Chang Pyo; Bae, Jin-Han; Moon, Jae-Woo; Choi, Yong-Seok; Cho, Byung-Wook; Cho, Hwan-Gue; Bhak, Jong; Kim, Heui-Soo

2015-04-10

With the advent of next-generation sequencing technology, genome-wide maps of DNA methylation are now available. The Thoroughbred horse is bred for racing, while the Jeju horse is a traditional Korean horse bred for racing or food. The methylation profiles of equine organs may provide genomic clues underlying their athletic traits. We have developed a database to elucidate genome-wide DNA methylation patterns of the cerebrum, lung, heart, and skeletal muscle from Thoroughbred and Jeju horses. Using MeDIP-Seq, our database provides information regarding significantly enriched methylated regions beyond a threshold, methylation density of a specific region, and differentially methylated regions (DMRs) for tissues from two equine breeds. It provided methylation patterns at 784 gene regions in the equine genome. This database can potentially help researchers identify DMRs in the tissues of these horse species and investigate the differences between the Thoroughbred and Jeju horse breeds. Copyright © 2015 Elsevier B.V. All rights reserved.
The Candidate Cancer Gene Database: a database of cancer driver genes from forward genetic screens in mice.

Science.gov (United States)

Abbott, Kenneth L; Nyre, Erik T; Abrahante, Juan; Ho, Yen-Yi; Isaksson Vogel, Rachel; Starr, Timothy K

2015-01-01

Identification of cancer driver gene mutations is crucial for advancing cancer therapeutics. Due to the overwhelming number of passenger mutations in the human tumor genome, it is difficult to pinpoint causative driver genes. Using transposon mutagenesis in mice many laboratories have conducted forward genetic screens and identified thousands of candidate driver genes that are highly relevant to human cancer. Unfortunately, this information is difficult to access and utilize because it is scattered across multiple publications using different mouse genome builds and strength metrics. To improve access to these findings and facilitate meta-analyses, we developed the Candidate Cancer Gene Database (CCGD, http://ccgd-starrlab.oit.umn.edu/). The CCGD is a manually curated database containing a unified description of all identified candidate driver genes and the genomic location of transposon common insertion sites (CISs) from all currently published transposon-based screens. To demonstrate relevance to human cancer, we performed a modified gene set enrichment analysis using KEGG pathways and show that human cancer pathways are highly enriched in the database. We also used hierarchical clustering to identify pathways enriched in blood cancers compared to solid cancers. The CCGD is a novel resource available to scientists interested in the identification of genetic drivers of cancer. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Genome-derived vaccines.

Science.gov (United States)

De Groot, Anne S; Rappuoli, Rino

2004-02-01

Vaccine research entered a new era when the complete genome of a pathogenic bacterium was published in 1995. Since then, more than 97 bacterial pathogens have been sequenced and at least 110 additional projects are now in progress. Genome sequencing has also dramatically accelerated: high-throughput facilities can draft the sequence of an entire microbe (two to four megabases) in 1 to 2 days. Vaccine developers are using microarrays, immunoinformatics, proteomics and high-throughput immunology assays to reduce the truly unmanageable volume of information available in genome databases to a manageable size. Vaccines composed by novel antigens discovered from genome mining are already in clinical trials. Within 5 years we can expect to see a novel class of vaccines composed by genome-predicted, assembled and engineered T- and Bcell epitopes. This article addresses the convergence of three forces--microbial genome sequencing, computational immunology and new vaccine technologies--that are shifting genome mining for vaccines onto the forefront of immunology research.
Medicago truncatula transporter database: a comprehensive database resource for M. truncatula transporters

Directory of Open Access Journals (Sweden)

Miao Zhenyan

2012-02-01

Full Text Available Abstract Background Medicago truncatula has been chosen as a model species for genomic studies. It is closely related to an important legume, alfalfa. Transporters are a large group of membrane-spanning proteins. They deliver essential nutrients, eject waste products, and assist the cell in sensing environmental conditions by forming a complex system of pumps and channels. Although studies have effectively characterized individual M. truncatula transporters in several databases, until now there has been no available systematic database that includes all transporters in M. truncatula. Description The M. truncatula transporter database (MTDB contains comprehensive information on the transporters in M. truncatula. Based on the TransportTP method, we have presented a novel prediction pipeline. A total of 3,665 putative transporters have been annotated based on International Medicago Genome Annotated Group (IMGAG V3.5 V3 and the M. truncatula Gene Index (MTGI V10.0 releases and assigned to 162 families according to the transporter classification system. These families were further classified into seven types according to their transport mode and energy coupling mechanism. Extensive annotations referring to each protein were generated, including basic protein function, expressed sequence tag (EST mapping, genome locus, three-dimensional template prediction, transmembrane segment, and domain annotation. A chromosome distribution map and text-based Basic Local Alignment Search Tools were also created. In addition, we have provided a way to explore the expression of putative M. truncatula transporter genes under stress treatments. Conclusions In summary, the MTDB enables the exploration and comparative analysis of putative transporters in M. truncatula. A user-friendly web interface and regular updates make MTDB valuable to researchers in related fields. The MTDB is freely available now to all users at http://bioinformatics.cau.edu.cn/MtTransporter/.
VerSeDa: vertebrate secretome database.

Science.gov (United States)

Cortazar, Ana R; Oguiza, José A; Aransay, Ana M; Lavín, José L

2017-01-01

Based on the current tools, de novo secretome (full set of proteins secreted by an organism) prediction is a time consuming bioinformatic task that requires a multifactorial analysis in order to obtain reliable in silico predictions. Hence, to accelerate this process and offer researchers a reliable repository where secretome information can be obtained for vertebrates and model organisms, we have developed VerSeDa (Vertebrate Secretome Database). This freely available database stores information about proteins that are predicted to be secreted through the classical and non-classical mechanisms, for the wide range of vertebrate species deposited at the NCBI, UCSC and ENSEMBL sites. To our knowledge, VerSeDa is the only state-of-the-art database designed to store secretome data from multiple vertebrate genomes, thus, saving an important amount of time spent in the prediction of protein features that can be retrieved from this repository directly. VerSeDa is freely available at http://genomics.cicbiogune.es/VerSeDa/index.php. © The Author(s) 2017. Published by Oxford University Press.
DPTEdb, an integrative database of transposable elements in dioecious plants.

Science.gov (United States)

Li, Shu-Fen; Zhang, Guo-Jun; Zhang, Xue-Jin; Yuan, Jin-Hong; Deng, Chuan-Liang; Gu, Lian-Feng; Gao, Wu-Jun

2016-01-01

Dioecious plants usually harbor 'young' sex chromosomes, providing an opportunity to study the early stages of sex chromosome evolution. Transposable elements (TEs) are mobile DNA elements frequently found in plants and are suggested to play important roles in plant sex chromosome evolution. The genomes of several dioecious plants have been sequenced, offering an opportunity to annotate and mine the TE data. However, comprehensive and unified annotation of TEs in these dioecious plants is still lacking. In this study, we constructed a dioecious plant transposable element database (DPTEdb). DPTEdb is a specific, comprehensive and unified relational database and web interface. We used a combination of de novo, structure-based and homology-based approaches to identify TEs from the genome assemblies of previously published data, as well as our own. The database currently integrates eight dioecious plant species and a total of 31 340 TEs along with classification information. DPTEdb provides user-friendly web interfaces to browse, search and download the TE sequences in the database. Users can also use tools, including BLAST, GetORF, HMMER, Cut sequence and JBrowse, to analyze TE data. Given the role of TEs in plant sex chromosome evolution, the database will contribute to the investigation of TEs in structural, functional and evolutionary dynamics of the genome of dioecious plants. In addition, the database will supplement the research of sex diversification and sex chromosome evolution of dioecious plants.Database URL: http://genedenovoweb.ticp.net:81/DPTEdb/index.php. © The Author(s) 2016. Published by Oxford University Press.
SpirPep: an in silico digestion-based platform to assist bioactive peptides discovery from a genome-wide database.

Science.gov (United States)

Anekthanakul, Krittima; Hongsthong, Apiradee; Senachak, Jittisak; Ruengjitchatchawalya, Marasri

2018-04-20

Bioactive peptides, including biological sources-derived peptides with different biological activities, are protein fragments that influence the functions or conditions of organisms, in particular humans and animals. Conventional methods of identifying bioactive peptides are time-consuming and costly. To quicken the processes, several bioinformatics tools are recently used to facilitate screening of the potential peptides prior their activity assessment in vitro and/or in vivo. In this study, we developed an efficient computational method, SpirPep, which offers many advantages over the currently available tools. The SpirPep web application tool is a one-stop analysis and visualization facility to assist bioactive peptide discovery. The tool is equipped with 15 customized enzymes and 1-3 miscleavage options, which allows in silico digestion of protein sequences encoded by protein-coding genes from single, multiple, or genome-wide scaling, and then directly classifies the peptides by bioactivity using an in-house database that contains bioactive peptides collected from 13 public databases. With this tool, the resulting peptides are categorized by each selected enzyme, and shown in a tabular format where the peptide sequences can be tracked back to their original proteins. The developed tool and webpages are coded in PHP and HTML with CSS/JavaScript. Moreover, the tool allows protein-peptide alignment visualization by Generic Genome Browser (GBrowse) to display the region and details of the proteins and peptides within each parameter, while considering digestion design for the desirable bioactivity. SpirPep is efficient; it takes less than 20 min to digest 3000 proteins (751,860 amino acids) with 15 enzymes and three miscleavages for each enzyme, and only a few seconds for single enzyme digestion. Obviously, the tool identified more bioactive peptides than that of the benchmarked tool; an example of validated pentapeptide (FLPIL) from LC-MS/MS was demonstrated. The
Ebolavirus Database: Gene and Protein Information Resource for Ebolaviruses

Directory of Open Access Journals (Sweden)

Rayapadi G. Swetha

2016-01-01

Full Text Available Ebola Virus Disease (EVD is a life-threatening haemorrhagic fever in humans. Even though there are many reports on EVD, the protein precursor functions and virulent factors of ebolaviruses remain poorly understood. Comparative analyses of Ebolavirus genomes will help in the identification of these important features. This prompted us to develop the Ebolavirus Database (EDB and we have provided links to various tools that will aid researchers to locate important regions in both the genomes and proteomes of Ebolavirus. The genomic analyses of ebolaviruses will provide important clues for locating the essential and core functional genes. The aim of EDB is to act as an integrated resource for ebolaviruses and we strongly believe that the database will be a useful tool for clinicians, microbiologists, health care workers, and bioscience researchers.
Genomes to Proteomes

Energy Technology Data Exchange (ETDEWEB)

Panisko, Ellen A. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Grigoriev, Igor [USDOE Joint Genome Inst., Walnut Creek, CA (United States); Daly, Don S. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Webb-Robertson, Bobbie-Jo [Pacific Northwest National Lab. (PNNL), Richland, WA (United States); Baker, Scott E. [Pacific Northwest National Lab. (PNNL), Richland, WA (United States)

2009-03-01

Biologists are awash with genomic sequence data. In large part, this is due to the rapid acceleration in the generation of DNA sequence that occurred as public and private research institutes raced to sequence the human genome. In parallel with the large human genome effort, mostly smaller genomes of other important model organisms were sequenced. Projects following on these initial efforts have made use of technological advances and the DNA sequencing infrastructure that was built for the human and other organism genome projects. As a result, the genome sequences of many organisms are available in high quality draft form. While in many ways this is good news, there are limitations to the biological insights that can be gleaned from DNA sequences alone; genome sequences offer only a bird's eye view of the biological processes endemic to an organism or community. Fortunately, the genome sequences now being produced at such a high rate can serve as the foundation for other global experimental platforms such as proteomics. Proteomic methods offer a snapshot of the proteins present at a point in time for a given biological sample. Current global proteomics methods combine enzymatic digestion, separations, mass spectrometry and database searching for peptide identification. One key aspect of proteomics is the prediction of peptide sequences from mass spectrometry data. Global proteomic analysis uses computational matching of experimental mass spectra with predicted spectra based on databases of gene models that are often generated computationally. Thus, the quality of gene models predicted from a genome sequence is crucial in the generation of high quality peptide identifications. Once peptides are identified they can be assigned to their parent protein. Proteins identified as expressed in a given experiment are most useful when compared to other expressed proteins in a larger biological context or biochemical pathway. In this chapter we will discuss the automatic
User Guidelines for the Brassica Database: BRAD.

Science.gov (United States)

Wang, Xiaobo; Cheng, Feng; Wang, Xiaowu

2016-01-01

The genome sequence of Brassica rapa was first released in 2011. Since then, further Brassica genomes have been sequenced or are undergoing sequencing. It is therefore necessary to develop tools that help users to mine information from genomic data efficiently. This will greatly aid scientific exploration and breeding application, especially for those with low levels of bioinformatic training. Therefore, the Brassica database (BRAD) was built to collect, integrate, illustrate, and visualize Brassica genomic datasets. BRAD provides useful searching and data mining tools, and facilitates the search of gene annotation datasets, syntenic or non-syntenic orthologs, and flanking regions of functional genomic elements. It also includes genome-analysis tools such as BLAST and GBrowse. One of the important aims of BRAD is to build a bridge between Brassica crop genomes with the genome of the model species Arabidopsis thaliana, thus transferring the bulk of A. thaliana gene study information for use with newly sequenced Brassica crops.
The genome portal of the Department of Energy Joint Genome Institute: 2014 updates

Energy Technology Data Exchange (ETDEWEB)

Nordberg, Henrik [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Cantor, Michael [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Dusheyko, Serge [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Hua, Susan [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Poliakov, Alexander [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Shabalov, Igor [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Smirnova, Tatyana [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Grigoriev, Igor V. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States); Dubchak, Inna [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States)

2013-11-12

The U.S. Department of Energy (DOE) Joint Genome Institute (JGI), a national user facility, serves the diverse scientific community by providing integrated high-throughput sequencing and computational analysis to enable system-based scientific approaches in support of DOE missions related to clean energy generation and environmental characterization. The JGI Genome Portal (http://genome.jgi.doe.gov) provides unified access to all JGI genomic databases and analytical tools. The JGI maintains extensive data management systems and specialized analytical capabilities to manage and interpret complex genomic data. A user can search, download and explore multiple data sets available for all DOE JGI sequencing projects including their status, assemblies and annotations of sequenced genomes. In this paper, we describe major updates of the Genome Portal in the past 2 years with a specific emphasis on efficient handling of the rapidly growing amount of diverse genomic data accumulated in JGI.
PCBs and DDE in Tree Swallow (Tachycineta bicolor) Eggs and Nestlings from an Estuarine PCB Superfund Site, New Bedford Harbor, MA, U.S.A.

Science.gov (United States)

While breeding tree swallows (Tachycineta bicolor) have been used as biomonitors for freshwater sites, we report the first use of this species to assess the transfer of breeding ground contaminants from an estuarine system. Eggs and nestlings were collected from nest boxes locat...

Identification and profiling of salinity stress-responsive proteins in Sorghum bicolor seedlings

DEFF Research Database (Denmark)

Ngara, Rudo; Ndimba, Roya; Borch-Jensen, Jonas

2012-01-01

Sorghum bicolor, a drought tolerant cereal crop, is not only an important food source in the semi arid/arid regions but also a potential model for studying and gaining a better understanding of the molecular mechanisms of drought and salt stress tolerance in cereals. In this study, seeds of a sweet...... sorghum variety, MN1618, were planted and grown on solid MS growth medium with or without 100mM NaCl. Heat shock protein expression immunoblotting assays demonstrated that this salt treatment induced stress within natural physiological parameters for our experimental material. 2D PAGE in combination...... with MS/MS proteomics techniques were used to separate, visualise and identify salinity stress responsive proteins in young sorghum leaves. Out of 281 Coomassie stainable spots, 118 showed statistically significant responses (p...
GC-MS analysis, evaluation of phytochemicals, anti-oxidant, thrombolytic and anti-inflammatory activities of Exacum bicolor

OpenAIRE

Appaji Mahesh Ashwini; Latha Puttarudrappa; Belagumba Vijaykumar Ravi; Mala Majumdar

2015-01-01

The aim of the present study was to investigate the GC-MS analysis, phytochemical screening, anti-oxidant, thrombolytic and anti-inflammatory activities of methanol extract of leaves of Exacum bicolor. FTIR analysis confirmed the presence of alcohol, phenols, alkanes, aromatic compounds, aldehyde and ethers. GC-MS analysis revealed the presence of eight phyto-constituents. The total phenol, flavonoid and alkaloid contents were 18.0 ± 0.2 mg/GAE/g, 13.1 ± 0.4 mg QE/g and 108.0 ± 1.2 mg AE/g re...
ProOpDB: Prokaryotic Operon DataBase.

Science.gov (United States)

Taboada, Blanca; Ciria, Ricardo; Martinez-Guerrero, Cristian E; Merino, Enrique

2012-01-01

The Prokaryotic Operon DataBase (ProOpDB, http://operons.ibt.unam.mx/OperonPredictor) constitutes one of the most precise and complete repositories of operon predictions now available. Using our novel and highly accurate operon identification algorithm, we have predicted the operon structures of more than 1200 prokaryotic genomes. ProOpDB offers diverse alternatives by which a set of operon predictions can be retrieved including: (i) organism name, (ii) metabolic pathways, as defined by the KEGG database, (iii) gene orthology, as defined by the COG database, (iv) conserved protein domains, as defined by the Pfam database, (v) reference gene and (vi) reference operon, among others. In order to limit the operon output to non-redundant organisms, ProOpDB offers an efficient method to select the most representative organisms based on a precompiled phylogenetic distances matrix. In addition, the ProOpDB operon predictions are used directly as the input data of our Gene Context Tool to visualize their genomic context and retrieve the sequence of their corresponding 5' regulatory regions, as well as the nucleotide or amino acid sequences of their genes.
The bovine QTL viewer: a web accessible database of bovine Quantitative Trait Loci

Directory of Open Access Journals (Sweden)

Xavier Suresh R

2006-06-01

Full Text Available Abstract Background Many important agricultural traits such as weight gain, milk fat content and intramuscular fat (marbling in cattle are quantitative traits. Most of the information on these traits has not previously been integrated into a genomic context. Without such integration application of these data to agricultural enterprises will remain slow and inefficient. Our goal was to populate a genomic database with data mined from the bovine quantitative trait literature and to make these data available in a genomic context to researchers via a user friendly query interface. Description The QTL (Quantitative Trait Locus data and related information for bovine QTL are gathered from published work and from existing databases. An integrated database schema was designed and the database (MySQL populated with the gathered data. The bovine QTL Viewer was developed for the integration of QTL data available for cattle. The tool consists of an integrated database of bovine QTL and the QTL viewer to display QTL and their chromosomal position. Conclusion We present a web accessible, integrated database of bovine (dairy and beef cattle QTL for use by animal geneticists. The viewer and database are of general applicability to any livestock species for which there are public QTL data. The viewer can be accessed at http://bovineqtl.tamu.edu.
Occurrence, morphology and ultrastructure of the Dufour gland in Melipona bicolor Lepeletier (Hymenoptera, Meliponini Ocorrência, morfologia e ultra-estrutura da glândula de Dufour em Melipona bicolor Lepeletier (Hymenoptera, Meliponini

Directory of Open Access Journals (Sweden)

Fábio Camargo Abdalla

2004-03-01

Full Text Available The occurrence, morphology and ultrastructure of the Dufour gland in Melipona bicolor Lepeletier, 1836 are presented. The Dufour gland is not present in workers. In virgin queens the gland cells show characteristics of low activity, which are described in the text. In physogastric queens the gland epithelium is higher and the cells more active than in virgin queens, showing numerous basal plasmic membrane invaginations impregnated by an electrondense material, increased apical invaginations and accumulation of substances that will be released to the gland lumen in the subcuticular space. Therefore, the data show that the Dufour gland is more developed in physogastric than in virgin queens, indicating a possible involvement of the Dufour gland in the reproduction of this species.A ocorrência, morfologia e ultra-estrutura da glândula de Dufour em Melipona bicolor Lepeletier, 1836 são apresentados. A glândula de Dufour não está presente nas operárias. Nas rainhas virgens, as células glandulares mostram características de baixa atividade, as quais são descritas no texto. Nas rainhas fisogástricas, o epitélio glandular é mais alto e as células mais ativas do que nas rainhas virgens, mostrando numerosas invaginações da membrana plasmática basal impregnadas por material eletrondenso, além do aumento da freqüência de invaginações apicais e acúmulo de substâncias no espaço subcuticular, as quais serão posteriormente liberadas ao lúmen da glândula. Portanto, os dados mostram que a glândula de Dufour é mais desenvolvida nas rainhas fisogástricas do que nas virgens, indicando um possível envolvimento da glândula de Dufour na reprodução desta espécie.
Mapping data - KOME | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available switchLanguage; BLAST Search Image Search Home About Archive Update History Data ...tional Rice Genome Sequencing Project (IRGSP) Data file File name: kome_mapping_data.zip File URL: ftp://ftp.biosciencedbc.jp/archiv...(Transcriptional Unit) About This Database Database Description Download License Update History of This Database Site Policy | Contact Us Mapping data - KOME | LSDB Archive ...
Heterotic trait locus (HTL) mapping identifies intra-locus interactions that underlie reproductive hybrid vigor in Sorghum bicolor.

Science.gov (United States)

Ben-Israel, Imri; Kilian, Benjamin; Nida, Habte; Fridman, Eyal

2012-01-01

Identifying intra-locus interactions underlying heterotic variation among whole-genome hybrids is a key to understanding mechanisms of heterosis and exploiting it for crop and livestock improvement. In this study, we present the development and first use of the heterotic trait locus (HTL) mapping approach to associate specific intra-locus interactions with an overdominant heterotic mode of inheritance in a diallel population using Sorghum bicolor as the model. This method combines the advantages of ample genetic diversity and the possibility of studying non-additive inheritance. Furthermore, this design enables dissecting the latter to identify specific intra-locus interactions. We identified three HTLs (3.5% of loci tested) with synergistic intra-locus effects on overdominant grain yield heterosis in 2 years of field trials. These loci account for 19.0% of the heterotic variation, including a significant interaction found between two of them. Moreover, analysis of one of these loci (hDPW4.1) in a consecutive F2 population confirmed a significant 21% increase in grain yield of heterozygous vs. homozygous plants in this locus. Notably, two of the three HTLs for grain yield are in synteny with previously reported overdominant quantitative trait loci for grain yield in maize. A mechanism for the reproductive heterosis found in this study is suggested, in which grain yield increase is achieved by releasing the compensatory tradeoffs between biomass and reproductive output, and between seed number and weight. These results highlight the power of analyzing a diverse set of inbreds and their hybrids for unraveling hitherto unknown allelic interactions mediating heterosis.
MIPS: analysis and annotation of proteins from whole genomes in 2005.

Science.gov (United States)

Mewes, H W; Frishman, D; Mayer, K F X; Münsterkötter, M; Noubibou, O; Pagel, P; Rattei, T; Oesterheld, M; Ruepp, A; Stümpflen, V

2006-01-01

The Munich Information Center for Protein Sequences (MIPS at the GSF), Neuherberg, Germany, provides resources related to genome information. Manually curated databases for several reference organisms are maintained. Several of these databases are described elsewhere in this and other recent NAR database issues. In a complementary effort, a comprehensive set of >400 genomes automatically annotated with the PEDANT system are maintained. The main goal of our current work on creating and maintaining genome databases is to extend gene centered information to information on interactions within a generic comprehensive framework. We have concentrated our efforts along three lines (i) the development of suitable comprehensive data structures and database technology, communication and query tools to include a wide range of different types of information enabling the representation of complex information such as functional modules or networks Genome Research Environment System, (ii) the development of databases covering computable information such as the basic evolutionary relations among all genes, namely SIMAP, the sequence similarity matrix and the CABiNet network analysis framework and (iii) the compilation and manual annotation of information related to interactions such as protein-protein interactions or other types of relations (e.g. MPCDB, MPPI, CYGD). All databases described and the detailed descriptions of our projects can be accessed through the MIPS WWW server (http://mips.gsf.de).
RatMap--rat genome tools and data.

Science.gov (United States)

Petersen, Greta; Johnson, Per; Andersson, Lars; Klinga-Levan, Karin; Gómez-Fabre, Pedro M; Ståhl, Fredrik

2005-01-01

The rat genome database RatMap (http://ratmap.org or http://ratmap.gen.gu.se) has been one of the main resources for rat genome information since 1994. The database is maintained by CMB-Genetics at Goteborg University in Sweden and provides information on rat genes, polymorphic rat DNA-markers and rat quantitative trait loci (QTLs), all curated at RatMap. The database is under the supervision of the Rat Gene and Nomenclature Committee (RGNC); thus much attention is paid to rat gene nomenclature. RatMap presents information on rat idiograms, karyotypes and provides a unified presentation of the rat genome sequence and integrated rat linkage maps. A set of tools is also available to facilitate the identification and characterization of rat QTLs, as well as the estimation of exon/intron number and sizes in individual rat genes. Furthermore, comparative gene maps of rat in regard to mouse and human are provided.
RatMap—rat genome tools and data

Science.gov (United States)

Petersen, Greta; Johnson, Per; Andersson, Lars; Klinga-Levan, Karin; Gómez-Fabre, Pedro M.; Ståhl, Fredrik

2005-01-01

The rat genome database RatMap (http://ratmap.org or http://ratmap.gen.gu.se) has been one of the main resources for rat genome information since 1994. The database is maintained by CMB–Genetics at Göteborg University in Sweden and provides information on rat genes, polymorphic rat DNA-markers and rat quantitative trait loci (QTLs), all curated at RatMap. The database is under the supervision of the Rat Gene and Nomenclature Committee (RGNC); thus much attention is paid to rat gene nomenclature. RatMap presents information on rat idiograms, karyotypes and provides a unified presentation of the rat genome sequence and integrated rat linkage maps. A set of tools is also available to facilitate the identification and characterization of rat QTLs, as well as the estimation of exon/intron number and sizes in individual rat genes. Furthermore, comparative gene maps of rat in regard to mouse and human are provided. PMID:15608244
Detection of genomic rearrangements in cucumber using genomecmp software

Science.gov (United States)

Kulawik, Maciej; Pawełkowicz, Magdalena Ewa; Wojcieszek, Michał; PlÄ der, Wojciech; Nowak, Robert M.

2017-08-01

Comparative genomic by increasing information about the genomes sequences available in the databases is a rapidly evolving science. A simple comparison of the general features of genomes such as genome size, number of genes, and chromosome number presents an entry point into comparative genomic analysis. Here we present the utility of the new tool genomecmp for finding rearrangements across the compared sequences and applications in plant comparative genomics.
Determination of the phenolic content and antioxidant potential of crude extracts and isolated compounds from leaves of Cordia multispicata and Tournefortia bicolor.

Science.gov (United States)

Correia Da Silva, Thiago B; Souza, Vivian Karoline T; Da Silva, Ana Paula F; Lyra Lemos, Rosangela P; Conserva, Lucia M

2010-01-01

In this work, the total phenolic content and antioxidant activity of extracts and four flavonoids isolated from leaves of two Boraginaceae species (Cordia multispicata Cham. and Tournefortia bicolor Sw.) were evaluated using Folin-Ciocalteu reagent, DPPH free radical scavenging and inhibition of peroxidation of linoleic acid by FTC method. For comparison, ascorbic acid, alpha-tocopherol and BHT were used. In general, extracts from T. bicolor (68.8 +/- 0.001 to > 1000 mg/g) showed higher phenolic content than C. multispicata (66.1 +/- 0.009 to 231 +/- 0.07 mg/g), and also scavenged radicals (IC(50) 12.8 +/- 2.5 to 437 +/- 3.5 mg/L) and inhibited lipid peroxide formation (IC(50) 51.2 +/- 2.29 to 89 +/- 0.59 mg/L). For these extracts a good correlation between the phenolic content and antioxidant activity was observed, suggesting that T. bicolor is richer in phenolic compounds and that it could serve as a new source of natural antioxidants or nutraceuticals with potential applications. Chromatographic procedures monitored by antioxidant assays afforded seven compounds, which were identified by spectral analyses (IR, MS and 1D and 2D NMR) and comparison with reported data as being trans-phytol (1), taraxerol (2), 3,7,4'-trimethoxyflavone (3), 5,3'-dihydroxy-3,7,4'-trimethoxyflavone (4), quercetin (5), tiliroside (6), and rutin (7). Compounds (4-7) were also evaluated and were effective as DPPH quenching (IC(50) 7.7 +/- 3.6 to 79.3 +/- 3.4 mg/L) and as inhibition of lipid peroxidation (IC(50) 80.1 +/- 0.98 to 88.7 +/- 3.62 mg/L). This is the first report on the total phenolic content, radical-scavenging and antioxidant activities of these species.
"DNA Origami Traffic Lights" with a Split Aptamer Sensor for a Bicolor Fluorescence Readout.

Science.gov (United States)

Walter, Heidi-Kristin; Bauer, Jens; Steinmeyer, Jeannine; Kuzuya, Akinori; Niemeyer, Christof M; Wagenknecht, Hans-Achim

2017-04-12

A split aptamer for adenosine triphosphate (ATP) was embedded as a recognition unit into two levers of a nanomechanical DNA origami construct by extension and modification of selected staple strands. An additional optical module in the stem of the split aptamer comprised two different cyanine-styryl dyes that underwent an energy transfer from green (donor) to red (acceptor) emission if two ATP molecules were bound as target molecule to the recognition module and thereby brought the dyes in close proximity. As a result, the ATP as a target triggered the DNA origami shape transition and yielded a fluorescence color change from green to red as readout. Conventional atomic force microscopy (AFM) images confirmed the topology change from the open form of the DNA origami in the absence of ATP into the closed form in the presence of the target molecule. The obtained closed/open ratios in the absence and presence of target molecules tracked well with the fluorescence color ratios and thereby validated the bicolor fluorescence readout. The correct positioning of the split aptamer as the functional unit farthest away from the fulcrum of the DNA origami was crucial for the aptasensing by fluorescence readout. The fluorescence color change allowed additionally to follow the topology change of the DNA origami aptasensor in real time in solution. The concepts of fluorescence energy transfer for bicolor readout in a split aptamer in solution, and AFM on surfaces, were successfully combined in a single DNA origami construct to obtain a bimodal readout. These results are important for future custom DNA devices for chemical-biological and bioanalytical purposes because they are not only working as simple aptamers but are also visible by AFM on the single-molecule level.
Respiratory cancer database: An open access database of respiratory cancer gene and miRNA.

Science.gov (United States)

Choubey, Jyotsna; Choudhari, Jyoti Kant; Patel, Ashish; Verma, Mukesh Kumar

2017-01-01

Respiratory cancer database (RespCanDB) is a genomic and proteomic database of cancer of respiratory organ. It also includes the information of medicinal plants used for the treatment of various respiratory cancers with structure of its active constituents as well as pharmacological and chemical information of drug associated with various respiratory cancers. Data in RespCanDB has been manually collected from published research article and from other databases. Data has been integrated using MySQL an object-relational database management system. MySQL manages all data in the back-end and provides commands to retrieve and store the data into the database. The web interface of database has been built in ASP. RespCanDB is expected to contribute to the understanding of scientific community regarding respiratory cancer biology as well as developments of new way of diagnosing and treating respiratory cancer. Currently, the database consist the oncogenomic information of lung cancer, laryngeal cancer, and nasopharyngeal cancer. Data for other cancers, such as oral and tracheal cancers, will be added in the near future. The URL of RespCanDB is http://ridb.subdic-bioinformatics-nitrr.in/.
Neocosmocercella fisherae n. sp. (Nematoda: Cosmocercidae), a parasite of the large intestine of Phyllomedusa bicolor (Boddaert) (Anura: Phyllomedusidae) from the Brazilian Amazon.

Science.gov (United States)

Dos Santos, Ana Nunes; de Oliveira Rodrigues, Allan Rodrigo; Dos Santos Rocha, Fábio José; Dos Santos, Jeannie Nascimento; González, Cynthya Elizabeth; de Vasconcelos Melo, Francisco Tiago

2018-03-01

Neocosmocercella fisherae n. sp. is the first nematode species found parasitising Phyllomedusa bicolor from the Brazilian Amazon Region. The new species has a triangular oral opening, with bi-lobed lips, and is distinguished from N. bakeri (triangular oral opening with simple lips), and from N. paraguayensis (hexagonal oral opening with bi-lobed lips). Additionally, the new species has ciliated cephalic papillae, which are absent in the other species of the genus. The reduced uterine sac and the presence of a single egg in the uterus in females are the main morphological characters that differentiate the new species from its congeners N. bakeri (8-10 eggs) and N. paraguayensis (10 eggs, based on the allotype). Additionally, the new species differs from the other two species of the genus by morphometric characters such as the size of spicules and gubernaculum in males and the vagina in females. Until now, phyllomedusid anurans are the only known hosts for the nematodes of this genus. The present work describes the third species of the genus and the first species of nematode parasitising P. bicolor.
RPAN: rice pan-genome browser for ∼3000 rice genomes.

Science.gov (United States)

Sun, Chen; Hu, Zhiqiang; Zheng, Tianqing; Lu, Kuangchen; Zhao, Yue; Wang, Wensheng; Shi, Jianxin; Wang, Chunchao; Lu, Jinyuan; Zhang, Dabing; Li, Zhikang; Wei, Chaochun

2017-01-25

A pan-genome is the union of the gene sets of all the individuals of a clade or a species and it provides a new dimension of genome complexity with the presence/absence variations (PAVs) of genes among these genomes. With the progress of sequencing technologies, pan-genome study is becoming affordable for eukaryotes with large-sized genomes. The Asian cultivated rice, Oryza sativa L., is one of the major food sources for the world and a model organism in plant biology. Recently, the 3000 Rice Genome Project (3K RGP) sequenced more than 3000 rice genomes with a mean sequencing depth of 14.3×, which provided a tremendous resource for rice research. In this paper, we present a genome browser, Rice Pan-genome Browser (RPAN), as a tool to search and visualize the rice pan-genome derived from 3K RGP. RPAN contains a database of the basic information of 3010 rice accessions, including genomic sequences, gene annotations, PAV information and gene expression data of the rice pan-genome. At least 12 000 novel genes absent in the reference genome were included. RPAN also provides multiple search and visualization functions. RPAN can be a rich resource for rice biology and rice breeding. It is available at http://cgm.sjtu.edu.cn/3kricedb/ or http://www.rmbreeding.cn/pan3k. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Mouse Genome Informatics (MGI)

Data.gov (United States)

U.S. Department of Health & Human Services — MGI is the international database resource for the laboratory mouse, providing integrated genetic, genomic, and biological data to facilitate the study of human...
A Web-Based Comparative Genomics Tutorial for Investigating Microbial Genomes

Directory of Open Access Journals (Sweden)

Michael Strong

2009-12-01

Full Text Available As the number of completely sequenced microbial genomes continues to rise at an impressive rate, it is important to prepare students with the skills necessary to investigate microorganisms at the genomic level. As a part of the core curriculum for first-year graduate students in the biological sciences, we have implemented a web-based tutorial to introduce students to the fields of comparative and functional genomics. The tutorial focuses on recent computational methods for identifying functionally linked genes and proteins on a genome-wide scale and was used to introduce students to the Rosetta Stone, Phylogenetic Profile, conserved Gene Neighbor, and Operon computational methods. Students learned to use a number of publicly available web servers and databases to identify functionally linked genes in the Escherichia coli genome, with emphasis on genome organization and operon structure. The overall effectiveness of the tutorial was assessed based on student evaluations and homework assignments. The tutorial is available to other educators at http://www.doe-mbi.ucla.edu/~strong/m253.php.
Dicty_cDB: SSL385 [Dicty_cDB

Lifescience Database Archive (English)

Full Text Available C37A2, complete sequence. 38 0.049 3 BZ347525 |BZ347525.1 hm84h07.b1 WGS-SbicolorF (JM107 adapted methyl filter...-SbicolorF (JM107 adapted methyl filtered) Sorghum bicolor genomic clone ho57b09 ..._Ba0020C03 5', genomic survey sequence. 42 0.33 2 BZ366117 |BZ366117.1 ic94g05.g1 WGS-SbicolorF (JM107 adapted methyl filter...in ordered pieces. 44 0.88 1 BZ626058 |BZ626058.1 ih42h02.b1 WGS-SbicolorF (DH5a methyl filter...07.g1 WGS-SbicolorF (DH5a methyl filtered) Sorghum bicolor genomic clone ii21a07, DNA sequence. 38 3.0 2 dna
1.15 - Structural Chemogenomics Databases to Navigate Protein–Ligand Interaction Space

NARCIS (Netherlands)

Kanev, G.K.; Kooistra, A.J.; de Esch, I.J.P.; de Graaf, C.

2017-01-01

Structural chemogenomics databases allow the integration and exploration of heterogeneous genomic, structural, chemical, and pharmacological data in order to extract useful information that is applicable for the discovery of new protein targets and biologically active molecules. Integrated databases

Chemical constituents of the ethyl acetate extracts of the stem bark and fruits of Dichrostachys cinerea and the roots of Parkia bicolor

Directory of Open Access Journals (Sweden)

J. Fotie

2004-06-01

Full Text Available The antibacterial activities of ethyl acetate, methanol and aqueous extracts of the stem bark of Dichrostachys cinerea and the roots of Parkia bicolor have been evaluated. Ethyl acetate extracts have been investigated, studies that led to a series of known compounds, amongst which many are reported here for the very first time from both the species.
SoyFN: a knowledge database of soybean functional networks.

Science.gov (United States)

Xu, Yungang; Guo, Maozu; Liu, Xiaoyan; Wang, Chunyu; Liu, Yang

2014-01-01

Many databases for soybean genomic analysis have been built and made publicly available, but few of them contain knowledge specifically targeting the omics-level gene-gene, gene-microRNA (miRNA) and miRNA-miRNA interactions. Here, we present SoyFN, a knowledge database of soybean functional gene networks and miRNA functional networks. SoyFN provides user-friendly interfaces to retrieve, visualize, analyze and download the functional networks of soybean genes and miRNAs. In addition, it incorporates much information about KEGG pathways, gene ontology annotations and 3'-UTR sequences as well as many useful tools including SoySearch, ID mapping, Genome Browser, eFP Browser and promoter motif scan. SoyFN is a schema-free database that can be accessed as a Web service from any modern programming language using a simple Hypertext Transfer Protocol call. The Web site is implemented in Java, JavaScript, PHP, HTML and Apache, with all major browsers supported. We anticipate that this database will be useful for members of research communities both in soybean experimental science and bioinformatics. Database URL: http://nclab.hit.edu.cn/SoyFN.
From Genome Sequence to Taxonomy - A Skeptic’s View

DEFF Research Database (Denmark)

Özen, Asli Ismihan; Vesth, Tammi Camilla; Ussery, David

2012-01-01

The relative ease of sequencing bacterial genomes has resulted in thousands of sequenced bacterial genomes available in the public databases. This same technology now allows for using the entire genome sequence as an identifier for an organism. There are many methods available which attempt to us...
Investigating core genetic-and-epigenetic cell cycle networks for stemness and carcinogenic mechanisms, and cancer drug design using big database mining and genome-wide next-generation sequencing data.

Science.gov (United States)

Li, Cheng-Wei; Chen, Bor-Sen

2016-10-01

Recent studies have demonstrated that cell cycle plays a central role in development and carcinogenesis. Thus, the use of big databases and genome-wide high-throughput data to unravel the genetic and epigenetic mechanisms underlying cell cycle progression in stem cells and cancer cells is a matter of considerable interest. Real genetic-and-epigenetic cell cycle networks (GECNs) of embryonic stem cells (ESCs) and HeLa cancer cells were constructed by applying system modeling, system identification, and big database mining to genome-wide next-generation sequencing data. Real GECNs were then reduced to core GECNs of HeLa cells and ESCs by applying principal genome-wide network projection. In this study, we investigated potential carcinogenic and stemness mechanisms for systems cancer drug design by identifying common core and specific GECNs between HeLa cells and ESCs. Integrating drug database information with the specific GECNs of HeLa cells could lead to identification of multiple drugs for cervical cancer treatment with minimal side-effects on the genes in the common core. We found that dysregulation of miR-29C, miR-34A, miR-98, and miR-215; and methylation of ANKRD1, ARID5B, CDCA2, PIF1, STAMBPL1, TROAP, ZNF165, and HIST1H2AJ in HeLa cells could result in cell proliferation and anti-apoptosis through NFκB, TGF-β, and PI3K pathways. We also identified 3 drugs, methotrexate, quercetin, and mimosine, which repressed the activated cell cycle genes, ARID5B, STK17B, and CCL2, in HeLa cells with minimal side-effects.
The optimal dosage of 60 co gamma irradiation for obtaining salt gland mutants of exo-recretohalophyte limonium bicolor (bunge) o. kuntze

International Nuclear Information System (INIS)

Yuan, F.; Chen, M.; Yang, J.; Wang, B.

2015-01-01

Limonium bicolor (Bunge) O. Kuntze is a typical exo-recretohalophyte with multi-cellular salt glands. It is often used to improve saline-alkali soil. Seeds of L. bicolor were treated with different doses of 60 Co gamma irradiation to determine the LD50 for 60 Co gamma irradiation; the goal was to produce a relatively high number of mutants in salt gland development and salt secretion with a relatively low level of mortality. 60 Co gamma irradiation did not greatly affect germination, but an increase in gamma dose prevented the development of true leaves and reduced the percentage of seedlings that emerged from soil. The LD50 for 60 Co gamma irradiation was 120 Gy. Two mutants (few and many) were obtained under the LD50 using the screening methods - differential interference contrast microscope and leaf discs excretion model. Compared with the wild type, few and many had mutation in salt gland development, and many showed lower salt secretion rate per single salt gland than WT. These mutants would provide insight into the molecular mechanisms of salt gland development and salt secretion and into the development of salt-tolerant crop plants. (author)
The Vigna Genome Server, 'VigGS': A Genomic Knowledge Base of the Genus Vigna Based on High-Quality, Annotated Genome Sequence of the Azuki Bean, Vigna angularis (Willd.) Ohwi & Ohashi.

Science.gov (United States)

Sakai, Hiroaki; Naito, Ken; Takahashi, Yu; Sato, Toshiyuki; Yamamoto, Toshiya; Muto, Isamu; Itoh, Takeshi; Tomooka, Norihiko

2016-01-01

The genus Vigna includes legume crops such as cowpea, mungbean and azuki bean, as well as >100 wild species. A number of the wild species are highly tolerant to severe environmental conditions including high-salinity, acid or alkaline soil; drought; flooding; and pests and diseases. These features of the genus Vigna make it a good target for investigation of genetic diversity in adaptation to stressful environments; however, a lack of genomic information has hindered such research in this genus. Here, we present a genome database of the genus Vigna, Vigna Genome Server ('VigGS', http://viggs.dna.affrc.go.jp), based on the recently sequenced azuki bean genome, which incorporates annotated exon-intron structures, along with evidence for transcripts and proteins, visualized in GBrowse. VigGS also facilitates user construction of multiple alignments between azuki bean genes and those of six related dicot species. In addition, the database displays sequence polymorphisms between azuki bean and its wild relatives and enables users to design primer sequences targeting any variant site. VigGS offers a simple keyword search in addition to sequence similarity searches using BLAST and BLAT. To incorporate up to date genomic information, VigGS automatically receives newly deposited mRNA sequences of pre-set species from the public database once a week. Users can refer to not only gene structures mapped on the azuki bean genome on GBrowse but also relevant literature of the genes. VigGS will contribute to genomic research into plant biotic and abiotic stresses and to the future development of new stress-tolerant crops. © The Author 2015. Published by Oxford University Press on behalf of Japanese Society of Plant Physiologists. All rights reserved. For permissions, please email: journals.permissions@oup.com.
Microbial Genome Analysis and Comparisons: Web-based Protocols and Resources

Science.gov (United States)

Fully annotated genome sequences of many microorganisms are publicly available as a resource. However, in-depth analysis of these genomes using specialized tools is required to derive meaningful information. We describe here the utility of three powerful publicly available genome databases and ana...
DEFINING THE CHEMICAL SPACE OF PUBLIC GENOMIC ...

Science.gov (United States)

The current project aims to chemically index the genomics content of public genomic databases to make these data accessible in relation to other publicly available, chemically-indexed toxicological information. By defining the chemical space of public genomic data, it is possible to identify classes of chemicals on which to develop methodologies for the integration of chemogenomic data into predictive toxicology. The chemical space of public genomic data will be presented as well as the methodologies and tools developed to identify this chemical space.
Using nanopore sequencing to get complete genomes from complex samples

DEFF Research Database (Denmark)

Kirkegaard, Rasmus Hansen; Karst, Søren Michael; Nielsen, Per Halkjær

The advantages of “next generation sequencing” has come at the cost of genome finishing. The dominant sequencing technology provides short reads of 150-300 bp, which has made genome assembly very difficult as the reads do not span important repeat regions. Genomes have thus been added...... to the databases as fragmented assemblies and not as finished contigs that resemble the chromosomes in which the DNA is organised within the cells. This is especially troublesome for genomes derived from complex metagenome sequencing. Databases with incomplete genomes can lead to false conclusions about...... the absence of genes and functional predictions of the organisms. Furthermore, it is common that repetitive elements and marker genes such as the 16S rRNA gene are missing completely from these genome bins. Using nanopore long reads, we demonstrate that it is possible to span these regions and make complete...
Toward the automated generation of genome-scale metabolic networks in the SEED.

Science.gov (United States)

DeJongh, Matthew; Formsma, Kevin; Boillot, Paul; Gould, John; Rycenga, Matthew; Best, Aaron

2007-04-26

Current methods for the automated generation of genome-scale metabolic networks focus on genome annotation and preliminary biochemical reaction network assembly, but do not adequately address the process of identifying and filling gaps in the reaction network, and verifying that the network is suitable for systems level analysis. Thus, current methods are only sufficient for generating draft-quality networks, and refinement of the reaction network is still largely a manual, labor-intensive process. We have developed a method for generating genome-scale metabolic networks that produces substantially complete reaction networks, suitable for systems level analysis. Our method partitions the reaction space of central and intermediary metabolism into discrete, interconnected components that can be assembled and verified in isolation from each other, and then integrated and verified at the level of their interconnectivity. We have developed a database of components that are common across organisms, and have created tools for automatically assembling appropriate components for a particular organism based on the metabolic pathways encoded in the organism's genome. This focuses manual efforts on that portion of an organism's metabolism that is not yet represented in the database. We have demonstrated the efficacy of our method by reverse-engineering and automatically regenerating the reaction network from a published genome-scale metabolic model for Staphylococcus aureus. Additionally, we have verified that our method capitalizes on the database of common reaction network components created for S. aureus, by using these components to generate substantially complete reconstructions of the reaction networks from three other published metabolic models (Escherichia coli, Helicobacter pylori, and Lactococcus lactis). We have implemented our tools and database within the SEED, an open-source software environment for comparative genome annotation and analysis. Our method sets the
Toward the automated generation of genome-scale metabolic networks in the SEED

Directory of Open Access Journals (Sweden)

Gould John

2007-04-01

Full Text Available Abstract Background Current methods for the automated generation of genome-scale metabolic networks focus on genome annotation and preliminary biochemical reaction network assembly, but do not adequately address the process of identifying and filling gaps in the reaction network, and verifying that the network is suitable for systems level analysis. Thus, current methods are only sufficient for generating draft-quality networks, and refinement of the reaction network is still largely a manual, labor-intensive process. Results We have developed a method for generating genome-scale metabolic networks that produces substantially complete reaction networks, suitable for systems level analysis. Our method partitions the reaction space of central and intermediary metabolism into discrete, interconnected components that can be assembled and verified in isolation from each other, and then integrated and verified at the level of their interconnectivity. We have developed a database of components that are common across organisms, and have created tools for automatically assembling appropriate components for a particular organism based on the metabolic pathways encoded in the organism's genome. This focuses manual efforts on that portion of an organism's metabolism that is not yet represented in the database. We have demonstrated the efficacy of our method by reverse-engineering and automatically regenerating the reaction network from a published genome-scale metabolic model for Staphylococcus aureus. Additionally, we have verified that our method capitalizes on the database of common reaction network components created for S. aureus, by using these components to generate substantially complete reconstructions of the reaction networks from three other published metabolic models (Escherichia coli, Helicobacter pylori, and Lactococcus lactis. We have implemented our tools and database within the SEED, an open-source software environment for comparative
SPTEdb: a database for transposable elements in salicaceous plants

Science.gov (United States)

Jia, Zirui; Xiao, Yao; Ma, Wenjun; Wang, Junhui

2018-01-01

Abstract Although transposable elements (TEs) play significant roles in structural, functional and evolutionary dynamics of the salicaceous plants genome and the accurate identification, definition and classification of TEs are still inadequate. In this study, we identified 18 393 TEs from Populus trichocarpa, Populus euphratica and Salix suchowensis using a combination of signature-based, similarity-based and De novo method, and annotated them into 1621 families. A comprehensive and user-friendly web-based database, SPTEdb, was constructed and served for researchers. SPTEdb enables users to browse, retrieve and download the TEs sequences from the database. Meanwhile, several analysis tools, including BLAST, HMMER, GetORF and Cut sequence, were also integrated into SPTEdb to help users to mine the TEs data easily and effectively. In summary, SPTEdb will facilitate the study of TEs biology and functional genomics in salicaceous plants. Database URL: http://genedenovoweb.ticp.net:81/SPTEdb/index.php PMID:29688371
SOILS, FERTILIZATION AND MANAGEMENT OF WATER Halotolerant/alkalophilic bacteria associated with the cyanobacterium Arthrospira platensis (Nordstedt Gomont that promote early growth in Sorghum bicolor (L. Moench

Directory of Open Access Journals (Sweden)

Liliana Gómez G

2012-01-01

Full Text Available Arthrospira platensis associated bacteria (APAB identified through molecuar biology like Bacillus okhensis, Indibacter alkaliphilus and Halomonas sp., are also producing 3-indol acetic acid (IAA, these bacteria was used in early plant growth promotion tests over Sorghum bicolor, these bioassay was considered indirect evidence to suggest that APAB also may have stimulatory effects over A. platensis growth naturally. I. alkaliphilus and B. okhensis enhanced early germination of S. bicolor seads, with better results than that achieved by Azospirillum brasilense, bacterium used like reference as a common plant growth promoting rizobacteria. The three APAB enhanced significative differences (P≤0.05 over morphoagronomic parameters, I. alkaliphilus and B. okhensis exhibith better resoults in elongation stimulation and root and foliage dry weight. Above evidence suggest this bacteria like plant growth promoting and it recomended testing with A. platensis axenic cultures and its associated bactteri for understanding true interaction between them.
Benchmarking distributed data warehouse solutions for storing genomic variant information

Science.gov (United States)

Wiewiórka, Marek S.; Wysakowicz, Dawid P.; Okoniewski, Michał J.

2017-01-01

Abstract Genomic-based personalized medicine encompasses storing, analysing and interpreting genomic variants as its central issues. At a time when thousands of patientss sequenced exomes and genomes are becoming available, there is a growing need for efficient database storage and querying. The answer could be the application of modern distributed storage systems and query engines. However, the application of large genomic variant databases to this problem has not been sufficiently far explored so far in the literature. To investigate the effectiveness of modern columnar storage [column-oriented Database Management System (DBMS)] and query engines, we have developed a prototypic genomic variant data warehouse, populated with large generated content of genomic variants and phenotypic data. Next, we have benchmarked performance of a number of combinations of distributed storages and query engines on a set of SQL queries that address biological questions essential for both research and medical applications. In addition, a non-distributed, analytical database (MonetDB) has been used as a baseline. Comparison of query execution times confirms that distributed data warehousing solutions outperform classic relational DBMSs. Moreover, pre-aggregation and further denormalization of data, which reduce the number of distributed join operations, significantly improve query performance by several orders of magnitude. Most of distributed back-ends offer a good performance for complex analytical queries, while the Optimized Row Columnar (ORC) format paired with Presto and Parquet with Spark 2 query engines provide, on average, the lowest execution times. Apache Kudu on the other hand, is the only solution that guarantees a sub-second performance for simple genome range queries returning a small subset of data, where low-latency response is expected, while still offering decent performance for running analytical queries. In summary, research and clinical applications that require
ASGDB: a specialised genomic resource for interpreting Anopheles sinensis insecticide resistance.

Science.gov (United States)

Zhou, Dan; Xu, Yang; Zhang, Cheng; Hu, Meng-Xue; Huang, Yun; Sun, Yan; Ma, Lei; Shen, Bo; Zhu, Chang-Liang

2018-01-10

Anopheles sinensis is an important malaria vector in Southeast Asia. The widespread emergence of insecticide resistance in this mosquito species poses a serious threat to the efficacy of malaria control measures, particularly in China. Recently, the whole-genome sequencing and de novo assembly of An. sinensis (China strain) has been finished. A series of insecticide-resistant studies in An. sinensis have also been reported. There is a growing need to integrate these valuable data to provide a comprehensive database for further studies on insecticide-resistant management of An. sinensis. A bioinformatics database named An. sinensis genome database (ASGDB) was built. In addition to being a searchable database of published An. sinensis genome sequences and annotation, ASGDB provides in-depth analytical platforms for further understanding of the genomic and genetic data, including visualization of genomic data, orthologous relationship analysis, GO analysis, pathway analysis, expression analysis and resistance-related gene analysis. Moreover, ASGDB provides a panoramic view of insecticide resistance studies in An. sinensis in China. In total, 551 insecticide-resistant phenotypic and genotypic reports on An. sinensis distributed in Chinese malaria-endemic areas since the mid-1980s have been collected, manually edited in the same format and integrated into OpenLayers map-based interface, which allows the international community to assess and exploit the high volume of scattered data much easier. The database has been given the URL: http://www.asgdb.org /. ASGDB was built to help users mine data from the genome sequence of An. sinensis easily and effectively, especially with its advantages in insecticide resistance surveillance and control.
HERVd: database of human endogenous retroviruses

Czech Academy of Sciences Publication Activity Database

Pačes, Jan; Pavlíček, Adam; Pačes, Václav

2002-01-01

Roč. 30, č. 1 (2002), s. 205-206 ISSN 0305-1048 R&D Projects: GA MŠk LN00A079; GA ČR GA301/99/M023 Keywords : HERV * database * human genome Subject RIV: EB - Genetics ; Molecular Biology Impact factor: 7.051, year: 2002
Correcting Inconsistencies and Errors in Bacterial Genome Metadata Using an Automated Curation Tool in Excel (AutoCurE).

Science.gov (United States)

Schmedes, Sarah E; King, Jonathan L; Budowle, Bruce

2015-01-01

Whole-genome data are invaluable for large-scale comparative genomic studies. Current sequencing technologies have made it feasible to sequence entire bacterial genomes with relative ease and time with a substantially reduced cost per nucleotide, hence cost per genome. More than 3,000 bacterial genomes have been sequenced and are available at the finished status. Publically available genomes can be readily downloaded; however, there are challenges to verify the specific supporting data contained within the download and to identify errors and inconsistencies that may be present within the organizational data content and metadata. AutoCurE, an automated tool for bacterial genome database curation in Excel, was developed to facilitate local database curation of supporting data that accompany downloaded genomes from the National Center for Biotechnology Information. AutoCurE provides an automated approach to curate local genomic databases by flagging inconsistencies or errors by comparing the downloaded supporting data to the genome reports to verify genome name, RefSeq accession numbers, the presence of archaea, BioProject/UIDs, and sequence file descriptions. Flags are generated for nine metadata fields if there are inconsistencies between the downloaded genomes and genomes reports and if erroneous or missing data are evident. AutoCurE is an easy-to-use tool for local database curation for large-scale genome data prior to downstream analyses.
Genomic research in Eucalyptus.

Science.gov (United States)

Poke, Fiona S; Vaillancourt, René E; Potts, Brad M; Reid, James B

2005-09-01

Eucalyptus L'Hérit. is a genus comprised of more than 700 species that is of vital importance ecologically to Australia and to the forestry industry world-wide, being grown in plantations for the production of solid wood products as well as pulp for paper. With the sequencing of the genomes of Arabidopsis thaliana and Oryza sativa and the recent completion of the first tree genome sequence, Populus trichocarpa, attention has turned to the current status of genomic research in Eucalyptus. For several eucalypt species, large segregating families have been established, high-resolution genetic maps constructed and large EST databases generated. Collaborative efforts have been initiated for the integration of diverse genomic projects and will provide the framework for future research including exploiting the sequence of the entire eucalypt genome which is currently being sequenced. This review summarises the current position of genomic research in Eucalyptus and discusses the direction of future research.
The Global Genome Biodiversity Network (GGBN) Data Standard specification

Science.gov (United States)

Droege, G.; Barker, K.; Seberg, O.; Coddington, J.; Benson, E.; Berendsohn, W. G.; Bunk, B.; Butler, C.; Cawsey, E. M.; Deck, J.; Döring, M.; Flemons, P.; Gemeinholzer, B.; Güntsch, A.; Hollowell, T.; Kelbert, P.; Kostadinov, I.; Kottmann, R.; Lawlor, R. T.; Lyal, C.; Mackenzie-Dodds, J.; Meyer, C.; Mulcahy, D.; Nussbeck, S. Y.; O'Tuama, É.; Orrell, T.; Petersen, G.; Robertson, T.; Söhngen, C.; Whitacre, J.; Wieczorek, J.; Yilmaz, P.; Zetzsche, H.; Zhang, Y.; Zhou, X.

2016-01-01

Genomic samples of non-model organisms are becoming increasingly important in a broad range of studies from developmental biology, biodiversity analyses, to conservation. Genomic sample definition, description, quality, voucher information and metadata all need to be digitized and disseminated across scientific communities. This information needs to be concise and consistent in today’s ever-increasing bioinformatic era, for complementary data aggregators to easily map databases to one another. In order to facilitate exchange of information on genomic samples and their derived data, the Global Genome Biodiversity Network (GGBN) Data Standard is intended to provide a platform based on a documented agreement to promote the efficient sharing and usage of genomic sample material and associated specimen information in a consistent way. The new data standard presented here build upon existing standards commonly used within the community extending them with the capability to exchange data on tissue, environmental and DNA sample as well as sequences. The GGBN Data Standard will reveal and democratize the hidden contents of biodiversity biobanks, for the convenience of everyone in the wider biobanking community. Technical tools exist for data providers to easily map their databases to the standard. Database URL: http://terms.tdwg.org/wiki/GGBN_Data_Standard PMID:27694206
Data Cleaning and Semantic Improvement in Biological Databases

Directory of Open Access Journals (Sweden)

Apiletti Daniele

2006-12-01

Full Text Available Public genomic and proteomic databases can be affected by a variety of errors. These errors may involve either the description or the meaning of data (namely, syntactic or semantic errors. We focus our analysis on the detection of semantic errors, in order to verify the accuracy of the stored information. In particular, we address the issue of data constraints and functional dependencies among attributes in a given relational database. Constraints and dependencies show semantics among attributes in a database schema and their knowledge may be exploited to improve data quality and integration in database design, and to perform query optimization and dimensional reduction.

Relational Databases: A Transparent Framework for Encouraging Biology Students to Think Informatically

Science.gov (United States)

Rice, Michael; Gladstone, William; Weir, Michael

2004-01-01

We discuss how relational databases constitute an ideal framework for representing and analyzing large-scale genomic data sets in biology. As a case study, we describe a Drosophila splice-site database that we recently developed at Wesleyan University for use in research and teaching. The database stores data about splice sites computed by a…
Goodbye genome paper, hello genome report: the increasing popularity of 'genome announcements' and their impact on science.

Science.gov (United States)

Smith, David Roy

2017-05-01

Next-generation sequencing technologies have revolutionized genomics and altered the scientific publication landscape. Life-science journals abound with genome papers-peer-reviewed descriptions of newly sequenced chromosomes. Although they once filled the pages of Nature and Science, genome papers are now mostly relegated to journals with low-impact factors. Some have forecast the death of the genome paper and argued that they are using up valuable resources and not advancing science. However, the publication rate of genome papers is on the rise. This increase is largely because some journals have created a new category of manuscript called genome reports, which are short, fast-tracked papers describing a chromosome sequence(s), its GenBank accession number and little else. In 2015, for example, more than 2000 genome reports were published, and 2016 is poised to bring even more. Here, I highlight the growing popularity of genome reports and discuss their merits, drawbacks and impact on science and the academic publication infrastructure. Genome reports can be excellent assets for the research community, but they are also being used as quick and easy routes to a publication, and in some instances they are not peer reviewed. One of the best arguments for genome reports is that they are a citable, user-generated genomic resource providing essential methodological and biological information, which may not be present in the sequence database. But they are expensive and time-consuming avenues for achieving such a goal. © The Author 2016. Published by Oxford University Press.
Analysis of the Genome and Chromium Metabolism-Related Genes of Serratia sp. S2.

Science.gov (United States)

Dong, Lanlan; Zhou, Simin; He, Yuan; Jia, Yan; Bai, Qunhua; Deng, Peng; Gao, Jieying; Li, Yingli; Xiao, Hong

2018-05-01

This study is to investigate the genome sequence of Serratia sp. S2. The genomic DNA of Serratia sp. S2 was extracted and the sequencing library was constructed. The sequencing was carried out by Illumina 2000 and complete genomic sequences were obtained. Gene function annotation and bioinformatics analysis were performed by comparing with the known databases. The genome size of Serratia sp. S2 was 5,604,115 bp and the G+C content was 57.61%. There were 5373 protein coding genes, and 3732, 3614, and 3942 genes were respectively annotated into the GO, KEGG, and COG databases. There were 12 genes related to chromium metabolism in the Serratia sp. S2 genome. The whole genome sequence of Serratia sp. S2 is submitted to the GenBank database with gene accession number of LNRP00000000. Our findings may provide theoretical basis for the subsequent development of new biotechnology to repair environmental chromium pollution.
Human Contamination in Public Genome Assemblies.

Science.gov (United States)

Kryukov, Kirill; Imanishi, Tadashi

2016-01-01

Contamination in genome assembly can lead to wrong or confusing results when using such genome as reference in sequence comparison. Although bacterial contamination is well known, the problem of human-originated contamination received little attention. In this study we surveyed 45,735 available genome assemblies for evidence of human contamination. We used lineage specificity to distinguish between contamination and conservation. We found that 154 genome assemblies contain fragments that with high confidence originate as contamination from human DNA. Majority of contaminating human sequences were present in the reference human genome assembly for over a decade. We recommend that existing contaminated genomes should be revised to remove contaminated sequence, and that new assemblies should be thoroughly checked for presence of human DNA before submitting them to public databases.
Nutritional status and ion uptake response of Gynura bicolor DC. between Porous-tube and traditional hydroponic growth systems

Science.gov (United States)

Wang, Minjuan; Fu, Yuming; Liu, Hong

2015-08-01

Hydroponic culture has traditionally been used for Bioregenerative Life Support Systems (BLSS) because the optimal environment for roots supports high growth rates. Recent developments in Porous-tube Nutrient Delivery System (PTNDS) also offer high control of the root environment which is designed to provide a means for accurate environmental control and to allow for two-phase flow separation in microgravity. This study compared the effects of PTNDS and traditional hydroponic cultures on biomass yield, nutritional composition and antioxidant defense system (T-AOC, GSH, H2O2 and MDA) of G. bicolor, and ionic concentration (NH4+, K+, Mg2+, Ca2+, NO3-, H2 PO4-, SO42-) of nutrient solution during planting period in controlled environment chambers. The results indicated that the biomass production and yield of G. bicolor grown in PTNDS were higher than in hydroponic culture, although Relative water content (RWC), leaf length and shoot height were not significantly different. PTNDS cultivation enhanced calories from 139.5 to 182.3 kJ/100 g dry matter, and carbohydrate from 4.8 to 7.3 g/100 g dry matter and reduced the amount of protein from 7.3 to 4.8 g/100 g dry matter and ash from 1.4 to1.0 g/100 g dry matter, compared with hydroponic culture. PTNDS cultivation accumulated the nutrition elements of Ca, Cu, Fe and Zn, and reduced Na concentration. T-AOC and GSH contents were significantly lower in PTNDS than in hydroponic culture in the first harvest. After the first harvest, the contents of MDA and H2O2 were significantly higher in PTNDS than in hydroponic culture. However, the activity of T-AOC and GSH and H2O2 and MDA contents had no significant differences under both cultures after the second and third harvest. Higher concentrations of K+, Mg2+ and Ca2+ were found in nutrient solution of plants grown in hydroponics culture compared to PTNDS, wherein lower concentrations of NO3-, H2 PO4- and SO42- occurred. Our results demonstrate that PTNDS culture has more
The Porcelain Crab Transcriptome and PCAD, the Porcelain Crab Microarray and Sequence Database

Energy Technology Data Exchange (ETDEWEB)

Tagmount, Abderrahmane; Wang, Mei; Lindquist, Erika; Tanaka, Yoshihiro; Teranishi, Kristen S.; Sunagawa, Shinichi; Wong, Mike; Stillman, Jonathon H.

2010-01-27

Background: With the emergence of a completed genome sequence of the freshwater crustacean Daphnia pulex, construction of genomic-scale sequence databases for additional crustacean sequences are important for comparative genomics and annotation. Porcelain crabs, genus Petrolisthes, have been powerful crustacean models for environmental and evolutionary physiology with respect to thermal adaptation and understanding responses of marine organisms to climate change. Here, we present a large-scale EST sequencing and cDNA microarray database project for the porcelain crab Petrolisthes cinctipes. Methodology/Principal Findings: A set of ~;;30K unique sequences (UniSeqs) representing ~;;19K clusters were generated from ~;;98K high quality ESTs from a set of tissue specific non-normalized and mixed-tissue normalized cDNA libraries from the porcelain crab Petrolisthes cinctipes. Homology for each UniSeq was assessed using BLAST, InterProScan, GO and KEGG database searches. Approximately 66percent of the UniSeqs had homology in at least one of the databases. All EST and UniSeq sequences along with annotation results and coordinated cDNA microarray datasets have been made publicly accessible at the Porcelain Crab Array Database (PCAD), a feature-enriched version of the Stanford and Longhorn Array Databases.Conclusions/Significance: The EST project presented here represents the third largest sequencing effort for any crustacean, and the largest effort for any crab species. Our assembly and clustering results suggest that our porcelain crab EST data set is equally diverse to the much larger EST set generated in the Daphnia pulex genome sequencing project, and thus will be an important resource to the Daphnia research community. Our homology results support the pancrustacea hypothesis and suggest that Malacostraca may be ancestral to Branchiopoda and Hexapoda. Our results also suggest that our cDNA microarrays cover as much of the transcriptome as can reasonably be captured in
MSDB: A Comprehensive Database of Simple Sequence Repeats.

Science.gov (United States)

Avvaru, Akshay Kumar; Saxena, Saketh; Sowpati, Divya Tej; Mishra, Rakesh Kumar

2017-06-01

Microsatellites, also known as Simple Sequence Repeats (SSRs), are short tandem repeats of 1-6 nt motifs present in all genomes, particularly eukaryotes. Besides their usefulness as genome markers, SSRs have been shown to perform important regulatory functions, and variations in their length at coding regions are linked to several disorders in humans. Microsatellites show a taxon-specific enrichment in eukaryotic genomes, and some may be functional. MSDB (Microsatellite Database) is a collection of >650 million SSRs from 6,893 species including Bacteria, Archaea, Fungi, Plants, and Animals. This database is by far the most exhaustive resource to access and analyze SSR data of multiple species. In addition to exploring data in a customizable tabular format, users can view and compare the data of multiple species simultaneously using our interactive plotting system. MSDB is developed using the Django framework and MySQL. It is freely available at http://tdb.ccmb.res.in/msdb. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Effects of Urin Cow Dosage on Growth and Production of Sorgum Plant (Sorghum Bicolor L) on Peat Land

Science.gov (United States)

Utami Lestari, Sri; Andrian, Andi

2017-12-01

Sweet sorghum (Sorghum bicolor (L)), is a potential cultivated plant, especially in marginal and dry areas, sorghum has an important role as a source of carbohydrates, sorghum is expected as an alternative choice for peatland cultivation, with the use of peatlands is also expected Raising awareness of the environment by cultivating more environmentally friendly plants. The aim of this research is to know the influence and get the best dosage of cow urine on growth and production of Sorghum (Sorghum bicolor L) plant on peat soil. The experiment was conducted experimentally by using Completely Randomized Design (RAL), with one factor, namely: Cow urine administration, given in 5 treatments and 4 replications, resulting in 20 trials. Each experimental unit consists of 4 plants and 2 plants to be sampled. The factors studied were A0 = dose of cow urine 0 cc / 1, A1 = dose of cow urine 25 cc / 1, A2 = dose of cow urine 50 cc / 1, A3 = dose of cow urine 75 cc / 1, A4 = dose Cow urine 100 cc / 1. Conclusion Giving of cow urine has significant effect on growth and production of sorghum plant which is seen on the parameters of plant height, leaf length, leaf width. While wet weight 100 seeds and dry weight of 100 seeds of sorghum plants have no significant effect. The best dose is given by A4 treatment with the best dose of 100 cc / 1.
Functional role of bacteriophage transfer RNAs: codon usage analysis of genomic sequences stored in the GENBANK/EMBL/DDBJ databases

Directory of Open Access Journals (Sweden)

T Kunisawa

2006-01-01

Full Text Available Complete genomic sequence data are stored in the public GenBank/EMBL/DDBJ databases so that any investigator can make use of the data. This report describes a comparative analysis of codon usage that is impossible without such a public and open data system. A limited number of bacteriophages harbor their own transfer RNAs. Based on a comparison between T4 phage-encoded tRNA species and the relative cellular amounts of host Escherichia coli tRNAs, it is hypothesized that T4 tRNAs could serve to supplement host isoacceptor tRNA species that are present in minor amounts and thus enhance the translational efficiency of phage proteins. When compared to their respective host bacteria, the codon usage data of bacteriophages D3, φC31, HP1, D29 and 933W all show an increased frequency of synonymous codons or amino acids that correspond to phage tRNA species, suggesting their supplemental role in the efficient production of phage proteins. The data-analysis presents an example in which the availability of an open and fully accessible database system would allow one to obtain comprehensive insights into a fundamental problem in molecular biology.
TMC-SNPdb: an Indian germline variant database derived from whole exome sequences.

Science.gov (United States)

Upadhyay, Pawan; Gardi, Nilesh; Desai, Sanket; Sahoo, Bikram; Singh, Ankita; Togar, Trupti; Iyer, Prajish; Prasad, Ratnam; Chandrani, Pratik; Gupta, Sudeep; Dutt, Amit

2016-01-01

Cancer is predominantly a somatic disease. A mutant allele present in a cancer cell genome is considered somatic when it's absent in the paired normal genome along with public SNP databases. The current build of dbSNP, the most comprehensive public SNP database, however inadequately represents several non-European Caucasian populations, posing a limitation in cancer genomic analyses of data from these populations. We present the T: ata M: emorial C: entre-SNP D: ata B: ase (TMC-SNPdb), as the first open source, flexible, upgradable, and freely available SNP database (accessible through dbSNP build 149 and ANNOVAR)-representing 114 309 unique germline variants-generated from whole exome data of 62 normal samples derived from cancer patients of Indian origin. The TMC-SNPdb is presented with a companion subtraction tool that can be executed with command line option or using an easy-to-use graphical user interface with the ability to deplete additional Indian population specific SNPs over and above dbSNP and 1000 Genomes databases. Using an institutional generated whole exome data set of 132 samples of Indian origin, we demonstrate that TMC-SNPdb could deplete 42, 33 and 28% false positive somatic events post dbSNP depletion in Indian origin tongue, gallbladder, and cervical cancer samples, respectively. Beyond cancer somatic analyses, we anticipate utility of the TMC-SNPdb in several Mendelian germline diseases. In addition to dbSNP build 149 and ANNOVAR, the TMC-SNPdb along with the subtraction tool is available for download in the public domain at the following:Database URL: http://www.actrec.gov.in/pi-webpages/AmitDutt/TMCSNP/TMCSNPdp.html. © The Author(s) 2016. Published by Oxford University Press.
The path to enlightenment: making sense of genomic and proteomic information.

Science.gov (United States)

Maurer, Martin H

2004-05-01

Whereas genomics describes the study of genome, mainly represented by its gene expression on the DNA or RNA level, the term proteomics denotes the study of the proteome, which is the protein complement encoded by the genome. In recent years, the number of proteomic experiments increased tremendously. While all fields of proteomics have made major technological advances, the biggest step was seen in bioinformatics. Biological information management relies on sequence and structure databases and powerful software tools to translate experimental results into meaningful biological hypotheses and answers. In this resource article, I provide a collection of databases and software available on the Internet that are useful to interpret genomic and proteomic data. The article is a toolbox for researchers who have genomic or proteomic datasets and need to put their findings into a biological context.
Protein structure database search and evolutionary classification.

Science.gov (United States)

Yang, Jinn-Moon; Tung, Chi-Hua

2006-01-01

As more protein structures become available and structural genomics efforts provide structural models in a genome-wide strategy, there is a growing need for fast and accurate methods for discovering homologous proteins and evolutionary classifications of newly determined structures. We have developed 3D-BLAST, in part, to address these issues. 3D-BLAST is as fast as BLAST and calculates the statistical significance (E-value) of an alignment to indicate the reliability of the prediction. Using this method, we first identified 23 states of the structural alphabet that represent pattern profiles of the backbone fragments and then used them to represent protein structure databases as structural alphabet sequence databases (SADB). Our method enhanced BLAST as a search method, using a new structural alphabet substitution matrix (SASM) to find the longest common substructures with high-scoring structured segment pairs from an SADB database. Using personal computers with Intel Pentium4 (2.8 GHz) processors, our method searched more than 10 000 protein structures in 1.3 s and achieved a good agreement with search results from detailed structure alignment methods. [3D-BLAST is available at http://3d-blast.life.nctu.edu.tw].
DATABASES DEVELOPED IN INDIA FOR BIOLOGICAL SCIENCES

Directory of Open Access Journals (Sweden)

Gitanjali Yadav

2017-09-01

Full Text Available The complexity of biological systems requires use of a variety of experimental methods with ever increasing sophistication to probe various cellular processes at molecular and atomic resolution. The availability of technologies for determining nucleic acid sequences of genes and atomic resolution structures of biomolecules prompted development of major biological databases like GenBank and PDB almost four decades ago. India was one of the few countries to realize early, the utility of such databases for progress in modern biology/biotechnology. Department of Biotechnology (DBT, India established Biotechnology Information System (BTIS network in late eighties. Starting with the genome sequencing revolution at the turn of the century, application of high-throughput sequencing technologies in biology and medicine for analysis of genomes, transcriptomes, epigenomes and microbiomes have generated massive volumes of sequence data. BTIS network has not only provided state of the art computational infrastructure to research institutes and universities for utilizing various biological databases developed abroad in their research, it has also actively promoted research and development (R&D projects in Bioinformatics to develop a variety of biological databases in diverse areas. It is encouraging to note that, a large number of biological databases or data driven software tools developed in India, have been published in leading peer reviewed international journals like Nucleic Acids Research, Bioinformatics, Database, BMC, PLoS and NPG series publication. Some of these databases are not only unique, they are also highly accessed as reflected in number of citations. Apart from databases developed by individual research groups, BTIS has initiated consortium projects to develop major India centric databases on Mycobacterium tuberculosis, Rice and Mango, which can potentially have practical applications in health and agriculture. Many of these biological
Pollen foraging in colonies of Melipona bicolor (Apidae, Meliponini): effects of season, colony size and queen number.

Science.gov (United States)

Hilário, S D; Imperatriz-Fonseca, V L

2009-01-01

We evaluated the ratio between the number of pollen foragers and the total number of bees entering colonies of Melipona bicolor, a facultative polygynous species of stingless bees. The variables considered in our analysis were: seasonality, colony size and the number of physogastric queens in each colony. The pollen forager ratios varied significantly between seasons; the ratio was higher in winter than in summer. However, colony size and number of queens per colony had no significant effect. We conclude that seasonal differences in pollen harvest are related to the production of sexuals and to the number of individuals and their body size.
Prospecting sugarcane resistance to Sugarcane yellow leaf virus by genome-wide association.

Science.gov (United States)

Debibakas, S; Rocher, S; Garsmeur, O; Toubi, L; Roques, D; D'Hont, A; Hoarau, J-Y; Daugrois, J H

2014-08-01

Using GWAS approaches, we detected independent resistant markers in sugarcane towards a vectored virus disease. Based on comparative genomics, several candidate genes potentially involved in virus/aphid/plant interactions were pinpointed. Yellow leaf of sugarcane is an emerging viral disease whose causal agent is a Polerovirus, the Sugarcane yellow leaf virus (SCYLV) transmitted by aphids. To identify quantitative trait loci controlling resistance to yellow leaf which are of direct relevance for breeding, we undertook a genome-wide association study (GWAS) on a sugarcane cultivar panel (n = 189) representative of current breeding germplasm. This panel was fingerprinted with 3,949 polymorphic markers (DArT and AFLP). The panel was phenotyped for SCYLV infection in leaves and stalks in two trials for two crop cycles, under natural disease pressure prevalent in Guadeloupe. Mixed linear models including co-factors representing population structure fixed effects and pairwise kinship random effects provided an efficient control of the risk of inflated type-I error at a genome-wide level. Six independent markers were significantly detected in association with SCYLV resistance phenotype. These markers explained individually between 9 and 14 % of the disease variation of the cultivar panel. Their frequency in the panel was relatively low (8-20 %). Among them, two markers were detected repeatedly across the GWAS exercises based on the different disease resistance parameters. These two markers could be blasted on Sorghum bicolor genome and candidate genes potentially involved in plant-aphid or plant-virus interactions were localized in the vicinity of sorghum homologs of sugarcane markers. Our results illustrate the potential of GWAS approaches to prospect among sugarcane germplasm for accessions likely bearing resistance alleles of significant effect useful in breeding programs.
DFAST and DAGA: web-based integrated genome annotation tools and resources.

Science.gov (United States)

Tanizawa, Yasuhiro; Fujisawa, Takatomo; Kaminuma, Eli; Nakamura, Yasukazu; Arita, Masanori

2016-01-01

Quality assurance and correct taxonomic affiliation of data submitted to public sequence databases have been an everlasting problem. The DDBJ Fast Annotation and Submission Tool (DFAST) is a newly developed genome annotation pipeline with quality and taxonomy assessment tools. To enable annotation of ready-to-submit quality, we also constructed curated reference protein databases tailored for lactic acid bacteria. DFAST was developed so that all the procedures required for DDBJ submission could be done seamlessly online. The online workspace would be especially useful for users not familiar with bioinformatics skills. In addition, we have developed a genome repository, DFAST Archive of Genome Annotation (DAGA), which currently includes 1,421 genomes covering 179 species and 18 subspecies of two genera, Lactobacillus and Pediococcus , obtained from both DDBJ/ENA/GenBank and Sequence Read Archive (SRA). All the genomes deposited in DAGA were annotated consistently and assessed using DFAST. To assess the taxonomic position based on genomic sequence information, we used the average nucleotide identity (ANI), which showed high discriminative power to determine whether two given genomes belong to the same species. We corrected mislabeled or misidentified genomes in the public database and deposited the curated information in DAGA. The repository will improve the accessibility and reusability of genome resources for lactic acid bacteria. By exploiting the data deposited in DAGA, we found intraspecific subgroups in Lactobacillus gasseri and Lactobacillus jensenii , whose variation between subgroups is larger than the well-accepted ANI threshold of 95% to differentiate species. DFAST and DAGA are freely accessible at https://dfast.nig.ac.jp.
DEDB: a database of Drosophila melanogaster exons in splicing graph form

Directory of Open Access Journals (Sweden)

Tan Tin

2004-12-01

Full Text Available Abstract Background A wealth of quality genomic and mRNA/EST sequences in recent years has provided the data required for large-scale genome-wide analysis of alternative splicing. We have capitalized on this by constructing a database that contains alternative splicing information organized as splicing graphs, where all transcripts arising from a single gene are collected, organized and classified. The splicing graph then serves as the basis for the classification of the various types of alternative splicing events. Description DEDB http://proline.bic.nus.edu.sg/dedb/index.html is a database of Drosophila melanogaster exons obtained from FlyBase arranged in a splicing graph form that permits the creation of simple rules allowing for the classification of alternative splicing events. Pfam domains were also mapped onto the protein sequences allowing users to access the impact of alternative splicing events on domain organization. Conclusions DEDB's catalogue of splicing graphs facilitates genome-wide classification of alternative splicing events for genome analysis. The splicing graph viewer brings together genome, transcript, protein and domain information to facilitate biologists in understanding the implications of alternative splicing.
Geminivirus data warehouse: a database enriched with machine learning approaches.

Science.gov (United States)

Silva, Jose Cleydson F; Carvalho, Thales F M; Basso, Marcos F; Deguchi, Michihito; Pereira, Welison A; Sobrinho, Roberto R; Vidigal, Pedro M P; Brustolini, Otávio J B; Silva, Fabyano F; Dal-Bianco, Maximiller; Fontes, Renildes L F; Santos, Anésia A; Zerbini, Francisco Murilo; Cerqueira, Fabio R; Fontes, Elizabeth P B

2017-05-05

The Geminiviridae family encompasses a group of single-stranded DNA viruses with twinned and quasi-isometric virions, which infect a wide range of dicotyledonous and monocotyledonous plants and are responsible for significant economic losses worldwide. Geminiviruses are divided into nine genera, according to their insect vector, host range, genome organization, and phylogeny reconstruction. Using rolling-circle amplification approaches along with high-throughput sequencing technologies, thousands of full-length geminivirus and satellite genome sequences were amplified and have become available in public databases. As a consequence, many important challenges have emerged, namely, how to classify, store, and analyze massive datasets as well as how to extract information or new knowledge. Data mining approaches, mainly supported by machine learning (ML) techniques, are a natural means for high-throughput data analysis in the context of genomics, transcriptomics, proteomics, and metabolomics. Here, we describe the development of a data warehouse enriched with ML approaches, designated geminivirus.org. We implemented search modules, bioinformatics tools, and ML methods to retrieve high precision information, demarcate species, and create classifiers for genera and open reading frames (ORFs) of geminivirus genomes. The use of data mining techniques such as ETL (Extract, Transform, Load) to feed our database, as well as algorithms based on machine learning for knowledge extraction, allowed us to obtain a database with quality data and suitable tools for bioinformatics analysis. The Geminivirus Data Warehouse (geminivirus.org) offers a simple and user-friendly environment for information retrieval and knowledge discovery related to geminiviruses.
Ensembl 2002: accommodating comparative genomics.

Science.gov (United States)

Clamp, M; Andrews, D; Barker, D; Bevan, P; Cameron, G; Chen, Y; Clark, L; Cox, T; Cuff, J; Curwen, V; Down, T; Durbin, R; Eyras, E; Gilbert, J; Hammond, M; Hubbard, T; Kasprzyk, A; Keefe, D; Lehvaslaiho, H; Iyer, V; Melsopp, C; Mongin, E; Pettett, R; Potter, S; Rust, A; Schmidt, E; Searle, S; Slater, G; Smith, J; Spooner, W; Stabenau, A; Stalker, J; Stupka, E; Ureta-Vidal, A; Vastrik, I; Birney, E

2003-01-01

The Ensembl (http://www.ensembl.org/) database project provides a bioinformatics framework to organise biology around the sequences of large genomes. It is a comprehensive source of stable automatic annotation of human, mouse and other genome sequences, available as either an interactive web site or as flat files. Ensembl also integrates manually annotated gene structures from external sources where available. As well as being one of the leading sources of genome annotation, Ensembl is an open source software engineering project to develop a portable system able to handle very large genomes and associated requirements. These range from sequence analysis to data storage and visualisation and installations exist around the world in both companies and at academic sites. With both human and mouse genome sequences available and more vertebrate sequences to follow, many of the recent developments in Ensembl have focusing on developing automatic comparative genome analysis and visualisation.
Reproductive success and contaminant associations in tree swallows (Tachycineta bicolor) used to assess a beneficial use impairment in U.S. and Binational Great Lakes’ Areas of Concern

Science.gov (United States)

During 2010-2014, tree swallow (Tachycineta bicolor) reproductive success was monitored at 68 sites across all 5 Great Lakes, including 58 sites located within Great Lakes Areas of concern (AOCs) and 10 non-AOCs. Sample eggs were collected from tree swallow clutches and analyzed ...

WormBase: Annotating many nematode genomes.

Science.gov (United States)

Howe, Kevin; Davis, Paul; Paulini, Michael; Tuli, Mary Ann; Williams, Gary; Yook, Karen; Durbin, Richard; Kersey, Paul; Sternberg, Paul W

2012-01-01

WormBase (www.wormbase.org) has been serving the scientific community for over 11 years as the central repository for genomic and genetic information for the soil nematode Caenorhabditis elegans. The resource has evolved from its beginnings as a database housing the genomic sequence and genetic and physical maps of a single species, and now represents the breadth and diversity of nematode research, currently serving genome sequence and annotation for around 20 nematodes. In this article, we focus on WormBase's role of genome sequence annotation, describing how we annotate and integrate data from a growing collection of nematode species and strains. We also review our approaches to sequence curation, and discuss the impact on annotation quality of large functional genomics projects such as modENCODE.
Nuclear-like Seq in mt Genome - RMG | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available ar-like Seq in mt Genome Data detail Data name Nuclear-like Seq in mt Genome DOI 10...e Site Policy | Contact Us Nuclear-like Seq in mt Genome - RMG | LSDB Archive ... ...switchLanguage; BLAST Search Image Search Home About Archive Update History Data List Contact us RMG Nucle
SNUGB: a versatile genome browser supporting comparative and functional fungal genomics

Directory of Open Access Journals (Sweden)

Kim Seungill

2008-12-01

Full Text Available Abstract Background Since the full genome sequences of Saccharomyces cerevisiae were released in 1996, genome sequences of over 90 fungal species have become publicly available. The heterogeneous formats of genome sequences archived in different sequencing centers hampered the integration of the data for efficient and comprehensive comparative analyses. The Comparative Fungal Genomics Platform (CFGP was developed to archive these data via a single standardized format that can support multifaceted and integrated analyses of the data. To facilitate efficient data visualization and utilization within and across species based on the architecture of CFGP and associated databases, a new genome browser was needed. Results The Seoul National University Genome Browser (SNUGB integrates various types of genomic information derived from 98 fungal/oomycete (137 datasets and 34 plant and animal (38 datasets species, graphically presents germane features and properties of each genome, and supports comparison between genomes. The SNUGB provides three different forms of the data presentation interface, including diagram, table, and text, and six different display options to support visualization and utilization of the stored information. Information for individual species can be quickly accessed via a new tool named the taxonomy browser. In addition, SNUGB offers four useful data annotation/analysis functions, including 'BLAST annotation.' The modular design of SNUGB makes its adoption to support other comparative genomic platforms easy and facilitates continuous expansion. Conclusion The SNUGB serves as a powerful platform supporting comparative and functional genomics within the fungal kingdom and also across other kingdoms. All data and functions are available at the web site http://genomebrowser.snu.ac.kr/.
ArthropodaCyc: a CycADS powered collection of BioCyc databases to analyse and compare metabolism of arthropods.

Science.gov (United States)

Baa-Puyoulet, Patrice; Parisot, Nicolas; Febvay, Gérard; Huerta-Cepas, Jaime; Vellozo, Augusto F; Gabaldón, Toni; Calevro, Federica; Charles, Hubert; Colella, Stefano

2016-01-01

Arthropods interact with humans at different levels with highly beneficial roles (e.g. as pollinators), as well as with a negative impact for example as vectors of human or animal diseases, or as agricultural pests. Several arthropod genomes are available at present and many others will be sequenced in the near future in the context of the i5K initiative, offering opportunities for reconstructing, modelling and comparing their metabolic networks. In-depth analysis of these genomic data through metabolism reconstruction is expected to contribute to a better understanding of the biology of arthropods, thereby allowing the development of new strategies to control harmful species. In this context, we present here ArthropodaCyc, a dedicated BioCyc collection of databases using the Cyc annotation database system (CycADS), allowing researchers to perform reliable metabolism comparisons of fully sequenced arthropods genomes. Since the annotation quality is a key factor when performing such global genome comparisons, all proteins from the genomes included in the ArthropodaCyc database were re-annotated using several annotation tools and orthology information. All functional/domain annotation results and their sources were integrated in the databases for user access. Currently, ArthropodaCyc offers a centralized repository of metabolic pathways, protein sequence domains, Gene Ontology annotations as well as evolutionary information for 28 arthropod species. Such database collection allows metabolism analysis both with integrated tools and through extraction of data in formats suitable for systems biology studies.Database URL: http://arthropodacyc.cycadsys.org/. © The Author(s) 2016. Published by Oxford University Press.
The Arab genome: Health and wealth.

Science.gov (United States)

Zayed, Hatem

2016-11-05

The 22 Arab nations have a unique genetic structure, which reflects both conserved and diverse gene pools due to the prevalent endogamous and consanguineous marriage culture and the long history of admixture among different ethnic subcultures descended from the Asian, European, and African continents. Human genome sequencing has enabled large-scale genomic studies of different populations and has become a powerful tool for studying disease predictions and diagnosis. Despite the importance of the Arab genome for better understanding the dynamics of the human genome, discovering rare genetic variations, and studying early human migration out of Africa, it is poorly represented in human genome databases, such as HapMap and the 1000 Genomes Project. In this review, I demonstrate the significance of sequencing the Arab genome and setting an Arab genome reference(s) for better understanding the molecular pathogenesis of genetic diseases, discovering novel/rare variants, and identifying a meaningful genotype-phenotype correlation for complex diseases. Copyright © 2016. Published by Elsevier B.V.
Integr8: enhanced inter-operability of European molecular biology databases.

Science.gov (United States)

Kersey, P J; Morris, L; Hermjakob, H; Apweiler, R

2003-01-01

The increasing production of molecular biology data in the post-genomic era, and the proliferation of databases that store it, require the development of an integrative layer in database services to facilitate the synthesis of related information. The solution of this problem is made more difficult by the absence of universal identifiers for biological entities, and the breadth and variety of available data. Integr8 was modelled using UML (Universal Modelling Language). Integr8 is being implemented as an n-tier system using a modern object-oriented programming language (Java). An object-relational mapping tool, OJB, is being used to specify the interface between the upper layers and an underlying relational database. The European Bioinformatics Institute is launching the Integr8 project. Integr8 will be an automatically populated database in which we will maintain stable identifiers for biological entities, describe their relationships with each other (in accordance with the central dogma of biology), and store equivalences between identified entities in the source databases. Only core data will be stored in Integr8, with web links to the source databases providing further information. Integr8 will provide the integrative layer of the next generation of bioinformatics services from the EBI. Web-based interfaces will be developed to offer gene-centric views of the integrated data, presenting (where known) the links between genome, proteome and phenotype.
FmMDb: a versatile database of foxtail millet markers for millets and bioenergy grasses research.

Directory of Open Access Journals (Sweden)

Venkata Suresh B

Full Text Available The prominent attributes of foxtail millet (Setaria italica L. including its small genome size, short life cycle, inbreeding nature, and phylogenetic proximity to various biofuel crops have made this crop an excellent model system to investigate various aspects of architectural, evolutionary and physiological significances in Panicoid bioenergy grasses. After release of its whole genome sequence, large-scale genomic resources in terms of molecular markers were generated for the improvement of both foxtail millet and its related species. Hence it is now essential to congregate, curate and make available these genomic resources for the benefit of researchers and breeders working towards crop improvement. In view of this, we have constructed the Foxtail millet Marker Database (FmMDb; http://www.nipgr.res.in/foxtail.html, a comprehensive online database for information retrieval, visualization and management of large-scale marker datasets with unrestricted public access. FmMDb is the first database which provides complete marker information to the plant science community attempting to produce elite cultivars of millet and bioenergy grass species, thus addressing global food insecurity.
The Effect of Silicon on some Morpho-physiological Characteristics and Grain Yield of Sorghum (Sorghum bicolor L.) under Salt Stress

OpenAIRE

S Hasibi; H Farahbakhsh; Gh Khajoeinejad

2016-01-01

Introduction Nowadays, salinity is one of the limiting factors for crop production in arid and semi-arid regions. On the other hand, sorghum (Sorghum bicolor L.) is a self-pollinated and short-day plant, which partly has been adapted to salinity and water stress conditions; also play an important role in humans, livestock and poultry nourishments. All studies have showed the positive effects of Silicon on growth and yield of plants in both normal and stress conditions. The aim of this exp...
MICA: desktop software for comprehensive searching of DNA databases

Directory of Open Access Journals (Sweden)

Glick Benjamin S

2006-10-01

Full Text Available Abstract Background Molecular biologists work with DNA databases that often include entire genomes. A common requirement is to search a DNA database to find exact matches for a nondegenerate or partially degenerate query. The software programs available for such purposes are normally designed to run on remote servers, but an appealing alternative is to work with DNA databases stored on local computers. We describe a desktop software program termed MICA (K-Mer Indexing with Compact Arrays that allows large DNA databases to be searched efficiently using very little memory. Results MICA rapidly indexes a DNA database. On a Macintosh G5 computer, the complete human genome could be indexed in about 5 minutes. The indexing algorithm recognizes all 15 characters of the DNA alphabet and fully captures the information in any DNA sequence, yet for a typical sequence of length L, the index occupies only about 2L bytes. The index can be searched to return a complete list of exact matches for a nondegenerate or partially degenerate query of any length. A typical search of a long DNA sequence involves reading only a small fraction of the index into memory. As a result, searches are fast even when the available RAM is limited. Conclusion MICA is suitable as a search engine for desktop DNA analysis software.
Annotation-Based Whole Genomic Prediction and Selection

DEFF Research Database (Denmark)

Kadarmideen, Haja; Do, Duy Ngoc; Janss, Luc

Genomic selection is widely used in both animal and plant species, however, it is performed with no input from known genomic or biological role of genetic variants and therefore is a black box approach in a genomic era. This study investigated the role of different genomic regions and detected QTLs...... in their contribution to estimated genomic variances and in prediction of genomic breeding values by applying SNP annotation approaches to feed efficiency. Ensembl Variant Predictor (EVP) and Pig QTL database were used as the source of genomic annotation for 60K chip. Genomic prediction was performed using the Bayes...... classes. Predictive accuracy was 0.531, 0.532, 0.302, and 0.344 for DFI, RFI, ADG and BF, respectively. The contribution per SNP to total genomic variance was similar among annotated classes across different traits. Predictive performance of SNP classes did not significantly differ from randomized SNP...
Alternatives to relational databases in precision medicine: Comparison of NoSQL approaches for big data storage using supercomputers

Science.gov (United States)

Velazquez, Enrique Israel

Improvements in medical and genomic technologies have dramatically increased the production of electronic data over the last decade. As a result, data management is rapidly becoming a major determinant, and urgent challenge, for the development of Precision Medicine. Although successful data management is achievable using Relational Database Management Systems (RDBMS), exponential data growth is a significant contributor to failure scenarios. Growing amounts of data can also be observed in other sectors, such as economics and business, which, together with the previous facts, suggests that alternate database approaches (NoSQL) may soon be required for efficient storage and management of big databases. However, this hypothesis has been difficult to test in the Precision Medicine field since alternate database architectures are complex to assess and means to integrate heterogeneous electronic health records (EHR) with dynamic genomic data are not easily available. In this dissertation, we present a novel set of experiments for identifying NoSQL database approaches that enable effective data storage and management in Precision Medicine using patients' clinical and genomic information from the cancer genome atlas (TCGA). The first experiment draws on performance and scalability from biologically meaningful queries with differing complexity and database sizes. The second experiment measures performance and scalability in database updates without schema changes. The third experiment assesses performance and scalability in database updates with schema modifications due dynamic data. We have identified two NoSQL approach, based on Cassandra and Redis, which seems to be the ideal database management systems for our precision medicine queries in terms of performance and scalability. We present NoSQL approaches and show how they can be used to manage clinical and genomic big data. Our research is relevant to the public health since we are focusing on one of the main
Characterizing and annotating the genome using RNA-seq data.

Science.gov (United States)

Chen, Geng; Shi, Tieliu; Shi, Leming

2017-02-01

Bioinformatics methods for various RNA-seq data analyses are in fast evolution with the improvement of sequencing technologies. However, many challenges still exist in how to efficiently process the RNA-seq data to obtain accurate and comprehensive results. Here we reviewed the strategies for improving diverse transcriptomic studies and the annotation of genetic variants based on RNA-seq data. Mapping RNA-seq reads to the genome and transcriptome represent two distinct methods for quantifying the expression of genes/transcripts. Besides the known genes annotated in current databases, many novel genes/transcripts (especially those long noncoding RNAs) still can be identified on the reference genome using RNA-seq. Moreover, owing to the incompleteness of current reference genomes, some novel genes are missing from them. Genome- guided and de novo transcriptome reconstruction are two effective and complementary strategies for identifying those novel genes/transcripts on or beyond the reference genome. In addition, integrating the genes of distinct databases to conduct transcriptomics and genetics studies can improve the results of corresponding analyses.
Development of polymorphic microsatellite loci for conservation genetic studies of the coral reef fish Centropyge bicolor

KAUST Repository

Herrera Sarrias, Marcela

2015-08-14

A total of 23 novel polymorphic microsatellite marker loci were developed for the angelfish Centropyge bicolor through 454 sequencing, and further tested on two spatially separated populations (90 individuals each) from Kimbe Bay in Papua New Guinea. The mean ± s.e. number of alleles per locus was 14·65 ± 1·05, and mean ± s.e. observed (HO) and expected (HE) heterozygosity frequencies were 0·676 ± 0·021 and 0·749 ± 0·018, respectively. The markers reported here constitute the first specific set for this genus and will be useful for future conservation genetic studies in the Indo-Pacific region. © 2015 The Fisheries Society of the British Isles.
Development of polymorphic microsatellite loci for conservation genetic studies of the coral reef fish Centropyge bicolor

KAUST Repository

Herrera Sarrias, Marcela; Saenz-Agudelo, P.; Nanninga, Gerrit B.; Berumen, Michael L.

2015-01-01

A total of 23 novel polymorphic microsatellite marker loci were developed for the angelfish Centropyge bicolor through 454 sequencing, and further tested on two spatially separated populations (90 individuals each) from Kimbe Bay in Papua New Guinea. The mean ± s.e. number of alleles per locus was 14·65 ± 1·05, and mean ± s.e. observed (HO) and expected (HE) heterozygosity frequencies were 0·676 ± 0·021 and 0·749 ± 0·018, respectively. The markers reported here constitute the first specific set for this genus and will be useful for future conservation genetic studies in the Indo-Pacific region. © 2015 The Fisheries Society of the British Isles.
Reefgenomics.Org - a repository for marine genomics data.

Science.gov (United States)

Liew, Yi Jin; Aranda, Manuel; Voolstra, Christian R

2016-01-01

Over the last decade, technological advancements have substantially decreased the cost and time of obtaining large amounts of sequencing data. Paired with the exponentially increased computing power, individual labs are now able to sequence genomes or transcriptomes to investigate biological questions of interest. This has led to a significant increase in available sequence data. Although the bulk of data published in articles are stored in public sequence databases, very often, only raw sequencing data are available; miscellaneous data such as assembled transcriptomes, genome annotations etc. are not easily obtainable through the same means. Here, we introduce our website (http://reefgenomics.org) that aims to centralize genomic and transcriptomic data from marine organisms. Besides providing convenient means to download sequences, we provide (where applicable) a genome browser to explore available genomic features, and a BLAST interface to search through the hosted sequences. Through the interface, multiple datasets can be queried simultaneously, allowing for the retrieval of matching sequences from organisms of interest. The minimalistic, no-frills interface reduces visual clutter, making it convenient for end-users to search and explore processed sequence data. DATABASE URL: http://reefgenomics.org. © The Author(s) 2016. Published by Oxford University Press.
BrassicaTED - a public database for utilization of miniature transposable elements in Brassica species.

Science.gov (United States)

Murukarthick, Jayakodi; Sampath, Perumal; Lee, Sang Choon; Choi, Beom-Soon; Senthil, Natesan; Liu, Shengyi; Yang, Tae-Jin

2014-06-20

MITE, TRIM and SINEs are miniature form transposable elements (mTEs) that are ubiquitous and dispersed throughout entire plant genomes. Tens of thousands of members cause insertion polymorphism at both the inter- and intra- species level. Therefore, mTEs are valuable targets and resources for development of markers that can be utilized for breeding, genetic diversity and genome evolution studies. Taking advantage of the completely sequenced genomes of Brassica rapa and B. oleracea, characterization of mTEs and building a curated database are prerequisite to extending their utilization for genomics and applied fields in Brassica crops. We have developed BrassicaTED as a unique web portal containing detailed characterization information for mTEs of Brassica species. At present, BrassicaTED has datasets for 41 mTE families, including 5894 and 6026 members from 20 MITE families, 1393 and 1639 members from 5 TRIM families, 1270 and 2364 members from 16 SINE families in B. rapa and B. oleracea, respectively. BrassicaTED offers different sections to browse structural and positional characteristics for every mTE family. In addition, we have added data on 289 MITE insertion polymorphisms from a survey of seven Brassica relatives. Genes with internal mTE insertions are shown with detailed gene annotation and microarray-based comparative gene expression data in comparison with their paralogs in the triplicated B. rapa genome. This database also includes a novel tool, K BLAST (Karyotype BLAST), for clear visualization of the locations for each member in the B. rapa and B. oleracea pseudo-genome sequences. BrassicaTED is a newly developed database of information regarding the characteristics and potential utility of mTEs including MITE, TRIM and SINEs in B. rapa and B. oleracea. The database will promote the development of desirable mTE-based markers, which can be utilized for genomics and breeding in Brassica species. BrassicaTED will be a valuable repository for scientists
HEROD: a human ethnic and regional specific omics database.

Science.gov (United States)

Zeng, Xian; Tao, Lin; Zhang, Peng; Qin, Chu; Chen, Shangying; He, Weidong; Tan, Ying; Xia Liu, Hong; Yang, Sheng Yong; Chen, Zhe; Jiang, Yu Yang; Chen, Yu Zong

2017-10-15

Genetic and gene expression variations within and between populations and across geographical regions have substantial effects on the biological phenotypes, diseases, and therapeutic response. The development of precision medicines can be facilitated by the OMICS studies of the patients of specific ethnicity and geographic region. However, there is an inadequate facility for broadly and conveniently accessing the ethnic and regional specific OMICS data. Here, we introduced a new free database, HEROD, a human ethnic and regional specific OMICS database. Its first version contains the gene expression data of 53 070 patients of 169 diseases in seven ethnic populations from 193 cities/regions in 49 nations curated from the Gene Expression Omnibus (GEO), the ArrayExpress Archive of Functional Genomics Data (ArrayExpress), the Cancer Genome Atlas (TCGA) and the International Cancer Genome Consortium (ICGC). Geographic region information of curated patients was mainly manually extracted from referenced publications of each original study. These data can be accessed and downloaded via keyword search, World map search, and menu-bar search of disease name, the international classification of disease code, geographical region, location of sample collection, ethnic population, gender, age, sample source organ, patient type (patient or healthy), sample type (disease or normal tissue) and assay type on the web interface. The HEROD database is freely accessible at http://bidd2.nus.edu.sg/herod/index.php. The database and web interface are implemented in MySQL, PHP and HTML with all major browsers supported. phacyz@nus.edu.sg. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
cuticleDB: a relational database of Arthropod cuticular proteins

Directory of Open Access Journals (Sweden)

Willis Judith H

2004-09-01

Full Text Available Abstract Background The insect exoskeleton or cuticle is a bi-partite composite of proteins and chitin that provides protective, skeletal and structural functions. Little information is available about the molecular structure of this important complex that exhibits a helicoidal architecture. Scores of sequences of cuticular proteins have been obtained from direct protein sequencing, from cDNAs, and from genomic analyses. Most of these cuticular protein sequences contain motifs found only in arthropod proteins. Description cuticleDB is a relational database containing all structural proteins of Arthropod cuticle identified to date. Many come from direct sequencing of proteins isolated from cuticle and from sequences from cDNAs that share common features with these authentic cuticular proteins. It also includes proteins from the Drosophila melanogaster and the Anopheles gambiae genomes, that have been predicted to be cuticular proteins, based on a Pfam motif (PF00379 responsible for chitin binding in Arthropod cuticle. The total number of the database entries is 445: 370 derive from insects, 60 from Crustacea and 15 from Chelicerata. The database can be accessed from our web server at http://bioinformatics.biol.uoa.gr/cuticleDB. Conclusions CuticleDB was primarily designed to contain correct and full annotation of cuticular protein data. The database will be of help to future genome annotators. Users will be able to test hypotheses for the existence of known and also of yet unknown motifs in cuticular proteins. An analysis of motifs may contribute to understanding how proteins contribute to the physical properties of cuticle as well as to the precise nature of their interaction with chitin.
A comparative cellular and molecular biology of longevity database.

Science.gov (United States)

Stuart, Jeffrey A; Liang, Ping; Luo, Xuemei; Page, Melissa M; Gallagher, Emily J; Christoff, Casey A; Robb, Ellen L

2013-10-01

Discovering key cellular and molecular traits that promote longevity is a major goal of aging and longevity research. One experimental strategy is to determine which traits have been selected during the evolution of longevity in naturally long-lived animal species. This comparative approach has been applied to lifespan research for nearly four decades, yielding hundreds of datasets describing aspects of cell and molecular biology hypothesized to relate to animal longevity. Here, we introduce a Comparative Cellular and Molecular Biology of Longevity Database, available at ( http://genomics.brocku.ca/ccmbl/ ), as a compendium of comparative cell and molecular data presented in the context of longevity. This open access database will facilitate the meta-analysis of amalgamated datasets using standardized maximum lifespan (MLSP) data (from AnAge). The first edition contains over 800 data records describing experimental measurements of cellular stress resistance, reactive oxygen species metabolism, membrane composition, protein homeostasis, and genome homeostasis as they relate to vertebrate species MLSP. The purpose of this review is to introduce the database and briefly demonstrate its use in the meta-analysis of combined datasets.
DistiLD Database

DEFF Research Database (Denmark)

Palleja, Albert; Horn, Heiko; Eliasson, Sabrina

2012-01-01

Genome-wide association studies (GWAS) have identified thousands of single nucleotide polymorphisms (SNPs) associated with the risk of hundreds of diseases. However, there is currently no database that enables non-specialists to answer the following simple questions: which SNPs associated...... with diseases are in linkage disequilibrium (LD) with a gene of interest? Which chromosomal regions have been associated with a given disease, and which are the potentially causal genes in each region? To answer these questions, we use data from the HapMap Project to partition each chromosome into so-called LD...... blocks, so that SNPs in LD with each other are preferentially in the same block, whereas SNPs not in LD are in different blocks. By projecting SNPs and genes onto LD blocks, the DistiLD database aims to increase usage of existing GWAS results by making it easy to query and visualize disease...

Microbial genome analysis: the COG approach.

Science.gov (United States)

Galperin, Michael Y; Kristensen, David M; Makarova, Kira S; Wolf, Yuri I; Koonin, Eugene V

2017-09-14

For the past 20 years, the Clusters of Orthologous Genes (COG) database had been a popular tool for microbial genome annotation and comparative genomics. Initially created for the purpose of evolutionary classification of protein families, the COG have been used, apart from straightforward functional annotation of sequenced genomes, for such tasks as (i) unification of genome annotation in groups of related organisms; (ii) identification of missing and/or undetected genes in complete microbial genomes; (iii) analysis of genomic neighborhoods, in many cases allowing prediction of novel functional systems; (iv) analysis of metabolic pathways and prediction of alternative forms of enzymes; (v) comparison of organisms by COG functional categories; and (vi) prioritization of targets for structural and functional characterization. Here we review the principles of the COG approach and discuss its key advantages and drawbacks in microbial genome analysis. Published by Oxford University Press 2017. This work is written by US Government employees and is in the public domain in the US.
Genome analysis methods - PGDBj Registered plant list, Marker list, QTL list, Plant DB link & Genome analysis methods | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us PGDBj Registered plant list, Marker list, QTL list, Plant DB link & Genome analysis methods Genome analysis... methods Data detail Data name Genome analysis methods DOI 10.18908/lsdba.nbdc01194-01-005 De...scription of data contents The current status and related information of the genomic analysis about each org...anism (March, 2014). In the case of organisms carried out genomic analysis, the d...e File name: pgdbj_dna_marker_linkage_map_genome_analysis_methods_en.zip File URL: ftp://ftp.biosciencedbc.j
High-efficient, bicolor-emitting GdVO_4:Dy"3"+ phosphor under near ultraviolet excitation

International Nuclear Information System (INIS)

Lu, Jinjin; Zhou, Jia; Jia, Huayu; Tian, Yue

2015-01-01

Bicolor emitting GdVO_4:Dy"3"+ phosphor with short columniation-shape was prepared via a simple co-precipitation process. The optimal doping concentration for obtaining maximal luminescent intensity was confirmed to be 0.3 mol% and the electric dipole–dipole interaction is responsible for concentration quenching of Dy"3"+ emission in GdVO_4 phosphor. In order to evaluate the luminescent performance of as-prepared phosphor, the luminescent efficiency and color coordinates were studied. The results show that luminescent efficiency of this phosphor is very high under near UV excitation and twice times higher than commercial Y_2O_2S:Eu"3"+ phosphor. In addition, the color coordinates for optimal Dy"3"+ concentration are (0.339, 0.379), which are close to equal energy point. Therefore, the GdVO_4:Dy"3"+ phosphor may have potential application for solid state lighting.
Applying Shannon's information theory to bacterial and phage genomes and metagenomes

Science.gov (United States)

Akhter, Sajia; Bailey, Barbara A.; Salamon, Peter; Aziz, Ramy K.; Edwards, Robert A.

2013-01-01

All sequence data contain inherent information that can be measured by Shannon's uncertainty theory. Such measurement is valuable in evaluating large data sets, such as metagenomic libraries, to prioritize their analysis and annotation, thus saving computational resources. Here, Shannon's index of complete phage and bacterial genomes was examined. The information content of a genome was found to be highly dependent on the genome length, GC content, and sequence word size. In metagenomic sequences, the amount of information correlated with the number of matches found by comparison to sequence databases. A sequence with more information (higher uncertainty) has a higher probability of being significantly similar to other sequences in the database. Measuring uncertainty may be used for rapid screening for sequences with matches in available database, prioritizing computational resources, and indicating which sequences with no known similarities are likely to be important for more detailed analysis.
Data Mining Supercomputing with SAS JMP® Genomics

Directory of Open Access Journals (Sweden)

Richard S. Segall

2011-02-01

Full Text Available JMP® Genomics is statistical discovery software that can uncover meaningful patterns in high-throughput genomics and proteomics data. JMP® Genomics is designed for biologists, biostatisticians, statistical geneticists, and those engaged in analyzing the vast stores of data that are common in genomic research (SAS, 2009. Data mining was performed using JMP® Genomics on the two collections of microarray databases available from National Center for Biotechnology Information (NCBI for lung cancer and breast cancer. The Gene Expression Omnibus (GEO of NCBI serves as a public repository for a wide range of highthroughput experimental data, including the two collections of lung cancer and breast cancer that were used for this research. The results for applying data mining using software JMP® Genomics are shown in this paper with numerous screen shots.
The whole genome sequences and experimentally phased haplotypes of over 100 personal genomes.

Science.gov (United States)

Mao, Qing; Ciotlos, Serban; Zhang, Rebecca Yu; Ball, Madeleine P; Chin, Robert; Carnevali, Paolo; Barua, Nina; Nguyen, Staci; Agarwal, Misha R; Clegg, Tom; Connelly, Abram; Vandewege, Ward; Zaranek, Alexander Wait; Estep, Preston W; Church, George M; Drmanac, Radoje; Peters, Brock A

2016-10-11

Since the completion of the Human Genome Project in 2003, it is estimated that more than 200,000 individual whole human genomes have been sequenced. A stunning accomplishment in such a short period of time. However, most of these were sequenced without experimental haplotype data and are therefore missing an important aspect of genome biology. In addition, much of the genomic data is not available to the public and lacks phenotypic information. As part of the Personal Genome Project, blood samples from 184 participants were collected and processed using Complete Genomics' Long Fragment Read technology. Here, we present the experimental whole genome haplotyping and sequencing of these samples to an average read coverage depth of 100X. This is approximately three-fold higher than the read coverage applied to most whole human genome assemblies and ensures the highest quality results. Currently, 114 genomes from this dataset are freely available in the GigaDB repository and are associated with rich phenotypic data; the remaining 70 should be added in the near future as they are approved through the PGP data release process. For reproducibility analyses, 20 genomes were sequenced at least twice using independent LFR barcoded libraries. Seven genomes were also sequenced using Complete Genomics' standard non-barcoded library process. In addition, we report 2.6 million high-quality, rare variants not previously identified in the Single Nucleotide Polymorphisms database or the 1000 Genomes Project Phase 3 data. These genomes represent a unique source of haplotype and phenotype data for the scientific community and should help to expand our understanding of human genome evolution and function.
Complete Mitochondrial Genomes of the Cherskii’s Sculpin and Siberian Taimen Reveal GenBank Entry Errors: Incorrect Species Identification and Recombinant Mitochondrial Genome

Directory of Open Access Journals (Sweden)

Evgeniy S Balakirev

2017-08-01

Full Text Available The complete mitochondrial (mt genome is sequenced in 2 individuals of the Cherskii’s sculpin Cottus czerskii . A surprisingly high level of sequence divergence (10.3% has been detected between the 2 genomes of C czerskii studied here and the GenBank mt genome of C czerskii (KJ956027. At the same time, a surprisingly low level of divergence (1.4% has been detected between the GenBank C czerskii (KJ956027 and the Amur sculpin Cottus szanaga (KX762049, KX762050. We argue that the observed discrepancies are due to incorrect taxonomic identification so that the GenBank accession number KJ956027 represents actually the mt genome of C szanaga erroneously identified as C czerskii . Our results are of consequence concerning the GenBank database quality, highlighting the potential negative consequences of entry errors, which once they are introduced tend to be propagated among databases and subsequent publications. We illustrate the premise with the data on recombinant mt genome of the Siberian taimen Hucho taimen (NCBI Reference Sequence Database NC_016426.1; GenBank accession number HQ897271.1, bearing 2 introgressed fragments (≈0.9 kb [kilobase] from 2 lenok subspecies, Brachymystax lenok and Brachymystax lenok tsinlingensis , submitted to GenBank on June 12, 2011. Since the time of submission, the H taimen recombinant mt genome leading to incorrect phylogenetic inferences was propagated in multiple subsequent publications despite the fact that nonrecombinant H taimen genomes were also available (submitted to GenBank on August 2, 2014; KJ711549, KJ711550. Other examples of recombinant sequences persisting in GenBank are also considered. A GenBank Entry Error Depositary is urgently needed to monitor and avoid a progressive accumulation of wrong biological information.
Detection and Toxicity Evaluation of Pyrrolizidine Alkaloids in Medicinal Plants Gynura bicolor and Gynura divaricata Collected from Different Chinese Locations.

Science.gov (United States)

Chen, Jian; Lü, Han; Fang, Lian-Xiang; Li, Wei-Lin; Verschaeve, Luc; Wang, Zheng-Tao; De Kimpe, Norbert; Mangelinckx, Sven

2017-02-01

Two edible plants in Southeast Asia, Gynura bicolor and G. divaricata, are not only known to be nutritive but also useful as medicinal herbs. Previous phytochemical investigation of Gynura species showed the presence of hepatotoxic pyrrolizidine alkaloids (PAs), indicating the toxic risk of using these two plants. The present study was designed to analyze the distribution of PA components and tried to evaluate the preliminary toxicity of these two Gynura species. Eight samples of G. bicolor and G. divaricata from five different Chinese locations were collected and their specific PAs were qualitatively characterized by applying an UPLC/MS/MS spectrometry method. Using a pre-column derivatization HPLC method, the total retronecine ester-type PAs in their alkaloids extracts were quantitatively estimated as well. Finally, their genotoxicity was investigated with an effective high-throughput screening method referred to as Vitotox™ test and their potential cytotoxicity was tested on HepG2 cells. It was found that different types of PAs were widely present in Gynura species collected from south of China. Among them, no significant genotoxic effects were detected with serial concentrations through the present in vitro assay. However, the cytotoxicity assay of Gynura plants collected from Jiangsu displayed weak activity at the concentration of 100 mg/ml. It is important to note that this research validates in part the indication that the use of Gynura species requires caution. © 2017 Wiley-VHCA AG, Zurich, Switzerland.
The Banana Genome Hub

Science.gov (United States)

Droc, Gaëtan; Larivière, Delphine; Guignon, Valentin; Yahiaoui, Nabila; This, Dominique; Garsmeur, Olivier; Dereeper, Alexis; Hamelin, Chantal; Argout, Xavier; Dufayard, Jean-François; Lengelle, Juliette; Baurens, Franc-Christophe; Cenci, Alberto; Pitollat, Bertrand; D’Hont, Angélique; Ruiz, Manuel; Rouard, Mathieu; Bocs, Stéphanie

2013-01-01

Banana is one of the world’s favorite fruits and one of the most important crops for developing countries. The banana reference genome sequence (Musa acuminata) was recently released. Given the taxonomic position of Musa, the completed genomic sequence has particular comparative value to provide fresh insights about the evolution of the monocotyledons. The study of the banana genome has been enhanced by a number of tools and resources that allows harnessing its sequence. First, we set up essential tools such as a Community Annotation System, phylogenomics resources and metabolic pathways. Then, to support post-genomic efforts, we improved banana existing systems (e.g. web front end, query builder), we integrated available Musa data into generic systems (e.g. markers and genetic maps, synteny blocks), we have made interoperable with the banana hub, other existing systems containing Musa data (e.g. transcriptomics, rice reference genome, workflow manager) and finally, we generated new results from sequence analyses (e.g. SNP and polymorphism analysis). Several uses cases illustrate how the Banana Genome Hub can be used to study gene families. Overall, with this collaborative effort, we discuss the importance of the interoperability toward data integration between existing information systems. Database URL: http://banana-genome.cirad.fr/ PMID:23707967
Heterologous expression and characterization of a laccase from Laccaria bicolor in Pichia pastoris

Directory of Open Access Journals (Sweden)

Bo Wang

2016-01-01

Full Text Available Synthetic dyes are known to be highly toxic to mammalian cells and mutagenic and carcinogenic to humans and, therefore, should be detoxified and removed from industrial effluents. Different approaches for removal and detoxication are extensively sought. Biochemical methods are considered the most economical and effective method of dye decolourization. In this research, the laccase gene from Laccaria bicolor was modified and expressed in Pichia pastoris. The properties of the recombinant laccase and its ability to degrade synthetic dyes were studied. The laccase activity was optimal at pH 2.2 and 50 °C. Its Km value was 0.187 mmol/L for ABTS [2,2'-azino-bis(3-ethylbenzthiazoline-6-sulphonic acid]. The laccase obtained was shown to decolorize the synthetic dyes, malachite green, crystal violet and orange G, with ABTS as a mediator. These results indicated that the laccase obtained may be used to treat industrial effluents containing artificial dyes.
Citrus sinensis annotation project (CAP): a comprehensive database for sweet orange genome.

Science.gov (United States)

Wang, Jia; Chen, Dijun; Lei, Yang; Chang, Ji-Wei; Hao, Bao-Hai; Xing, Feng; Li, Sen; Xu, Qiang; Deng, Xiu-Xin; Chen, Ling-Ling

2014-01-01

Citrus is one of the most important and widely grown fruit crop with global production ranking firstly among all the fruit crops in the world. Sweet orange accounts for more than half of the Citrus production both in fresh fruit and processed juice. We have sequenced the draft genome of a double-haploid sweet orange (C. sinensis cv. Valencia), and constructed the Citrus sinensis annotation project (CAP) to store and visualize the sequenced genomic and transcriptome data. CAP provides GBrowse-based organization of sweet orange genomic data, which integrates ab initio gene prediction, EST, RNA-seq and RNA-paired end tag (RNA-PET) evidence-based gene annotation. Furthermore, we provide a user-friendly web interface to show the predicted protein-protein interactions (PPIs) and metabolic pathways in sweet orange. CAP provides comprehensive information beneficial to the researchers of sweet orange and other woody plants, which is freely available at http://citrus.hzau.edu.cn/.
An Ontology-Based GIS for Genomic Data Management of Rumen Microbes.

Science.gov (United States)

Jelokhani-Niaraki, Saber; Tahmoorespur, Mojtaba; Minuchehr, Zarrin; Nassiri, Mohammad Reza

2015-03-01

During recent years, there has been exponential growth in biological information. With the emergence of large datasets in biology, life scientists are encountering bottlenecks in handling the biological data. This study presents an integrated geographic information system (GIS)-ontology application for handling microbial genome data. The application uses a linear referencing technique as one of the GIS functionalities to represent genes as linear events on the genome layer, where users can define/change the attributes of genes in an event table and interactively see the gene events on a genome layer. Our application adopted ontology to portray and store genomic data in a semantic framework, which facilitates data-sharing among biology domains, applications, and experts. The application was developed in two steps. In the first step, the genome annotated data were prepared and stored in a MySQL database. The second step involved the connection of the database to both ArcGIS and Protégé as the GIS engine and ontology platform, respectively. We have designed this application specifically to manage the genome-annotated data of rumen microbial populations. Such a GIS-ontology application offers powerful capabilities for visualizing, managing, reusing, sharing, and querying genome-related data.
dEMBF: A Comprehensive Database of Enzymes of Microalgal Biofuel Feedstock.

Science.gov (United States)

Misra, Namrata; Panda, Prasanna Kumar; Parida, Bikram Kumar; Mishra, Barada Kanta

2016-01-01

Microalgae have attracted wide attention as one of the most versatile renewable feedstocks for production of biofuel. To develop genetically engineered high lipid yielding algal strains, a thorough understanding of the lipid biosynthetic pathway and the underpinning enzymes is essential. In this work, we have systematically mined the genomes of fifteen diverse algal species belonging to Chlorophyta, Heterokontophyta, Rhodophyta, and Haptophyta, to identify and annotate the putative enzymes of lipid metabolic pathway. Consequently, we have also developed a database, dEMBF (Database of Enzymes of Microalgal Biofuel Feedstock), which catalogues the complete list of identified enzymes along with their computed annotation details including length, hydrophobicity, amino acid composition, subcellular location, gene ontology, KEGG pathway, orthologous group, Pfam domain, intron-exon organization, transmembrane topology, and secondary/tertiary structural data. Furthermore, to facilitate functional and evolutionary study of these enzymes, a collection of built-in applications for BLAST search, motif identification, sequence and phylogenetic analysis have been seamlessly integrated into the database. dEMBF is the first database that brings together all enzymes responsible for lipid synthesis from available algal genomes, and provides an integrative platform for enzyme inquiry and analysis. This database will be extremely useful for algal biofuel research. It can be accessed at http://bbprof.immt.res.in/embf.
The STRING database in 2017

DEFF Research Database (Denmark)

Szklarczyk, Damian; Morris, John H; Cook, Helen

2017-01-01

A system-wide understanding of cellular function requires knowledge of all functional interactions between the expressed proteins. The STRING database aims to collect and integrate this information, by consolidating known and predicted protein-protein association data for a large number of organi......A system-wide understanding of cellular function requires knowledge of all functional interactions between the expressed proteins. The STRING database aims to collect and integrate this information, by consolidating known and predicted protein-protein association data for a large number...... of organisms. The associations in STRING include direct (physical) interactions, as well as indirect (functional) interactions, as long as both are specific and biologically meaningful. Apart from collecting and reassessing available experimental data on protein-protein interactions, and importing known...... pathways and protein complexes from curated databases, interaction predictions are derived from the following sources: (i) systematic co-expression analysis, (ii) detection of shared selective signals across genomes, (iii) automated text-mining of the scientific literature and (iv) computational transfer...
FGF: A web tool for Fishing Gene Family in a whole genome database

DEFF Research Database (Denmark)

Zheng, Hongkun; Shi, Junjie; Fang, Xiaodong

2007-01-01

Gene duplication is an important process in evolution. The availability of genome sequences of a number of organisms has made it possible to conduct comprehensive searches for duplicated genes enabling informative studies of their evolution. We have established the FGF (Fishing Gene Family) progr...... is freely available on a web server at http://fgf.genomics.org.cn/...
Investigation of genome sequences within the family Pasteurellaceae

DEFF Research Database (Denmark)

Angen, Øystein; Ussery, David

Introduction The bacterial genome sequences are now available for an increasing number of strains within the family Pasteurellaceae. At present, 24 Pasteurellaceae genomes are publicly available through internet databases, and another 40 genomes are being sequenced. This investigation will describe...... the core genome for both the family Pasteurellaceae and for the species Haemophilus influenzae. Methods Twenty genome sequences from the following species were included: Haemophilus influenzae (11 strains), Haemophilus ducreyi (1 strain), Histophilus somni (2 strains), Haemophilus parasuis (1 strain......), Actinobacillus pleuropneumoniae (2 strains), Actinobacillus succinogenes (1 strain), Mannheimia succiniciproducens (1 strain), and Pasteurella multocida (1 strain). The predicted proteins for each genome were BLASTed against each other, and a set of conserved core gene families was determined as described...
Sequencing intractable DNA to close microbial genomes.

Directory of Open Access Journals (Sweden)

Richard A Hurt

Full Text Available Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps and the Desulfovibrio africanus genome (1 intractable gap. The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.
Sequencing Intractable DNA to Close Microbial Genomes

Energy Technology Data Exchange (ETDEWEB)

Hurt, Jr., Richard Ashley [ORNL; Brown, Steven D [ORNL; Podar, Mircea [ORNL; Palumbo, Anthony Vito [ORNL; Elias, Dwayne A [ORNL

2012-01-01

Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled intractable resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such difficult regions in the non-contiguous finished Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. These developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.
Genome Variation Map: a data repository of genome variations in BIG Data Center.

Science.gov (United States)

Song, Shuhui; Tian, Dongmei; Li, Cuiping; Tang, Bixia; Dong, Lili; Xiao, Jingfa; Bao, Yiming; Zhao, Wenming; He, Hang; Zhang, Zhang

2018-01-04

The Genome Variation Map (GVM; http://bigd.big.ac.cn/gvm/) is a public data repository of genome variations. As a core resource in the BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, GVM dedicates to collect, integrate and visualize genome variations for a wide range of species, accepts submissions of different types of genome variations from all over the world and provides free open access to all publicly available data in support of worldwide research activities. Unlike existing related databases, GVM features integration of a large number of genome variations for a broad diversity of species including human, cultivated plants and domesticated animals. Specifically, the current implementation of GVM not only houses a total of ∼4.9 billion variants for 19 species including chicken, dog, goat, human, poplar, rice and tomato, but also incorporates 8669 individual genotypes and 13 262 manually curated high-quality genotype-to-phenotype associations for non-human species. In addition, GVM provides friendly intuitive web interfaces for data submission, browse, search and visualization. Collectively, GVM serves as an important resource for archiving genomic variation data, helpful for better understanding population genetic diversity and deciphering complex mechanisms associated with different phenotypes. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Genome Variation Map: a data repository of genome variations in BIG Data Center

Science.gov (United States)

Tian, Dongmei; Li, Cuiping; Tang, Bixia; Dong, Lili; Xiao, Jingfa; Bao, Yiming; Zhao, Wenming; He, Hang

2018-01-01

Abstract The Genome Variation Map (GVM; http://bigd.big.ac.cn/gvm/) is a public data repository of genome variations. As a core resource in the BIG Data Center, Beijing Institute of Genomics, Chinese Academy of Sciences, GVM dedicates to collect, integrate and visualize genome variations for a wide range of species, accepts submissions of different types of genome variations from all over the world and provides free open access to all publicly available data in support of worldwide research activities. Unlike existing related databases, GVM features integration of a large number of genome variations for a broad diversity of species including human, cultivated plants and domesticated animals. Specifically, the current implementation of GVM not only houses a total of ∼4.9 billion variants for 19 species including chicken, dog, goat, human, poplar, rice and tomato, but also incorporates 8669 individual genotypes and 13 262 manually curated high-quality genotype-to-phenotype associations for non-human species. In addition, GVM provides friendly intuitive web interfaces for data submission, browse, search and visualization. Collectively, GVM serves as an important resource for archiving genomic variation data, helpful for better understanding population genetic diversity and deciphering complex mechanisms associated with different phenotypes. PMID:29069473

Clusters of orthologous genes for 41 archaeal genomes and implications for evolutionary genomics of archaea

Directory of Open Access Journals (Sweden)

Wolf Yuri I

2007-11-01

Full Text Available Abstract Background An evolutionary classification of genes from sequenced genomes that distinguishes between orthologs and paralogs is indispensable for genome annotation and evolutionary reconstruction. Shortly after multiple genome sequences of bacteria, archaea, and unicellular eukaryotes became available, an attempt on such a classification was implemented in Clusters of Orthologous Groups of proteins (COGs. Rapid accumulation of genome sequences creates opportunities for refining COGs but also represents a challenge because of error amplification. One of the practical strategies involves construction of refined COGs for phylogenetically compact subsets of genomes. Results New Archaeal Clusters of Orthologous Genes (arCOGs were constructed for 41 archaeal genomes (13 Crenarchaeota, 27 Euryarchaeota and one Nanoarchaeon using an improved procedure that employs a similarity tree between smaller, group-specific clusters, semi-automatically partitions orthology domains in multidomain proteins, and uses profile searches for identification of remote orthologs. The annotation of arCOGs is a consensus between three assignments based on the COGs, the CDD database, and the annotations of homologs in the NR database. The 7538 arCOGs, on average, cover ~88% of the genes in a genome compared to a ~76% coverage in COGs. The finer granularity of ortholog identification in the arCOGs is apparent from the fact that 4538 arCOGs correspond to 2362 COGs; ~40% of the arCOGs are new. The archaeal gene core (protein-coding genes found in all 41 genome consists of 166 arCOGs. The arCOGs were used to reconstruct gene loss and gene gain events during archaeal evolution and gene sets of ancestral forms. The Last Archaeal Common Ancestor (LACA is conservatively estimated to possess 996 genes compared to 1245 and 1335 genes for the last common ancestors of Crenarchaeota and Euryarchaeota, respectively. It is inferred that LACA was a chemoautotrophic hyperthermophile
PrionScan: an online database of predicted prion domains in complete proteomes.

Science.gov (United States)

Espinosa Angarica, Vladimir; Angulo, Alfonso; Giner, Arturo; Losilla, Guillermo; Ventura, Salvador; Sancho, Javier

2014-02-05

Prions are a particular type of amyloids related to a large variety of important processes in cells, but also responsible for serious diseases in mammals and humans. The number of experimentally characterized prions is still low and corresponds to a handful of examples in microorganisms and mammals. Prion aggregation is mediated by specific protein domains with a remarkable compositional bias towards glutamine/asparagine and against charged residues and prolines. These compositional features have been used to predict new prion proteins in the genomes of different organisms. Despite these efforts, there are only a few available data sources containing prion predictions at a genomic scale. Here we present PrionScan, a new database of predicted prion-like domains in complete proteomes. We have previously developed a predictive methodology to identify and score prionogenic stretches in protein sequences. In the present work, we exploit this approach to scan all the protein sequences in public databases and compile a repository containing relevant information of proteins bearing prion-like domains. The database is updated regularly alongside UniprotKB and in its present version contains approximately 28000 predictions in proteins from different functional categories in more than 3200 organisms from all the taxonomic subdivisions. PrionScan can be used in two different ways: database query and analysis of protein sequences submitted by the users. In the first mode, simple queries allow to retrieve a detailed description of the properties of a defined protein. Queries can also be combined to generate more complex and specific searching patterns. In the second mode, users can submit and analyze their own sequences. It is expected that this database would provide relevant insights on prion functions and regulation from a genome-wide perspective, allowing researches performing cross-species prion biology studies. Our database might also be useful for guiding experimentalists
Cyclebase 3.0: a multi-organism database on cell-cycle regulation and phenotypes

DEFF Research Database (Denmark)

Santos Delgado, Alberto; Wernersson, Rasmus; Jensen, Lars Juhl

2015-01-01

3.0, we have updated the content of the database to reflect changes to genome annotation, added new mRNAand protein expression data, and integrated cell-cycle phenotype information from high-content screens and model-organism databases. The new version of Cyclebase also features a new web interface...
CyanOmics: an integrated database of omics for the model cyanobacterium Synechococcus sp. PCC 7002.

Science.gov (United States)

Yang, Yaohua; Feng, Jie; Li, Tao; Ge, Feng; Zhao, Jindong

2015-01-01

Cyanobacteria are an important group of organisms that carry out oxygenic photosynthesis and play vital roles in both the carbon and nitrogen cycles of the Earth. The annotated genome of Synechococcus sp. PCC 7002, as an ideal model cyanobacterium, is available. A series of transcriptomic and proteomic studies of Synechococcus sp. PCC 7002 cells grown under different conditions have been reported. However, no database of such integrated omics studies has been constructed. Here we present CyanOmics, a database based on the results of Synechococcus sp. PCC 7002 omics studies. CyanOmics comprises one genomic dataset, 29 transcriptomic datasets and one proteomic dataset and should prove useful for systematic and comprehensive analysis of all those data. Powerful browsing and searching tools are integrated to help users directly access information of interest with enhanced visualization of the analytical results. Furthermore, Blast is included for sequence-based similarity searching and Cluster 3.0, as well as the R hclust function is provided for cluster analyses, to increase CyanOmics's usefulness. To the best of our knowledge, it is the first integrated omics analysis database for cyanobacteria. This database should further understanding of the transcriptional patterns, and proteomic profiling of Synechococcus sp. PCC 7002 and other cyanobacteria. Additionally, the entire database framework is applicable to any sequenced prokaryotic genome and could be applied to other integrated omics analysis projects. Database URL: http://lag.ihb.ac.cn/cyanomics. © The Author(s) 2015. Published by Oxford University Press.
Database Description - PGDBj Registered plant list, Marker list, QTL list, Plant DB link & Genome analysis methods | LSDB Archive [Life Science Database Archive metadata

Lifescience Database Archive (English)

Full Text Available List Contact us PGDBj Registered plant list, Marker list, QTL list, Plant DB link & Genome analysis methods ... QTL list, Plant DB link & Genome analysis methods Alternative name - DOI 10.18908/lsdba.nbdc01194-01-000 Cr...ers and QTLs are curated manually from the published literature. The marker information includes marker sequences, genotyping methods... Registered plant list, Marker list, QTL list, Plant DB link & Genome analysis methods | LSDB Archive ...
Quality scores for 32,000 genomes

DEFF Research Database (Denmark)

Land, Miriam L.; Hyatt, Doug; Jun, Se-Ran

2014-01-01

Background More than 80% of the microbial genomes in GenBank are of ‘draft’ quality (12,553 draft vs. 2,679 finished, as of October, 2013). We have examined all the microbial DNA sequences available for complete, draft, and Sequence Read Archive genomes in GenBank as well as three other major...... public databases, and assigned quality scores for more than 30,000 prokaryotic genome sequences. Results Scores were assigned using four categories: the completeness of the assembly, the presence of full-length rRNA genes, tRNA composition and the presence of a set of 102 conserved genes in prokaryotes....... Most (~88%) of the genomes had quality scores of 0.8 or better and can be safely used for standard comparative genomics analysis. We compared genomes across factors that may influence the score. We found that although sequencing depth coverage of over 100x did not ensure a better score, sequencing read...
Molecular signatures database (MSigDB) 3.0.

Science.gov (United States)

Liberzon, Arthur; Subramanian, Aravind; Pinchback, Reid; Thorvaldsdóttir, Helga; Tamayo, Pablo; Mesirov, Jill P

2011-06-15

Well-annotated gene sets representing the universe of the biological processes are critical for meaningful and insightful interpretation of large-scale genomic data. The Molecular Signatures Database (MSigDB) is one of the most widely used repositories of such sets. We report the availability of a new version of the database, MSigDB 3.0, with over 6700 gene sets, a complete revision of the collection of canonical pathways and experimental signatures from publications, enhanced annotations and upgrades to the web site. MSigDB is freely available for non-commercial use at http://www.broadinstitute.org/msigdb.
A reference methylome database and analysis pipeline to facilitate integrative and comparative epigenomics.

Directory of Open Access Journals (Sweden)

Qiang Song

Full Text Available DNA methylation is implicated in a surprising diversity of regulatory, evolutionary processes and diseases in eukaryotes. The introduction of whole-genome bisulfite sequencing has enabled the study of DNA methylation at a single-base resolution, revealing many new aspects of DNA methylation and highlighting the usefulness of methylome data in understanding a variety of genomic phenomena. As the number of publicly available whole-genome bisulfite sequencing studies reaches into the hundreds, reliable and convenient tools for comparing and analyzing methylomes become increasingly important. We present MethPipe, a pipeline for both low and high-level methylome analysis, and MethBase, an accompanying database of annotated methylomes from the public domain. Together these resources enable researchers to extract interesting features from methylomes and compare them with those identified in public methylomes in our database.
HOLLYWOOD: a comparative relational database of alternative splicing.

Science.gov (United States)

Holste, Dirk; Huo, George; Tung, Vivian; Burge, Christopher B

2006-01-01

RNA splicing is an essential step in gene expression, and is often variable, giving rise to multiple alternatively spliced mRNA and protein isoforms from a single gene locus. The design of effective databases to support experimental and computational investigations of alternative splicing (AS) is a significant challenge. In an effort to integrate accurate exon and splice site annotation with current knowledge about splicing regulatory elements and predicted AS events, and to link information about the splicing of orthologous genes in different species, we have developed the Hollywood system. This database was built upon genomic annotation of splicing patterns of known genes derived from spliced alignment of complementary DNAs (cDNAs) and expressed sequence tags, and links features such as splice site sequence and strength, exonic splicing enhancers and silencers, conserved and non-conserved patterns of splicing, and cDNA library information for inferred alternative exons. Hollywood was implemented as a relational database and currently contains comprehensive information for human and mouse. It is accompanied by a web query tool that allows searches for sets of exons with specific splicing characteristics or splicing regulatory element composition, or gives a graphical or sequence-level summary of splicing patterns for a specific gene. A streamlined graphical representation of gene splicing patterns is provided, and these patterns can alternatively be layered onto existing information in the UCSC Genome Browser. The database is accessible at http://hollywood.mit.edu.
ProteinWorldDB: querying radical pairwise alignments among protein sets from complete genomes.

Science.gov (United States)

Otto, Thomas Dan; Catanho, Marcos; Tristão, Cristian; Bezerra, Márcia; Fernandes, Renan Mathias; Elias, Guilherme Steinberger; Scaglia, Alexandre Capeletto; Bovermann, Bill; Berstis, Viktors; Lifschitz, Sergio; de Miranda, Antonio Basílio; Degrave, Wim

2010-03-01

Many analyses in modern biological research are based on comparisons between biological sequences, resulting in functional, evolutionary and structural inferences. When large numbers of sequences are compared, heuristics are often used resulting in a certain lack of accuracy. In order to improve and validate results of such comparisons, we have performed radical all-against-all comparisons of 4 million protein sequences belonging to the RefSeq database, using an implementation of the Smith-Waterman algorithm. This extremely intensive computational approach was made possible with the help of World Community Grid, through the Genome Comparison Project. The resulting database, ProteinWorldDB, which contains coordinates of pairwise protein alignments and their respective scores, is now made available. Users can download, compare and analyze the results, filtered by genomes, protein functions or clusters. ProteinWorldDB is integrated with annotations derived from Swiss-Prot, Pfam, KEGG, NCBI Taxonomy database and gene ontology. The database is a unique and valuable asset, representing a major effort to create a reliable and consistent dataset of cross-comparisons of the whole protein content encoded in hundreds of completely sequenced genomes using a rigorous dynamic programming approach. The database can be accessed through http://proteinworlddb.org
Efficient privacy-preserving string search and an application in genomics.

Science.gov (United States)

Shimizu, Kana; Nuida, Koji; Rätsch, Gunnar

2016-06-01

Personal genomes carry inherent privacy risks and protecting privacy poses major social and technological challenges. We consider the case where a user searches for genetic information (e.g. an allele) on a server that stores a large genomic database and aims to receive allele-associated information. The user would like to keep the query and result private and the server the database. We propose a novel approach that combines efficient string data structures such as the Burrows-Wheeler transform with cryptographic techniques based on additive homomorphic encryption. We assume that the sequence data is searchable in efficient iterative query operations over a large indexed dictionary, for instance, from large genome collections and employing the (positional) Burrows-Wheeler transform. We use a technique called oblivious transfer that is based on additive homomorphic encryption to conceal the sequence query and the genomic region of interest in positional queries. We designed and implemented an efficient algorithm for searching sequences of SNPs in large genome databases. During search, the user can only identify the longest match while the server does not learn which sequence of SNPs the user queried. In an experiment based on 2184 aligned haploid genomes from the 1000 Genomes Project, our algorithm was able to perform typical queries within [Formula: see text] 4.6 s and [Formula: see text] 10.8 s for client and server side, respectively, on laptop computers. The presented algorithm is at least one order of magnitude faster than an exhaustive baseline algorithm. https://github.com/iskana/PBWT-sec and https://github.com/ratschlab/PBWT-sec shimizu-kana@aist.go.jp or Gunnar.Ratsch@ratschlab.org Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press.
CrusView: a Java-based visualization platform for comparative genomics analyses in Brassicaceae species.

Science.gov (United States)

Chen, Hao; Wang, Xiangfeng

2013-09-01

In plants and animals, chromosomal breakage and fusion events based on conserved syntenic genomic blocks lead to conserved patterns of karyotype evolution among species of the same family. However, karyotype information has not been well utilized in genomic comparison studies. We present CrusView, a Java-based bioinformatic application utilizing Standard Widget Toolkit/Swing graphics libraries and a SQLite database for performing visualized analyses of comparative genomics data in Brassicaceae (crucifer) plants. Compared with similar software and databases, one of the unique features of CrusView is its integration of karyotype information when comparing two genomes. This feature allows users to perform karyotype-based genome assembly and karyotype-assisted genome synteny analyses with preset karyotype patterns of the Brassicaceae genomes. Additionally, CrusView is a local program, which gives its users high flexibility when analyzing unpublished genomes and allows users to upload self-defined genomic information so that they can visually study the associations between genome structural variations and genetic elements, including chromosomal rearrangements, genomic macrosynteny, gene families, high-frequency recombination sites, and tandem and segmental duplications between related species. This tool will greatly facilitate karyotype, chromosome, and genome evolution studies using visualized comparative genomics approaches in Brassicaceae species. CrusView is freely available at http://www.cmbb.arizona.edu/CrusView/.
Molecular markers associated with aluminium tolerance in Sorghum bicolor.

Science.gov (United States)

Too, Emily Jepkosgei; Onkware, Augustino Osoro; Were, Beatrice Ang'iyo; Gudu, Samuel; Carlsson, Anders; Geleta, Mulatu

2018-01-01

Sorghum ( Sorghum bicolor , L. Moench) production in many agro-ecologies is constrained by a variety of stresses, including high levels of aluminium (Al) commonly found in acid soils. Therefore, for such soils, growing Al tolerant cultivars is imperative for high productivity. In this study, molecular markers associated with Al tolerance were identified using a mapping population developed by crossing two contrasting genotypes for this trait. Four SSR ( Xtxp34 , Sb5_236 , Sb6_34 , and Sb6_342 ), one STS ( CTG29_3b ) and three ISSR ( 811_1400 , 835_200 and 884_200 ) markers produced alleles that showed significant association with Al tolerance. CTG29_3b, 811_1400 , Xtxp34 and Sb5_ 236 are located on chromosome 3 with the first two markers located close to Alt SB , a locus that underlie the Al tolerance gene ( SbMATE ) implying that their association with Al tolerance is due to their linkage to this gene. Although CTG29_3b and 811_ 1400 are located closer to Alt SB , Xtxp34 and Sb5_236 explained higher phenotypic variance of Al tolerance indices. Markers 835_200 , 884_200 , Sb6_34 and Sb6_342 are located on different chromosomes, which implies the presence of several genes involved in Al tolerance in addition to S bMATE in sorghum. These molecular markers have a high potential for use in breeding for Al tolerance in sorghum.
FARE-CAFE: a database of functional and regulatory elements of cancer-associated fusion events.

Science.gov (United States)

Korla, Praveen Kumar; Cheng, Jack; Huang, Chien-Hung; Tsai, Jeffrey J P; Liu, Yu-Hsuan; Kurubanjerdjit, Nilubon; Hsieh, Wen-Tsong; Chen, Huey-Yi; Ng, Ka-Lok

2015-01-01

Chromosomal translocation (CT) is of enormous clinical interest because this disorder is associated with various major solid tumors and leukemia. A tumor-specific fusion gene event may occur when a translocation joins two separate genes. Currently, various CT databases provide information about fusion genes and their genomic elements. However, no database of the roles of fusion genes, in terms of essential functional and regulatory elements in oncogenesis, is available. FARE-CAFE is a unique combination of CTs, fusion proteins, protein domains, domain-domain interactions, protein-protein interactions, transcription factors and microRNAs, with subsequent experimental information, which cannot be found in any other CT database. Genomic DNA information including, for example, manually collected exact locations of the first and second break points, sequences and karyotypes of fusion genes are included. FARE-CAFE will substantially facilitate the cancer biologist's mission of elucidating the pathogenesis of various types of cancer. This database will ultimately help to develop 'novel' therapeutic approaches. Database URL: http://ppi.bioinfo.asia.edu.tw/FARE-CAFE. © The Author(s) 2015. Published by Oxford University Press.
Laying the Foundation for a Genomic Rosetta Stone: Creating Information Hubs through the User of Consensus Idenifiers

Energy Technology Data Exchange (ETDEWEB)

Van Brabant, Bart; Kyrpides, Nikos; Glockner, Frank Oliver; Gray, Tanya; Field, Dawn; De Vos, Paul; De Baets, Bernard; Dawyndt, Peter

2007-05-01

This paper presents a holistic approach that illustrates how the semantic hurdle for integration of biological databases might be overcome when mapping sources that provide information on individual genes and complete genomes to sources that provide information on the biological resources from which these sequences where derived, and vice versa. In particular we will explain how each of the completed and ongoing whole-genome sequencing projects in the Genomes OnLine Database and each of the ribosomal RNA sequences in the SILVA ribosomal RNA database have been persistently cross-referenced with the StrainInfo.net bioportal, serving both a genome centric and an organism centric view to the life on our blue planet as one more stepping stone towards the establishment of fully integrated and flexible biological information networks.
Genomic Resource and Genome Guided Comparison of Twenty Type Strains of the Genus Methylobacterium

Directory of Open Access Journals (Sweden)

Vasvi Chaudhry

2017-12-01

Full Text Available Bacteria of the genus Methylobacterium are widespread in diverse habitats ranging from soil, water and plant (phyllosphere, rhizosphere and endosphere. In the present study, we in house generated genomic data resource of six type strains along with fourteen database genomes of the Methylobacterium genus to carry out phylogenomic, taxonomic, comparative and ecological studies of this genus. Overall, the genus shows high diversity and genetic variation primarily due to its ability to acquire genetic material from diverse sources through horizontal gene transfer. As majority of species identified in this study are plant associated with their genomes equipped with methylotrophy and photosynthesis related gene along with genes for plant probiotic traits. Most of the species genomes are equipped with genes for adaptation and defense for UV radiation, oxidative stress and desiccation. The genus has an open pan-genome and we predicted the role of gain/loss of prophages and CRISPR elements in diversity and evolution. Our genomic resource with annotation and analysis provides a platform for interspecies genomic comparisons in the genus Methylobacterium, and to unravel their natural genome diversity and to study how natural selection shapes their genome with the adaptive mechanisms which allow them to acquire diverse habitat lifestyles. This type strains genomic data display power of Next Generation Sequencing in rapidly creating resource paving the way for studies on phylogeny and taxonomy as well as for basic and applied research for this important genus.
MIPS: analysis and annotation of genome information in 2007.

Science.gov (United States)

Mewes, H W; Dietmann, S; Frishman, D; Gregory, R; Mannhaupt, G; Mayer, K F X; Münsterkötter, M; Ruepp, A; Spannagl, M; Stümpflen, V; Rattei, T

2008-01-01

The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) combines automatic processing of large amounts of sequences with manual annotation of selected model genomes. Due to the massive growth of the available data, the depth of annotation varies widely between independent databases. Also, the criteria for the transfer of information from known to orthologous sequences are diverse. To cope with the task of global in-depth genome annotation has become unfeasible. Therefore, our efforts are dedicated to three levels of annotation: (i) the curation of selected genomes, in particular from fungal and plant taxa (e.g. CYGD, MNCDB, MatDB), (ii) the comprehensive, consistent, automatic annotation employing exhaustive methods for the computation of sequence similarities and sequence-related attributes as well as the classification of individual sequences (SIMAP, PEDANT and FunCat) and (iii) the compilation of manually curated databases for protein interactions based on scrutinized information from the literature to serve as an accepted set of reliable annotated interaction data (MPACT, MPPI, CORUM). All databases and tools described as well as the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de).
High-density rhesus macaque oligonucleotide microarray design using early-stage rhesus genome sequence information and human genome annotations

Directory of Open Access Journals (Sweden)

Magness Charles L

2007-01-01

Full Text Available Abstract Background Until recently, few genomic reagents specific for non-human primate research have been available. To address this need, we have constructed a macaque-specific high-density oligonucleotide microarray by using highly fragmented low-pass sequence contigs from the rhesus genome project together with the detailed sequence and exon structure of the human genome. Using this method, we designed oligonucleotide probes to over 17,000 distinct rhesus/human gene orthologs and increased by four-fold the number of available genes relative to our first-generation expressed sequence tag (EST-derived array. Results We constructed a database containing 248,000 exon sequences from 23,000 human RefSeq genes and compared each human exon with its best matching sequence in the January 2005 version of the rhesus genome project list of 486,000 DNA contigs. Best matching rhesus exon sequences for each of the 23,000 human genes were then concatenated in the proper order and orientation to produce a rhesus "virtual transcriptome." Microarray probes were designed, one per gene, to the region closest to the 3' untranslated region (UTR of each rhesus virtual transcript. Each probe was compared to a composite rhesus/human transcript database to test for cross-hybridization potential yielding a final probe set representing 18,296 rhesus/human gene orthologs, including transcript variants, and over 17,000 distinct genes. We hybridized mRNA from rhesus brain and spleen to both the EST- and genome-derived microarrays. Besides four-fold greater gene coverage, the genome-derived array also showed greater mean signal intensities for genes present on both arrays. Genome-derived probes showed 99.4% identity when compared to 4,767 rhesus GenBank sequence tag site (STS sequences indicating that early stage low-pass versions of complex genomes are of sufficient quality to yield valuable functional genomic information when combined with finished genome information from
Open TG-GATEs: a large-scale toxicogenomics database

Science.gov (United States)

Igarashi, Yoshinobu; Nakatsu, Noriyuki; Yamashita, Tomoya; Ono, Atsushi; Ohno, Yasuo; Urushidani, Tetsuro; Yamada, Hiroshi

2015-01-01

Toxicogenomics focuses on assessing the safety of compounds using gene expression profiles. Gene expression signatures from large toxicogenomics databases are expected to perform better than small databases in identifying biomarkers for the prediction and evaluation of drug safety based on a compound's toxicological mechanisms in animal target organs. Over the past 10 years, the Japanese Toxicogenomics Project consortium (TGP) has been developing a large-scale toxicogenomics database consisting of data from 170 compounds (mostly drugs) with the aim of improving and enhancing drug safety assessment. Most of the data generated by the project (e.g. gene expression, pathology, lot number) are freely available to the public via Open TG-GATEs (Toxicogenomics Project-Genomics Assisted Toxicity Evaluation System). Here, we provide a comprehensive overview of the database, including both gene expression data and metadata, with a description of experimental conditions and procedures used to generate the database. Open TG-GATEs is available from http://toxico.nibio.go.jp/english/index.html. PMID:25313160
CHOgenome.org 2.0: Genome resources and website updates.

Science.gov (United States)

Kremkow, Benjamin G; Baik, Jong Youn; MacDonald, Madolyn L; Lee, Kelvin H

2015-07-01

Chinese hamster ovary (CHO) cells are a major host cell line for the production of therapeutic proteins, and CHO cell and Chinese hamster (CH) genomes have recently been sequenced using next-generation sequencing methods. CHOgenome.org was launched in 2011 (version 1.0) to serve as a database repository and to provide bioinformatics tools for the CHO community. CHOgenome.org (version 1.0) maintained GenBank CHO-K1 genome data, identified CHO-omics literature, and provided a CHO-specific BLAST service. Recent major updates to CHOgenome.org (version 2.0) include new sequence and annotation databases for both CHO and CH genomes, a more user-friendly website, and new research tools, including a proteome browser and a genome viewer. CHO cell-line specific sequences and annotations facilitate cell line development opportunities, several of which are discussed. Moving forward, CHOgenome.org will host the increasing amount of CHO-omics data and continue to make useful bioinformatics tools available to the CHO community. Copyright © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

Resources for Functional Genomics Studies in Drosophila melanogaster

Science.gov (United States)

Mohr, Stephanie E.; Hu, Yanhui; Kim, Kevin; Housden, Benjamin E.; Perrimon, Norbert

2014-01-01

Drosophila melanogaster has become a system of choice for functional genomic studies. Many resources, including online databases and software tools, are now available to support design or identification of relevant fly stocks and reagents or analysis and mining of existing functional genomic, transcriptomic, proteomic, etc. datasets. These include large community collections of fly stocks and plasmid clones, “meta” information sites like FlyBase and FlyMine, and an increasing number of more specialized reagents, databases, and online tools. Here, we introduce key resources useful to plan large-scale functional genomics studies in Drosophila and to analyze, integrate, and mine the results of those studies in ways that facilitate identification of highest-confidence results and generation of new hypotheses. We also discuss ways in which existing resources can be used and might be improved and suggest a few areas of future development that would further support large- and small-scale studies in Drosophila and facilitate use of Drosophila information by the research community more generally. PMID:24653003
Genomics of pear and other Rosaceae fruit trees.

Science.gov (United States)

Yamamoto, Toshiya; Terakami, Shingo

2016-01-01

The family Rosaceae includes many economically important fruit trees, such as pear, apple, peach, cherry, quince, apricot, plum, raspberry, and loquat. Over the past few years, whole-genome sequences have been released for Chinese pear, European pear, apple, peach, Japanese apricot, and strawberry. These sequences help us to conduct functional and comparative genomics studies and to develop new cultivars with desirable traits by marker-assisted selection in breeding programs. These genomics resources also allow identification of evolutionary relationships in Rosaceae, development of genome-wide SNP and SSR markers, and construction of reference genetic linkage maps, which are available through the Genome Database for the Rosaceae website. Here, we review the recent advances in genomics studies and their practical applications for Rosaceae fruit trees, particularly pear, apple, peach, and cherry.
Genome-wide identification of specific oligonucleotides using artificial neural network and computational genomic analysis

Directory of Open Access Journals (Sweden)

Chen Jiun-Ching

2007-05-01

Full Text Available Abstract Background Genome-wide identification of specific oligonucleotides (oligos is a computationally-intensive task and is a requirement for designing microarray probes, primers, and siRNAs. An artificial neural network (ANN is a machine learning technique that can effectively process complex and high noise data. Here, ANNs are applied to process the unique subsequence distribution for prediction of specific oligos. Results We present a novel and efficient algorithm, named the integration of ANN and BLAST (IAB algorithm, to identify specific oligos. We establish the unique marker database for human and rat gene index databases using the hash table algorithm. We then create the input vectors, via the unique marker database, to train and test the ANN. The trained ANN predicted the specific oligos with high efficiency, and these oligos were subsequently verified by BLAST. To improve the prediction performance, the ANN over-fitting issue was avoided by early stopping with the best observed error and a k-fold validation was also applied. The performance of the IAB algorithm was about 5.2, 7.1, and 6.7 times faster than the BLAST search without ANN for experimental results of 70-mer, 50-mer, and 25-mer specific oligos, respectively. In addition, the results of polymerase chain reactions showed that the primers predicted by the IAB algorithm could specifically amplify the corresponding genes. The IAB algorithm has been integrated into a previously published comprehensive web server to support microarray analysis and genome-wide iterative enrichment analysis, through which users can identify a group of desired genes and then discover the specific oligos of these genes. Conclusion The IAB algorithm has been developed to construct SpecificDB, a web server that provides a specific and valid oligo database of the probe, siRNA, and primer design for the human genome. We also demonstrate the ability of the IAB algorithm to predict specific oligos through
Utilizing linkage disequilibrium information from Indian Genome ...

Indian Academy of Sciences (India)

Using LD information derived from Indian Genome Variation database (IGVdb) on populations .... Line diagram represents the SNPs selected in Indian (upper panel) and CEPH .... out procedure for extracting DNA from human nucleated cells.
Database resources for the tuberculosis community.

Science.gov (United States)

Lew, Jocelyne M; Mao, Chunhong; Shukla, Maulik; Warren, Andrew; Will, Rebecca; Kuznetsov, Dmitry; Xenarios, Ioannis; Robertson, Brian D; Gordon, Stephen V; Schnappinger, Dirk; Cole, Stewart T; Sobral, Bruno

2013-01-01

Access to online repositories for genomic and associated "-omics" datasets is now an essential part of everyday research activity. It is important therefore that the Tuberculosis community is aware of the databases and tools available to them online, as well as for the database hosts to know what the needs of the research community are. One of the goals of the Tuberculosis Annotation Jamboree, held in Washington DC on March 7th-8th 2012, was therefore to provide an overview of the current status of three key Tuberculosis resources, TubercuList (tuberculist.epfl.ch), TB Database (www.tbdb.org), and Pathosystems Resource Integration Center (PATRIC, www.patricbrc.org). Here we summarize some key updates and upcoming features in TubercuList, and provide an overview of the PATRIC site and its online tools for pathogen RNA-Seq analysis. Copyright © 2012 Elsevier Ltd. All rights reserved.
Automated genome mining of ribosomal peptide natural products

Energy Technology Data Exchange (ETDEWEB)

Mohimani, Hosein; Kersten, Roland; Liu, Wei; Wang, Mingxun; Purvine, Samuel O.; Wu, Si; Brewer, Heather M.; Pasa-Tolic, Ljiljana; Bandeira, Nuno; Moore, Bradley S.; Pevzner, Pavel A.; Dorrestein, Pieter C.

2014-07-31

Ribosomally synthesized and posttranslationally modified peptides (RiPPs), especially from microbial sources, are a large group of bioactive natural products that are a promising source of new (bio)chemistry and bioactivity (1). In light of exponentially increasing microbial genome databases and improved mass spectrometry (MS)-based metabolomic platforms, there is a need for computational tools that connect natural product genotypes predicted from microbial genome sequences with their corresponding chemotypes from metabolomic datasets. Here, we introduce RiPPquest, a tandem mass spectrometry database search tool for identification of microbial RiPPs and apply it for lanthipeptide discovery. RiPPquest uses genomics to limit search space to the vicinity of RiPP biosynthetic genes and proteomics to analyze extensive peptide modifications and compute p-values of peptide-spectrum matches (PSMs). We highlight RiPPquest by connection of multiple RiPPs from extracts of Streptomyces to their gene clusters and by the discovery of a new class III lanthipeptide, informatipeptin, from Streptomyces viridochromogenes DSM 40736 as the first natural product to be identified in an automated fashion by genome mining. The presented tool is available at cy-clo.ucsd.edu.
Effect of Agromorphological Diversity and Botanical Race on Biochemical Composition in Sweet Grains Sorghum [Sorghum Bicolor (L. Moench] of Burkina Faso

Directory of Open Access Journals (Sweden)

Nerbéwendé Sawadogo

2017-05-01

Full Text Available Sorghum bicolor (L. Moench is an under-harvested crop in Burkina Faso. It is grown mainly for its sweet grains in the pasty stage. However, the precocity of the cycle and the sweet grains at pasty stage make it an interesting plant with agro-alimentary potential during the lean season. This study was carried out to identify the main sugars responsible for the sweetness of the grains at the pasty stage and their variation according to the agro-morphological group and the botanical race. Thus, the grains harvested at the pasty stage of fifteen (15 accessions selected according to the agro-morphological group and botanical race were lyophilized and analyzed by High Performance Liquid Chromatography (HPLC. The results reveal the presence of four (4 main carbohydrates at pasty stage of grains such as fructose, glucose, sucrose and starch. Analysis of variance revealed that these carbohydrates discriminate significantly the agro-morphological groups and the botanical races. Moreover, with exception of the sucrose, the coefficient of determination (R2 values shows that the agro-morphological group factor has a greater effect on the expression of glucose, fructose and starch than the botanical race. Group III and caudatum race have the highest levels of fructose and would be the sweetest. While group IV and the guinea-bicolor race with the low value of fructose would be the least sweet. Fructose is therefore the main sugar responsible for the sweetness of the pasty grains of sweet grains sorghum.
VariVis: a visualisation toolkit for variation databases

Directory of Open Access Journals (Sweden)

Smith Timothy D

2008-04-01

Full Text Available Abstract Background With the completion of the Human Genome Project and recent advancements in mutation detection technologies, the volume of data available on genetic variations has risen considerably. These data are stored in online variation databases and provide important clues to the cause of diseases and potential side effects or resistance to drugs. However, the data presentation techniques employed by most of these databases make them difficult to use and understand. Results Here we present a visualisation toolkit that can be employed by online variation databases to generate graphical models of gene sequence with corresponding variations and their consequences. The VariVis software package can run on any web server capable of executing Perl CGI scripts and can interface with numerous Database Management Systems and "flat-file" data files. VariVis produces two easily understandable graphical depictions of any gene sequence and matches these with variant data. While developed with the goal of improving the utility of human variation databases, the VariVis package can be used in any variation database to enhance utilisation of, and access to, critical information.
Improved bacteriophage genome data is necessary for integrating viral and bacterial ecology.

Science.gov (United States)

Bibby, Kyle

2014-02-01

The recent rise in "omics"-enabled approaches has lead to improved understanding in many areas of microbial ecology. However, despite the importance that viruses play in a broad microbial ecology context, viral ecology remains largely not integrated into high-throughput microbial ecology studies. A fundamental hindrance to the integration of viral ecology into omics-enabled microbial ecology studies is the lack of suitable reference bacteriophage genomes in reference databases-currently, only 0.001% of bacteriophage diversity is represented in genome sequence databases. This commentary serves to highlight this issue and to promote bacteriophage genome sequencing as a valuable scientific undertaking to both better understand bacteriophage diversity and move towards a more holistic view of microbial ecology.
An Ontology-Based GIS for Genomic Data Management of Rumen Microbes

Directory of Open Access Journals (Sweden)

Saber Jelokhani-Niaraki

2015-03-01

Full Text Available During recent years, there has been exponential growth in biological information. With the emergence of large datasets in biology, life scientists are encountering bottlenecks in handling the biological data. This study presents an integrated geographic information system (GIS-ontology application for handling microbial genome data. The application uses a linear referencing technique as one of the GIS functionalities to represent genes as linear events on the genome layer, where users can define/change the attributes of genes in an event table and interactively see the gene events on a genome layer. Our application adopted ontology to portray and store genomic data in a semantic framework, which facilitates data-sharing among biology domains, applications, and experts. The application was developed in two steps. In the first step, the genome annotated data were prepared and stored in a MySQL database. The second step involved the connection of the database to both ArcGIS and Protégé as the GIS engine and ontology platform, respectively. We have designed this application specifically to manage the genome-annotated data of rumen microbial populations. Such a GIS-ontology application offers powerful capabilities for visualizing, managing, reusing, sharing, and querying genome-related data.
An Ontology-Based GIS for Genomic Data Management of Rumen Microbes

Science.gov (United States)

Jelokhani-Niaraki, Saber; Minuchehr, Zarrin; Nassiri, Mohammad Reza

2015-01-01

During recent years, there has been exponential growth in biological information. With the emergence of large datasets in biology, life scientists are encountering bottlenecks in handling the biological data. This study presents an integrated geographic information system (GIS)-ontology application for handling microbial genome data. The application uses a linear referencing technique as one of the GIS functionalities to represent genes as linear events on the genome layer, where users can define/change the attributes of genes in an event table and interactively see the gene events on a genome layer. Our application adopted ontology to portray and store genomic data in a semantic framework, which facilitates data-sharing among biology domains, applications, and experts. The application was developed in two steps. In the first step, the genome annotated data were prepared and stored in a MySQL database. The second step involved the connection of the database to both ArcGIS and Protégé as the GIS engine and ontology platform, respectively. We have designed this application specifically to manage the genome-annotated data of rumen microbial populations. Such a GIS-ontology application offers powerful capabilities for visualizing, managing, reusing, sharing, and querying genome-related data. PMID:25873847
A curated database of cyanobacterial strains relevant for modern taxonomy and phylogenetic studies

OpenAIRE

Ramos, Vitor; Morais, Jo?o; Vasconcelos, Vitor M.

2017-01-01

The dataset herein described lays the groundwork for an online database of relevant cyanobacterial strains, named CyanoType (http://lege.ciimar.up.pt/cyanotype). It is a database that includes categorized cyanobacterial strains useful for taxonomic, phylogenetic or genomic purposes, with associated information obtained by means of a literature-based curation. The dataset lists 371 strains and represents the first version of the database (CyanoType v.1). Information for each strain includes st...
Zebrafish Database: Customizable, Free, and Open-Source Solution for Facility Management.

Science.gov (United States)

Yakulov, Toma Antonov; Walz, Gerd

2015-12-01

Zebrafish Database is a web-based customizable database solution, which can be easily adapted to serve both single laboratories and facilities housing thousands of zebrafish lines. The database allows the users to keep track of details regarding the various genomic features, zebrafish lines, zebrafish batches, and their respective locations. Advanced search and reporting options are available. Unique features are the ability to upload files and images that are associated with the respective records and an integrated calendar component that supports multiple calendars and categories. Built on the basis of the Joomla content management system, the Zebrafish Database is easily extendable without the need for advanced programming skills.
Towards understanding the first genome sequence of a crenarchaeon by genome annotation using clusters of orthologous groups of proteins (COGs).

Science.gov (United States)

Natale, D A; Shankavaram, U T; Galperin, M Y; Wolf, Y I; Aravind, L; Koonin, E V

2000-01-01

Standard archival sequence databases have not been designed as tools for genome annotation and are far from being optimal for this purpose. We used the database of Clusters of Orthologous Groups of proteins (COGs) to reannotate the genomes of two archaea, Aeropyrum pernix, the first member of the Crenarchaea to be sequenced, and Pyrococcus abyssi. A. pernix and P. abyssi proteins were assigned to COGs using the COGNITOR program; the results were verified on a case-by-case basis and augmented by additional database searches using the PSI-BLAST and TBLASTN programs. Functions were predicted for over 300 proteins from A. pernix, which could not be assigned a function using conventional methods with a conservative sequence similarity threshold, an approximately 50% increase compared to the original annotation. A. pernix shares most of the conserved core of proteins that were previously identified in the Euryarchaeota. Cluster analysis or distance matrix tree construction based on the co-occurrence of genomes in COGs showed that A. pernix forms a distinct group within the archaea, although grouping with the two species of Pyrococci, indicative of similar repertoires of conserved genes, was observed. No indication of a specific relationship between Crenarchaeota and eukaryotes was obtained in these analyses. Several proteins that are conserved in Euryarchaeota and most bacteria are unexpectedly missing in A. pernix, including the entire set of de novo purine biosynthesis enzymes, the GTPase FtsZ (a key component of the bacterial and euryarchaeal cell-division machinery), and the tRNA-specific pseudouridine synthase, previously considered universal. A. pernix is represented in 48 COGs that do not contain any euryarchaeal members. Many of these proteins are TCA cycle and electron transport chain enzymes, reflecting the aerobic lifestyle of A. pernix. Special-purpose databases organized on the basis of phylogenetic analysis and carefully curated with respect to known and
Complete Mitochondrial Genomes of the Cherskii's Sculpin Cottus czerskii and Siberian Taimen Hucho taimen Reveal GenBank Entry Errors: Incorrect Species Identification and Recombinant Mitochondrial Genome.

Science.gov (United States)

Balakirev, Evgeniy S; Saveliev, Pavel A; Ayala, Francisco J

2017-01-01

The complete mitochondrial (mt) genome is sequenced in 2 individuals of the Cherskii's sculpin Cottus czerskii . A surprisingly high level of sequence divergence (10.3%) has been detected between the 2 genomes of C czerskii studied here and the GenBank mt genome of C czerskii (KJ956027). At the same time, a surprisingly low level of divergence (1.4%) has been detected between the GenBank C czerskii (KJ956027) and the Amur sculpin Cottus szanaga (KX762049, KX762050). We argue that the observed discrepancies are due to incorrect taxonomic identification so that the GenBank accession number KJ956027 represents actually the mt genome of C szanaga erroneously identified as C czerskii . Our results are of consequence concerning the GenBank database quality, highlighting the potential negative consequences of entry errors, which once they are introduced tend to be propagated among databases and subsequent publications. We illustrate the premise with the data on recombinant mt genome of the Siberian taimen Hucho taimen (NCBI Reference Sequence Database NC_016426.1; GenBank accession number HQ897271.1), bearing 2 introgressed fragments (≈0.9 kb [kilobase]) from 2 lenok subspecies, Brachymystax lenok and Brachymystax lenok tsinlingensis , submitted to GenBank on June 12, 2011. Since the time of submission, the H taimen recombinant mt genome leading to incorrect phylogenetic inferences was propagated in multiple subsequent publications despite the fact that nonrecombinant H taimen genomes were also available (submitted to GenBank on August 2, 2014; KJ711549, KJ711550). Other examples of recombinant sequences persisting in GenBank are also considered. A GenBank Entry Error Depositary is urgently needed to monitor and avoid a progressive accumulation of wrong biological information.
BIGSdb: Scalable analysis of bacterial genome variation at the population level

Directory of Open Access Journals (Sweden)

Maiden Martin CJ

2010-12-01

Full Text Available Abstract Background The opportunities for bacterial population genomics that are being realised by the application of parallel nucleotide sequencing require novel bioinformatics platforms. These must be capable of the storage, retrieval, and analysis of linked phenotypic and genotypic information in an accessible, scalable and computationally efficient manner. Results The Bacterial Isolate Genome Sequence Database (BIGSDB is a scalable, open source, web-accessible database system that meets these needs, enabling phenotype and sequence data, which can range from a single sequence read to whole genome data, to be efficiently linked for a limitless number of bacterial specimens. The system builds on the widely used mlstdbNet software, developed for the storage and distribution of multilocus sequence typing (MLST data, and incorporates the capacity to define and identify any number of loci and genetic variants at those loci within the stored nucleotide sequences. These loci can be further organised into 'schemes' for isolate characterisation or for evolutionary or functional analyses. Isolates and loci can be indexed by multiple names and any number of alternative schemes can be accommodated, enabling cross-referencing of different studies and approaches. LIMS functionality of the software enables linkage to and organisation of laboratory samples. The data are easily linked to external databases and fine-grained authentication of access permits multiple users to participate in community annotation by setting up or contributing to different schemes within the database. Some of the applications of BIGSDB are illustrated with the genera Neisseria and Streptococcus. The BIGSDB source code and documentation are available at http://pubmlst.org/software/database/bigsdb/. Conclusions Genomic data can be used to characterise bacterial isolates in many different ways but it can also be efficiently exploited for evolutionary or functional studies. BIGSDB
Minimum information about a single amplified genome (MISAG) and a metagenome-assembled genome (MIMAG) of bacteria and archaea

Energy Technology Data Exchange (ETDEWEB)

Bowers, Robert M.; Kyrpides, Nikos C.; Stepanauskas, Ramunas; Harmon-Smith, Miranda; Doud, Devin; Reddy, T. B. K.; Schulz, Frederik; Jarett, Jessica; Rivers, Adam R.; Eloe-Fadrosh, Emiley A.; Tringe, Susannah G.; Ivanova, Natalia N.; Copeland, Alex; Clum, Alicia; Becraft, Eric D.; Malmstrom, Rex R.; Birren, Bruce; Podar, Mircea; Bork, Peer; Weinstock, George M.; Garrity, George M.; Dodsworth, Jeremy A.; Yooseph, Shibu; Sutton, Granger; Glöckner, Frank O.; Gilbert, Jack A.; Nelson, William C.; Hallam, Steven J.; Jungbluth, Sean P.; Ettema, Thijs J. G.; Tighe, Scott; Konstantinidis, Konstantinos T.; Liu, Wen-Tso; Baker, Brett J.; Rattei, Thomas; Eisen, Jonathan A.; Hedlund, Brian; McMahon, Katherine D.; Fierer, Noah; Knight, Rob; Finn, Rob; Cochrane, Guy; Karsch-Mizrachi, Ilene; Tyson, Gene W.; Rinke, Christian; Kyrpides, Nikos C.; Schriml, Lynn; Garrity, George M.; Hugenholtz, Philip; Sutton, Granger; Yilmaz, Pelin; Meyer, Folker; Glöckner, Frank O.; Gilbert, Jack A.; Knight, Rob; Finn, Rob; Cochrane, Guy; Karsch-Mizrachi, Ilene; Lapidus, Alla; Meyer, Folker; Yilmaz, Pelin; Parks, Donovan H.; Eren, A. M.; Schriml, Lynn; Banfield, Jillian F.; Hugenholtz, Philip; Woyke, Tanja

2017-08-08

The number of genomes from uncultivated microbes will soon surpass the number of isolate genomes in public databases (Hugenholtz, Skarshewski, & Parks, 2016). Technological advancements in high-throughput sequencing and assembly, including single-cell genomics and the computational extraction of genomes from metagenomes (GFMs), are largely responsible. Here we propose community standards for reporting the Minimum Information about a Single-Cell Genome (MIxS-SCG) and Minimum Information about Genomes extracted From Metagenomes (MIxS-GFM) specific for Bacteria and Archaea. The standards have been developed in the context of the International Genomics Standards Consortium (GSC) community (Field et al., 2014) and can be viewed as a supplement to other GSC checklists including the Minimum Information about a Genome Sequence (MIGS), Minimum information about a Metagenomic Sequence(s) (MIMS) (Field et al., 2008) and Minimum Information about a Marker Gene Sequence (MIMARKS) (P. Yilmaz et al., 2011). Community-wide acceptance of MIxS-SCG and MIxS-GFM for Bacteria and Archaea will enable broad comparative analyses of genomes from the majority of taxa that remain uncultivated, improving our understanding of microbial function, ecology, and evolution.
Putative Microsatellite DNA Marker-Based Wheat Genomic Resource for Varietal Improvement and Management

Directory of Open Access Journals (Sweden)

Sarika Jaiswal

2017-11-01

Full Text Available Wheat fulfills 20% of global caloric requirement. World needs 60% more wheat for 9 billion population by 2050 but climate change with increasing temperature is projected to affect wheat productivity adversely. Trait improvement and management of wheat germplasm requires genomic resource. Simple Sequence Repeats (SSRs being highly polymorphic and ubiquitously distributed in the genome, can be a marker of choice but there is no structured marker database with options to generate primer pairs for genotyping on desired chromosome/physical location. Previously associated markers with different wheat trait are also not available in any database. Limitations of in vitro SSR discovery can be overcome by genome-wide in silico mining of SSR. Triticum aestivum SSR database (TaSSRDb is an integrated online database with three-tier architecture, developed using PHP and MySQL and accessible at http://webtom.cabgrid.res.in/wheatssr/. For genotyping, Primer3 standalone code computes primers on user request. Chromosome-wise SSR calling for all the three sub genomes along with choice of motif types is provided in addition to the primer generation for desired marker. We report here a database of highest number of SSRs (476,169 from complex, hexaploid wheat genome (~17 GB along with previously reported 268 SSR markers associated with 11 traits. Highest (116.93 SSRs/Mb and lowest (74.57 SSRs/Mb SSR densities were found on 2D and 3A chromosome, respectively. To obtain homozygous locus, e-PCR was done. Such 30 loci were randomly selected for PCR validation in panel of 18 wheat Advance Varietal Trial (AVT lines. TaSSRDb can be a valuable genomic resource tool for linkage mapping, gene/QTL (Quantitative trait locus discovery, diversity analysis, traceability and variety identification. Varietal specific profiling and differentiation can supplement DUS (Distinctiveness, Uniformity, and Stability testing, EDV (Essentially Derived Variety/IV (Initial Variety disputes, seed
CrusView: A Java-Based Visualization Platform for Comparative Genomics Analyses in Brassicaceae Species[OPEN

Science.gov (United States)

Chen, Hao; Wang, Xiangfeng

2013-01-01

In plants and animals, chromosomal breakage and fusion events based on conserved syntenic genomic blocks lead to conserved patterns of karyotype evolution among species of the same family. However, karyotype information has not been well utilized in genomic comparison studies. We present CrusView, a Java-based bioinformatic application utilizing Standard Widget Toolkit/Swing graphics libraries and a SQLite database for performing visualized analyses of comparative genomics data in Brassicaceae (crucifer) plants. Compared with similar software and databases, one of the unique features of CrusView is its integration of karyotype information when comparing two genomes. This feature allows users to perform karyotype-based genome assembly and karyotype-assisted genome synteny analyses with preset karyotype patterns of the Brassicaceae genomes. Additionally, CrusView is a local program, which gives its users high flexibility when analyzing unpublished genomes and allows users to upload self-defined genomic information so that they can visually study the associations between genome structural variations and genetic elements, including chromosomal rearrangements, genomic macrosynteny, gene families, high-frequency recombination sites, and tandem and segmental duplications between related species. This tool will greatly facilitate karyotype, chromosome, and genome evolution studies using visualized comparative genomics approaches in Brassicaceae species. CrusView is freely available at http://www.cmbb.arizona.edu/CrusView/. PMID:23898041
Genomic Sequence Variation Markup Language (GSVML).

Science.gov (United States)

Nakaya, Jun; Kimura, Michio; Hiroi, Kaei; Ido, Keisuke; Yang, Woosung; Tanaka, Hiroshi

2010-02-01

With the aim of making good use of internationally accumulated genomic sequence variation data, which is increasing rapidly due to the explosive amount of genomic research at present, the development of an interoperable data exchange format and its international standardization are necessary. Genomic Sequence Variation Markup Language (GSVML) will focus on genomic sequence variation data and human health applications, such as gene based medicine or pharmacogenomics. We developed GSVML through eight steps, based on case analysis and domain investigations. By focusing on the design scope to human health applications and genomic sequence variation, we attempted to eliminate ambiguity and to ensure practicability. We intended to satisfy the requirements derived from the use case analysis of human-based clinical genomic applications. Based on database investigations, we attempted to minimize the redundancy of the data format, while maximizing the data covering range. We also attempted to ensure communication and interface ability with other Markup Languages, for exchange of omics data among various omics researchers or facilities. The interface ability with developing clinical standards, such as the Health Level Seven Genotype Information model, was analyzed. We developed the human health-oriented GSVML comprising variation data, direct annotation, and indirect annotation categories; the variation data category is required, while the direct and indirect annotation categories are optional. The annotation categories contain omics and clinical information, and have internal relationships. For designing, we examined 6 cases for three criteria as human health application and 15 data elements for three criteria as data formats for genomic sequence variation data exchange. The data format of five international SNP databases and six Markup Languages and the interface ability to the Health Level Seven Genotype Model in terms of 317 items were investigated. GSVML was developed as

Some links on this page may take you to non-federal websites. Their policies may differ from this site.