WorldWideScience

Sample records for genomic hybridisation analysis

  1. High resolution microarray comparative genomic hybridisation analysis using spotted oligonucleotides.

    NARCIS (Netherlands)

    Carvalho, B; Ouwerkerk, E; Meijer, G.A.; Ylstra, B.

    2004-01-01

    BACKGROUND: Currently, comparative genomic hybridisation array (array CGH) is the method of choice for studying genome wide DNA copy number changes. To date, either amplified representations of bacterial artificial chromosomes (BACs)/phage artificial chromosomes (PACs) or cDNAs have been spotted as

  2. Microarray comparative genomic hybridisation analysis incorporating genomic organisation, and application to enterobacterial plant pathogens.

    Directory of Open Access Journals (Sweden)

    Leighton Pritchard

    2009-08-01

    Full Text Available Microarray comparative genomic hybridisation (aCGH provides an estimate of the relative abundance of genomic DNA (gDNA taken from comparator and reference organisms by hybridisation to a microarray containing probes that represent sequences from the reference organism. The experimental method is used in a number of biological applications, including the detection of human chromosomal aberrations, and in comparative genomic analysis of bacterial strains, but optimisation of the analysis is desirable in each problem domain.We present a method for analysis of bacterial aCGH data that encodes spatial information from the reference genome in a hidden Markov model. This technique is the first such method to be validated in comparisons of sequenced bacteria that diverge at the strain and at the genus level: Pectobacterium atrosepticum SCRI1043 (Pba1043 and Dickeya dadantii 3937 (Dda3937; and Lactococcus lactis subsp. lactis IL1403 and L. lactis subsp. cremoris MG1363. In all cases our method is found to outperform common and widely used aCGH analysis methods that do not incorporate spatial information. This analysis is applied to comparisons between commercially important plant pathogenic soft-rotting enterobacteria (SRE Pba1043, P. atrosepticum SCRI1039, P. carotovorum 193, and Dda3937.Our analysis indicates that it should not be assumed that hybridisation strength is a reliable proxy for sequence identity in aCGH experiments, and robustly extends the applicability of aCGH to bacterial comparisons at the genus level. Our results in the SRE further provide evidence for a dynamic, plastic 'accessory' genome, revealing major genomic islands encoding gene products that provide insight into, and may play a direct role in determining, variation amongst the SRE in terms of their environmental survival, host range and aetiology, such as phytotoxin synthesis, multidrug resistance, and nitrogen fixation.

  3. A process for analysis of microarray comparative genomics hybridisation studies for bacterial genomes

    Directory of Open Access Journals (Sweden)

    Woodward Martin J

    2008-01-01

    Full Text Available Abstract Background Microarray based comparative genomic hybridisation (CGH experiments have been used to study numerous biological problems including understanding genome plasticity in pathogenic bacteria. Typically such experiments produce large data sets that are difficult for biologists to handle. Although there are some programmes available for interpretation of bacterial transcriptomics data and CGH microarray data for looking at genetic stability in oncogenes, there are none specifically to understand the mosaic nature of bacterial genomes. Consequently a bottle neck still persists in accurate processing and mathematical analysis of these data. To address this shortfall we have produced a simple and robust CGH microarray data analysis process that may be automated in the future to understand bacterial genomic diversity. Results The process involves five steps: cleaning, normalisation, estimating gene presence and absence or divergence, validation, and analysis of data from test against three reference strains simultaneously. Each stage of the process is described and we have compared a number of methods available for characterising bacterial genomic diversity, for calculating the cut-off between gene presence and absence or divergence, and shown that a simple dynamic approach using a kernel density estimator performed better than both established, as well as a more sophisticated mixture modelling technique. We have also shown that current methods commonly used for CGH microarray analysis in tumour and cancer cell lines are not appropriate for analysing our data. Conclusion After carrying out the analysis and validation for three sequenced Escherichia coli strains, CGH microarray data from 19 E. coli O157 pathogenic test strains were used to demonstrate the benefits of applying this simple and robust process to CGH microarray studies using bacterial genomes.

  4. [Sotos syndrome diagnosed by comparative genomic hybridisation].

    Science.gov (United States)

    Saldarriaga, Wilmar; Molina-Barrera, Laura Camila; Ramírez-Cheyne, Julián

    2016-01-01

    Sotos Syndrome (SS) is a genetic disease with an autosomal dominant pattern caused by haplo-insufficiency of NSD1 gene secondary to point mutations or microdeletion of the 5q35 locus where the gene is located. It is a rare syndrome, occurring in 7 out of every 100,000 births. The objective of this report is to present the case of a 4 year-old patient with a global developmental delay, as well as specific physical findings suggesting a syndrome of genetic origin. Female patient, 4 years of age, thinning hair, triangular facie, long palpebral fissure, arched palate, prominent jaw, winged scapula and clinodactilia of the fifth finger both hands. The molecular test comparative genomic hybridisation test by microarray was subsequently performed, with the result showing 5q35.2 q35.3 region microdeletion of 2,082 MB, including the NSD1 gene. Finally, this article also proposes the performing of comparative genomic hybridisation as the first diagnostic option in cases where clinical findings are suggestive of SS. Copyright © 2015 Sociedad Chilena de Pediatría. Publicado por Elsevier España, S.L.U. All rights reserved.

  5. [Comparative genomic hybridisation as a first option in genetic diagnosis: 1,000 cases and a cost-benefit analysis].

    Science.gov (United States)

    Castells-Sarret, Neus; Cueto-González, Anna M; Borregan, Mar; López-Grondona, Fermina; Miró, Rosa; Tizzano, Eduardo; Plaja, Alberto

    2017-09-25

    Conventional cytogenetics diagnoses 3-5% of patients with unexplained developmental delay/intellectual disability and/or multiple congenital anomalies. The Multiplex Ligation-dependent Probe Amplification increases diagnostic rates from between 2.4 to 5.8%. Currently the comparative genomic hybridisation array or aCGH is the highest performing diagnostic tool in patients with developmental delay/intellectual disability, congenital anomalies and autism spectrum disorders. Our aim is to evaluate the efficiency of the use of aCGH as first-line test in these and other indications (epilepsy, short stature). A total of 1000 patients referred due to one or more of the abovementioned disorders were analysed by aCGH. Pathogenic genomic imbalances were detected in 14% of the cases, with a variable distribution of diagnosis according to the phenotypes: 18.9% of patients with developmental delay/intellectual disability; 13.7% of multiple congenital anomalies, 9.76% of psychiatric pathologies, 7.02% of patients with epilepsy, and 13.3% of patients with short stature. Within the multiple congenital anomalies, central nervous system abnormalities and congenital heart diseases accounted for 14.9% and 10.6% of diagnoses, respectively. Among the psychiatric disorders, patients with autism spectrum disorders accounted for 8.9% of the diagnoses. Our results demonstrate the effectiveness and efficiency of the use of aCGH as the first line test in genetic diagnosis of patients suspected of genomic imbalances, supporting its inclusion within the National Health System. Copyright © 2017. Publicado por Elsevier España, S.L.U.

  6. Introgression of tomato chromosomes into the potato genome: an analysis through molecular marker and in situ hybridisation techniques.

    NARCIS (Netherlands)

    Calderé, F.G.

    1998-01-01

    Transfer of alien chromosomes and genes across intergeneric boundaries can be useful not only for the introgression of desirable characters but also for fundamental genetic studies. The successful demonstration of hybridisation of potato ( Solanum tuberosum ) and tomato ( Lycopersicon esculentum ) t

  7. Array comparative genomic hybridisation analysis of boys with X linked hypopituitarism identifies a 3.9 Mb duplicated critical region at Xq27 containing SOX3.

    NARCIS (Netherlands)

    Solomon, N.M.; Ross, S.; Morgan, T.; Belsky, J.L.; Hol, F.A.; Karnes, P.; Hopwood, N.J.; Myers, S.E.; Tan, A.; Warne, G.L.; Forrest, S.M.; Thomas, P.Q.

    2004-01-01

    INTRODUCTION: Array comparative genomic hybridisation (array CGH) is a powerful method that detects alteration of gene copy number with greater resolution and efficiency than traditional methods. However, its ability to detect disease causing duplications in constitutional genomic DNA has not been s

  8. Genomic confirmation of hybridisation and recent inbreeding in a vector-isolated Leishmania population.

    Directory of Open Access Journals (Sweden)

    Matthew B Rogers

    2014-01-01

    Full Text Available Although asexual reproduction via clonal propagation has been proposed as the principal reproductive mechanism across parasitic protozoa of the Leishmania genus, sexual recombination has long been suspected, based on hybrid marker profiles detected in field isolates from different geographical locations. The recent experimental demonstration of a sexual cycle in Leishmania within sand flies has confirmed the occurrence of hybridisation, but knowledge of the parasite life cycle in the wild still remains limited. Here, we use whole genome sequencing to investigate the frequency of sexual reproduction in Leishmania, by sequencing the genomes of 11 Leishmania infantum isolates from sand flies and 1 patient isolate in a focus of cutaneous leishmaniasis in the Çukurova province of southeast Turkey. This is the first genome-wide examination of a vector-isolated population of Leishmania parasites. A genome-wide pattern of patchy heterozygosity and SNP density was observed both within individual strains and across the whole group. Comparisons with other Leishmania donovani complex genome sequences suggest that these isolates are derived from a single cross of two diverse strains with subsequent recombination within the population. This interpretation is supported by a statistical model of the genomic variability for each strain compared to the L. infantum reference genome strain as well as genome-wide scans for recombination within the population. Further analysis of these heterozygous blocks indicates that the two parents were phylogenetically distinct. Patterns of linkage disequilibrium indicate that this population reproduced primarily clonally following the original hybridisation event, but that some recombination also occurred. This observation allowed us to estimate the relative rates of sexual and asexual reproduction within this population, to our knowledge the first quantitative estimate of these events during the Leishmania life cycle.

  9. Genomic Confirmation of Hybridisation and Recent Inbreeding in a Vector-Isolated Leishmania Population

    Science.gov (United States)

    Smith, Barbara A.; Imamura, Hideo; Sanders, Mandy; Svobodova, Milena; Volf, Petr; Berriman, Matthew; Cotton, James A.; Smith, Deborah F.

    2014-01-01

    Although asexual reproduction via clonal propagation has been proposed as the principal reproductive mechanism across parasitic protozoa of the Leishmania genus, sexual recombination has long been suspected, based on hybrid marker profiles detected in field isolates from different geographical locations. The recent experimental demonstration of a sexual cycle in Leishmania within sand flies has confirmed the occurrence of hybridisation, but knowledge of the parasite life cycle in the wild still remains limited. Here, we use whole genome sequencing to investigate the frequency of sexual reproduction in Leishmania, by sequencing the genomes of 11 Leishmania infantum isolates from sand flies and 1 patient isolate in a focus of cutaneous leishmaniasis in the Çukurova province of southeast Turkey. This is the first genome-wide examination of a vector-isolated population of Leishmania parasites. A genome-wide pattern of patchy heterozygosity and SNP density was observed both within individual strains and across the whole group. Comparisons with other Leishmania donovani complex genome sequences suggest that these isolates are derived from a single cross of two diverse strains with subsequent recombination within the population. This interpretation is supported by a statistical model of the genomic variability for each strain compared to the L. infantum reference genome strain as well as genome-wide scans for recombination within the population. Further analysis of these heterozygous blocks indicates that the two parents were phylogenetically distinct. Patterns of linkage disequilibrium indicate that this population reproduced primarily clonally following the original hybridisation event, but that some recombination also occurred. This observation allowed us to estimate the relative rates of sexual and asexual reproduction within this population, to our knowledge the first quantitative estimate of these events during the Leishmania life cycle. PMID:24453988

  10. Genomic and transcriptomic alterations following hybridisation and genome doubling in trigenomic allohexaploid Brassica carinata × Brassica rapa.

    Science.gov (United States)

    Xu, Y; Zhao, Q; Mei, S; Wang, J

    2012-09-01

    Allopolyploidisation is a prominent evolutionary force that involves two major events: interspecific hybridisation and genome doubling. Both events have important functional consequences in shaping the genomic architecture of the neo-allopolyploids. The respective effects of hybridisation and genome doubling upon genomic and transcriptomic changes in Brassica allopolyploids are unresolved. In this study, amplified fragment length polymorphism (AFLP), methylation-sensitive amplification polymorphism (MSAP) and cDNA-AFLP approaches were used to track genetic, epigenetic and transcriptional changes in both allohexaploid Brassica (ArArBcBcCcCc genome) and triploid hybrids (ArBcCc genome). Results from these groups were compared with each other and also to their parents Brassica carinata (BBCC genome) and Brassica rapa (AA genome). Rapid and dramatic genetic, DNA methylation and gene expression changes were detected in the triploid hybrids. During the shift from triploidy to allohexaploidy, some of the hybridisation-induced alterations underwent reversion. Additionally, novel genetic, epigenetic and transcriptional alterations were also detected. The proportions of A-genome-specific DNA methylation and gene expression alterations were significantly greater than those of BC-genome-specific alterations in the triploid hybrids. However, the two parental genomes were equally affected during the ploidy shift. Hemi-CCG methylation changes induced by hybridisation were recovered after genome doubling. Full-CG methylation changes were a more general process initiated in the hybrid and continued after genome doubling. These results indicate that genome doubling could ameliorate genomic and transcriptomic alterations induced by hybridisation and instigate additional alterations in trigenomic Brassica allohexaploids. Moreover, genome doubling also modified hybridisation-induced progenitor genome-biased alterations and epigenetic alteration characteristics.

  11. Validation and implementation of array comparative genomic hybridisation as a first line test in place of postnatal karyotyping for genome imbalance

    Directory of Open Access Journals (Sweden)

    Docherty Zoe

    2010-04-01

    Full Text Available Abstract Background Several studies have demonstrated that array comparative genomic hybridisation (CGH for genome-wide imbalance provides a substantial increase in diagnostic yield for patients traditionally referred for karyotyping by G-banded chromosome analysis. The purpose of this study was to demonstrate the feasibility of and strategies for, the use of array CGH in place of karyotyping for genome imbalance, and to report on the results of the implementation of this approach. Results Following a validation period, an oligoarray platform was chosen. In order to minimise costs and increase efficiency, a patient/patient hybridisation strategy was used, and analysis criteria were set to optimise detection of pathogenic imbalance. A customised database application with direct links to a number of online resources was developed to allow efficient management and tracking of patient samples and facilitate interpretation of results. Following introduction into our routine diagnostic service for patients with suspected genome imbalance, array CGH as a follow-on test for patients with normal karyotypes (n = 1245 and as a first-line test (n = 1169 gave imbalance detection rates of 26% and 22% respectively (excluding common, benign variants. At least 89% of the abnormalities detected by first line testing would not have been detected by standard karyotype analysis. The average reporting time for first-line tests was 25 days from receipt of sample. Conclusions Array CGH can be used in a diagnostic service setting in place of G-banded chromosome analysis, providing a more comprehensive and objective test for patients with suspected genome imbalance. The increase in consumable costs can be minimised by employing appropriate hybridisation strategies; the use of robotics and a customised database application to process multiple samples reduces staffing costs and streamlines analysis, interpretation and reporting of results. Array CGH provides a

  12. High frequency of submicroscopic chromosomal imbalances in patients with syndromic craniosynostosis detected by a combined approach of microsatellite segregation analysis, multiplex ligation-dependent probe amplification and array-based comparative genome hybridisation.

    NARCIS (Netherlands)

    Jehee, F.S.; Krepischi-Santos, A.C.; Rocha, K.M.; Cavalcanti, D.P.; Kim, C.A.; Bertola, D.R.; Alonso, L.G.; D'Angelo, C.S.; Mazzeu, J.F.; Froyen, G.; Lugtenberg, D.; Vianna-Morgante, A.M.; Rosenberg, C.; Passos-Bueno, M.R.

    2008-01-01

    We present the first comprehensive study, to our knowledge, on genomic chromosomal analysis in syndromic craniosynostosis. In total, 45 patients with craniosynostotic disorders were screened with a variety of methods including conventional karyotype, microsatellite segregation analysis, subtelomeric

  13. Experimental analysis of Hybridised Energy Storage Systems for automotive applications

    Science.gov (United States)

    Sarwar, Wasim; Engstrom, Timothy; Marinescu, Monica; Green, Nick; Taylor, Nigel; Offer, Gregory J.

    2016-08-01

    The requirements of the Energy Storage System (ESS) for an electrified vehicle portfolio consisting of a range of vehicles from micro Hybrid Electric Vehicle (mHEV) to a Battery Electric Vehicle (BEV) vary considerably. To reduce development cost of an electrified powertrain portfolio, a modular system would ideally be scaled across each vehicle; however, the conflicting requirements of a mHEV and BEV prevent this. This study investigates whether it is possible to combine supercapacitors suitable for an mHEV with high-energy batteries suitable for use in a BEV to create a Hybridised Energy Storage System (HESS) suitable for use in a HEV. A passive HESS is found to be capable of meeting the electrical demands of a HEV drive cycle; the operating principles of HESSs are discussed and factors limiting system performance are explored. The performance of the HESS is found to be significantly less temperature dependent than battery-only systems, however the heat generated suggests a requirement for thermal management. As the HESS degrades (at a similar rate to a specialised high-power-battery), battery resistance rises faster than supercapacitor resistance; as a result, the supercapacitor provides a greater current contribution, therefore the energy throughput, temperature rise and degradation of the batteries is reduced.

  14. Genome composition of 'Elatior'-begonias hybrids analyzed by genomic in situ hybridisation

    NARCIS (Netherlands)

    Marasek Ciolakowska, A.R.; Ramanna, M.S.; Laak, W.A.; Tuyl, van J.M.

    2010-01-01

    Interspecific hybridization of various tuberous Begonia species hybrids with Begonia socotrana results in so-called 'Elatior'-begonias hybrids (B. x hiemalis Fotsch). In our study, genomic in situ hybridization (GISH) has been employed to assess the genome composition in eleven 'Elatior'-begonias

  15. 2q23.1 microdeletion identified by array comparative genomic hybridisation: an emerging phenotype with Angelman-like features?

    Science.gov (United States)

    Jaillard, S; Dubourg, C; Gérard-Blanluet, M; Delahaye, A; Pasquier, L; Dupont, C; Henry, C; Tabet, A-C; Lucas, J; Aboura, A; David, V; Benzacken, B; Odent, S; Pipiras, E

    2009-12-01

    Genome-wide screening of patients with mental retardation using array comparative genomic hybridisation (CGH) has identified several novel imbalances. With this genotype-first approach, the 2q22.3q23.3 deletion was recently described as a novel microdeletion syndrome. The authors report two unrelated patients with a de novo interstitial deletion mapping in this genomic region and presenting similar "pseudo-Angelman" phenotypes, including severe psychomotor retardation, speech impairment, epilepsy, microcephaly, ataxia, and behavioural disabilities. The microdeletions were identified by array CGH using oligonucleotide and bacterial artificial chromosome (BAC) arrays, and further confirmed by fluorescence in situ hybridisation (FISH) and semi-quantitative polymerase chain reaction (PCR). The boundaries and sizes of the deletions in the two patients were different but an overlapping region of about 250 kb was defined, which mapped to 2q23.1 and included two genes: MBD5 and EPC2. The SIP1 gene associated with the Mowat-Wilson syndrome was not included in the deleted genomic region. Haploinsufficiency of one of the deleted genes (MBD5 or EPC2) could be responsible for the common clinical features observed in the 2q23.1 microdeletion syndrome, and this hypothesis needs further investigation.

  16. Karyotype analysis of Lilium longiflorum and Lilium rubellum by chromosome banding and fluorescence in situ hybridisation

    NARCIS (Netherlands)

    Lim, K.B.; Wennekes, J.; Jong, de J.H.S.G.M.; Jacobsen, E.; Tuyl, van J.M.

    2001-01-01

    Detailed karyotypes of Lilium longiflorum and L. rubellum were constructed on the basis of chromosome arm lengths, C-banding, AgNO3 staining, and PI-DAPI banding, together with fluorescence in situ hybridisation (FISH) with the 5S and 45S rDNA sequences as probes. The C-banding patterns that were

  17. Karyotype analysis of Lilium longiflorum and Lilium rubellum by chromosome banding and fluorescence in situ hybridisation

    NARCIS (Netherlands)

    Lim, K.B.; Wennekes, J.; Jong, de J.H.S.G.M.; Jacobsen, E.; Tuyl, van J.M.

    2001-01-01

    Detailed karyotypes of Lilium longiflorum and L. rubellum were constructed on the basis of chromosome arm lengths, C-banding, AgNO3 staining, and PI-DAPI banding, together with fluorescence in situ hybridisation (FISH) with the 5S and 45S rDNA sequences as probes. The C-banding patterns that were ob

  18. A High-Throughput Computational Framework for Identifying Significant Copy Number Aberrations from Array Comparative Genomic Hybridisation Data

    Directory of Open Access Journals (Sweden)

    Ian Roberts

    2012-01-01

    Full Text Available Reliable identification of copy number aberrations (CNA from comparative genomic hybridization data would be improved by the availability of a generalised method for processing large datasets. To this end, we developed swatCGH, a data analysis framework and region detection heuristic for computational grids. swatCGH analyses sequentially displaced (sliding windows of neighbouring probes and applies adaptive thresholds of varying stringency to identify the 10% of each chromosome that contains the most frequently occurring CNAs. We used the method to analyse a published dataset, comparing data preprocessed using four different DNA segmentation algorithms, and two methods for prioritising the detected CNAs. The consolidated list of the most commonly detected aberrations confirmed the value of swatCGH as a simplified high-throughput method for identifying biologically significant CNA regions of interest.

  19. A High-Throughput Computational Framework for Identifying Significant Copy Number Aberrations from Array Comparative Genomic Hybridisation Data

    Science.gov (United States)

    Roberts, Ian; Carter, Stephanie A.; Scarpini, Cinzia G.; Karagavriilidou, Konstantina; Barna, Jenny C. J.; Calleja, Mark; Coleman, Nicholas

    2012-01-01

    Reliable identification of copy number aberrations (CNA) from comparative genomic hybridization data would be improved by the availability of a generalised method for processing large datasets. To this end, we developed swatCGH, a data analysis framework and region detection heuristic for computational grids. swatCGH analyses sequentially displaced (sliding) windows of neighbouring probes and applies adaptive thresholds of varying stringency to identify the 10% of each chromosome that contains the most frequently occurring CNAs. We used the method to analyse a published dataset, comparing data preprocessed using four different DNA segmentation algorithms, and two methods for prioritising the detected CNAs. The consolidated list of the most commonly detected aberrations confirmed the value of swatCGH as a simplified high-throughput method for identifying biologically significant CNA regions of interest. PMID:23008709

  20. Fluorescence in situ hybridisation analysis of bone marrow trephine biopsy specimens; an additional tool in the diagnostic armoury.

    Science.gov (United States)

    Neat, Michael J; Moonim, Mufaddal T; Dunn, Robert G; Geoghegan, Helen; Foot, Nicola J

    2013-01-01

    Fluorescence in situ hybridisation (FISH) analysis is now widely employed in the diagnosis and risk stratification of a wide range of malignant diseases. While this technique is used successfully with formalin-fixed paraffin-embedded (FFPE) sections from numerous tissue types, FISH analysis of FFPE tissue sections from trephine biopsy specimens has been less widely reported, possibly due to technical limitations relating to the decalcification protocols employed. During the last 4 years FISH analysis has been carried out successfully in 42 out of 55 (76%) consecutive trephine biopsy specimens received as part of the standard diagnostic service at our institution. Samples decalcified using EDTA-based protocols were analysed successfully in 31/31 cases (100%), whereas only 11/24 samples (46%) decalcified using formic acid-based protocols were successful. In our experience, FISH analysis of trephine biopsy specimens is a highly reproducible technique and a very useful adjunctive tool in the diagnostic armoury; however, its use in a standard diagnostic setting relies on the use of EDTA-based decalcification protocols.

  1. Deciphering the hybridisation history leading to the Lager lineage based on the mosaic genomes of Saccharomyces bayanus strains NBRC1948 and CBS380.

    Directory of Open Access Journals (Sweden)

    Huu-Vang Nguyen

    Full Text Available Saccharomyces bayanus is a yeast species described as one of the two parents of the hybrid brewing yeast S. pastorianus. Strains CBS380(T and NBRC1948 have been retained successively as pure-line representatives of S. bayanus. In the present study, sequence analyses confirmed and upgraded our previous finding: S. bayanus type strain CBS380(T harbours a mosaic genome. The genome of strain NBRC1948 was also revealed to be mosaic. Both genomes were characterized by amplification and sequencing of different markers, including genes involved in maltotriose utilization or genes detected by array-CGH mapping. Sequence comparisons with public Saccharomyces spp. nucleotide sequences revealed that the CBS380(T and NBRC1948 genomes are composed of: a predominant non-cerevisiae genetic background belonging to S. uvarum, a second unidentified species provisionally named S. lagerae, and several introgressed S. cerevisiae fragments. The largest cerevisiae-introgressed DNA common to both genomes totals 70kb in length and is distributed in three contigs, cA, cB and cC. These vary in terms of length and presence of MAL31 or MTY1 (maltotriose-transporter gene. In NBRC1948, two additional cerevisiae-contigs, cD and cE, totaling 12kb in length, as well as several smaller cerevisiae fragments were identified. All of these contigs were partially detected in the genomes of S. pastorianus lager strains CBS1503 (S. monacensis and CBS1513 (S. carlsbergensis explaining the noticeable common ability of S. bayanus and S. pastorianus to metabolize maltotriose. NBRC1948 was shown to be inter-fertile with S. uvarum CBS7001. The cross involving these two strains produced F1 segregants resembling the strains CBS380(T or NRRLY-1551. This demonstrates that these S. bayanus strains were the offspring of a cross between S. uvarum and a strain similar to NBRC1948. Phylogenies established with selected cerevisiae and non-cerevisiae genes allowed us to decipher the complex hybridisation

  2. Solar PV-CSP Hybridisation for Baseload Generation : A Techno-economic Analysis for the Chilean Market

    OpenAIRE

    Larchet, Kevin

    2015-01-01

    The development of high capacity factor solar power plants is an interesting topic, especially when considering the climate and economic conditions of a location such as the Chilean Atacama Desert. The hybridisation of solar photovoltaic (PV) and concentrating solar power (CSP) technologies for such an application is a promising collaboration. The low cost of PV and dispatchability of CSP, integrated with thermal energy storage (TES), has the promise of delivering baseload electricity at a lo...

  3. De novo monosomy 9p24.3-pter and trisomy 17q24.3-qter characterised by microarray comparative genomic hybridisation in a fetus with an increased nuchal translucency.

    Science.gov (United States)

    Brisset, Sophie; Kasakyan, Serdar; L'Herminé, Aurore Coulomb; Mairovitz, Valérie; Gautier, Evelyne; Aubry, Marie-Cécile; Benkhalifa, Moncef; Tachdjian, Gérard

    2006-03-01

    Increased nuchal translucency (NT) during the first trimester of pregnancy is a useful marker to detect chromosomal abnormalities. Here, we report a prenatal case with molecular cytogenetic characterisation of an abnormal derivative chromosome 9 identified through NT. Amniocentesis was performed because of an increased NT (4.4 mm) and showed an abnormal de novo 46,XX,add(9)(p24.3) karyotype. To characterise the origin of the small additional material on 9p, we performed a microarray comparative genomic hybridisation (microarray CGH) using a genomic DNA array providing an average of 1 Mb resolution. Microarray CGH showed a deletion of distal 9p and a trisomy of distal 17q. These results were confirmed by FISH analyses. Microarray CGH provided accurate information on the breakpoint regions and the size of both distal 9p deletion and distal 17q trisomy. The fetus was therefore a carrier of a de novo derivative chromosome 9 arising from a t(9;17)(p24.3;q24.3) translocation and generating a monosomy 9p24.3-pter and a trisomy 17q24.3-qter. This case illustrates that microarray CGH is a rapid, powerful and sensitive technology to identify small de novo unbalanced chromosomal abnormalities and can be applied in prenatal diagnosis. 2006 John Wiley & Sons, Ltd.

  4. Genomic gains and losses in malignant mesothelioma demonstrated by FISH analysis of paraffin-embedded tissues.

    Science.gov (United States)

    Takeda, Maiko; Kasai, Takahiko; Enomoto, Yasunori; Takano, Masato; Morita, Kouhei; Kadota, Eiji; Iizuka, Norishige; Maruyama, Hiroshi; Nonomura, Akitaka

    2012-01-01

    Malignant mesothelioma (MM) results from the accumulation of a number of acquired genetic events at the onset. In MM, the most frequent changes were losses in 9p21, 1p36, 14q32 and 22q12, and gains in 5p, 7p and 8q24 by comparative genomic hybridisation analysis. Although the diagnostic utility of 9p21 homozygous deletion by fluorescence in situ hybridisation (FISH) analysis in MM has been reported recently, alterations of other genes have not been examined to any great extent. This study analysed the frequency of various genomic gains and losses in MM using FISH analysis. The authors performed a FISH analysis using paraffin-embedded tissues from 42 cases of MM. Chromosomal losses in MM were found at 9p21 (83%), 1p36 (43%), 14q32 (43%) and 22q12 (38%), whereas gains were found at 5p15 (48%), 7p12 (38%) and 8q24 (45%). There were no cases of adenomatoid tumour, benign mesothelial multicystic tumour, reactive mesothelial hyperplasia or pleuritis showing any gains or losses. At least one genomic abnormality was identified in all cases of MM. Among various histological subtypes, the chromosomal abnormality tended to be more common in cases showing sarcomatous elements (biphasic or pure sarcomatoid) than in cases showing an epithelioid histology. The authors found various genomic gains and losses in MM by FISH analysis. The frequency of each genomic gain or loss examined in MM by FISH analysis almost agreed with the comparative genomic hybridisation technique in previous studies. This study suggests that genomic evaluation by FISH analysis might be helpful in distinguishing MM from benign mesothelial proliferation.

  5. Representational difference analysis reveals genomic differences between Q. robur and Q. suber: implications for the study of genome evolution in the genus Quercus.

    Science.gov (United States)

    Zoldos, V; Siljak-Yakovlev, S; Papes, D; Sarr, A; Panaud, O

    2001-04-01

    Very similar genome sizes, similar karyotypes and heterochromatin organisation, and identical number/position of ribosomal loci characterise the common oak (Q. robur) and the cork oak (Q. suber), two distantly related oak species. Representational Difference Analysis (RDA) was used to subtract the genome of Q. suber from the genome of Q. robur in order to search for genome differentiation. A library of 400 clones (bearing RDA fragments) representing genome differences between the two species was obtained. Seven Q. robur-specific DNA sequences were analysed with respect to their molecular and chromosome organisation. All belong to the dispersed repetitive component of the genome, as revealed by Southern hybridisation and in situ hybridisation. They are present in the Q. robur genome in between 100 and 700 copies, and are distributed along the length of almost all chromosomes. A search for homologies between RDA fragments and sequences in Genbank revealed similarities of all RDA fragments with known retrotransposons. The RDA fragments were also tested for their presence/absence in the genomes of six additional oak species belonging to different phylogenetic groups, in order to examine the evolutionary dynamics of these DNA sequences.

  6. Sexing the human fetus and identification of polyploid nuclei by DNA-DNA in situ hybridisation in interphase nuclei.

    Science.gov (United States)

    West, J D; Gosden, C M; Gosden, J R; West, K M; Davidson, Z; Davidson, C; Nicolaides, K H

    1989-01-01

    Samples of human adult lymphocytes, fetal lymphocytes, amniotic fluid cells, and chorionic villus cells were sexed independently by cytogenetics and DNA-DNA in situ hybridisation to a tritiated Y probe. For the in situ hybridisation analysis, the presence of Y bodies (hybridisation bodies) in 100 interphase nuclei were scored after autoradiography. In all, 82/83 samples were sexed in this way (one technical failure) and 78/82 were sexed by both in situ hybridisation and cytogenetics. There was complete agreement between the two methods. There was a considerable variation (40-100%) in the percentage of interphase nuclei with a hybridisation body among the male samples, but very few nuclei from female samples showed significant hybridisation. In situ hybridisation could be used to sex the conceptus when males but not females are at risk for various X-linked genetic disorders and may also be useful for detecting 45,X/46,XY mosaicism or polyploid/diploid mosaicism. This would be particularly useful for direct preparations of chorionic villus samples, which often prove difficult to analyse cytogenetically but offer the best means of avoiding maternal contamination. Some interphase nuclei had more than one hybridisation body, and this was most commonly found among amniotic fluid cells. Comparison of sizes of nuclei with one or two hybridisation bodies strongly suggested that most of the amniotic fluid cell nuclei with two hybridisation bodies were tetraploid.

  7. Improved technique for fluorescence in situ hybridisation analysis of isolated nuclei from archival, B5 or formalin fixed, paraffin wax embedded tissue.

    Science.gov (United States)

    Schurter, M J; LeBrun, D P; Harrison, K J

    2002-04-01

    Fluorescence in situ hybridisation (FISH) is an effective method to detect chromosomal alterations in a variety of tissue types, including archived paraffin wax embedded specimens fixed in B5 or formalin. However, precipitating fixatives such as B5 have been known to produce unsatisfactory results in comparison with formalin when used for FISH. This study describes an effective nuclear isolation and FISH procedure for B5 and formalin fixed tissue, optimising the nuclear isolation step and nuclei pretreatments using tonsil and mantle cell lymphoma specimens. The protocol presented can be used to isolate nuclei and perform FISH on B5 or formalin fixed, paraffin wax embedded samples from a variety of tissue types.

  8. Hybridisation between two cyprinid fishes in a novel habitat: genetics, morphology and life-history traits

    Directory of Open Access Journals (Sweden)

    Caffrey Joe

    2010-06-01

    Full Text Available Abstract Background The potential role hybridisation in adaptive radiation and the evolution of new lineages has received much recent attention. Hybridisation between roach (Rutilus rutilus L. and bream (Abramis brama L. is well documented throughout Europe, however hybrids in Ireland occur at an unprecedented frequency, often exceeding that of both parental species. Utilising an integrated approach, which incorporates geometric morphometrics, life history and molecular genetic analyses we identify the levels and processes of hybridisation present, while also determining the direction of hybridisation, through the analysis of mitochondrial DNA. Results The presence of F2 hybrids was found to be unlikely from the studied populations, although significant levels of backcrossing, involving both parental taxa was observed in some lakes. Hybridisation represents a viable conduit for introgression of genes between roach and bream. The vast majority of hybrids in all populations studied exhibited bream mitochondrial DNA, indicating that bream are maternal in the majority of crosses. Conclusions The success of roach × bream hybrids in Ireland is not due to a successful self reproducing lineage. The potential causes of widespread hybridisation between both species, along with the considerations regarding the role of hybridisation in evolution and conservation, are also discussed.

  9. Comparative genomics in chicken and Pekin duck using FISH mapping and microarray analysis

    Directory of Open Access Journals (Sweden)

    Fowler Katie E

    2009-08-01

    Full Text Available Abstract Background The availability of the complete chicken (Gallus gallus genome sequence as well as a large number of chicken probes for fluorescent in-situ hybridization (FISH and microarray resources facilitate comparative genomic studies between chicken and other bird species. In a previous study, we provided a comprehensive cytogenetic map for the turkey (Meleagris gallopavo and the first analysis of copy number variants (CNVs in birds. Here, we extend this approach to the Pekin duck (Anas platyrhynchos, an obvious target for comparative genomic studies due to its agricultural importance and resistance to avian flu. Results We provide a detailed molecular cytogenetic map of the duck genome through FISH assignment of 155 chicken clones. We identified one inter- and six intrachromosomal rearrangements between chicken and duck macrochromosomes and demonstrated conserved synteny among all microchromosomes analysed. Array comparative genomic hybridisation revealed 32 CNVs, of which 5 overlap previously designated "hotspot" regions between chicken and turkey. Conclusion Our results suggest extensive conservation of avian genomes across 90 million years of evolution in both macro- and microchromosomes. The data on CNVs between chicken and duck extends previous analyses in chicken and turkey and supports the hypotheses that avian genomes contain fewer CNVs than mammalian genomes and that genomes of evolutionarily distant species share regions of copy number variation ("CNV hotspots". Our results will expedite duck genomics, assist marker development and highlight areas of interest for future evolutionary and functional studies.

  10. Spectrogram Analysis of Genomes

    Directory of Open Access Journals (Sweden)

    David Sussillo

    2004-01-01

    Full Text Available We performed frequency-domain analysis in the genomes of various organisms using tricolor spectrograms, identifying several types of distinct visual patterns characterizing specific DNA regions. We relate patterns and their frequency characteristics to the sequence characteristics of the DNA. At times, the spectrogram patterns could be related to the structure of the corresponding protein region by using various public databases such as GenBank. Some patterns are explained from the biological nature of the corresponding regions, which relate to chromosome structure and protein coding, and some patterns have yet unknown biological significance. We found biologically meaningful patterns, on the scale of millions of base pairs, to a few hundred base pairs. Chromosome-wide patterns include periodicities ranging from 2 to 300. The color of the spectrogram depends on the nucleotide content at specific frequencies, and therefore can be used as a local indicator of CG content and other measures of relative base content. Several smaller-scale patterns were found to represent different types of domains made up of various tandem repeats.

  11. Genome-wide Studies of Mycolic Acid Bacteria: Computational Identification and Analysis of a Minimal Genome

    KAUST Repository

    Kamanu, Frederick Kinyua

    2012-12-01

    The mycolic acid bacteria are a distinct suprageneric group of asporogenous Grampositive, high GC-content bacteria, distinguished by the presence of mycolic acids in their cell envelope. They exhibit great diversity in their cell and morphology; although primarily non-pathogens, this group contains three major pathogens Mycobacterium leprae, Mycobacterium tuberculosis complex, and Corynebacterium diphtheria. Although the mycolic acid bacteria are a clearly defined group of bacteria, the taxonomic relationships between its constituent genera and species are less well defined. Two approaches were tested for their suitability in describing the taxonomy of the group. First, a Multilocus Sequence Typing (MLST) experiment was assessed and found to be superior to monophyletic (16S small ribosomal subunit) in delineating a total of 52 mycolic acid bacterial species. Phylogenetic inference was performed using the neighbor-joining method. To further refine phylogenetic analysis and to take advantage of the widespread availability of bacterial genome data, a computational framework that simulates DNA-DNA hybridisation was developed and validated using multiscale bootstrap resampling. The tool classifies microbial genomes based on whole genome DNA, and was deployed as a web-application using PHP and Javascript. It is accessible online at http://cbrc.kaust.edu.sa/dna_hybridization/ A third study was a computational and statistical methods in the identification and analysis of a putative minimal mycolic acid bacterial genome so as to better understand (1) the genomic requirements to encode a mycolic acid bacterial cell and (2) the role and type of genes and genetic elements that lead to the massive increase in genome size in environmental mycolic acid bacteria. Using a reciprocal comparison approach, a total of 690 orthologous gene clusters forming a putative minimal genome were identified across 24 mycolic acid bacterial species. In order to identify new potential drug

  12. Analysis of the microbial community from a saline aquifer during CO2 storage in Ketzin using improved Fluorescence in situ Hybridisation method

    Science.gov (United States)

    Morozova, Daria; Let, Daniela; Zettlitzer, Michael; Würdemann, Hilke

    2013-04-01

    In order to investigate the possibility of underground CO2 storage, a research facility in Ketzin (Germany, west of Berlin) is operated where CO2 is stored in a subsurface saline aquifer. Three 700-850 m deep holes were constructed relatively, one injection well containing the injection tubing and two observation wells harbouring measuring technique. Since the Earth subsurface is known to be a major habitat for a high number of different groups of microorganisms, our working group aims at characterising microbial reactions between the gas (either dissolved in water or in the supercritical state), fluid and the mineral content of both the reservoir rock and the cap rock. Main purpose of the microbial monitoring is to analyse compositions and activities of the microbial communities in order to characterize microbial life in extreme habitats and its influence on corrosion and mineral dissolution and precipitation. Analyses of microbial community composition and its changes provide information about the effectiveness and reliability of long-term CO2 storage technique. Our previous study revealed that up to 106 cells ml-1 were detected in the first observation well, where CO2 break through after injection of 500 t (Morozova et al., 2010). For the identification and enumeration of the microorganisms, a widely applied fluorescence in situ hybridisation (FISH) method was applied. FISH coupled with rRNA-targeted oligonucleotide probes allows direct visualisation, identification and localisation of bacterial cells from selected phylogenetic groups. However, its application to the samples from the second observation well, where CO2 arrived after injection of approximately 11.000 t, was hampered. The presence of solids and particles in the reservoir fluids significantly interfered with the cell visualization using epifluorescent microscopy. Since it is difficult to distinguish cells among particles and this strongly hinders the identification and enumeration of bacteria, an

  13. Interspecific Hybridisation in Campanula

    DEFF Research Database (Denmark)

    Röper, Anna Catharina

    In the present thesis, economically important Campanula species were selected for interspecific hybridisation to increase the genetic viability in plant breeding material. To reach this goal, ovule culture was established as an embryo rescue technique to overcome post-fertilisation barriers...

  14. Comparative Genome Analysis and Genome Evolution

    NARCIS (Netherlands)

    Snel, Berend

    2002-01-01

    This thesis described a collection of bioinformatic analyses on complete genome sequence data. We have studied the evolution of gene content and find that vertical inheritance dominates over horizontal gene trasnfer, even to the extent that we can use the gene content to make genome phylogenies. Usi

  15. Comparative Genome Analysis and Genome Evolution

    NARCIS (Netherlands)

    Snel, Berend

    2003-01-01

    This thesis described a collection of bioinformatic analyses on complete genome sequence data. We have studied the evolution of gene content and find that vertical inheritance dominates over horizontal gene trasnfer, even to the extent that we can use the gene content to make genome phylogenies. Usi

  16. Whole cell hybridisation for monitoring harmful marine microalgae.

    Science.gov (United States)

    Toebe, Kerstin

    2013-10-01

    Fluorescence in situ hybridisation (FISH) is a powerful molecular biological tool to detect and enumerate harmful microorganism in the marine environment. Different FISH methods are available, and especially in combination with automated counting techniques, the potential for a routine monitoring of harmful marine microalgae is attainable. Various oligonucleotide probes are developed for detecting harmful microalgae. However, FISH-based methods are not yet regularly included in monitoring programmes tracking the presence of harmful marine microalgae. A limitation factor of the FISH technique is the currently available number of suited fluorochromes attached to the FISH probes to detect various harmful species in one environmental sample at a time. However, coupled automated techniques, like flow cytometry or solid-phase cytometry, can facilitate the analysis of numerous field samples and help to overcome this drawback. A great benefit of FISH in contrast to other molecular biological detection methods for harmful marine microalgae is the direct visualisation of the hybridised target cells, which are not permitted in cell free formats, like DNA depending analysis methods. Therefore, an additional validation of the FISH-generated results is simultaneously given.

  17. The integrated microbial genome resource of analysis.

    Science.gov (United States)

    Checcucci, Alice; Mengoni, Alessio

    2015-01-01

    Integrated Microbial Genomes and Metagenomes (IMG) is a biocomputational system that allows to provide information and support for annotation and comparative analysis of microbial genomes and metagenomes. IMG has been developed by the US Department of Energy (DOE)-Joint Genome Institute (JGI). IMG platform contains both draft and complete genomes, sequenced by Joint Genome Institute and other public and available genomes. Genomes of strains belonging to Archaea, Bacteria, and Eukarya domains are present as well as those of viruses and plasmids. Here, we provide some essential features of IMG system and case study for pangenome analysis.

  18. Interspecific Hybridisation in Campanula

    DEFF Research Database (Denmark)

    Röper, Anna Catharina

    In the present thesis, economically important Campanula species were selected for interspecific hybridisation to increase the genetic viability in plant breeding material. To reach this goal, ovule culture was established as an embryo rescue technique to overcome post-fertilisation barriers...... was confirmed for most of the interspecific hybrids. Intermediate phenotypes compared to the parental species were found for several interspecific hybrids. The results from this PhD study demonstrated that the genetic viability could be increased in Campanula and reveal valuable information for interspecific...

  19. Coronavirus Genomics and Bioinformatics Analysis

    Directory of Open Access Journals (Sweden)

    Kwok-Yung Yuen

    2010-08-01

    Full Text Available The drastic increase in the number of coronaviruses discovered and coronavirus genomes being sequenced have given us an unprecedented opportunity to perform genomics and bioinformatics analysis on this family of viruses. Coronaviruses possess the largest genomes (26.4 to 31.7 kb among all known RNA viruses, with G + C contents varying from 32% to 43%. Variable numbers of small ORFs are present between the various conserved genes (ORF1ab, spike, envelope, membrane and nucleocapsid and downstream to nucleocapsid gene in different coronavirus lineages. Phylogenetically, three genera, Alphacoronavirus, Betacoronavirus and Gammacoronavirus, with Betacoronavirus consisting of subgroups A, B, C and D, exist. A fourth genus, Deltacoronavirus, which includes bulbul coronavirus HKU11, thrush coronavirus HKU12 and munia coronavirus HKU13, is emerging. Molecular clock analysis using various gene loci revealed that the time of most recent common ancestor of human/civet SARS related coronavirus to be 1999-2002, with estimated substitution rate of 4´10-4 to 2´10-2 substitutions per site per year. Recombination in coronaviruses was most notable between different strains of murine hepatitis virus (MHV, between different strains of infectious bronchitis virus, between MHV and bovine coronavirus, between feline coronavirus (FCoV type I and canine coronavirus generating FCoV type II, and between the three genotypes of human coronavirus HKU1 (HCoV-HKU1. Codon usage bias in coronaviruses were observed, with HCoV-HKU1 showing the most extreme bias, and cytosine deamination and selection of CpG suppressed clones are the two major independent biological forces that shape such codon usage bias in coronaviruses.

  20. The Hybridisation of Higher Education in Canada

    Directory of Open Access Journals (Sweden)

    Douglas Shale

    2002-01-01

    Full Text Available Canada's postsecondary institutions are becoming increasingly involved with technology enhanced learning, generally under the rubric of distance education. Growth and activity in distance education stems from rapid developments in communication and information technologies such as videoconferencing and the Internet. This case study focuses on the use of new technologies, primarily within the context of higher education institutions operating in Canada's English speaking provinces. Capitalising on the interactive capabilities of "new" learning technologies, some distance education providers are starting to behave more like conventional educational institutions in terms of forming study groups and student cohorts. Conversely, new telecommunications technologies are having a reverse impact on traditional classroom settings, and as a result conventional universities are beginning to establish administrative structures reflective of those used by distance education providers. When viewed in tandem, these trends reflect growing convergence between conventional and distance learning modes, leading to the hybridisation of higher education in Canada.

  1. Climate change promotes hybridisation between deeply divergent species

    Science.gov (United States)

    Chiocchio, Andrea; Zampiglia, Mauro; Nascetti, Giuseppe

    2017-01-01

    Rare hybridisations between deeply divergent animal species have been reported for decades in a wide range of taxa, but have often remained unexplained, mainly considered chance events and reported as anecdotal. Here, we combine field observations with long-term data concerning natural hybridisations, climate, land-use, and field-validated species distribution models for two deeply divergent and naturally sympatric toad species in Europe (Bufo bufo and Bufotes viridis species groups). We show that climate warming and seasonal extreme temperatures are conspiring to set the scene for these maladaptive hybridisations, by differentially affecting life-history traits of both species. Our results identify and provide evidence of an ultimate cause for such events, and reveal that the potential influence of climate change on interspecific hybridisations goes far beyond closely related species. Furthermore, climate projections suggest that the chances for these events will steadily increase in the near future. PMID:28348926

  2. Specificity assessment from fractionation experiments (SAFE): a novel method to evaluate microarray probe specificity based on hybridisation stringencies.

    Science.gov (United States)

    Drobyshev, Alexei L; Machka, Christine; Horsch, Marion; Seltmann, Matthias; Liebscher, Volkmar; Hrabé de Angelis, Martin; Beckers, Johannes

    2003-01-15

    The cDNA-chip technology is a highly versatile tool for the comprehensive analysis of gene expression at the transcript level. Although it has been applied successfully in expression profiling projects, there is an ongoing dispute concerning the quality of such expression data. The latter critically depends on the specificity of hybridisation. SAFE (specificity assessment from fractionation experiments) is a novel method to discriminate between non- specific cross-hybridisation and specific signals. We applied in situ fractionation of hybridised target on DNA-chips by means of repeated washes with increasing stringencies. Different fractions of hybridised target are washed off at defined stringencies and the collected fluorescence intensity data at each step comprise the fractionation curve. Based on characteristic features of the fractionation curve, unreliable data can be filtered and eliminated from subsequent analyses. The approach described here provides a novel experimental tool to identify probes that produce specific hybridisation signals in DNA-chip expression profiling approaches. The iterative use of the SAFE procedure will result in increasingly reliable sets of probes for microarray experiments and significantly improve the overall efficiency and reliability of RNA expression profiling data from DNA-chip experiments.

  3. High genetic differentiation with no evidence of hybridisation between four limpet species (Patella spp. revealed by allozyme loci

    Directory of Open Access Journals (Sweden)

    Alexandra Sá-Pinto

    2007-12-01

    Full Text Available The occurrence of hybridisation between limpet species of the genus Patella has always been a contentious issue. Although a previous allozyme study reported high differentiation and no hybridisation between Patella vulgata Linnaeus, 1758, Patella depressa Pennant, 1777 and Patella ulyssiponensis Gmelin, 1791 along English shores, the recent finding of an mtDNA haplotype of P. depressa in a P. vulgata individual raised new doubts on this issue. To further study the possibility of hybridisation between limpet species and their level of genetic differentiation, ten allozyme loci were screened using starch gel electrophoresis for P. ulyssiponensis, P. depressa, P. vulgata and Patella rustica Linnaeus, 1758, from the Atlantic coast of the Iberian Peninsula. Our results show high differentiation between species, which could be clearly separated into different clusters with a Bayesian clustering algorithm. No significant signs of hybridisation were detected between any of the four species. Thus, the hypothesis of hybridisation between P. vulgata and P. depressa across their sympatric distribution is not supported. Two sympatric clusters were recovered within P. vulgata that could be related to Hardy-Weinberg disequilibrium found in locus MPI. Finally, due to the high level of intraspecific variability, the studied loci are interesting tools for the analysis of population structure and stock identification.

  4. Classifying Genomic Sequences by Sequence Feature Analysis

    Institute of Scientific and Technical Information of China (English)

    Zhi-Hua Liu; Dian Jiao; Xiao Sun

    2005-01-01

    Traditional sequence analysis depends on sequence alignment. In this study, we analyzed various functional regions of the human genome based on sequence features, including word frequency, dinucleotide relative abundance, and base-base correlation. We analyzed the human chromosome 22 and classified the upstream,exon, intron, downstream, and intergenic regions by principal component analysis and discriminant analysis of these features. The results show that we could classify the functional regions of genome based on sequence feature and discriminant analysis.

  5. Sequence Analysis of SSR-Flanking Regions Identifies Genome Affinities between Pasture Grass Fungal Endophyte Taxa

    Directory of Open Access Journals (Sweden)

    Eline van Zijll de Jong

    2011-01-01

    Full Text Available Fungal species of the Neotyphodium and Epichloë genera are endophytes of pasture grasses showing complex differences of life-cycle and genetic architecture. Simple sequence repeat (SSR markers have been developed from endophyte-derived expressed sequence tag (EST collections. Although SSR array size polymorphisms are appropriate for phenetic analysis to distinguish between taxa, the capacity to resolve phylogenetic relationships is limited by both homoplasy and heteroploidy effects. In contrast, nonrepetitive sequence regions that flank SSRs have been effectively implemented in this study to demonstrate a common evolutionary origin of grass fungal endophytes. Consistent patterns of relationships between specific taxa were apparent across multiple target loci, confirming previous studies of genome evolution based on variation of individual genes. Evidence was obtained for the definition of endophyte taxa not only through genomic affinities but also by relative gene content. Results were compatible with the current view that some asexual Neotyphodium species arose following interspecific hybridisation between sexual Epichloë ancestors. Phylogenetic analysis of SSR-flanking regions, in combination with the results of previous studies with other EST-derived SSR markers, further permitted characterisation of Neotyphodium isolates that could not be assigned to known taxa on the basis of morphological characteristics.

  6. Comparative genomic analysis of esophageal cancers.

    Science.gov (United States)

    Caygill, Christine P J; Gatenby, Piers A C; Herceg, Zdenko; Lima, Sheila C S; Pinto, Luis F R; Watson, Anthony; Wu, Ming-Shiang

    2014-09-01

    The following, from the 12th OESO World Conference: Cancers of the Esophagus, includes commentaries on comparative genomic analysis of esophageal cancers: genomic polymorphisms, the genetic and epigenetic drivers in esophageal cancers, and the collection of data in the UK Barrett's Oesophagus Registry.

  7. Structural and functional analysis of rice genome

    Indian Academy of Sciences (India)

    Akhilesh K. Tyagi; Jitendra P. Khurana; Paramjit Khurana; Saurabh Raghuvanshi; Anupama Gaur; Anita Kapur; Vikrant Gupta; Dibyendu Kumar; V. Ravi; Shubha Vij; Parul Khurana; Sulabha Sharma

    2004-04-01

    Rice is an excellent system for plant genomics as it represents a modest size genome of 430 Mb. It feeds more than half the population of the world. Draft sequences of the rice genome, derived by whole-genome shotgun approach at relatively low coverage (4–6 X), were published and the International Rice Genome Sequencing Project (IRGSP) declared high quality (>10 X), genetically anchored, phase 2 level sequence in 2002. In addition, phase 3 level finished sequence of chromosomes 1, 4 and 10 (out of 12 chromosomes of rice) has already been reported by scientists from IRGSP consortium. Various estimates of genes in rice place the number at > 50,000. Already, over 28,000 full-length cDNAs have been sequenced, most of which map to genetically anchored genome sequence. Such information is very useful in revealing novel features of macro- and micro-level synteny of rice genome with other cereals. Microarray analysis is unraveling the identity of rice genes expressing in temporal and spatial manner and should help target candidate genes useful for improving traits of agronomic importance. Simultaneously, functional analysis of rice genome has been initiated by marker-based characterization of useful genes and employing functional knock-outs created by mutation or gene tagging. Integration of this enormous information is expected to catalyze tremendous activity on basic and applied aspects of rice genomics.

  8. Genome sequence and analysis of Lactobacillus helveticus

    Directory of Open Access Journals (Sweden)

    Paola eCremonesi

    2013-01-01

    Full Text Available The microbiological characterization of lactobacilli is historically well developed, but the genomic analysis is recent. Because of the widespread use of L. helveticus in cheese technology, information concerning the heterogeneity in this species is accumulating rapidly. Recently, the genome of five L. helveticus strains was sequenced to completion and compared with other genomically characterized lactobacilli. The genomic analysis of the first sequenced strain, L. helveticus DPC 4571, isolated from cheese and selected for its characteristics of rapid lysis and high proteolytic activity, has revealed a plethora of genes with industrial potential including those responsible for key metabolic functions such as proteolysis, lipolysis, and cell lysis. These genes and their derived enzymes can facilitate the production of cheese and cheese derivatives with potential for use as ingredients in consumer foods. In addition, L. helveticus has the potential to produce peptides with a biological function, such as angiotensin converting enzyme (ACE inhibitory activity, in fermented dairy products, demonstrating the therapeutic value of this species. A most intriguing feature of the genome of L. helveticus is the remarkable similarity in gene content with many intestinal lactobacilli. Comparative genomics has allowed the identification of key gene sets that facilitate a variety of lifestyles including adaptation to food matrices or the gastrointestinal tract.As genome sequence and functional genomic information continues to explode, key features of the genomes of L. helveticus strains continue to be discovered, answering many questions but also raising many new ones.

  9. Whole genome analysis of a Vietnamese trio

    Indian Academy of Sciences (India)

    Dang Thanh Hai; Nguyen Dai Thanh; Pham Thi Minh Trang; Le Si Quang; Phan Thi Thu Hang; Dang Cao Cuong; Hoang Kim Phuc; Nguyen Huu Duc; Do Duc Dong; Bui Quang Minh; Pham Bao Son; Le Sy Vinh

    2015-03-01

    We here present the first whole genome analysis of an anonymous Kinh Vietnamese (KHV) trio whose genomes were deeply sequenced to 30-fold average coverage. The resulting short reads covered 99.91% of the human reference genome (GRCh37d5). We identified 4,719,412 SNPs and 827,385 short indels that satisfied the Mendelian inheritance law. Among them, 109,914 (2.3%) SNPs and 59,119 (7.1%) short indels were novel. We also detected 30,171 structural variants of which 27,604 (91.5%) were large indels. There were 6,681 large indels in the range 0.1–100 kbp occurring in the child genome that were also confirmed in either the father or mother genome.We compared these large indels against the DGV database and found that 1,499 (22.44%) were KHV specific. De novo assembly of high-quality unmapped reads yielded 789 contigs with the length ≥ 300 bp. There were 235 contigs from the child genome of which 199 (84.7%) were significantly matched with at least one contig from the father or mother genome. Blasting these 199 contigs against other alternative human genomes revealed 4 novel contigs. The novel variants identified from our study demonstrated the necessity of conducting more genome-wide studies not only for Kinh but also for other ethnic groups in Vietnam.

  10. Comparative genomic analysis of eutherian kallikrein genes

    Directory of Open Access Journals (Sweden)

    Marko Premzl

    2017-03-01

    Full Text Available The present study made attempts to update and revise eutherian kallikrein genes implicated in major physiological and pathological processes and in medical molecular diagnostics. Using eutherian comparative genomic analysis protocol and free available genomic sequence assemblies, the tests of reliability of eutherian public genomic sequences annotated most comprehensive curated third party data gene data set of eutherian kallikrein genes including 121 complete coding sequences among 335 potential coding sequences. The present analysis first described 13 major gene clusters of eutherian kallikrein genes, and explained their differential gene expansion patterns. One updated classification and nomenclature of eutherian kallikrein genes was proposed, as new framework of future experiments.

  11. Big Data Analysis of Human Genome Variations

    KAUST Repository

    Gojobori, Takashi

    2016-01-25

    Since the human genome draft sequence was in public for the first time in 2000, genomic analyses have been intensively extended to the population level. The following three international projects are good examples for large-scale studies of human genome variations: 1) HapMap Data (1,417 individuals) (http://hapmap.ncbi.nlm.nih.gov/downloads/genotypes/2010-08_phaseII+III/forward/), 2) HGDP (Human Genome Diversity Project) Data (940 individuals) (http://www.hagsc.org/hgdp/files.html), 3) 1000 genomes Data (2,504 individuals) http://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ If we can integrate all three data into a single volume of data, we should be able to conduct a more detailed analysis of human genome variations for a total number of 4,861 individuals (= 1,417+940+2,504 individuals). In fact, we successfully integrated these three data sets by use of information on the reference human genome sequence, and we conducted the big data analysis. In particular, we constructed a phylogenetic tree of about 5,000 human individuals at the genome level. As a result, we were able to identify clusters of ethnic groups, with detectable admixture, that were not possible by an analysis of each of the three data sets. Here, we report the outcome of this kind of big data analyses and discuss evolutionary significance of human genomic variations. Note that the present study was conducted in collaboration with Katsuhiko Mineta and Kosuke Goto at KAUST.

  12. Positive Emotional Responses to Hybridised Writing about a Socio-Scientific Issue

    Science.gov (United States)

    Tomas, Louisa; Ritchie, Stephen M.

    2012-01-01

    In order to understand better the role of affect in learning about socio-scientific issues (SSI), this study investigated Year 12 students' emotional arousal as they participated in an online writing-to-learn science project about the socio-scientific issue of biosecurity. Students wrote a series of hybridised scientific narratives, or BioStories, that integrate scientific information about biosecurity with narrative storylines, and uploaded these to a dedicated website. Throughout their participation in the project, students recorded their emotional responses to the various activities ( N = 50). Four case students were also video recorded during selected science lessons as they researched, composed and uploaded their BioStories for peer review. Analysis of these data, as well as interview data obtained from the case students, revealed that pride, strength, determination, interest and alertness were among the positive emotions most strongly elicited by the project. These emotions reflected students' interest in learning about a new socio-scientific issue, and their enhanced feelings of self-efficacy in successfully writing hybridised scientific narratives in science. The results of this study suggest that the elicitation of positive emotional responses as students engage in hybridised writing about SSI with strong links to environmental education, such as biosecurity, can be valuable in engaging students in education for sustainability.

  13. Mathematical Analysis of Genomic Evolution

    Directory of Open Access Journals (Sweden)

    Cedric Green

    2011-01-01

    Full Text Available Changes in nucleotide sequences, or mutations, accumulate from generation to generation in the genomes of all living organisms. The mutations can be advantageous, deleterious, or neutral. The goal of this project is to determine the amount of advantageous mutations it takes to get human (Homo sapiens DNA from the DNA of genetically distinct organisms. We do this by collecting the genomic data of such organisms, and estimating the amount of mutations it takes to transform yeast (Saccharomyces cerevisiae DNA to the DNA of a human. We calculate the typical number of mutations occurring annually through the organism's average life span and the average mutation rate. This allows us to determine the total number of mutations as well as the probability of advantageous mutations. Not surprisingly, this probability proves to be fairly small. A more precise estimate can be determined by accounting for the differences in the chromosomal structure and phenomena like horizontal gene transfer.

  14. A Ploidy Difference Represents an Impassable Barrier for Hybridisation in Animals. Is There an Exception among Botiid Loaches (Teleostei: Botiidae?

    Directory of Open Access Journals (Sweden)

    Jörg Bohlen

    Full Text Available One of the most efficient mechanisms to keep animal lineages separate is a difference in ploidy level (number of whole genome copies, since hybrid offspring from parents with different ploidy level are functionally sterile. In the freshwater fish family Botiidae, ploidy difference has been held responsible for the separation of its two subfamilies, the evolutionary tetraploid Botiinae and the diploid Leptobotiinae. Diploid and tetraploid species coexist in the upper Yangtze, the Pearl River and the Red River basins in China. Interestingly, the species 'Botia' zebra from the Pearl River basin combines a number of morphological characters that otherwise are found in the diploid genus Leptobotia with morphological characters of the tetraploid genus Sinibotia, therefore the aim of the present study is to test weather 'B.' zebra is the result of a hybridisation event between species from different subfamilies with different ploidy level. A closer morphological examination indeed demonstrates a high similarity of 'B.' zebra to two co-occurring species, the diploid Leptobotia guilinensis and the tetraploid Sinibotia pulchra. These two species thus could have been the potential parental species in case of a hybrid origin of 'B.' zebra. The morphologic analysis further reveals that 'B.' zebra bears even the diagnostic characters of the genera Leptobotia (Leptobotiinae and Sinibotia (Botiinae. In contrast, a comparison of six allozyme loci between 'B.' zebra, L. guilinensis and S. pulchra showed only similarities between 'B.' zebra and S. pulchra, not between 'B.' zebra and L. guilinensis. Six specimens of 'B.' zebra that were cytogenetically analysed were tetraploid with 4n = 100. The composition of the karyotype (18% metacentric, 18% submetacentric, 36% subtelocentric and 28% acrocentric chromosomes differs from those of L. guilinensis (12%, 24%, 20% and 44% and S. pulchra (20%, 26%, 28% and 26%, and cannot be obtained by any combination of genomes from

  15. Comparative genomic analysis of sixty mycobacteriophage genomes: Genome clustering, gene acquisition and gene size

    Science.gov (United States)

    Hatfull, Graham F.; Jacobs-Sera, Deborah; Lawrence, Jeffrey G.; Pope, Welkin H.; Russell, Daniel A.; Ko, Ching-Chung; Weber, Rebecca J.; Patel, Manisha C.; Germane, Katherine L.; Edgar, Robert H.; Hoyte, Natasha N.; Bowman, Charles A.; Tantoco, Anthony T.; Paladin, Elizabeth C.; Myers, Marlana S.; Smith, Alexis L.; Grace, Molly S.; Pham, Thuy T.; O'Brien, Matthew B.; Vogelsberger, Amy M.; Hryckowian, Andrew J.; Wynalek, Jessica L.; Donis-Keller, Helen; Bogel, Matt W.; Peebles, Craig L.; Cresawn, Steve G.; Hendrix, Roger W.

    2010-01-01

    Mycobacteriophages are viruses that infect mycobacterial hosts. Expansion of a collection of sequenced phage genomes to a total of sixty – all infecting a common bacterial host – provides further insight into their diversity and evolution. Of the sixty phage genomes, 55 can be grouped into nine clusters according to their nucleotide sequence similarities, five of which can be further divided into subclusters; five genomes do not cluster with other phages. The sequence diversity between genomes within a cluster varies greatly; for example, the six genomes in cluster D share more than 97.5% average nucleotide similarity with each other. In contrast, similarity between the two genomes in Cluster I is barely detectable by diagonal plot analysis. The total of 6,858 predicted ORFs have been grouped into 1523 phamilies (phams) of related sequences, 46% of which possess only a single member. Only 18.8% of the phams have sequence similarity to non-mycobacteriophage database entries and fewer than 10% of all phams can be assigned functions based on database searching or synteny. Genome clustering facilitates the identification of genes that are in greatest genetic flux and are more likely to have been exchanged horizontally in relatively recent evolutionary time. Although mycobacteriophage genes exhibit smaller average size than genes of their host (205 residues compared to 315), phage genes in higher flux average only ∼100 amino acids, suggesting that the primary units of genetic exchange correspond to single protein domains. PMID:20064525

  16. A Distance Measure for Genome Phylogenetic Analysis

    Science.gov (United States)

    Cao, Minh Duc; Allison, Lloyd; Dix, Trevor

    Phylogenetic analyses of species based on single genes or parts of the genomes are often inconsistent because of factors such as variable rates of evolution and horizontal gene transfer. The availability of more and more sequenced genomes allows phylogeny construction from complete genomes that is less sensitive to such inconsistency. For such long sequences, construction methods like maximum parsimony and maximum likelihood are often not possible due to their intensive computational requirement. Another class of tree construction methods, namely distance-based methods, require a measure of distances between any two genomes. Some measures such as evolutionary edit distance of gene order and gene content are computational expensive or do not perform well when the gene content of the organisms are similar. This study presents an information theoretic measure of genetic distances between genomes based on the biological compression algorithm expert model. We demonstrate that our distance measure can be applied to reconstruct the consensus phylogenetic tree of a number of Plasmodium parasites from their genomes, the statistical bias of which would mislead conventional analysis methods. Our approach is also used to successfully construct a plausible evolutionary tree for the γ-Proteobacteria group whose genomes are known to contain many horizontally transferred genes.

  17. Genome-wide linkage and copy number variation analysis reveals 710 kb duplication on chromosome 1p31.3 responsible for autosomal dominant omphalocele

    Science.gov (United States)

    Radhakrishna, Uppala; Nath, Swapan K; McElreavey, Ken; Ratnamala, Uppala; Sun, Celi; Maiti, Amit K; Gagnebin, Maryline; Béna, Frédérique; Newkirk, Heather L; Sharp, Andrew J; Everman, David B; Murray, Jeffrey C; Schwartz, Charles E; Antonarakis, Stylianos E; Butler, Merlin G

    2017-01-01

    Background Omphalocele is a congenital birth defect characterised by the presence of internal organs located outside of the ventral abdominal wall. The purpose of this study was to identify the underlying genetic mechanisms of a large autosomal dominant Caucasian family with omphalocele. Methods and findings A genetic linkage study was conducted in a large family with an autosomal dominant transmission of an omphalocele using a genome-wide single nucleotide polymorphism (SNP) array. The analysis revealed significant evidence of linkage (non-parametric NPL = 6.93, p=0.0001; parametric logarithm of odds (LOD) = 2.70 under a fully penetrant dominant model) at chromosome band 1p31.3. Haplotype analysis narrowed the locus to a 2.74 Mb region between markers rs2886770 (63014807 bp) and rs1343981 (65757349 bp). Molecular characterisation of this interval using array comparative genomic hybridisation followed by quantitative microsphere hybridisation analysis revealed a 710 kb duplication located at 63.5–64.2 Mb. All affected individuals who had an omphalocele and shared the haplotype were positive for this duplicated region, while the duplication was absent from all normal individuals of this family. Multipoint linkage analysis using the duplication as a marker yielded a maximum LOD score of 3.2 at 1p31.3 under a dominant model. The 710 kb duplication at 1p31.3 band contains seven known genes including FOXD3, ALG6, ITGB3BP, KIAA1799, DLEU2L, PGM1, and the proximal portion of ROR1. Importantly, this duplication is absent from the database of genomic variants. Conclusions The present study suggests that development of an omphalocele in this family is controlled by overexpression of one or more genes in the duplicated region. To the authors’ knowledge, this is the first reported association of an inherited omphalocele condition with a chromosomal rearrangement. PMID:22499347

  18. Comparative Genome Analysis in the Integrated Microbial Genomes(IMG) System

    Energy Technology Data Exchange (ETDEWEB)

    Kyrpides, Nikos C.; Markowitz, Victor M.

    2006-03-01

    Comparative genome analysis is critical for the effectiveexploration of a rapidly growing number of complete and draft sequencesfor microbial genomes. The Integrated Microbial Genomes (IMG) system(img.jgi.doe.gov) has been developed as a community resource thatprovides support for comparative analysis of microbial genomes in anintegrated context. IMG allows users to navigate the multidimensionalmicrobial genome data space and focus their analysis on a subset ofgenes, genomes, and functions of interest. IMG provides graphicalviewers, summaries and occurrence profile tools for comparing genes,pathways and functions (terms) across specific genomes. Genes can befurther examined using gene neighborhoods and compared with sequencealignment tools.

  19. Re-analysis by fluorescence in situ hybridisation of spare embryos cultured until Day 5 after preimplantation genetic diagnosis for a 47, XYY infertile patient demonstrates a high incidence of diploid mosaic embryos: a case report.

    Science.gov (United States)

    Emiliani, S; Merino, E G; Van den Bergh, M; Abramowicz, M; Vassart, G; Englert, Y; Delneste, D

    2000-12-01

    Mosaicism in 4-8-cell human embryos analysed by fluorescence in situ hybridisation (FISH) has been widely reported, but few studies have addressed the incidence of mosaicism in more advanced embryonic stages. In the present study we analysed spare human embryos in a case of preimplantation genetic diagnosis (PGD) for increased risk of aneuploidy because of an infertile 47,XYY man. After replacement of two embryos typed as 1818XX at PGD, six spare embryos (not frozen because of their low quality) were re-analysed on Day 5 for PGD confirmation. Out of five embryos typed as 1818XY at PGD, four were diploid mosaic (DM) and one was normal in all cells. The sixth embryo, typed as 18XYY/1818181818X at PGD, was a DM. In spite of the bias of our small series of morphologically low-quality embryos, the surprisingly high proportion of mosaics (which confirms previous findings) questions the validity of PGD, but supports the strategy of transferring only the embryos where two blastomeres gave normal and concordant results at PGD. More data are required to understand the clinical significance of early diploid mosaicism (and its impact on implantation rate) and to determine whether some diploid mosaic embryos might be considered safe for transfer.

  20. Whole genome sequence analysis of Mycobacterium suricattae

    KAUST Repository

    Dippenaar, Anzaan

    2015-10-21

    Tuberculosis occurs in various mammalian hosts and is caused by a range of different lineages of the Mycobacterium tuberculosis complex (MTBC). A recently described member, Mycobacterium suricattae, causes tuberculosis in meerkats (Suricata suricatta) in Southern Africa and preliminary genetic analysis showed this organism to be closely related to an MTBC pathogen of rock hyraxes (Procavia capensis), the dassie bacillus. Here we make use of whole genome sequencing to describe the evolution of the genome of M. suricattae, including known and novel regions of difference, SNPs and IS6110 insertion sites. We used genome-wide phylogenetic analysis to show that M. suricattae clusters with the chimpanzee bacillus, previously isolated from a chimpanzee (Pan troglodytes) in West Africa. We propose an evolutionary scenario for the Mycobacterium africanum lineage 6 complex, showing the evolutionary relationship of M. africanum and chimpanzee bacillus, and the closely related members M. suricattae, dassie bacillus and Mycobacterium mungi.

  1. Next generation haplotyping to decipher nuclear genomic interspecific admixture in Citrus species: analysis of chromosome 2.

    Science.gov (United States)

    Curk, Franck; Ancillo, Gema; Garcia-Lor, Andres; Luro, François; Perrier, Xavier; Jacquemoud-Collet, Jean-Pierre; Navarro, Luis; Ollitrault, Patrick

    2014-12-29

    The most economically important Citrus species originated by natural interspecific hybridization between four ancestral taxa (Citrus reticulata, Citrus maxima, Citrus medica, and Citrus micrantha) and from limited subsequent interspecific recombination as a result of apomixis and vegetative propagation. Such reticulate evolution coupled with vegetative propagation results in mosaic genomes with large chromosome fragments from the basic taxa in frequent interspecific heterozygosity. Modern breeding of these species is hampered by their complex heterozygous genomic structures that determine species phenotype and are broken by sexual hybridisation. Nevertheless, a large amount of diversity is present in the citrus gene pool, and breeding to allow inclusion of desirable traits is of paramount importance. However, the efficient mobilization of citrus biodiversity in innovative breeding schemes requires previous understanding of Citrus origins and genomic structures. Haplotyping of multiple gene fragments along the whole genome is a powerful approach to reveal the admixture genomic structure of current species and to resolve the evolutionary history of the gene pools. In this study, the efficiency of parallel sequencing with 454 methodology to decipher the hybrid structure of modern citrus species was assessed by analysis of 16 gene fragments on chromosome 2. 454 amplicon libraries were established using the Fluidigm array system for 48 genotypes and 16 gene fragments from chromosome 2. Haplotypes were established from the reads of each accession and phylogenetic analyses were performed using the haplotypic data for each gene fragment. The length of 454 reads and the level of differentiation between the ancestral taxa of modern citrus allowed efficient haplotype phylogenetic assignations for 12 of the 16 gene fragments. The analysis of the mixed genomic structure of modern species and cultivars (i) revealed C. maxima introgressions in modern mandarins, (ii) was

  2. AcCNET (Accessory Genome Constellation Network): comparative genomics software for accessory genome analysis using bipartite networks.

    Science.gov (United States)

    Lanza, Val F; Baquero, Fernando; de la Cruz, Fernando; Coque, Teresa M

    2017-01-15

    AcCNET (Accessory genome Constellation Network) is a Perl application that aims to compare accessory genomes of a large number of genomic units, both at qualitative and quantitative levels. Using the proteomes extracted from the analysed genomes, AcCNET creates a bipartite network compatible with standard network analysis platforms. AcCNET allows merging phylogenetic and functional information about the concerned genomes, thus improving the capability of current methods of network analysis. The AcCNET bipartite network opens a new perspective to explore the pangenome of bacterial species, focusing on the accessory genome behind the idiosyncrasy of a particular strain and/or population.

  3. AGAPE (Automated Genome Analysis PipelinE for pan-genome analysis of Saccharomyces cerevisiae.

    Directory of Open Access Journals (Sweden)

    Giltae Song

    Full Text Available The characterization and public release of genome sequences from thousands of organisms is expanding the scope for genetic variation studies. However, understanding the phenotypic consequences of genetic variation remains a challenge in eukaryotes due to the complexity of the genotype-phenotype map. One approach to this is the intensive study of model systems for which diverse sources of information can be accumulated and integrated. Saccharomyces cerevisiae is an extensively studied model organism, with well-known protein functions and thoroughly curated phenotype data. To develop and expand the available resources linking genomic variation with function in yeast, we aim to model the pan-genome of S. cerevisiae. To initiate the yeast pan-genome, we newly sequenced or re-sequenced the genomes of 25 strains that are commonly used in the yeast research community using advanced sequencing technology at high quality. We also developed a pipeline for automated pan-genome analysis, which integrates the steps of assembly, annotation, and variation calling. To assign strain-specific functional annotations, we identified genes that were not present in the reference genome. We classified these according to their presence or absence across strains and characterized each group of genes with known functional and phenotypic features. The functional roles of novel genes not found in the reference genome and associated with strains or groups of strains appear to be consistent with anticipated adaptations in specific lineages. As more S. cerevisiae strain genomes are released, our analysis can be used to collate genome data and relate it to lineage-specific patterns of genome evolution. Our new tool set will enhance our understanding of genomic and functional evolution in S. cerevisiae, and will be available to the yeast genetics and molecular biology community.

  4. AGAPE (Automated Genome Analysis PipelinE) for pan-genome analysis of Saccharomyces cerevisiae.

    Science.gov (United States)

    Song, Giltae; Dickins, Benjamin J A; Demeter, Janos; Engel, Stacia; Gallagher, Jennifer; Choe, Kisurb; Dunn, Barbara; Snyder, Michael; Cherry, J Michael

    2015-01-01

    The characterization and public release of genome sequences from thousands of organisms is expanding the scope for genetic variation studies. However, understanding the phenotypic consequences of genetic variation remains a challenge in eukaryotes due to the complexity of the genotype-phenotype map. One approach to this is the intensive study of model systems for which diverse sources of information can be accumulated and integrated. Saccharomyces cerevisiae is an extensively studied model organism, with well-known protein functions and thoroughly curated phenotype data. To develop and expand the available resources linking genomic variation with function in yeast, we aim to model the pan-genome of S. cerevisiae. To initiate the yeast pan-genome, we newly sequenced or re-sequenced the genomes of 25 strains that are commonly used in the yeast research community using advanced sequencing technology at high quality. We also developed a pipeline for automated pan-genome analysis, which integrates the steps of assembly, annotation, and variation calling. To assign strain-specific functional annotations, we identified genes that were not present in the reference genome. We classified these according to their presence or absence across strains and characterized each group of genes with known functional and phenotypic features. The functional roles of novel genes not found in the reference genome and associated with strains or groups of strains appear to be consistent with anticipated adaptations in specific lineages. As more S. cerevisiae strain genomes are released, our analysis can be used to collate genome data and relate it to lineage-specific patterns of genome evolution. Our new tool set will enhance our understanding of genomic and functional evolution in S. cerevisiae, and will be available to the yeast genetics and molecular biology community.

  5. Fluorescent labelling of in situ hybridisation probes through the copper-catalysed azide-alkyne cycloaddition reaction.

    Science.gov (United States)

    Hesse, Susann; Manetto, Antonio; Cassinelli, Valentina; Fuchs, Jörg; Ma, Lu; Raddaoui, Nada; Houben, Andreas

    2016-09-01

    In situ hybridisation is a powerful tool to investigate the genome and chromosome architecture. Nick translation (NT) is widely used to label DNA probes for fluorescence in situ hybridisation (FISH). However, NT is limited to the use of long double-stranded DNA and does not allow the labelling of single-stranded and short DNA, e.g. oligonucleotides. An alternative technique is the copper(I)-catalysed azide-alkyne cycloaddition (CuAAC), at which azide and alkyne functional groups react in a multistep process catalysed by copper(I) ions to give 1,4-distributed 1,2,3-triazoles at a high yield (also called 'click reaction'). We successfully applied this technique to label short single-stranded DNA probes as well as long PCR-derived double-stranded probes and tested them by FISH on plant chromosomes and nuclei. The hybridisation efficiency of differently labelled probes was compared to those obtained by conventional labelling techniques. We show that copper(I)-catalysed azide-alkyne cycloaddition-labelled probes are reliable tools to detect different types of repetitive sequences on chromosomes opening new promising routes for the detection of single copy gene. Moreover, a combination of FISH using such probes with other techniques, e.g. immunohistochemistry (IHC) and cell proliferation assays using 5-ethynyl-deoxyuridine, is herein shown to be easily feasible.

  6. Integrative bayesian network analysis of genomic data.

    Science.gov (United States)

    Ni, Yang; Stingo, Francesco C; Baladandayuthapani, Veerabhadran

    2014-01-01

    Rapid development of genome-wide profiling technologies has made it possible to conduct integrative analysis on genomic data from multiple platforms. In this study, we develop a novel integrative Bayesian network approach to investigate the relationships between genetic and epigenetic alterations as well as how these mutations affect a patient's clinical outcome. We take a Bayesian network approach that admits a convenient decomposition of the joint distribution into local distributions. Exploiting the prior biological knowledge about regulatory mechanisms, we model each local distribution as linear regressions. This allows us to analyze multi-platform genome-wide data in a computationally efficient manner. We illustrate the performance of our approach through simulation studies. Our methods are motivated by and applied to a multi-platform glioblastoma dataset, from which we reveal several biologically relevant relationships that have been validated in the literature as well as new genes that could potentially be novel biomarkers for cancer progression.

  7. Comparative genome analysis of Basidiomycete fungi

    Energy Technology Data Exchange (ETDEWEB)

    Riley, Robert; Salamov, Asaf; Henrissat, Bernard; Nagy, Laszlo; Brown, Daren; Held, Benjamin; Baker, Scott; Blanchette, Robert; Boussau, Bastien; Doty, Sharon L.; Fagnan, Kirsten; Floudas, Dimitris; Levasseur, Anthony; Manning, Gerard; Martin, Francis; Morin, Emmanuelle; Otillar, Robert; Pisabarro, Antonio; Walton, Jonathan; Wolfe, Ken; Hibbett, David; Grigoriev, Igor

    2013-08-07

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes symbionts, pathogens, and saprotrophs including the majority of wood decaying and ectomycorrhizal species. To better understand the genetic diversity of this phylum we compared the genomes of 35 basidiomycetes including 6 newly sequenced genomes. These genomes span extremes of genome size, gene number, and repeat content. Analysis of core genes reveals that some 48percent of basidiomycete proteins are unique to the phylum with nearly half of those (22percent) found in only one organism. Correlations between lifestyle and certain gene families are evident. Phylogenetic patterns of plant biomass-degrading genes in Agaricomycotina suggest a continuum rather than a dichotomy between the white rot and brown rot modes of wood decay. Based on phylogenetically-informed PCA analysis of wood decay genes, we predict that that Botryobasidium botryosum and Jaapia argillacea have properties similar to white rot species, although neither has typical ligninolytic class II fungal peroxidases (PODs). This prediction is supported by growth assays in which both fungi exhibit wood decay with white rot-like characteristics. Based on this, we suggest that the white/brown rot dichotomy may be inadequate to describe the full range of wood decaying fungi. Analysis of the rate of discovery of proteins with no or few homologs suggests the value of continued sequencing of basidiomycete fungi.

  8. Combing genomic DNA for structural and functional studies.

    Science.gov (United States)

    Schurra, Catherine; Bensimon, Aaron

    2009-01-01

    Molecular combing is a process whereby single DNA molecules bind by their extremities to a silanised surface and are then uniformly stretched and aligned by a receding air/water interface (1). This method, with a high resolution ranging from a few kilobases to megabases, has many applications in the field of molecular cytogenetics, allowing structural and functional analysis at the genome level. Here we describe protocols for preparing DNA for combing and for the use of fluorescent hybridisation (FH) applied to combed DNA to conduct physical mapping or genomic structural analysis. We also present the methodology for visualising and studying DNA replication using combed DNA.

  9. Mating, hybridisation and introgression in Lasius ants (Hymenoptera: Formicidae)

    DEFF Research Database (Denmark)

    Van der Have, Tom; Pedersen, Jes Søe; Boomsma, Jacobus Jan

    2011-01-01

    Recent reviews have shown that hybridisation among ant species is likely to be more common than previously appreci-ated, but that documented cases of introgression remain rare. After molecular phylogenetic work had shown that Euro-pean Lasius niger (LINNAEUS, 1758) and L. psammophilus SEIFERT, 1992...... (formerly L. alienus (FOERSTER, 1850)) are unlikely to be very closely related, we decided to analyse an old data set confirming the conclusion by PEARSON (1983) that these two ants can indeed form viable hybrids. We show that signatures of introgression can be detected in a Danish site...... sympatrically. This would imply that multiple accessible field sites are available to study the molecular details of hybridisation and in-trogression between two ant species that have variable degrees of sympatry throughout their distributional ranges...

  10. Applied bioinformatics: Genome annotation and transcriptome analysis

    DEFF Research Database (Denmark)

    Gupta, Vikas

    and dhurrin, which have not previously been characterized in blueberries. There are more than 44,500 spider species with distinct habitats and unique characteristics. Spiders are masters of producing silk webs to catch prey and using venom to neutralize. The exploration of the genetics behind these properties...... japonicus (Lotus), Vaccinium corymbosum (blueberry), Stegodyphus mimosarum (spider) and Trifolium occidentale (clover). From a bioinformatics data analysis perspective, my work can be divided into three parts; genome annotation, small RNA, and gene expression analysis. Lotus is a legume of significant...... has just started. We have assembled and annotated the first two spider genomes to facilitate our understanding of spiders at the molecular level. The need for analyzing the large and increasing amount of sequencing data has increased the demand for efficient, user friendly, and broadly applicable...

  11. Comparative genomic analysis of soybean flowering genes.

    Directory of Open Access Journals (Sweden)

    Chol-Hee Jung

    Full Text Available Flowering is an important agronomic trait that determines crop yield. Soybean is a major oilseed legume crop used for human and animal feed. Legumes have unique vegetative and floral complexities. Our understanding of the molecular basis of flower initiation and development in legumes is limited. Here, we address this by using a computational approach to examine flowering regulatory genes in the soybean genome in comparison to the most studied model plant, Arabidopsis. For this comparison, a genome-wide analysis of orthologue groups was performed, followed by an in silico gene expression analysis of the identified soybean flowering genes. Phylogenetic analyses of the gene families highlighted the evolutionary relationships among these candidates. Our study identified key flowering genes in soybean and indicates that the vernalisation and the ambient-temperature pathways seem to be the most variant in soybean. A comparison of the orthologue groups containing flowering genes indicated that, on average, each Arabidopsis flowering gene has 2-3 orthologous copies in soybean. Our analysis highlighted that the CDF3, VRN1, SVP, AP3 and PIF3 genes are paralogue-rich genes in soybean. Furthermore, the genome mapping of the soybean flowering genes showed that these genes are scattered randomly across the genome. A paralogue comparison indicated that the soybean genes comprising the largest orthologue group are clustered in a 1.4 Mb region on chromosome 16 of soybean. Furthermore, a comparison with the undomesticated soybean (Glycine soja revealed that there are hundreds of SNPs that are associated with putative soybean flowering genes and that there are structural variants that may affect the genes of the light-signalling and ambient-temperature pathways in soybean. Our study provides a framework for the soybean flowering pathway and insights into the relationship and evolution of flowering genes between a short-day soybean and the long-day plant

  12. PGSB/MIPS Plant Genome Information Resources and Concepts for the Analysis of Complex Grass Genomes.

    Science.gov (United States)

    Spannagl, Manuel; Bader, Kai; Pfeifer, Matthias; Nussbaumer, Thomas; Mayer, Klaus F X

    2016-01-01

    PGSB (Plant Genome and Systems Biology; formerly MIPS-Munich Institute for Protein Sequences) has been involved in developing, implementing and maintaining plant genome databases for more than a decade. Genome databases and analysis resources have focused on individual genomes and aim to provide flexible and maintainable datasets for model plant genomes as a backbone against which experimental data, e.g., from high-throughput functional genomics, can be organized and analyzed. In addition, genomes from both model and crop plants form a scaffold for comparative genomics, assisted by specialized tools such as the CrowsNest viewer to explore conserved gene order (synteny) between related species on macro- and micro-levels.The genomes of many economically important Triticeae plants such as wheat, barley, and rye present a great challenge for sequence assembly and bioinformatic analysis due to their enormous complexity and large genome size. Novel concepts and strategies have been developed to deal with these difficulties and have been applied to the genomes of wheat, barley, rye, and other cereals. This includes the GenomeZipper concept, reference-guided exome assembly, and "chromosome genomics" based on flow cytometry sorted chromosomes.

  13. Human and mouse genome analysis using array comparative genomic hybridization

    NARCIS (Netherlands)

    Snijders, Antoine Maria

    2004-01-01

    Almost all human cancers as well as developmental abnormalities are characterized by the presence of genetic alterations, most of which target a gene or a particular genomic locus resulting in altered gene expression and ultimately an altered phenotype. Different types of genetic alterations include

  14. Genome-wide analysis correlates Ayurveda Prakriti.

    Science.gov (United States)

    Govindaraj, Periyasamy; Nizamuddin, Sheikh; Sharath, Anugula; Jyothi, Vuskamalla; Rotti, Harish; Raval, Ritu; Nayak, Jayakrishna; Bhat, Balakrishna K; Prasanna, B V; Shintre, Pooja; Sule, Mayura; Joshi, Kalpana S; Dedge, Amrish P; Bharadwaj, Ramachandra; Gangadharan, G G; Nair, Sreekumaran; Gopinath, Puthiya M; Patwardhan, Bhushan; Kondaiah, Paturu; Satyamoorthy, Kapaettu; Valiathan, Marthanda Varma Sankaran; Thangaraj, Kumarasamy

    2015-10-29

    The practice of Ayurveda, the traditional medicine of India, is based on the concept of three major constitutional types (Vata, Pitta and Kapha) defined as "Prakriti". To the best of our knowledge, no study has convincingly correlated genomic variations with the classification of Prakriti. In the present study, we performed genome-wide SNP (single nucleotide polymorphism) analysis (Affymetrix, 6.0) of 262 well-classified male individuals (after screening 3416 subjects) belonging to three Prakritis. We found 52 SNPs (p ≤ 1 × 10(-5)) were significantly different between Prakritis, without any confounding effect of stratification, after 10(6) permutations. Principal component analysis (PCA) of these SNPs classified 262 individuals into their respective groups (Vata, Pitta and Kapha) irrespective of their ancestry, which represent its power in categorization. We further validated our finding with 297 Indian population samples with known ancestry. Subsequently, we found that PGM1 correlates with phenotype of Pitta as described in the ancient text of Caraka Samhita, suggesting that the phenotypic classification of India's traditional medicine has a genetic basis; and its Prakriti-based practice in vogue for many centuries resonates with personalized medicine.

  15. Enhancing genomics information retrieval through dimensional analysis.

    Science.gov (United States)

    Hu, Qinmin; Huang, Jimmy Xiangji

    2013-06-01

    We propose a novel dimensional analysis approach to employing meta information in order to find the relationships within the unstructured or semi-structured document/passages for improving genomics information retrieval performance. First, we make use of the auxiliary information as three basic dimensions, namely "temporal", "journal", and "author". The reference section is treated as a commensurable quantity of the three basic dimensions. Then, the sample space and subspaces are built up and a set of events are defined to meet the basic requirement of dimensional homogeneity to be commensurable quantities. After that, the classic graph analysis algorithm in the Web environments is applied on each dimension respectively to calculate the importance of each dimension. Finally, we integrate all the dimension networks and re-rank the outputs for evaluation. Our experimental results show the proposed approach is superior and promising.

  16. Genome-wide Analysis of Gene Regulation

    DEFF Research Database (Denmark)

    Chen, Yun

    cells are capable of regulating their gene expression, so that each cell can only express a particular set of genes yielding limited numbers of proteins with specialized functions. Therefore a rigid control of differential gene expression is necessary for cellular diversity. On the other hand, aberrant...... gene regulation will disrupt the cell’s fundamental processes, which in turn can cause disease. Hence, understanding gene regulation is essential for deciphering the code of life. Along with the development of high throughput sequencing (HTS) technology and the subsequent large-scale data analysis......, genome-wide assays have increased our understanding of gene regulation significantly. This thesis describes the integration and analysis of HTS data across different important aspects of gene regulation. Gene expression can be regulated at different stages when the genetic information is passed from gene...

  17. Multidimensional gene set analysis of genomic data.

    Directory of Open Access Journals (Sweden)

    David Montaner

    Full Text Available Understanding the functional implications of changes in gene expression, mutations, etc., is the aim of most genomic experiments. To achieve this, several functional profiling methods have been proposed. Such methods study the behaviour of different gene modules (e.g. gene ontology terms in response to one particular variable (e.g. differential gene expression. In spite to the wealth of information provided by functional profiling methods, a common limitation to all of them is their inherent unidimensional nature. In order to overcome this restriction we present a multidimensional logistic model that allows studying the relationship of gene modules with different genome-scale measurements (e.g. differential expression, genotyping association, methylation, copy number alterations, heterozygosity, etc. simultaneously. Moreover, the relationship of such functional modules with the interactions among the variables can also be studied, which produces novel results impossible to be derived from the conventional unidimensional functional profiling methods. We report sound results of gene sets associations that remained undetected by the conventional one-dimensional gene set analysis in several examples. Our findings demonstrate the potential of the proposed approach for the discovery of new cell functionalities with complex dependences on more than one variable.

  18. Genome Data Exploration Using Correspondence Analysis.

    Science.gov (United States)

    Tekaia, Fredj

    2016-01-01

    Recent developments of sequencing technologies that allow the production of massive amounts of genomic and genotyping data have highlighted the need for synthetic data representation and pattern recognition methods that can mine and help discovering biologically meaningful knowledge included in such large data sets. Correspondence analysis (CA) is an exploratory descriptive method designed to analyze two-way data tables, including some measure of association between rows and columns. It constructs linear combinations of variables, known as factors. CA has been used for decades to study high-dimensional data, and remarkable inferences from large data tables were obtained by reducing the dimensionality to a few orthogonal factors that correspond to the largest amount of variability in the data. Herein, I review CA and highlight its use by considering examples in handling high-dimensional data that can be constructed from genomic and genetic studies. Examples in amino acid compositions of large sets of species (viruses, phages, yeast, and fungi) as well as an example related to pairwise shared orthologs in a set of yeast and fungal species, as obtained from their proteome comparisons, are considered. For the first time, results show striking segregations between yeasts and fungi as well as between viruses and phages. Distributions obtained from shared orthologs show clusters of yeast and fungal species corresponding to their phylogenetic relationships. A direct comparison with the principal component analysis method is discussed using a recently published example of genotyping data related to newly discovered traces of an ancient hominid that was compared to modern human populations in the search for ancestral similarities. CA offers more detailed results highlighting links between modern humans and the ancient hominid and their characterizations. Compared to the popular principal component analysis method, CA allows easier and more effective interpretation of results

  19. Pig genome sequence - analysis and publication strategy

    NARCIS (Netherlands)

    Archibald, A.L.; Bolund, L.; Churcher, C.; Fredholm, M.; Groenen, M.A.M.; Harlizius, B.

    2010-01-01

    Background - The pig genome is being sequenced and characterised under the auspices of the Swine Genome Sequencing Consortium. The sequencing strategy followed a hybrid approach combining hierarchical shotgun sequencing of BAC clones and whole genome shotgun sequencing. Results - Assemblies of the B

  20. Pig genome sequence - analysis and publication strategy

    DEFF Research Database (Denmark)

    Archibald, Alan L.; Bolund, Lars; Churcher, Carol;

    2010-01-01

    BACKGROUND: The pig genome is being sequenced and characterised under the auspices of the Swine Genome Sequencing Consortium. The sequencing strategy followed a hybrid approach combining hierarchical shotgun sequencing of BAC clones and whole genome shotgun sequencing. RESULTS: Assemblies......) is under construction and will incorporate whole genome shotgun sequence (WGS) data providing > 30x genome coverage. The WGS sequence, most of which comprise short Illumina/Solexa reads, were generated from DNA from the same single Duroc sow as the source of the BAC library from which clones were...

  1. Phylogenomic Analysis and Dynamic Evolution of Chloroplast Genomes in Salicaceae

    Directory of Open Access Journals (Sweden)

    Yuan Huang

    2017-06-01

    Full Text Available Chloroplast genomes of plants are highly conserved in both gene order and gene content. Analysis of the whole chloroplast genome is known to provide much more informative DNA sites and thus generates high resolution for plant phylogenies. Here, we report the complete chloroplast genomes of three Salix species in family Salicaceae. Phylogeny of Salicaceae inferred from complete chloroplast genomes is generally consistent with previous studies but resolved with higher statistical support. Incongruences of phylogeny, however, are observed in genus Populus, which most likely results from homoplasy. By comparing three Salix chloroplast genomes with the published chloroplast genomes of other Salicaceae species, we demonstrate that the synteny and length of chloroplast genomes in Salicaceae are highly conserved but experienced dynamic evolution among species. We identify seven positively selected chloroplast genes in Salicaceae, which might be related to the adaptive evolution of Salicaceae species. Comparative chloroplast genome analysis within the family also indicates that some chloroplast genes are lost or became pseudogenes, infer that the chloroplast genes horizontally transferred to the nucleus genome. Based on the complete nucleus genome sequences from two Salicaceae species, we remarkably identify that the entire chloroplast genome is indeed transferred and integrated to the nucleus genome in the individual of the reference genome of P. trichocarpa at least once. This observation, along with presence of the large nuclear plastid DNA (NUPTs and NUPTs-containing multiple chloroplast genes in their original order in the chloroplast genome, favors the DNA-mediated hypothesis of organelle to nucleus DNA transfer. Overall, the phylogenomic analysis using chloroplast complete genomes clearly elucidates the phylogeny of Salicaceae. The identification of positively selected chloroplast genes and dynamic chloroplast-to-nucleus gene transfers in

  2. The Complete Mitochondrial Genome of Gossypium hirsutum and Evolutionary Analysis of Higher Plant Mitochondrial Genomes

    Science.gov (United States)

    Su, Aiguo; Geng, Jianing; Grover, Corrinne E.; Hu, Songnian; Hua, Jinping

    2013-01-01

    Background Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L.) is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt) genome could be helpful for the evolution research of plant mt genomes. Methodology/Principal Findings We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes. Conclusion The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species. PMID:23940520

  3. The complete mitochondrial genome of Gossypium hirsutum and evolutionary analysis of higher plant mitochondrial genomes.

    Directory of Open Access Journals (Sweden)

    Guozheng Liu

    Full Text Available BACKGROUND: Mitochondria are the main manufacturers of cellular ATP in eukaryotes. The plant mitochondrial genome contains large number of foreign DNA and repeated sequences undergone frequently intramolecular recombination. Upland Cotton (Gossypium hirsutum L. is one of the main natural fiber crops and also an important oil-producing plant in the world. Sequencing of the cotton mitochondrial (mt genome could be helpful for the evolution research of plant mt genomes. METHODOLOGY/PRINCIPAL FINDINGS: We utilized 454 technology for sequencing and combined with Fosmid library of the Gossypium hirsutum mt genome screening and positive clones sequencing and conducted a series of evolutionary analysis on Cycas taitungensis and 24 angiosperms mt genomes. After data assembling and contigs joining, the complete mitochondrial genome sequence of G. hirsutum was obtained. The completed G.hirsutum mt genome is 621,884 bp in length, and contained 68 genes, including 35 protein genes, four rRNA genes and 29 tRNA genes. Five gene clusters are found conserved in all plant mt genomes; one and four clusters are specifically conserved in monocots and dicots, respectively. Homologous sequences are distributed along the plant mt genomes and species closely related share the most homologous sequences. For species that have both mt and chloroplast genome sequences available, we checked the location of cp-like migration and found several fragments closely linked with mitochondrial genes. CONCLUSION: The G. hirsutum mt genome possesses most of the common characters of higher plant mt genomes. The existence of syntenic gene clusters, as well as the conservation of some intergenic sequences and genic content among the plant mt genomes suggest that evolution of mt genomes is consistent with plant taxonomy but independent among different species.

  4. Millstone: software for multiplex microbial genome analysis and engineering.

    Science.gov (United States)

    Goodman, Daniel B; Kuznetsov, Gleb; Lajoie, Marc J; Ahern, Brian W; Napolitano, Michael G; Chen, Kevin Y; Chen, Changping; Church, George M

    2017-05-25

    Inexpensive DNA sequencing and advances in genome editing have made computational analysis a major rate-limiting step in adaptive laboratory evolution and microbial genome engineering. We describe Millstone, a web-based platform that automates genotype comparison and visualization for projects with up to hundreds of genomic samples. To enable iterative genome engineering, Millstone allows users to design oligonucleotide libraries and create successive versions of reference genomes. Millstone is open source and easily deployable to a cloud platform, local cluster, or desktop, making it a scalable solution for any lab.

  5. SIGMA: A System for Integrative Genomic Microarray Analysis of Cancer Genomes

    Directory of Open Access Journals (Sweden)

    Davies Jonathan J

    2006-12-01

    Full Text Available Abstract Background The prevalence of high resolution profiling of genomes has created a need for the integrative analysis of information generated from multiple methodologies and platforms. Although the majority of data in the public domain are gene expression profiles, and expression analysis software are available, the increase of array CGH studies has enabled integration of high throughput genomic and gene expression datasets. However, tools for direct mining and analysis of array CGH data are limited. Hence, there is a great need for analytical and display software tailored to cross platform integrative analysis of cancer genomes. Results We have created a user-friendly java application to facilitate sophisticated visualization and analysis such as cross-tumor and cross-platform comparisons. To demonstrate the utility of this software, we assembled array CGH data representing Affymetrix SNP chip, Stanford cDNA arrays and whole genome tiling path array platforms for cross comparison. This cancer genome database contains 267 profiles from commonly used cancer cell lines representing 14 different tissue types. Conclusion In this study we have developed an application for the visualization and analysis of data from high resolution array CGH platforms that can be adapted for analysis of multiple types of high throughput genomic datasets. Furthermore, we invite researchers using array CGH technology to deposit both their raw and processed data, as this will be a continually expanding database of cancer genomes. This publicly available resource, the System for Integrative Genomic Microarray Analysis (SIGMA of cancer genomes, can be accessed at http://sigma.bccrc.ca.

  6. Integrative genomic analysis by interoperation of bioinformatics tools in GenomeSpace

    Science.gov (United States)

    Thorvaldsdottir, Helga; Liefeld, Ted; Ocana, Marco; Borges-Rivera, Diego; Pochet, Nathalie; Robinson, James T.; Demchak, Barry; Hull, Tim; Ben-Artzi, Gil; Blankenberg, Daniel; Barber, Galt P.; Lee, Brian T.; Kuhn, Robert M.; Nekrutenko, Anton; Segal, Eran; Ideker, Trey; Reich, Michael; Regev, Aviv; Chang, Howard Y.; Mesirov, Jill P.

    2015-01-01

    Integrative analysis of multiple data types to address complex biomedical questions requires the use of multiple software tools in concert and remains an enormous challenge for most of the biomedical research community. Here we introduce GenomeSpace (http://www.genomespace.org), a cloud-based, cooperative community resource. Seeded as a collaboration of six of the most popular genomics analysis tools, GenomeSpace now supports the streamlined interaction of 20 bioinformatics tools and data resources. To facilitate the ability of non-programming users’ to leverage GenomeSpace in integrative analysis, it offers a growing set of ‘recipes’, short workflows involving a few tools and steps to guide investigators through high utility analysis tasks. PMID:26780094

  7. Pathway and network analysis of cancer genomes

    DEFF Research Database (Denmark)

    Creixell, Pau; Reimand, Jueri; Haider, Syed

    2015-01-01

    Genomic information on tumors from 50 cancer types cataloged by the International Cancer Genome Consortium (ICGC) shows that only a few well-studied driver genes are frequently mutated, in contrast to many infrequently mutated genes that may also contribute to tumor biology. Hence there has been...

  8. Identification of probable genomic packaging signal sequence from SARS—CoV genome by bioinformatics analysis

    Institute of Scientific and Technical Information of China (English)

    QINLei; XIONGBin; LUOCheng; GUOZong-Ming; HAOPei; SUJiong; NANPeng; FENGYing; SHIYi-Xiang; YUXiao-Jing; LUOXiao-Min; CHENKai-Xian; SHENXu; SHENJian-Hua; ZOUJian-Ping; ZHAOGuo-Ping; SHITie-Liu; HEWei-Zhong; ZHONGYang; JIANGHua-Liang; LIYi-Xue

    2003-01-01

    AIM:To predict the probable genomic packaging signal of SARS-CoV by bioinformatics analysis. The derived packaging signal may be used to design antisense RNA and RNA interfere (RANi) drugs treating SARS. methods: Based on the studies about the genomic packaging signals of MHV and BCoV, especially the information about primary and secondary structures, the putative genomic packaging signal of SARS_CoV were analyzed by using bioinformatic tools. Multi-alignment for the genomic sequences was performed among SARS-CoV,MHV,BCoV, PEDV and HCoV 229E. Secondary structures of RNA sequences were also predicted for the identification fo the possible genomic packaging signals. Meanwhile, the N and M proteins of all five viruses were analyzed to study the evolutionary relationship with genomic packaging signals. RESULTS: The putative genomic packaging signal of SARS-CoV locates at the 3′ end of ORF1b near that of MHV and BCoV, where is the most variable region of this gene. The RNA secondary structure of SARS-CoV genomic packaging signal is very similar to that of MHV and BCoV. The same result was also obtained in studying the genomic packaging signals of PEDV and HCoV 229E. Further more, the genomic sequence multi-alignment indicated that the locations of packaging signals of SARS-CoV, PEDV, and HCoV overlaped each other. It seems that the mutation rate of packaging signal sequences is much higher than the N protein, while only subtle variations for the M protein. CONCLUSIONS: The probable genomic packaging signal of SARS-CoV is analogous to that of MHV and BCoV, with the corresponding secondary RNA structure locating at the similar region of ORF1b. The positions where genomic packaging signals exist have suffered rounds of mutations, which may influence the primary structures of the N and M proteins consequently.

  9. Genome analysis methods - PGDBj Registered plant list, Marker list, QTL list, Plant DB link & Genome analysis methods | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available [ Credits ] BLAST Search Image Search Home About Archive Update History Contact us PGDBj Registered...ear Year of genome analysis Sequencing method Sequencing method Read counts Read counts Covered genome region Covered...otation method Number of predicted genes Number of predicted genes Genome database Genome database informati... License Update History of This Database Site Policy | Contact Us Genome analysis... methods - PGDBj Registered plant list, Marker list, QTL list, Plant DB link & Genome analysis methods | LSDB Archive ...

  10. GenomePeek—an online tool for prokaryotic genome and metagenome analysis

    Directory of Open Access Journals (Sweden)

    Katelyn McNair

    2015-06-01

    Full Text Available As more and more prokaryotic sequencing takes place, a method to quickly and accurately analyze this data is needed. Previous tools are mainly designed for metagenomic analysis and have limitations; such as long runtimes and significant false positive error rates. The online tool GenomePeek (edwards.sdsu.edu/GenomePeek was developed to analyze both single genome and metagenome sequencing files, quickly and with low error rates. GenomePeek uses a sequence assembly approach where reads to a set of conserved genes are extracted, assembled and then aligned against the highly specific reference database. GenomePeek was found to be faster than traditional approaches while still keeping error rates low, as well as offering unique data visualization options.

  11. Genomic analysis of plant chromosomes based on meiotic pairing

    Directory of Open Access Journals (Sweden)

    Lisete Chamma Davide

    2007-12-01

    Full Text Available This review presents the principles and applications of classical genomic analysis, with emphasis on plant breeding. The main mathematical models used to estimate the preferential chromosome pairing in diploid or polyploid, interspecific or intergenera hybrids are presented and discussed, with special reference to the applications and studies for the definition of genome relationships among species of the Poaceae family.

  12. Initial sequencing and analysis of the human genome.

    Science.gov (United States)

    Lander, E S; Linton, L M; Birren, B; Nusbaum, C; Zody, M C; Baldwin, J; Devon, K; Dewar, K; Doyle, M; FitzHugh, W; Funke, R; Gage, D; Harris, K; Heaford, A; Howland, J; Kann, L; Lehoczky, J; LeVine, R; McEwan, P; McKernan, K; Meldrim, J; Mesirov, J P; Miranda, C; Morris, W; Naylor, J; Raymond, C; Rosetti, M; Santos, R; Sheridan, A; Sougnez, C; Stange-Thomann, Y; Stojanovic, N; Subramanian, A; Wyman, D; Rogers, J; Sulston, J; Ainscough, R; Beck, S; Bentley, D; Burton, J; Clee, C; Carter, N; Coulson, A; Deadman, R; Deloukas, P; Dunham, A; Dunham, I; Durbin, R; French, L; Grafham, D; Gregory, S; Hubbard, T; Humphray, S; Hunt, A; Jones, M; Lloyd, C; McMurray, A; Matthews, L; Mercer, S; Milne, S; Mullikin, J C; Mungall, A; Plumb, R; Ross, M; Shownkeen, R; Sims, S; Waterston, R H; Wilson, R K; Hillier, L W; McPherson, J D; Marra, M A; Mardis, E R; Fulton, L A; Chinwalla, A T; Pepin, K H; Gish, W R; Chissoe, S L; Wendl, M C; Delehaunty, K D; Miner, T L; Delehaunty, A; Kramer, J B; Cook, L L; Fulton, R S; Johnson, D L; Minx, P J; Clifton, S W; Hawkins, T; Branscomb, E; Predki, P; Richardson, P; Wenning, S; Slezak, T; Doggett, N; Cheng, J F; Olsen, A; Lucas, S; Elkin, C; Uberbacher, E; Frazier, M; Gibbs, R A; Muzny, D M; Scherer, S E; Bouck, J B; Sodergren, E J; Worley, K C; Rives, C M; Gorrell, J H; Metzker, M L; Naylor, S L; Kucherlapati, R S; Nelson, D L; Weinstock, G M; Sakaki, Y; Fujiyama, A; Hattori, M; Yada, T; Toyoda, A; Itoh, T; Kawagoe, C; Watanabe, H; Totoki, Y; Taylor, T; Weissenbach, J; Heilig, R; Saurin, W; Artiguenave, F; Brottier, P; Bruls, T; Pelletier, E; Robert, C; Wincker, P; Smith, D R; Doucette-Stamm, L; Rubenfield, M; Weinstock, K; Lee, H M; Dubois, J; Rosenthal, A; Platzer, M; Nyakatura, G; Taudien, S; Rump, A; Yang, H; Yu, J; Wang, J; Huang, G; Gu, J; Hood, L; Rowen, L; Madan, A; Qin, S; Davis, R W; Federspiel, N A; Abola, A P; Proctor, M J; Myers, R M; Schmutz, J; Dickson, M; Grimwood, J; Cox, D R; Olson, M V; Kaul, R; Raymond, C; Shimizu, N; Kawasaki, K; Minoshima, S; Evans, G A; Athanasiou, M; Schultz, R; Roe, B A; Chen, F; Pan, H; Ramser, J; Lehrach, H; Reinhardt, R; McCombie, W R; de la Bastide, M; Dedhia, N; Blöcker, H; Hornischer, K; Nordsiek, G; Agarwala, R; Aravind, L; Bailey, J A; Bateman, A; Batzoglou, S; Birney, E; Bork, P; Brown, D G; Burge, C B; Cerutti, L; Chen, H C; Church, D; Clamp, M; Copley, R R; Doerks, T; Eddy, S R; Eichler, E E; Furey, T S; Galagan, J; Gilbert, J G; Harmon, C; Hayashizaki, Y; Haussler, D; Hermjakob, H; Hokamp, K; Jang, W; Johnson, L S; Jones, T A; Kasif, S; Kaspryzk, A; Kennedy, S; Kent, W J; Kitts, P; Koonin, E V; Korf, I; Kulp, D; Lancet, D; Lowe, T M; McLysaght, A; Mikkelsen, T; Moran, J V; Mulder, N; Pollara, V J; Ponting, C P; Schuler, G; Schultz, J; Slater, G; Smit, A F; Stupka, E; Szustakowki, J; Thierry-Mieg, D; Thierry-Mieg, J; Wagner, L; Wallis, J; Wheeler, R; Williams, A; Wolf, Y I; Wolfe, K H; Yang, S P; Yeh, R F; Collins, F; Guyer, M S; Peterson, J; Felsenfeld, A; Wetterstrand, K A; Patrinos, A; Morgan, M J; de Jong, P; Catanese, J J; Osoegawa, K; Shizuya, H; Choi, S; Chen, Y J; Szustakowki, J

    2001-02-15

    The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

  13. Analysis of Simple Sequence Repeats in Genomes of Rhizobia

    Institute of Scientific and Technical Information of China (English)

    GAO Ya-mei; HAN Yi-qiang; TANG Hui; SUN Dong-mei; WANG Yan-jie; WANG Wei-dong

    2008-01-01

    Simple sequence repeats (SSRs) or microsatellites, as genetic markers, are ubiquitous in genomes of various organisms. The analysis of SSR in rhizobia genome provides useful information for a variety of applications in population genetics of rhizobia. We analyzed the occurrences, relative abundance, and relative density of SSRs, the most common in Bradyrhizobium japonicum, Mesorhizobium loti, and Sinorhizobium meliloti genomes se-quenced in the microorganisms tandem repeats database, and SSRs in the three species genomes were compared with each other. The result showed that there were 1 410, 859, and 638 SSRs in B. japonicum, M. loti, and 5. meliloti genomes, respectively. In the genomes of B. japonicum, M. loti, and 5. meliloti, tetranucleotide, pentanucleotide, and hexanucleotide repeats were more abundant and indicated higher mutation rates in these species. The least abundance was mononucleotide repeat. The SSRs type and distribution were similar among these species.

  14. Analysis of intra-genomic GC content homogeneity within prokaryotes

    DEFF Research Database (Denmark)

    Bohlin, J; Snipen, L; Hardy, S.P.

    2010-01-01

    both aerobic and facultative microbes. Although an association has previously been found between mean genomic GC content and oxygen requirement, our analysis suggests that no such association exits when phylogenetic bias is accounted for. A significant association between GCVAR and mean GC content......Bacterial genomes possess varying GC content (total guanines (Gs) and cytosines (Cs) per total of the four bases within the genome) but within a given genome, GC content can vary locally along the chromosome, with some regions significantly more or less GC rich than on average. We have examined how...... the GC content varies within microbial genomes to assess whether this property can be associated with certain biological functions related to the organism's environment and phylogeny. We utilize a new quantity GCVAR, the intra-genomic GC content variability with respect to the average GC content...

  15. Analysis of the Vibrionaceae pan-genome

    OpenAIRE

    Kahlke, Tim

    2013-01-01

    Paper 2 of this thesis is not available in Munin: 2. Tim Kahlke, Alexander Goesmann and Peik Haugen: 'The Vibrionaceae pan-genome hints at gene expression as the major driving force for unequal gene distributions on Vibrionaceae chromosomes' (manuscript) In the presented work the bacterial family Vibrionaceae was used as a model to investigate bacterial diversity on a gene level and to analyze the underlying concepts of bacterial niche adaptation and evolution. For this, the genomes ...

  16. Whole Genome Amplification in Genomic Analysis of Single Circulating Tumor Cells.

    Science.gov (United States)

    Gasch, Christin; Pantel, Klaus; Riethdorf, Sabine

    2015-01-01

    Investigation of the genome of organisms is one of the major basics in molecular biology to understand the complex organization of cells. While genomic DNA can easily be isolated from tissues or cell cultures of plant, animal or human origin, DNA extraction from single cells is still challenging. Here, we describe three techniques for the amplification of genomic DNA of fixed single circulating tumor cells (CTC) isolated from blood of cancer patients. This amplification is aimed to increase DNA amounts from those of one cell to yields sufficient for different DNA analyses such as mutational analysis including next-generation sequencing, array-comparative genome hybridization (CGH), and quantitative measurement of gene amplifications. Molecular analysis of CTC as liquid biopsy can be used to identify therapeutic targets in personalized medicine directed, e.g. against human epidermal growth factor receptor 2 (HER2) or epidermal growth factor receptor (EGFR) and to stratify the patients to those therapies.

  17. Hybridisation among groupers (genus Cephalopholis) at the eastern Indian Ocean suture zone: taxonomic and evolutionary implications

    KAUST Repository

    Payet, Samuel D.

    2016-08-05

    Hybridisation is a significant evolutionary process that until recently was considered rare in the marine environment. A suture zone in the eastern Indian Ocean is home to numerous hybridising sister species, providing an ideal opportunity to determine how hybridisation affects speciation and biodiversity in coral reef fishes. At this location, hybridisation between two grouper (Epinephelidae) species: Cephalopholis urodeta (Pacific Ocean) and C. nigripinnis (Indian Ocean) was investigated to determine the genetic basis of hybridisation and to compare the ecology and life history of hybrids and their parent species. This approach aimed to provide insights into the taxonomic and evolutionary consequences of hybridisation. Despite clear phenotypic differences, multiple molecular markers revealed hybrids, and their parent species were genetically homogenous within and (thousands of kilometres) outside of the hybrid zone. Hybrids were at least as fit as their parent species (in terms of growth, reproduction, and abundance) and were observed in a broad range of intermediate phenotypes. The two species appear to be interbreeding at Christmas Island due to inherent biological and ecological compatibilities, and the lack of genetic structure may be explained by three potential scenarios: (1) hybridisation and introgression; (2) discordance between morphology and genetics; and (3) incomplete lineage sorting. Further molecular analyses are necessary to discriminate these scenarios. Regardless of which applies, C. urodeta and C. nigripinnis are unlikely to evolve in reproductive isolation as they cohabit where they are common (Christmas Island) and will source congeneric mates where they are rare (Cocos Keeling Islands). Our results add to the growing body of evidence that hybridisation among coral reef fishes is a dynamic evolutionary factor. © 2016 Springer-Verlag Berlin Heidelberg

  18. Hybridisation among groupers (genus Cephalopholis) at the eastern Indian Ocean suture zone: taxonomic and evolutionary implications

    Science.gov (United States)

    Payet, Samuel D.; Hobbs, Jean-Paul A.; DiBattista, Joseph D.; Newman, Stephen J.; Sinclair-Taylor, Tane; Berumen, Michael L.; McIlwain, Jennifer L.

    2016-12-01

    Hybridisation is a significant evolutionary process that until recently was considered rare in the marine environment. A suture zone in the eastern Indian Ocean is home to numerous hybridising sister species, providing an ideal opportunity to determine how hybridisation affects speciation and biodiversity in coral reef fishes. At this location, hybridisation between two grouper (Epinephelidae) species: Cephalopholis urodeta (Pacific Ocean) and C. nigripinnis (Indian Ocean) was investigated to determine the genetic basis of hybridisation and to compare the ecology and life history of hybrids and their parent species. This approach aimed to provide insights into the taxonomic and evolutionary consequences of hybridisation. Despite clear phenotypic differences, multiple molecular markers revealed hybrids, and their parent species were genetically homogenous within and (thousands of kilometres) outside of the hybrid zone. Hybrids were at least as fit as their parent species (in terms of growth, reproduction, and abundance) and were observed in a broad range of intermediate phenotypes. The two species appear to be interbreeding at Christmas Island due to inherent biological and ecological compatibilities, and the lack of genetic structure may be explained by three potential scenarios: (1) hybridisation and introgression; (2) discordance between morphology and genetics; and (3) incomplete lineage sorting. Further molecular analyses are necessary to discriminate these scenarios. Regardless of which applies, C. urodeta and C. nigripinnis are unlikely to evolve in reproductive isolation as they cohabit where they are common (Christmas Island) and will source congeneric mates where they are rare (Cocos Keeling Islands). Our results add to the growing body of evidence that hybridisation among coral reef fishes is a dynamic evolutionary factor.

  19. Bisimilarity and refinement for hybrid(ised logics

    Directory of Open Access Journals (Sweden)

    Alexandre Madeira

    2013-05-01

    Full Text Available The complexity of modern software systems entails the need for reconfiguration mechanisms gov- erning the dynamic evolution of their execution configurations in response to both external stimulus or internal performance measures. Formally, such systems may be represented by transition systems whose nodes correspond to the different configurations they may assume. Therefore, each node is en- dowed with, for example, an algebra, or a first-order structure, to precisely characterise the semantics of the services provided in the corresponding configuration. Hybrid logics, which add to the modal description of transition structures the ability to refer to specific states, offer a generic framework to approach the specification and design of this sort of systems. Therefore, the quest for suitable notions of equivalence and refinement between models of hybrid logic specifications becomes fundamental to any design discipline adopting this perspective. This paper contributes to this effort from a distinctive point of view: instead of focussing on a specific hybrid logic, the paper introduces notions of bisimilarity and refinement for hybridised logics, i.e. standard specification logics (e.g. propositional, equational, fuzzy, etc to which modal and hybrid features were added in a systematic way.

  20. Chromosomes in the flow to simplify genome analysis.

    Science.gov (United States)

    Doležel, Jaroslav; Vrána, Jan; Safář, Jan; Bartoš, Jan; Kubaláková, Marie; Simková, Hana

    2012-08-01

    Nuclear genomes of human, animals, and plants are organized into subunits called chromosomes. When isolated into aqueous suspension, mitotic chromosomes can be classified using flow cytometry according to light scatter and fluorescence parameters. Chromosomes of interest can be purified by flow sorting if they can be resolved from other chromosomes in a karyotype. The analysis and sorting are carried out at rates of 10(2)-10(4) chromosomes per second, and for complex genomes such as wheat the flow sorting technology has been ground-breaking in reducing genome complexity for genome sequencing. The high sample rate provides an attractive approach for karyotype analysis (flow karyotyping) and the purification of chromosomes in large numbers. In characterizing the chromosome complement of an organism, the high number that can be studied using flow cytometry allows for a statistically accurate analysis. Chromosome sorting plays a particularly important role in the analysis of nuclear genome structure and the analysis of particular and aberrant chromosomes. Other attractive but not well-explored features include the analysis of chromosomal proteins, chromosome ultrastructure, and high-resolution mapping using FISH. Recent results demonstrate that chromosome flow sorting can be coupled seamlessly with DNA array and next-generation sequencing technologies for high-throughput analyses. The main advantages are targeting the analysis to a genome region of interest and a significant reduction in sample complexity. As flow sorters can also sort single copies of chromosomes, shotgun sequencing DNA amplified from them enables the production of haplotype-resolved genome sequences. This review explains the principles of flow cytometric chromosome analysis and sorting (flow cytogenetics), discusses the major uses of this technology in genome analysis, and outlines future directions.

  1. Analysis of intra-genomic GC content homogeneity within prokaryotes

    Directory of Open Access Journals (Sweden)

    Bohlin Jon

    2010-08-01

    Full Text Available Abstract Background Bacterial genomes possess varying GC content (total guanines (Gs and cytosines (Cs per total of the four bases within the genome but within a given genome, GC content can vary locally along the chromosome, with some regions significantly more or less GC rich than on average. We have examined how the GC content varies within microbial genomes to assess whether this property can be associated with certain biological functions related to the organism's environment and phylogeny. We utilize a new quantity GCVAR, the intra-genomic GC content variability with respect to the average GC content of the total genome. A low GCVAR indicates intra-genomic GC homogeneity and high GCVAR heterogeneity. Results The regression analyses indicated that GCVAR was significantly associated with domain (i.e. archaea or bacteria, phylum, and oxygen requirement. GCVAR was significantly higher among anaerobes than both aerobic and facultative microbes. Although an association has previously been found between mean genomic GC content and oxygen requirement, our analysis suggests that no such association exits when phylogenetic bias is accounted for. A significant association between GCVAR and mean GC content was also found but appears to be non-linear and varies greatly among phyla. Conclusions Our findings show that GCVAR is linked with oxygen requirement, while mean genomic GC content is not. We therefore suggest that GCVAR should be used as a complement to mean GC content.

  2. Genomic analysis of hyperthermophilic archaea; Chokonetsusei kosaikin no genomu kaiseki

    Energy Technology Data Exchange (ETDEWEB)

    Kato, C. [Japan Marine Science and Technology Center, Kanagawa (Japan)

    1997-05-20

    Whole genome sequences of five strains of microorganisms have been reported up to the present and many genome analysis projects are in progress in the world. Among archaea (archaebacteria), the genome analysis of Methanococcus jannaschii have been completed and the sequencing data are opened to public. While 134 regulatory genes were identified in Synechocystis sp. PCC 6803 (eubacteria, 3.6 genome size), only 7 regulatory genes were identified in M. jannaschii (1.7Mb). Difference of the genome size is believed to correspond to the quantity of the environmental stresses. In Japan, the genome analysis project on a new hyperthermophilic archaeon, Pyrococcus horikoshii is in progress. P. horikoshii was isolated in a deep sea hydrothermal vent. It shows barophilic growth at maximum high temperature of 103degC under pressure of 30MPa. Thus, the genome analysis of barophilic hyperthermophilic archaea is expected to contribute to the understanding of the origin of life and evolution. 19 refs., 4 figs., 1 tab.

  3. Comparative genomic analysis of eutherian interferon-γ-inducible GTPases.

    Science.gov (United States)

    Premzl, Marko

    2012-11-01

    The interferon-γ-inducible GTPases, IFGGs, are intracellular proteins involved in immune response against pathogens. A comprehensive comparative genomic review and analysis of eutherian IFGGs was carried out using public genomic sequences. The 64 eutherian IFGG genes were examined in detail and annotated. The eutherian IFGG promoter types were first catalogued followed by a phylogenetic analysis of eutherian IFGGs, which described five major IFGG clusters. The patterns of differential gene expansions and protein regions that may regulate IFGG catalytic features suggested a new classification of eutherian IFGGs. This mini-review has also provided new tests of reliability of public genomic sequences as well as tests of protein molecular evolution.

  4. Resequencing of the common marmoset genome improves genome assemblies and gene-coding sequence analysis.

    Science.gov (United States)

    Sato, Kengo; Kuroki, Yoko; Kumita, Wakako; Fujiyama, Asao; Toyoda, Atsushi; Kawai, Jun; Iriki, Atsushi; Sasaki, Erika; Okano, Hideyuki; Sakakibara, Yasubumi

    2015-11-20

    The first draft of the common marmoset (Callithrix jacchus) genome was published by the Marmoset Genome Sequencing and Analysis Consortium. The draft was based on whole-genome shotgun sequencing, and the current assembly version is Callithrix_jacches-3.2.1, but there still exist 187,214 undetermined gap regions and supercontigs and relatively short contigs that are unmapped to chromosomes in the draft genome. We performed resequencing and assembly of the genome of common marmoset by deep sequencing with high-throughput sequencing technology. Several different sequence runs using Illumina sequencing platforms were executed, and 181 Gbp of high-quality bases including mate-pairs with long insert lengths of 3, 8, 20, and 40 Kbp were obtained, that is, approximately 60× coverage. The resequencing significantly improved the MGSAC draft genome sequence. The N50 of the contigs, which is a statistical measure used to evaluate assembly quality, doubled. As a result, 51% of the contigs (total length: 299 Mbp) that were unmapped to chromosomes in the MGSAC draft were merged with chromosomal contigs, and the improved genome sequence helped to detect 5,288 new genes that are homologous to human cDNAs and the gaps in 5,187 transcripts of the Ensembl gene annotations were completely filled.

  5. Mycobacterial species as case-study of comparative genome analysis.

    Science.gov (United States)

    Zakham, F; Belayachi, L; Ussery, D; Akrim, M; Benjouad, A; El Aouad, R; Ennaji, M M

    2011-02-08

    The genus Mycobacterium represents more than 120 species including important pathogens of human and cause major public health problems and illnesses. Further, with more than 100 genome sequences from this genus, comparative genome analysis can provide new insights for better understanding the evolutionary events of these species and improving drugs, vaccines, and diagnostics tools for controlling Mycobacterial diseases. In this present study we aim to outline a comparative genome analysis of fourteen Mycobacterial genomes: M. avium subsp. paratuberculosis K—10, M. bovis AF2122/97, M. bovis BCG str. Pasteur 1173P2, M. leprae Br4923, M. marinum M, M. sp. KMS, M. sp. MCS, M. tuberculosis CDC1551, M. tuberculosis F11, M. tuberculosis H37Ra, M. tuberculosis H37Rv, M. tuberculosis KZN 1435 , M. ulcerans Agy99,and M. vanbaalenii PYR—1, For this purpose a comparison has been done based on their length of genomes, GC content, number of genes in different data bases (Genbank, Refseq, and Prodigal). The BLAST matrix of these genomes has been figured to give a lot of information about the similarity between species in a simple scheme. As a result of multiple genome analysis, the pan and core genome have been defined for twelve Mycobacterial species. We have also introduced the genome atlas of the reference strain M. tuberculosis H37Rv which can give a good overview of this genome. And for examining the phylogenetic relationships among these bacteria, a phylogenic tree has been constructed from 16S rRNA gene for tuberculosis and non tuberculosis Mycobacteria to understand the evolutionary events of these species.

  6. Hyperstructures, genome analysis and I-cells

    DEFF Research Database (Denmark)

    Amar, P.; Ballet, P.; Barlovatz-Meimon, G.

    2002-01-01

    New concepts may prove necessary to profit from the avalanche of sequence data on the genome, transcriptome, proteome and interactome and to relate this information to cell physiology. Here, we focus on the concept of large activity-based structures, or hyperstructures, in which a variety of type...

  7. Whole genome sequencing analysis of Plasmodium vivax using whole genome capture

    Directory of Open Access Journals (Sweden)

    Bright A

    2012-06-01

    Full Text Available Abstract Background Malaria caused by Plasmodium vivax is an experimentally neglected severe disease with a substantial burden on human health. Because of technical limitations, little is known about the biology of this important human pathogen. Whole genome analysis methods on patient-derived material are thus likely to have a substantial impact on our understanding of P. vivax pathogenesis and epidemiology. For example, it will allow study of the evolution and population biology of the parasite, allow parasite transmission patterns to be characterized, and may facilitate the identification of new drug resistance genes. Because parasitemias are typically low and the parasite cannot be readily cultured, on-site leukocyte depletion of blood samples is typically needed to remove human DNA that may be 1000X more abundant than parasite DNA. These features have precluded the analysis of archived blood samples and require the presence of laboratories in close proximity to the collection of field samples for optimal pre-cryopreservation sample preparation. Results Here we show that in-solution hybridization capture can be used to extract P. vivax DNA from human contaminating DNA in the laboratory without the need for on-site leukocyte filtration. Using a whole genome capture method, we were able to enrich P. vivax DNA from bulk genomic DNA from less than 0.5% to a median of 55% (range 20%-80%. This level of enrichment allows for efficient analysis of the samples by whole genome sequencing and does not introduce any gross biases into the data. With this method, we obtained greater than 5X coverage across 93% of the P. vivax genome for four P. vivax strains from Iquitos, Peru, which is similar to our results using leukocyte filtration (greater than 5X coverage across 96% . Conclusion The whole genome capture technique will enable more efficient whole genome analysis of P. vivax from a larger geographic region and from valuable archived sample collections.

  8. A novel statistic for genome-wide interaction analysis.

    Science.gov (United States)

    Wu, Xuesen; Dong, Hua; Luo, Li; Zhu, Yun; Peng, Gang; Reveille, John D; Xiong, Momiao

    2010-09-23

    Although great progress in genome-wide association studies (GWAS) has been made, the significant SNP associations identified by GWAS account for only a few percent of the genetic variance, leading many to question where and how we can find the missing heritability. There is increasing interest in genome-wide interaction analysis as a possible source of finding heritability unexplained by current GWAS. However, the existing statistics for testing interaction have low power for genome-wide interaction analysis. To meet challenges raised by genome-wide interactional analysis, we have developed a novel statistic for testing interaction between two loci (either linked or unlinked). The null distribution and the type I error rates of the new statistic for testing interaction are validated using simulations. Extensive power studies show that the developed statistic has much higher power to detect interaction than classical logistic regression. The results identified 44 and 211 pairs of SNPs showing significant evidence of interactions with FDRanalysis is a valuable tool for finding remaining missing heritability unexplained by the current GWAS, and the developed novel statistic is able to search significant interaction between SNPs across the genome. Real data analysis showed that the results of genome-wide interaction analysis can be replicated in two independent studies.

  9. Bovine Genome Database: supporting community annotation and analysis of the Bos taurus genome

    Directory of Open Access Journals (Sweden)

    Childs Kevin L

    2010-11-01

    Full Text Available Abstract Background A goal of the Bovine Genome Database (BGD; http://BovineGenome.org has been to support the Bovine Genome Sequencing and Analysis Consortium (BGSAC in the annotation and analysis of the bovine genome. We were faced with several challenges, including the need to maintain consistent quality despite diversity in annotation expertise in the research community, the need to maintain consistent data formats, and the need to minimize the potential duplication of annotation effort. With new sequencing technologies allowing many more eukaryotic genomes to be sequenced, the demand for collaborative annotation is likely to increase. Here we present our approach, challenges and solutions facilitating a large distributed annotation project. Results and Discussion BGD has provided annotation tools that supported 147 members of the BGSAC in contributing 3,871 gene models over a fifteen-week period, and these annotations have been integrated into the bovine Official Gene Set. Our approach has been to provide an annotation system, which includes a BLAST site, multiple genome browsers, an annotation portal, and the Apollo Annotation Editor configured to connect directly to our Chado database. In addition to implementing and integrating components of the annotation system, we have performed computational analyses to create gene evidence tracks and a consensus gene set, which can be viewed on individual gene pages at BGD. Conclusions We have provided annotation tools that alleviate challenges associated with distributed annotation. Our system provides a consistent set of data to all annotators and eliminates the need for annotators to format data. Involving the bovine research community in genome annotation has allowed us to leverage expertise in various areas of bovine biology to provide biological insight into the genome sequence.

  10. The assisted prediction modelling frame with hybridisation and ensemble for business risk forecasting and an implementation

    Science.gov (United States)

    Li, Hui; Hong, Lu-Yao; Zhou, Qing; Yu, Hai-Jie

    2015-08-01

    The business failure of numerous companies results in financial crises. The high social costs associated with such crises have made people to search for effective tools for business risk prediction, among which, support vector machine is very effective. Several modelling means, including single-technique modelling, hybrid modelling, and ensemble modelling, have been suggested in forecasting business risk with support vector machine. However, existing literature seldom focuses on the general modelling frame for business risk prediction, and seldom investigates performance differences among different modelling means. We reviewed researches on forecasting business risk with support vector machine, proposed the general assisted prediction modelling frame with hybridisation and ensemble (APMF-WHAE), and finally, investigated the use of principal components analysis, support vector machine, random sampling, and group decision, under the general frame in forecasting business risk. Under the APMF-WHAE frame with support vector machine as the base predictive model, four specific predictive models were produced, namely, pure support vector machine, a hybrid support vector machine involved with principal components analysis, a support vector machine ensemble involved with random sampling and group decision, and an ensemble of hybrid support vector machine using group decision to integrate various hybrid support vector machines on variables produced from principle components analysis and samples from random sampling. The experimental results indicate that hybrid support vector machine and ensemble of hybrid support vector machines were able to produce dominating performance than pure support vector machine and support vector machine ensemble.

  11. Copy Number Variation Analysis by Array Analysis of Single Cells Following Whole Genome Amplification.

    Science.gov (United States)

    Dimitriadou, Eftychia; Zamani Esteki, Masoud; Vermeesch, Joris Robert

    2015-01-01

    Whole genome amplification is required to ensure the availability of sufficient material for copy number variation analysis of a genome deriving from an individual cell. Here, we describe the protocols we use for copy number variation analysis of non-fixed single cells by array-based approaches following single-cell isolation and whole genome amplification. We are focusing on two alternative protocols, an isothermal and a PCR-based whole genome amplification method, followed by either comparative genome hybridization (aCGH) or SNP array analysis, respectively.

  12. Microarray MAPH: accurate array-based detection of relative copy number in genomic DNA

    Directory of Open Access Journals (Sweden)

    Chan Alan

    2006-06-01

    Full Text Available Abstract Background Current methods for measurement of copy number do not combine all the desirable qualities of convenience, throughput, economy, accuracy and resolution. In this study, to improve the throughput associated with Multiplex Amplifiable Probe Hybridisation (MAPH we aimed to develop a modification based on the 3-Dimensional, Flow-Through Microarray Platform from PamGene International. In this new method, electrophoretic analysis of amplified products is replaced with photometric analysis of a probed oligonucleotide array. Copy number analysis of hybridised probes is based on a dual-label approach by comparing the intensity of Cy3-labelled MAPH probes amplified from test samples co-hybridised with similarly amplified Cy5-labelled reference MAPH probes. The key feature of using a hybridisation-based end point with MAPH is that discrimination of amplified probes is based on sequence and not fragment length. Results In this study we showed that microarray MAPH measurement of PMP22 gene dosage correlates well with PMP22 gene dosage determined by capillary MAPH and that copy number was accurately reported in analyses of DNA from 38 individuals, 12 of which were known to have Charcot-Marie-Tooth disease type 1A (CMT1A. Conclusion Measurement of microarray-based endpoints for MAPH appears to be of comparable accuracy to electrophoretic methods, and holds the prospect of fully exploiting the potential multiplicity of MAPH. The technology has the potential to simplify copy number assays for genes with a large number of exons, or of expanded sets of probes from dispersed genomic locations.

  13. MIPS: analysis and annotation of proteins from whole genomes.

    Science.gov (United States)

    Mewes, H W; Amid, C; Arnold, R; Frishman, D; Güldener, U; Mannhaupt, G; Münsterkötter, M; Pagel, P; Strack, N; Stümpflen, V; Warfsmann, J; Ruepp, A

    2004-01-01

    The Munich Information Center for Protein Sequences (MIPS-GSF), Neuherberg, Germany, provides protein sequence-related information based on whole-genome analysis. The main focus of the work is directed toward the systematic organization of sequence-related attributes as gathered by a variety of algorithms, primary information from experimental data together with information compiled from the scientific literature. MIPS maintains automatically generated and manually annotated genome-specific databases, develops systematic classification schemes for the functional annotation of protein sequences and provides tools for the comprehensive analysis of protein sequences. This report updates the information on the yeast genome (CYGD), the Neurospora crassa genome (MNCDB), the database of complete cDNAs (German Human Genome Project, NGFN), the database of mammalian protein-protein interactions (MPPI), the database of FASTA homologies (SIMAP), and the interface for the fast retrieval of protein-associated information (QUIPOS). The Arabidopsis thaliana database, the rice database, the plant EST databases (MATDB, MOsDB, SPUTNIK), as well as the databases for the comprehensive set of genomes (PEDANT genomes) are described elsewhere in the 2003 and 2004 NAR database issues, respectively. All databases described, and the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de).

  14. Pan-cancer analysis of ROS1 genomic aberrations

    OpenAIRE

    Wang, Yidan; 王奕丹

    2015-01-01

    The ROS proto-oncogene 1 (ROS1) encodes the ROS1 receptor kinase. ROS1 rearrangements are known to be oncogenic in glioblastoma, non–small-cell lung carcinoma (NSCLC) and cholangiocarcinoma. The clinical relevance of ROS1 genomic aberrations in other human cancers is largely unexamined. Here, we performed a pan-cancer analysis of ROS1 genomic aberrations across 20 cancer sites by interrogating the whole-exome sequencing data of the Cancer Genome Atlas (TCGA) via the cBioportal (www.cbioportal...

  15. Arabidopsis transcription factors: genome-wide comparative analysis among eukaryotes.

    Science.gov (United States)

    Riechmann, J L; Heard, J; Martin, G; Reuber, L; Jiang, C; Keddie, J; Adam, L; Pineda, O; Ratcliffe, O J; Samaha, R R; Creelman, R; Pilgrim, M; Broun, P; Zhang, J Z; Ghandehari, D; Sherman, B K; Yu, G

    2000-12-15

    The completion of the Arabidopsis thaliana genome sequence allows a comparative analysis of transcriptional regulators across the three eukaryotic kingdoms. Arabidopsis dedicates over 5% of its genome to code for more than 1500 transcription factors, about 45% of which are from families specific to plants. Arabidopsis transcription factors that belong to families common to all eukaryotes do not share significant similarity with those of the other kingdoms beyond the conserved DNA binding domains, many of which have been arranged in combinations specific to each lineage. The genome-wide comparison reveals the evolutionary generation of diversity in the regulation of transcription.

  16. Single-cell analysis in cancer genomics

    Science.gov (United States)

    Saadatpour, Assieh; Lai, Shujing; Guo, Guoji; Yuan, Guo-Cheng

    2017-01-01

    Genetic changes and environmental differences result in cellular heterogeneity among cancer cells within the same tumor, thereby complicating treatment outcomes. Recent advances in single-cell technologies have opened new avenues to characterize the intra-tumor cellular heterogeneity, identify rare cell types, measure mutation rates, and, ultimately, guide diagnosis and treatment. In this paper, we review the recent single-cell technological and computational advances at the genomic, transcriptomic, and proteomic levels, and discuss their applications in cancer research. PMID:26450340

  17. Comparative Genomics via Wavelet Analysis for Closely Related Bacteria

    Directory of Open Access Journals (Sweden)

    Jiuzhou Song

    2004-01-01

    Full Text Available Comparative genomics has been a valuable method for extracting and extrapolating genome information among closely related bacteria. The efficiency of the traditional methods is extremely influenced by the software method used. To overcome the problem here, we propose using wavelet analysis to perform comparative genomics. First, global comparison using wavelet analysis gives the difference at a quantitative level. Then local comparison using keto-excess or purine-excess plots shows precise positions of inversions, translocations, and horizontally transferred DNA fragments. We firstly found that the level of energy spectra difference is related to the similarity of bacteria strains; it could be a quantitative index to describe the similarities of genomes. The strategy is described in detail by comparisons of closely related strains: S.typhi CT18, S.typhi Ty2, S.typhimurium LT2, H.pylori 26695, and H.pylori J99.

  18. Comparative Genomics via Wavelet Analysis for Closely Related Bacteria

    Science.gov (United States)

    Song, Jiuzhou; Ware, Tony; Liu, Shu-Lin; Surette, M.

    2004-12-01

    Comparative genomics has been a valuable method for extracting and extrapolating genome information among closely related bacteria. The efficiency of the traditional methods is extremely influenced by the software method used. To overcome the problem here, we propose using wavelet analysis to perform comparative genomics. First, global comparison using wavelet analysis gives the difference at a quantitative level. Then local comparison using keto-excess or purine-excess plots shows precise positions of inversions, translocations, and horizontally transferred DNA fragments. We firstly found that the level of energy spectra difference is related to the similarity of bacteria strains; it could be a quantitative index to describe the similarities of genomes. The strategy is described in detail by comparisons of closely related strains: S.typhi CT18, S.typhi Ty2, S.typhimurium LT2, H.pylori 26695, and H.pylori J99.

  19. Genomic analysis of epithelial ovarian cancer

    Institute of Scientific and Technical Information of China (English)

    John Farley; Laurent L Ozbun; Michael J Birrer

    2008-01-01

    Ovarian cancer is a major health problem for women in the United States.Despite evidence of considerable heterogeneity,most cases of ovarian cancer are treated in a similar fashion.The molecular basis for the clinicopathologic characteristics of these tumors remains poorly defined.Whole genome expression profiling is a genomic tool,which can identify dysregulated genes and uncover unique sub-classes of tumors.The application of this technology to ovarian cancer has provided a solid molecular basis for differences in histology and grade of ovarian tumors.Differentially expressed genes identified pathways implicated in cell proliferation,invasion,motility,chromosomal instability,and gene silencing and provided new insights into the origin and potential treatment of these cancers.The added knowledge provided by global gene expression profiling should allow for a more rational treatment of ovarian cancers.These techniques are leading to a paradigm shift from empirical treatment to an individually tailored approach.This review summarizes the new genomic data on epithelial ovarian cancers of different histology and grade and the impact it will have on our understanding and treatment of this disease.

  20. Complete genome sequence of Enterococcus faecium strain TX16 and comparative genomic analysis of Enterococcus faecium genomes

    Directory of Open Access Journals (Sweden)

    Qin Xiang

    2012-07-01

    Full Text Available Abstract Background Enterococci are among the leading causes of hospital-acquired infections in the United States and Europe, with Enterococcus faecalis and Enterococcus faecium being the two most common species isolated from enterococcal infections. In the last decade, the proportion of enterococcal infections caused by E. faecium has steadily increased compared to other Enterococcus species. Although the underlying mechanism for the gradual replacement of E. faecalis by E. faecium in the hospital environment is not yet understood, many studies using genotyping and phylogenetic analysis have shown the emergence of a globally dispersed polyclonal subcluster of E. faecium strains in clinical environments. Systematic study of the molecular epidemiology and pathogenesis of E. faecium has been hindered by the lack of closed, complete E. faecium genomes that can be used as references. Results In this study, we report the complete genome sequence of the E. faecium strain TX16, also known as DO, which belongs to multilocus sequence type (ST 18, and was the first E. faecium strain ever sequenced. Whole genome comparison of the TX16 genome with 21 E. faecium draft genomes confirmed that most clinical, outbreak, and hospital-associated (HA strains (including STs 16, 17, 18, and 78, in addition to strains of non-hospital origin, group in the same clade (referred to as the HA clade and are evolutionally considerably more closely related to each other by phylogenetic and gene content similarity analyses than to isolates in the community-associated (CA clade with approximately a 3–4% average nucleotide sequence difference between the two clades at the core genome level. Our study also revealed that many genomic loci in the TX16 genome are unique to the HA clade. 380 ORFs in TX16 are HA-clade specific and antibiotic resistance genes are enriched in HA-clade strains. Mobile elements such as IS16 and transposons were also found almost exclusively in HA strains

  1. Introducing a New Breed of Wine Yeast: Interspecific Hybridisation between a Commercial Saccharomyces cerevisiae Wine Yeast and Saccharomyces mikatae

    Science.gov (United States)

    Bellon, Jennifer R.; Schmid, Frank; Capone, Dimitra L.; Dunn, Barbara L.; Chambers, Paul J.

    2013-01-01

    Interspecific hybrids are commonplace in agriculture and horticulture; bread wheat and grapefruit are but two examples. The benefits derived from interspecific hybridisation include the potential of generating advantageous transgressive phenotypes. This paper describes the generation of a new breed of wine yeast by interspecific hybridisation between a commercial Saccharomyces cerevisiae wine yeast strain and Saccharomyces mikatae, a species hitherto not associated with industrial fermentation environs. While commercially available wine yeast strains provide consistent and reliable fermentations, wines produced using single inocula are thought to lack the sensory complexity and rounded palate structure obtained from spontaneous fermentations. In contrast, interspecific yeast hybrids have the potential to deliver increased complexity to wine sensory properties and alternative wine styles through the formation of novel, and wider ranging, yeast volatile fermentation metabolite profiles, whilst maintaining the robustness of the wine yeast parent. Screening of newly generated hybrids from a cross between a S. cerevisiae wine yeast and S. mikatae (closely-related but ecologically distant members of the Saccharomyces sensu stricto clade), has identified progeny with robust fermentation properties and winemaking potential. Chemical analysis showed that, relative to the S. cerevisiae wine yeast parent, hybrids produced wines with different concentrations of volatile metabolites that are known to contribute to wine flavour and aroma, including flavour compounds associated with non-Saccharomyces species. The new S. cerevisiae x S. mikatae hybrids have the potential to produce complex wines akin to products of spontaneous fermentation while giving winemakers the safeguard of an inoculated ferment. PMID:23614011

  2. Introducing a new breed of wine yeast: interspecific hybridisation between a commercial Saccharomyces cerevisiae wine yeast and Saccharomyces mikatae.

    Directory of Open Access Journals (Sweden)

    Jennifer R Bellon

    Full Text Available Interspecific hybrids are commonplace in agriculture and horticulture; bread wheat and grapefruit are but two examples. The benefits derived from interspecific hybridisation include the potential of generating advantageous transgressive phenotypes. This paper describes the generation of a new breed of wine yeast by interspecific hybridisation between a commercial Saccharomyces cerevisiae wine yeast strain and Saccharomyces mikatae, a species hitherto not associated with industrial fermentation environs. While commercially available wine yeast strains provide consistent and reliable fermentations, wines produced using single inocula are thought to lack the sensory complexity and rounded palate structure obtained from spontaneous fermentations. In contrast, interspecific yeast hybrids have the potential to deliver increased complexity to wine sensory properties and alternative wine styles through the formation of novel, and wider ranging, yeast volatile fermentation metabolite profiles, whilst maintaining the robustness of the wine yeast parent. Screening of newly generated hybrids from a cross between a S. cerevisiae wine yeast and S. mikatae (closely-related but ecologically distant members of the Saccharomyces sensu stricto clade, has identified progeny with robust fermentation properties and winemaking potential. Chemical analysis showed that, relative to the S. cerevisiae wine yeast parent, hybrids produced wines with different concentrations of volatile metabolites that are known to contribute to wine flavour and aroma, including flavour compounds associated with non-Saccharomyces species. The new S. cerevisiae x S. mikatae hybrids have the potential to produce complex wines akin to products of spontaneous fermentation while giving winemakers the safeguard of an inoculated ferment.

  3. The genome sequence of Blochmannia floridanus: Comparative analysis of reduced genomes

    Science.gov (United States)

    Gil, Rosario; Silva, Francisco J.; Zientz, Evelyn; Delmotte, François; González-Candelas, Fernando; Latorre, Amparo; Rausell, Carolina; Kamerbeek, Judith; Gadau, Jürgen; Hölldobler, Bert; van Ham, Roeland C. H. J.; Gross, Roy; Moya, Andrés

    2003-01-01

    Bacterial symbioses are widespread among insects, probably being one of the key factors of their evolutionary success. We present the complete genome sequence of Blochmannia floridanus, the primary endosymbiont of carpenter ants. Although these ants feed on a complex diet, this symbiosis very likely has a nutritional basis: Blochmannia is able to supply nitrogen and sulfur compounds to the host while it takes advantage of the host metabolic machinery. Remarkably, these bacteria lack all known genes involved in replication initiation (dnaA, priA, and recA). The phylogenetic analysis of a set of conserved protein-coding genes shows that Bl. floridanus is phylogenetically related to Buchnera aphidicola and Wigglesworthia glossinidia, the other endosymbiotic bacteria whose complete genomes have been sequenced so far. Comparative analysis of the five known genomes from insect endosymbiotic bacteria reveals they share only 313 genes, a number that may be close to the minimum gene set necessary to sustain endosymbiotic life. PMID:12886019

  4. The complete genome sequence and comparative genome analysis of the high pathogenicity Yersinia enterocolitica strain 8081.

    Directory of Open Access Journals (Sweden)

    Nicholas R Thomson

    2006-12-01

    Full Text Available The human enteropathogen, Yersinia enterocolitica, is a significant link in the range of Yersinia pathologies extending from mild gastroenteritis to bubonic plague. Comparison at the genomic level is a key step in our understanding of the genetic basis for this pathogenicity spectrum. Here we report the genome of Y. enterocolitica strain 8081 (serotype 0:8; biotype 1B and extensive microarray data relating to the genetic diversity of the Y. enterocolitica species. Our analysis reveals that the genome of Y. enterocolitica strain 8081 is a patchwork of horizontally acquired genetic loci, including a plasticity zone of 199 kb containing an extraordinarily high density of virulence genes. Microarray analysis has provided insights into species-specific Y. enterocolitica gene functions and the intraspecies differences between the high, low, and nonpathogenic Y. enterocolitica biotypes. Through comparative genome sequence analysis we provide new information on the evolution of the Yersinia. We identify numerous loci that represent ancestral clusters of genes potentially important in enteric survival and pathogenesis, which have been lost or are in the process of being lost, in the other sequenced Yersinia lineages. Our analysis also highlights large metabolic operons in Y. enterocolitica that are absent in the related enteropathogen, Yersinia pseudotuberculosis, indicating major differences in niche and nutrients used within the mammalian gut. These include clusters directing, the production of hydrogenases, tetrathionate respiration, cobalamin synthesis, and propanediol utilisation. Along with ancestral gene clusters, the genome of Y. enterocolitica has revealed species-specific and enteropathogen-specific loci. This has provided important insights into the pathology of this bacterium and, more broadly, into the evolution of the genus. Moreover, wider investigations looking at the patterns of gene loss and gain in the Yersinia have highlighted common

  5. Diversity of Pseudomonas Genomes, Including Populus-Associated Isolates, as Revealed by Comparative Genome Analysis.

    Science.gov (United States)

    Jun, Se-Ran; Wassenaar, Trudy M; Nookaew, Intawat; Hauser, Loren; Wanchai, Visanu; Land, Miriam; Timm, Collin M; Lu, Tse-Yuan S; Schadt, Christopher W; Doktycz, Mitchel J; Pelletier, Dale A; Ussery, David W

    2015-10-30

    The Pseudomonas genus contains a metabolically versatile group of organisms that are known to occupy numerous ecological niches, including the rhizosphere and endosphere of many plants. Their diversity influences the phylogenetic diversity and heterogeneity of these communities. On the basis of average amino acid identity, comparative genome analysis of >1,000 Pseudomonas genomes, including 21 Pseudomonas strains isolated from the roots of native Populus deltoides (eastern cottonwood) trees resulted in consistent and robust genomic clusters with phylogenetic homogeneity. All Pseudomonas aeruginosa genomes clustered together, and these were clearly distinct from other Pseudomonas species groups on the basis of pangenome and core genome analyses. In contrast, the genomes of Pseudomonas fluorescens were organized into 20 distinct genomic clusters, representing enormous diversity and heterogeneity. Most of our 21 Populus-associated isolates formed three distinct subgroups within the major P. fluorescens group, supported by pathway profile analysis, while two isolates were more closely related to Pseudomonas chlororaphis and Pseudomonas putida. Genes specific to Populus-associated subgroups were identified. Genes specific to subgroup 1 include several sensory systems that act in two-component signal transduction, a TonB-dependent receptor, and a phosphorelay sensor. Genes specific to subgroup 2 contain hypothetical genes, and genes specific to subgroup 3 were annotated with hydrolase activity. This study justifies the need to sequence multiple isolates, especially from P. fluorescens, which displays the most genetic variation, in order to study functional capabilities from a pangenomic perspective. This information will prove useful when choosing Pseudomonas strains for use to promote growth and increase disease resistance in plants.

  6. A novel statistic for genome-wide interaction analysis.

    Directory of Open Access Journals (Sweden)

    Xuesen Wu

    2010-09-01

    Full Text Available Although great progress in genome-wide association studies (GWAS has been made, the significant SNP associations identified by GWAS account for only a few percent of the genetic variance, leading many to question where and how we can find the missing heritability. There is increasing interest in genome-wide interaction analysis as a possible source of finding heritability unexplained by current GWAS. However, the existing statistics for testing interaction have low power for genome-wide interaction analysis. To meet challenges raised by genome-wide interactional analysis, we have developed a novel statistic for testing interaction between two loci (either linked or unlinked. The null distribution and the type I error rates of the new statistic for testing interaction are validated using simulations. Extensive power studies show that the developed statistic has much higher power to detect interaction than classical logistic regression. The results identified 44 and 211 pairs of SNPs showing significant evidence of interactions with FDR<0.001 and 0.001genome-wide interaction analysis is a valuable tool for finding remaining missing heritability unexplained by the current GWAS, and the developed novel statistic is able to search significant interaction between SNPs across the genome. Real data analysis showed that the results of genome-wide interaction analysis can be replicated in two independent studies.

  7. Analysis of the Core Genome and Pan-Genome of Autotrophic Acetogenic Bacteria

    Science.gov (United States)

    Shin, Jongoh; Song, Yoseb; Jeong, Yujin; Cho, Byung-Kwan

    2016-01-01

    Acetogens are obligate anaerobic bacteria capable of reducing carbon dioxide (CO2) to multicarbon compounds coupled to the oxidation of inorganic substrates, such as hydrogen (H2) or carbon monoxide (CO), via the Wood-Ljungdahl pathway. Owing to the metabolic capability of CO2 fixation, much attention has been focused on understanding the unique pathways associated with acetogens, particularly their metabolic coupling of CO2 fixation to energy conservation. Most known acetogens are phylogenetically and metabolically diverse bacteria present in 23 different bacterial genera. With the increased volume of available genome information, acetogenic bacterial genomes can be analyzed by comparative genome analysis. Even with the genetic diversity that exists among acetogens, the Wood-Ljungdahl pathway, a central metabolic pathway, and cofactor biosynthetic pathways are highly conserved for autotrophic growth. Additionally, comparative genome analysis revealed that most genes in the acetogen-specific core genome were associated with the Wood-Ljungdahl pathway. The conserved enzymes and those predicted as missing can provide insight into biological differences between acetogens and allow for the discovery of promising candidates for industrial applications. PMID:27733845

  8. Analysis of the core genome and pan-genome of autotrophic acetogenic bacteria

    Directory of Open Access Journals (Sweden)

    JongOh Shin

    2016-09-01

    Full Text Available Acetogens are obligate anaerobic bacteria capable of reducing carbon dioxide (CO2 to multicarbon compounds coupled to the oxidation of inorganic substrates, such as hydrogen (H2 or carbon monoxide (CO, via the Wood-Ljungdahl pathway. Owing to the metabolic capability of CO2 fixation, much attention has been focused on understanding the unique pathways associated with acetogens, particularly their metabolic coupling of CO2 fixation to energy conservation. Most known acetogens are phylogenetically and metabolically diverse bacteria present in 23 different bacterial genera. With the increased volume of available genome information, acetogenic bacterial genomes can be analyzed by comparative genome analysis. Even with the genetic diversity that exists among acetogens, the Wood-Ljungdahl pathway, a central metabolic pathway, and cofactor biosynthetic pathways are highly conserved for autotrophic growth. Additionally, comparative genome analysis revealed that most genes in the acetogen-specific core genome were associated with the Wood-Ljungdahl pathway. The conserved enzymes and those predicted as missing can provide insight into biological differences between acetogens and allow for the discovery of promising candidates for industrial applications.

  9. Enhancing genomic laboratory reports: A qualitative analysis of provider review

    Science.gov (United States)

    Rahm, Alanna Kulchak; Stuckey, Heather; Green, Jamie; Feldman, Lynn; Zallen, Doris T.; Bonhag, Michele; Segal, Michael M.; Fan, Audrey L.; Williams, Marc S.

    2016-01-01

    This study reports on the responses of physicians who reviewed provider and patient versions of a genomic laboratory report designed to communicate results of whole genome sequencing. Semi‐structured interviews addressed concept communication, elements, and format of example genome reports. Analysis of the coded transcripts resulted in recognition of three constructs around communication of genome sequencing results: (1) Providers agreed that whole genomic sequencing results are complex and they welcomed a report that provided supportive interpretation information to accompany sequencing results; (2) Providers strongly endorsed a report that included active clinical guidance, such as reference to practice guidelines, if available; and (3) Providers valued the genomic report as a resource that would serve as the basis to facilitate communication of genome sequencing results with their patients and families. Providers valued both versions of the report, though they affirmed the need for a provider‐oriented report. Critical elements of the report included clear language to explain the result, as well as consolidated yet comprehensive prognostic information with clear guidance over time for the clinical care of the patient. Most importantly, it appears a report with this design has the potential not only to return results but also serves as a communication tool to help providers and patients discuss and coordinate care over time. © 2016 The Authors. American Journal of Medical Genetics Part A published by Wiley Periodicals, Inc. PMID:26842872

  10. Pan-genome sequence analysis using Panseq: an online tool for the rapid analysis of core and accessory genomic regions

    Directory of Open Access Journals (Sweden)

    Villegas Andre

    2010-09-01

    Full Text Available Abstract Background The pan-genome of a bacterial species consists of a core and an accessory gene pool. The accessory genome is thought to be an important source of genetic variability in bacterial populations and is gained through lateral gene transfer, allowing subpopulations of bacteria to better adapt to specific niches. Low-cost and high-throughput sequencing platforms have created an exponential increase in genome sequence data and an opportunity to study the pan-genomes of many bacterial species. In this study, we describe a new online pan-genome sequence analysis program, Panseq. Results Panseq was used to identify Escherichia coli O157:H7 and E. coli K-12 genomic islands. Within a population of 60 E. coli O157:H7 strains, the existence of 65 accessory genomic regions identified by Panseq analysis was confirmed by PCR. The accessory genome and binary presence/absence data, and core genome and single nucleotide polymorphisms (SNPs of six L. monocytogenes strains were extracted with Panseq and hierarchically clustered and visualized. The nucleotide core and binary accessory data were also used to construct maximum parsimony (MP trees, which were compared to the MP tree generated by multi-locus sequence typing (MLST. The topology of the accessory and core trees was identical but differed from the tree produced using seven MLST loci. The Loci Selector module found the most variable and discriminatory combinations of four loci within a 100 loci set among 10 strains in 1 s, compared to the 449 s required to exhaustively search for all possible combinations; it also found the most discriminatory 20 loci from a 96 loci E. coli O157:H7 SNP dataset. Conclusion Panseq determines the core and accessory regions among a collection of genomic sequences based on user-defined parameters. It readily extracts regions unique to a genome or group of genomes, identifies SNPs within shared core genomic regions, constructs files for use in phylogeny programs

  11. Analysis of the genomic homologous recombination in Theilovirus based on complete genomes

    Directory of Open Access Journals (Sweden)

    Yi Maoli

    2011-09-01

    Full Text Available Abstract At present, Theilovirus is considered to comprise four distinct serotypes, including Theiler's murine encephalomyelitis virus, Vilyuisk human encephalomyelitis virus, Thera virus, and Saffold virus. So far, there is no systematical study that investigated the genomic recombination of Theilovirus. The present study performed the phylogenetic and recombination analysis of Theilovirus over the complete genomes. Seven potentially significant recombination events were identified. However, according to the strains information and references related to the recombinants and their parental strains, four of the recombination events might happen non-naturally. These results will provide valuable hints for future research on evolution and antigenic variability of Theilovirus.

  12. Analysis of the genomic homologous recombination in Theilovirus based on complete genomes.

    Science.gov (United States)

    Sun, Guangming; Zhang, Xiaodan; Yi, Maoli; Shao, Shihe; Zhang, Wen

    2011-09-17

    At present, Theilovirus is considered to comprise four distinct serotypes, including Theiler's murine encephalomyelitis virus, Vilyuisk human encephalomyelitis virus, Thera virus, and Saffold virus. So far, there is no systematical study that investigated the genomic recombination of Theilovirus. The present study performed the phylogenetic and recombination analysis of Theilovirus over the complete genomes. Seven potentially significant recombination events were identified. However, according to the strains information and references related to the recombinants and their parental strains, four of the recombination events might happen non-naturally. These results will provide valuable hints for future research on evolution and antigenic variability of Theilovirus.

  13. Cytogenetic analysis from DNA by comparative genomic hybridization.

    Science.gov (United States)

    Tachdjian, G; Aboura, A; Lapierre, J M; Viguié, F

    2000-01-01

    Comparative genomic hybridization (CGH) is a modified in situ hybridization technique which allows detection and mapping of DNA sequence copy differences between two genomes in a single experiment. In CGH analysis, two differentially labelled genomic DNA (study and reference) are co-hybridized to normal metaphase spreads. Chromosomal locations of copy number changes in the DNA segments of the study genome are revealed by a variable fluorescence intensity ratio along each target chromosome. Since its development, CGH has been applied mostly as a research tool in the field of cancer cytogenetics to identify genetic changes in many previously unknown regions. CGH may also have a role in clinical cytogenetics for detection and identification of unbalanced chromosomal abnormalities.

  14. Genome wide copy number analysis of single cells

    Science.gov (United States)

    Baslan, Timour; Kendall, Jude; Rodgers, Linda; Cox, Hilary; Riggs, Mike; Stepansky, Asya; Troge, Jennifer; Ravi, Kandasamy; Esposito, Diane; Lakshmi, B.; Wigler, Michael; Navin, Nicholas; Hicks, James

    2016-01-01

    Summary Copy number variation (CNV) is increasingly recognized as an important contributor to phenotypic variation in health and disease. Most methods for determining CNV rely on admixtures of cells, where information regarding genetic heterogeneity is lost. Here, we present a protocol that allows for the genome wide copy number analysis of single nuclei isolated from mixed populations of cells. Single nucleus sequencing (SNS), combines flow sorting of single nuclei based on DNA content, whole genome amplification (WGA), followed by next generation sequencing to quantize genomic intervals in a genome wide manner. Multiplexing of single cells is discussed. Additionally, we outline informatic approaches that correct for biases inherent in the WGA procedure and allow for accurate determination of copy number profiles. All together, the protocol takes ~3 days from flow cytometry to sequence-ready DNA libraries. PMID:22555242

  15. Differential DNA Methylation Analysis without a Reference Genome

    Directory of Open Access Journals (Sweden)

    Johanna Klughammer

    2015-12-01

    Full Text Available Genome-wide DNA methylation mapping uncovers epigenetic changes associated with animal development, environmental adaptation, and species evolution. To address the lack of high-throughput methods for DNA methylation analysis in non-model organisms, we developed an integrated approach for studying DNA methylation differences independent of a reference genome. Experimentally, our method relies on an optimized 96-well protocol for reduced representation bisulfite sequencing (RRBS, which we have validated in nine species (human, mouse, rat, cow, dog, chicken, carp, sea bass, and zebrafish. Bioinformatically, we developed the RefFreeDMA software to deduce ad hoc genomes directly from RRBS reads and to pinpoint differentially methylated regions between samples or groups of individuals (http://RefFreeDMA.computational-epigenetics.org. The identified regions are interpreted using motif enrichment analysis and/or cross-mapping to annotated genomes. We validated our method by reference-free analysis of cell-type-specific DNA methylation in the blood of human, cow, and carp. In summary, we present a cost-effective method for epigenome analysis in ecology and evolution, which enables epigenome-wide association studies in natural populations and species without a reference genome.

  16. Differential DNA Methylation Analysis without a Reference Genome.

    Science.gov (United States)

    Klughammer, Johanna; Datlinger, Paul; Printz, Dieter; Sheffield, Nathan C; Farlik, Matthias; Hadler, Johanna; Fritsch, Gerhard; Bock, Christoph

    2015-12-22

    Genome-wide DNA methylation mapping uncovers epigenetic changes associated with animal development, environmental adaptation, and species evolution. To address the lack of high-throughput methods for DNA methylation analysis in non-model organisms, we developed an integrated approach for studying DNA methylation differences independent of a reference genome. Experimentally, our method relies on an optimized 96-well protocol for reduced representation bisulfite sequencing (RRBS), which we have validated in nine species (human, mouse, rat, cow, dog, chicken, carp, sea bass, and zebrafish). Bioinformatically, we developed the RefFreeDMA software to deduce ad hoc genomes directly from RRBS reads and to pinpoint differentially methylated regions between samples or groups of individuals (http://RefFreeDMA.computational-epigenetics.org). The identified regions are interpreted using motif enrichment analysis and/or cross-mapping to annotated genomes. We validated our method by reference-free analysis of cell-type-specific DNA methylation in the blood of human, cow, and carp. In summary, we present a cost-effective method for epigenome analysis in ecology and evolution, which enables epigenome-wide association studies in natural populations and species without a reference genome.

  17. Use of a non-radioactive hybridisation assay for direct detection of gram-negative bacteria carrying TEM beta-lactamase genes in infected urine.

    Science.gov (United States)

    Carter, G I; Towner, K J; Pearson, N J; Slack, R C

    1989-02-01

    DNA in infected urines from 81 patients with urinary tract infection was hybridised directly with a non-radioactive DNA probe specific for bacterial genes coding for TEM-type beta-lactamase. The results were assessed by means of a computerised image analysis system and compared with those obtained following isolation of the infecting organism, conventional sensitivity testing and isoelectric focusing (IEF) procedures for the detection of TEM-type beta-lactamase. Of the 27 ampicillin-resistant gram-negative organisms isolated in pure culture from the urines, 14 were shown by both hybridisation and IEF to carry a gene for TEM beta-lactamase production. Only four discordant results were obtained: three "false positive" direct hybridisation results, one due to urine pigmentation, and one, possibly, to a TEM beta-lactamase gene which was not being expressed, and one "false negative" result due to insufficient cell numbers in the urine. The system is capable of screening large numbers of samples and is applicable to any gene for which a suitable DNA probe is available.

  18. What’s in the genome of a filamentous fungus? Analysis of the Neurospora genome sequence

    Science.gov (United States)

    Mannhaupt, Gertrud; Montrone, Corinna; Haase, Dirk; Mewes, H. Werner; Aign, Verena; Hoheisel, Jörg D.; Fartmann, Berthold; Nyakatura, Gerald; Kempken, Frank; Maier, Josef; Schulte, Ulrich

    2003-01-01

    The German Neurospora Genome Project has assembled sequences from ordered cosmid and BAC clones of linkage groups II and V of the genome of Neurospora crassa in 13 and 12 contigs, respectively. Including additional sequences located on other linkage groups a total of 12 Mb were subjected to a manual gene extraction and annotation process. The genome comprises a small number of repetitive elements, a low degree of segmental duplications and very few paralogous genes. The analysis of the 3218 identified open reading frames provides a first overview of the protein equipment of a filamentous fungus. Significantly, N.crassa possesses a large variety of metabolic enzymes including a substantial number of enzymes involved in the degradation of complex substrates as well as secondary metabolism. While several of these enzymes are specific for filamentous fungi many are shared exclusively with prokaryotes. PMID:12655011

  19. Savant Genome Browser 2: visualization and analysis for population-scale genomics.

    Science.gov (United States)

    Fiume, Marc; Smith, Eric J M; Brook, Andrew; Strbenac, Dario; Turner, Brian; Mezlini, Aziz M; Robinson, Mark D; Wodak, Shoshana J; Brudno, Michael

    2012-07-01

    High-throughput sequencing (HTS) technologies are providing an unprecedented capacity for data generation, and there is a corresponding need for efficient data exploration and analysis capabilities. Although most existing tools for HTS data analysis are developed for either automated (e.g. genotyping) or visualization (e.g. genome browsing) purposes, such tools are most powerful when combined. For example, integration of visualization and computation allows users to iteratively refine their analyses by updating computational parameters within the visual framework in real-time. Here we introduce the second version of the Savant Genome Browser, a standalone program for visual and computational analysis of HTS data. Savant substantially improves upon its predecessor and existing tools by introducing innovative visualization modes and navigation interfaces for several genomic datatypes, and synergizing visual and automated analyses in a way that is powerful yet easy even for non-expert users. We also present a number of plugins that were developed by the Savant Community, which demonstrate the power of integrating visual and automated analyses using Savant. The Savant Genome Browser is freely available (open source) at www.savantbrowser.com.

  20. Genome analysis and comparative genomics of a Giardia intestinalis assemblage E isolate

    Directory of Open Access Journals (Sweden)

    Andersson Jan O

    2010-10-01

    Full Text Available Abstract Background Giardia intestinalis is a protozoan parasite that causes diarrhea in a wide range of mammalian species. To further understand the genetic diversity between the Giardia intestinalis species, we have performed genome sequencing and analysis of a wild-type Giardia intestinalis sample from the assemblage E group, isolated from a pig. Results We identified 5012 protein coding genes, the majority of which are conserved compared to the previously sequenced genomes of the WB and GS strains in terms of microsynteny and sequence identity. Despite this, there is an unexpectedly large number of chromosomal rearrangements and several smaller structural changes that are present in all chromosomes. Novel members of the VSP, NEK Kinase and HCMP gene families were identified, which may reveal possible mechanisms for host specificity and new avenues for antigenic variation. We used comparative genomics of the three diverse Giardia intestinalis isolates P15, GS and WB to define a core proteome for this species complex and to identify lineage-specific genes. Extensive analyses of polymorphisms in the core proteome of Giardia revealed differential rates of divergence among cellular processes. Conclusions Our results indicate that despite a well conserved core of genes there is significant genome variation between Giardia isolates, both in terms of gene content, gene polymorphisms, structural chromosomal variations and surface molecule repertoires. This study improves the annotation of the Giardia genomes and enables the identification of functionally important variation.

  1. Yeast as a touchstone in post-genomic research: strategies for integrative analysis in functional genomics.

    Science.gov (United States)

    Castrillo, Juan I; Oliver, Stephen G

    2004-01-31

    The new complexity arising from the genome sequencing projects requires new comprehensive post-genomic strategies: advanced studies in regulatory mechanisms, application of new high-throughput technologies at a genome-wide scale, at the different levels of cellular complexity (genome, transcriptome, proteome and metabolome), efficient analysis of the results, and application of new bioinformatic methods in an integrative or systems biology perspective. This can be accomplished in studies with model organisms under controlled conditions. In this review a perspective of the favourable characteristics of yeast as a touchstone model in post-genomic research is presented. The state-of-the art, latest advances in the field and bottlenecks, new strategies, new regulatory mechanisms, applications (patents) and high-throughput technologies, most of them being developed and validated in yeast, are presented. The optimal characteristics of yeast as a well-defined system for comprehensive studies under controlled conditions makes it a perfect model to be used in integrative, "systems biology" studies to get new insights into the mechanisms of regulation (regulatory networks) responsible of specific phenotypes under particular environmental conditions, to be applied to more complex organisms (e.g. plants, human).

  2. Sequencing and Analysis of a Genomic Fragment Provide an Insight into the Dunaliella viridis Genomic Sequence

    Institute of Scientific and Technical Information of China (English)

    Xiao-Ming SUN; Yuan-Ping TANG; Xiang-Zong MENG; Wen-Wen ZHANG; Shan LI; Zhi-Rui DENG; Zheng-Kai XU; Ren-Tao SONG

    2006-01-01

    Dunaliella is a genus of wall-less unicellular eukaryotic green alga. Its exceptional resistances to salt and various other stresses have made it an ideal model for stress tolerance study. However, very little is known about its genome and genomic sequences. In this study, we sequenced and analyzed a 29,268 bp genomic fragment from Dunaliella viridis. The fragment showed low sequence homology to the GenBank database. At the nucleotide level, only a segment with significant sequence homology to 18S rRNA was found. The fragment contained six putative genes, but only one gene showed significant homology at the protein level to GenBank database. The average GC content of this sequence was 51.1%, which was much lower than that of close related green algae Chlamydomonas (65.7%). Significant segmental duplications were found within this fragment. The duplicated sequences accounted for about 35.7% of the entire region. Large amounts of simple sequence repeats (microsatellites) were found, with strong bias towards (AC)n type (76%). Analysis of other Dunaliella genomic sequences in the GenBank database (total 25,749 bp) was in agreement with these findings. These sequence features made it difficult to sequence Dunaliella genomic sequences. Further investigation should be made to reveal the biological significance of these unique sequence features.

  3. Technological, mediatic and cultural hybridisation: Cultural mediations in the context of globalisation

    Directory of Open Access Journals (Sweden)

    Laan Mendes de Barros

    2009-12-01

    Full Text Available We live in a context of borders that are dissolving in many senses, of the convergence and hybridisation of technologies, mass media and cultures. The context is the resizing of practical time, of movements and links between the local and the global. In these times of interculturality, communication plays a very important role; not so much in its technological media dimension, but particularly in the dynamics of cultural mediations that are dividing off from mediatised relations. This article aims to reflect on the transformations in present-day communication processes, marked by strong movements of hybridisation, as well as examining how to consider interculturality in the context of cultural mediations, based on dialogue between Latin American and French authors. Also, using media material, the article presents illustrations of the Brazilian cultural scene, which is marked by a long history of hybridisation that is filled with intercultural dynamics.

  4. Hyperstructures, genome analysis and I-cells

    DEFF Research Database (Denmark)

    Amar, P.; Ballet, P.; Barlovatz-Meimon, G.

    2002-01-01

    familiar to biologists. Finally, we speculate on how a variety of in silico approaches involving cellular automata and multi-agent systems could be combined to develop new concepts in the form of an Integrated cell (I-cell) which would undergo selection for growth and survival in a world of artificial......New concepts may prove necessary to profit from the avalanche of sequence data on the genome, transcriptome, proteome and interactome and to relate this information to cell physiology. Here, we focus on the concept of large activity-based structures, or hyperstructures, in which a variety of types...... of molecules are brought together to perform a function. We review the evidence for the existence of hyperstructures responsible for the initiation of DNA replication, the sequestration of newly replicated origins of replication, cell division and for metabolism. The processes responsible for hyperstructure...

  5. Digital microarray analysis for digital artifact genomics

    Science.gov (United States)

    Jaenisch, Holger; Handley, James; Williams, Deborah

    2013-06-01

    We implement a Spatial Voting (SV) based analogy of microarray analysis for digital gene marker identification in malware code sections. We examine a famous set of malware formally analyzed by Mandiant and code named Advanced Persistent Threat (APT1). APT1 is a Chinese organization formed with specific intent to infiltrate and exploit US resources. Manidant provided a detailed behavior and sting analysis report for the 288 malware samples available. We performed an independent analysis using a new alternative to the traditional dynamic analysis and static analysis we call Spatial Analysis (SA). We perform unsupervised SA on the APT1 originating malware code sections and report our findings. We also show the results of SA performed on some members of the families associated by Manidant. We conclude that SV based SA is a practical fast alternative to dynamics analysis and static analysis.

  6. Genomic compositions and phylogenetic analysis of Shigella boydii subgroup

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    Comparative Genomic Hybridization (CGH) microarray analysis was used to compare the genomic compositions of all eighteen Shigella boydii serotype representative strains. The results indicated the genomic "backbone" of this subgroup contained 2552 ORFs homologous to nonpathogenic E. coli K12. Compared with the genome of K12199 ORFs were found to be absent in all S. boydii serotype representatives, including mainly outer membrane protein genes and O-antigen biosynthesis genes. Yet the specific ORFs of S. boydii subgroup contained basically bacteriophage genes and the function unknown (FUN) genes. Some iron metabolism, transport and type II secretion system related genes were found in most representative strains. According to the CGH phylogenetic analysis, the eighteen S. boydii serotype representatives were divided into four groups, in which serotype C13 strain was remarkably distinguished from the other serotype strains. This grouping result corresponded to the distribution of some metabolism related genes. Furthermore, the analysis of genome backbone genes, specific genes, and the phylogenetic trees allowed us to discover the evolution laws of S. boydii and to find out important clues to pathogenesis research, vaccination and the therapeutic medicine development.

  7. Comparative genomics of Mycoplasma: analysis of conserved essential genes and diversity of the pan-genome.

    Directory of Open Access Journals (Sweden)

    Wei Liu

    Full Text Available Mycoplasma, the smallest self-replicating organism with a minimal metabolism and little genomic redundancy, is expected to be a close approximation to the minimal set of genes needed to sustain bacterial life. This study employs comparative evolutionary analysis of twenty Mycoplasma genomes to gain an improved understanding of essential genes. By analyzing the core genome of mycoplasmas, we finally revealed the conserved essential genes set for mycoplasma survival. Further analysis showed that the core genome set has many characteristics in common with experimentally identified essential genes. Several key genes, which are related to DNA replication and repair and can be disrupted in transposon mutagenesis studies, may be critical for bacteria survival especially over long period natural selection. Phylogenomic reconstructions based on 3,355 homologous groups allowed robust estimation of phylogenetic relatedness among mycoplasma strains. To obtain deeper insight into the relative roles of molecular evolution in pathogen adaptation to their hosts, we also analyzed the positive selection pressures on particular sites and lineages. There appears to be an approximate correlation between the divergence of species and the level of positive selection detected in corresponding lineages.

  8. Hybridisation of short DNA molecules investigated with in situ atomic force microscopy

    DEFF Research Database (Denmark)

    Holmberg, Maria; Kuhle, A.; Garnaes, J.;

    2003-01-01

    By introducing the complementary DNA (cDNA) strand to a molecular layer of short single stranded DNA (ssDNA), immobilised on a gold surface, we have investigated hybridisation between the two DNA strands through the technique of in situ atomic force microscopy (AFM). Before introduction of c...... the two DNA strands has been studied. Introduction of the cDNA strand resulted in an increase in smoothness and thickness of the molecular layer. Both the increase in order and thickness of the molecular layer can be expected if hybridisation occurs, since double stranded DNA molecules have a more rigid...

  9. Genome-wide identification of the regulatory targets of a transcription factor using biochemical characterization and computational genomic analysis

    Directory of Open Access Journals (Sweden)

    Jolly Emmitt R

    2005-11-01

    Full Text Available Abstract Background A major challenge in computational genomics is the development of methodologies that allow accurate genome-wide prediction of the regulatory targets of a transcription factor. We present a method for target identification that combines experimental characterization of binding requirements with computational genomic analysis. Results Our method identified potential target genes of the transcription factor Ndt80, a key transcriptional regulator involved in yeast sporulation, using the combined information of binding affinity, positional distribution, and conservation of the binding sites across multiple species. We have also developed a mathematical approach to compute the false positive rate and the total number of targets in the genome based on the multiple selection criteria. Conclusion We have shown that combining biochemical characterization and computational genomic analysis leads to accurate identification of the genome-wide targets of a transcription factor. The method can be extended to other transcription factors and can complement other genomic approaches to transcriptional regulation.

  10. Species specific cpDNA markers useful for studies on the hybridisation between Pinus mugo and P. sylvestris

    Directory of Open Access Journals (Sweden)

    Witold Wachowiak

    2014-01-01

    Full Text Available PCR-RFLP technique has been used to detect species-specific mutations of organelles DNA for closely related dwarf mountain pine (Pinus mugo and Scots pine (P. sylvestris. Restriction fragment patterns have been compared of amplification products for trnL-trnF cpDNA and for coxI and orf25 genes of mtDNA. The difference has been found in the Dral and Hinfl restriction patterns of the amplification products for trnL-trnF region of cpDNA with two haplotypes detected. The haplotype M is characteristic for P. mugo and the haplotype S for P. sylvestris. These markers may be useful for the analysis of the natural hybridisation and introgression between these species postulated for some sympatric populations on the basis of morphological analysis. No differences have been disclosed in the studied mtDNA regions.

  11. Stacks: an analysis tool set for population genomics.

    Science.gov (United States)

    Catchen, Julian; Hohenlohe, Paul A; Bassham, Susan; Amores, Angel; Cresko, William A

    2013-06-01

    Massively parallel short-read sequencing technologies, coupled with powerful software platforms, are enabling investigators to analyse tens of thousands of genetic markers. This wealth of data is rapidly expanding and allowing biological questions to be addressed with unprecedented scope and precision. The sizes of the data sets are now posing significant data processing and analysis challenges. Here we describe an extension of the Stacks software package to efficiently use genotype-by-sequencing data for studies of populations of organisms. Stacks now produces core population genomic summary statistics and SNP-by-SNP statistical tests. These statistics can be analysed across a reference genome using a smoothed sliding window. Stacks also now provides several output formats for several commonly used downstream analysis packages. The expanded population genomics functions in Stacks will make it a useful tool to harness the newest generation of massively parallel genotyping data for ecological and evolutionary genetics.

  12. RAPD-based screening of genomic libraries for positional cloning.

    Science.gov (United States)

    Dioh, W; Tharreau, D; Lebrun, M H

    1997-12-15

    RAPD markers are frequently used for positional cloning. However, RAPD markers often contain repeated sequences which prevent genomic library screening by hybridisation. We have developed a simple RAPD analysis of genomic libraries based on the identification of cosmid pools and clones amplifying the RAPD marker of interest. Our method does not require the cloning or characterisation of the RAPD marker as it relies on the analysis of cosmid pools or clones using a simple RAPD protocol. We applied this strategy using four RAPD markers composed of single copy or repeated sequences linked to avirulence genes of the rice blast fungus Magnaporthe grisea . Cosmids containing these RAPD markers were easily and rapidly identified allowing the construction of physical contigs at these loci.

  13. Functional genomic analysis of C. elegans molting.

    Directory of Open Access Journals (Sweden)

    Alison R Frand

    2005-10-01

    Full Text Available Although the molting cycle is a hallmark of insects and nematodes, neither the endocrine control of molting via size, stage, and nutritional inputs nor the enzymatic mechanism for synthesis and release of the exoskeleton is well understood. Here, we identify endocrine and enzymatic regulators of molting in C. elegans through a genome-wide RNA-interference screen. Products of the 159 genes discovered include annotated transcription factors, secreted peptides, transmembrane proteins, and extracellular matrix enzymes essential for molting. Fusions between several genes and green fluorescent protein show a pulse of expression before each molt in epithelial cells that synthesize the exoskeleton, indicating that the corresponding proteins are made in the correct time and place to regulate molting. We show further that inactivation of particular genes abrogates expression of the green fluorescent protein reporter genes, revealing regulatory networks that might couple the expression of genes essential for molting to endocrine cues. Many molting genes are conserved in parasitic nematodes responsible for human disease, and thus represent attractive targets for pesticide and pharmaceutical development.

  14. Dyneins across eukaryotes: a comparative genomic analysis.

    Science.gov (United States)

    Wickstead, Bill; Gull, Keith

    2007-12-01

    Dyneins are large minus-end-directed microtubule motors. Each dynein contains at least one dynein heavy chain (DHC) and a variable number of intermediate chains (IC), light intermediate chains (LIC) and light chains (LC). Here, we used genome sequence data from 24 diverse eukaryotes to assess the distribution of DHCs, ICs, LICs and LCs across Eukaryota. Phylogenetic inference identified nine DHC families (two cytoplasmic and seven axonemal) and six IC families (one cytoplasmic). We confirm that dyneins have been lost from higher plants and show that this is most likely because of a single loss of cytoplasmic dynein 1 from the ancestor of Rhodophyta and Viridiplantae, followed by lineage-specific losses of other families. Independent losses in Entamoeba mean that at least three extant eukaryotic lineages are entirely devoid of dyneins. Cytoplasmic dynein 2 is associated with intraflagellar transport (IFT), but in two chromalveolate organisms, we find an IFT footprint without the retrograde motor. The distribution of one family of outer-arm dyneins accounts for 2-headed or 3-headed outer-arm ultrastructures observed in different organisms. One diatom species builds motile axonemes without any inner-arm dyneins (IAD), and the unexpected conservation of IAD I1 in non-flagellate algae and LC8 (DYNLL1/2) in all lineages reveals a surprising fluidity to dynein function.

  15. Primer to analysis of genomic data using R

    CERN Document Server

    Gondro, Cedric

    2015-01-01

    Through this book, researchers and students will learn to use R for analysis of large-scale genomic data and how to create routines to automate analytical steps. The philosophy behind the book is to start with real world raw datasets and perform all the analytical steps needed to reach final results. Though theory plays an important role, this is a practical book for advanced undergraduate and graduate classes in bioinformatics, genomics and statistical genetics or for use in lab sessions. This book is also designed to be used by students in computer science and statistics who want to learn the practical aspects of genomic analysis without delving into algorithmic details. The datasets used throughout the book may be downloaded from the publisher’s website.  Chapters show how to handle and manage high-throughput genomic data, create automated workflows and speed up analyses in R. A wide range of R packages useful for working with genomic data are illustrated with practical examples. In recent years R has b...

  16. Castor bean organelle genome sequencing and worldwide genetic diversity analysis.

    Directory of Open Access Journals (Sweden)

    Maximo Rivarola

    Full Text Available Castor bean is an important oil-producing plant in the Euphorbiaceae family. Its high-quality oil contains up to 90% of the unusual fatty acid ricinoleate, which has many industrial and medical applications. Castor bean seeds also contain ricin, a highly toxic Type 2 ribosome-inactivating protein, which has gained relevance in recent years due to biosafety concerns. In order to gain knowledge on global genetic diversity in castor bean and to ultimately help the development of breeding and forensic tools, we carried out an extensive chloroplast sequence diversity analysis. Taking advantage of the recently published genome sequence of castor bean, we assembled the chloroplast and mitochondrion genomes extracting selected reads from the available whole genome shotgun reads. Using the chloroplast reference genome we used the methylation filtration technique to readily obtain draft genome sequences of 7 geographically and genetically diverse castor bean accessions. These sequence data were used to identify single nucleotide polymorphism markers and phylogenetic analysis resulted in the identification of two major clades that were not apparent in previous population genetic studies using genetic markers derived from nuclear DNA. Two distinct sub-clades could be defined within each major clade and large-scale genotyping of castor bean populations worldwide confirmed previously observed low levels of genetic diversity and showed a broad geographic distribution of each sub-clade.

  17. Castor bean organelle genome sequencing and worldwide genetic diversity analysis.

    Science.gov (United States)

    Rivarola, Maximo; Foster, Jeffrey T; Chan, Agnes P; Williams, Amber L; Rice, Danny W; Liu, Xinyue; Melake-Berhan, Admasu; Huot Creasy, Heather; Puiu, Daniela; Rosovitz, M J; Khouri, Hoda M; Beckstrom-Sternberg, Stephen M; Allan, Gerard J; Keim, Paul; Ravel, Jacques; Rabinowicz, Pablo D

    2011-01-01

    Castor bean is an important oil-producing plant in the Euphorbiaceae family. Its high-quality oil contains up to 90% of the unusual fatty acid ricinoleate, which has many industrial and medical applications. Castor bean seeds also contain ricin, a highly toxic Type 2 ribosome-inactivating protein, which has gained relevance in recent years due to biosafety concerns. In order to gain knowledge on global genetic diversity in castor bean and to ultimately help the development of breeding and forensic tools, we carried out an extensive chloroplast sequence diversity analysis. Taking advantage of the recently published genome sequence of castor bean, we assembled the chloroplast and mitochondrion genomes extracting selected reads from the available whole genome shotgun reads. Using the chloroplast reference genome we used the methylation filtration technique to readily obtain draft genome sequences of 7 geographically and genetically diverse castor bean accessions. These sequence data were used to identify single nucleotide polymorphism markers and phylogenetic analysis resulted in the identification of two major clades that were not apparent in previous population genetic studies using genetic markers derived from nuclear DNA. Two distinct sub-clades could be defined within each major clade and large-scale genotyping of castor bean populations worldwide confirmed previously observed low levels of genetic diversity and showed a broad geographic distribution of each sub-clade.

  18. Castor Bean Organelle Genome Sequencing and Worldwide Genetic Diversity Analysis

    Science.gov (United States)

    Chan, Agnes P.; Williams, Amber L.; Rice, Danny W.; Liu, Xinyue; Melake-Berhan, Admasu; Huot Creasy, Heather; Puiu, Daniela; Rosovitz, M. J.; Khouri, Hoda M.; Beckstrom-Sternberg, Stephen M.; Allan, Gerard J.; Keim, Paul; Ravel, Jacques; Rabinowicz, Pablo D.

    2011-01-01

    Castor bean is an important oil-producing plant in the Euphorbiaceae family. Its high-quality oil contains up to 90% of the unusual fatty acid ricinoleate, which has many industrial and medical applications. Castor bean seeds also contain ricin, a highly toxic Type 2 ribosome-inactivating protein, which has gained relevance in recent years due to biosafety concerns. In order to gain knowledge on global genetic diversity in castor bean and to ultimately help the development of breeding and forensic tools, we carried out an extensive chloroplast sequence diversity analysis. Taking advantage of the recently published genome sequence of castor bean, we assembled the chloroplast and mitochondrion genomes extracting selected reads from the available whole genome shotgun reads. Using the chloroplast reference genome we used the methylation filtration technique to readily obtain draft genome sequences of 7 geographically and genetically diverse castor bean accessions. These sequence data were used to identify single nucleotide polymorphism markers and phylogenetic analysis resulted in the identification of two major clades that were not apparent in previous population genetic studies using genetic markers derived from nuclear DNA. Two distinct sub-clades could be defined within each major clade and large-scale genotyping of castor bean populations worldwide confirmed previously observed low levels of genetic diversity and showed a broad geographic distribution of each sub-clade. PMID:21750729

  19. Sequencing and analysis of the giant panda genome

    Institute of Scientific and Technical Information of China (English)

    YANG HuanMing

    2010-01-01

    @@ The giant panda (Ailuropoda melanoleuca) is loved all over the world and is considered a symbol of China, as illustrated by its being one of the mascots for the Beijing 2008 Olympic Games.It is also one of the world's most endangered animals and a flagship species for conservation.Using next-generation sequencing technology (Illumina Genome Analyzer) and our in-house assembly software, we have generated the first map of the giant panda genome sequence.This map will provide an unparalleled amount of information to aid in understanding the genetic and biological nature of this unique species and will contribute significantly to disease control and conservation efforts for this endangered species.In March 2008, the giant panda genome sequencing and analysis project was started at the Beijing Genomics Institute (BGI) in Shenzhen with collaborators from the Kunming Institute of Zoology and the Chengdu Research Base of Giant Panda Breeding.On 21 Jan.2010, this collaboration resulted in the publication, as a cover story in the journal Nature, of the sequencing and analysis of the giant panda genome.

  20. Comparative analysis of methods for genome-wide nucleosome cartography.

    Science.gov (United States)

    Quintales, Luis; Vázquez, Enrique; Antequera, Francisco

    2015-07-01

    Nucleosomes contribute to compacting the genome into the nucleus and regulate the physical access of regulatory proteins to DNA either directly or through the epigenetic modifications of the histone tails. Precise mapping of nucleosome positioning across the genome is, therefore, essential to understanding the genome regulation. In recent years, several experimental protocols have been developed for this purpose that include the enzymatic digestion, chemical cleavage or immunoprecipitation of chromatin followed by next-generation sequencing of the resulting DNA fragments. Here, we compare the performance and resolution of these methods from the initial biochemical steps through the alignment of the millions of short-sequence reads to a reference genome to the final computational analysis to generate genome-wide maps of nucleosome occupancy. Because of the lack of a unified protocol to process data sets obtained through the different approaches, we have developed a new computational tool (NUCwave), which facilitates their analysis, comparison and assessment and will enable researchers to choose the most suitable method for any particular purpose. NUCwave is freely available at http://nucleosome.usal.es/nucwave along with a step-by-step protocol for its use. © The Author 2014. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  1. Genome-wide gene expression analysis of anguillid herpesvirus 1

    NARCIS (Netherlands)

    Beurden, van S.J.; Peeters, B.P.H.; Rottier, P.J.M.; Davison, A.A.; Engelsma, M.Y.

    2013-01-01

    Background Whereas temporal gene expression in mammalian herpesviruses has been studied extensively, little is known about gene expression in fish herpesviruses. Here we report a genome-wide transcription analysis of a fish herpesvirus, anguillid herpesvirus 1, in cell culture, studied during the

  2. Integrated translational genomics for analysis of complex traits in sorghum

    Science.gov (United States)

    We will report on the integration of sequencing and genotype data from natural variation (by whole genome resequencing [wgs] or genotype by sequencing [gbs]), transcriptome (RNA-seq) and mutant analysis (also by wgs) with the goal of identifying genes controlling important agronomic traits and tran...

  3. Genome-Wide Association Analysis in Primary Sclerosing Cholangitis

    NARCIS (Netherlands)

    T.H. Karlsen; A. Franke; E. Melum; A.. Kaser; J.R. Hov; T. Balschun; B.A. Lie; A. Bergquist; C. Schramm; T.J. Weismüller; D. Gotthardt; C. Rust; E.E.R. Philipp; T. Fritz; L. Henckaerts; R. Weersma; P. Stokkers; C.Y. Ponsioen; C. Wijmenga; M. Sterneck; M. Nothnagel; J. Hampe; A. Teufel; H. Runz; P. Rosenstiel; A. Stiehl; S. Vermeire; U. Beuers; M. Manns; E. Schrumpf; K.M. Boberg; S. Schreiber

    2010-01-01

    BACKGROUND & AIMS: We aimed to characterize the genetic susceptibility to primary sclerosing cholangitis (PSC) by means of a genome-wide association analysis of single nucleotide polymorphism (SNP) markers. METHODS: A total of 443,816 SNPs on the Affymetrix SNP Array 5.0 (Affymetrix, Santa Clara, CA

  4. Comparative analysis of whole genome structure of Streptococcus suis using whole genome PCR scanning

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    An outbreak associated with Streptococcus suis infection in humans emerged in Sichuan province, China in 2005. The outbreak is atypical for the apparent large number of human cases, high fatality rate and geographical spread. To determine whether the bacterium has changed, we compared both human and animal isolates from the Sichuan outbreak with those collected previously within China and in other countries using whole genome PCR scanning (WGPScaning) comparative sequencing of several known virulence factor genes and multilocus sequence typing (MLST) analysis. WGPScanning analysis showed that all primer pairs yielded PCR products of the expected sizes in all four strains tested. The nucleotide sequences of all the detected virulence factor genes are identical in the four strains and MLST results showed that the four isolates studied and reference strain all belonged to the ST1 complex. No new genetic changes were found in the genome structure of the isolates from this Sichuan outbreak.

  5. Comparative analysis of whole genome structure of Streptococcus suis using whole genome PCR scanning

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    An outbreak associated with Streptococcus suis infection in humans emerged in Sichuan province, China in 2005. The outbreak is atypical for the apparent large number of human cases, high fatality rate and geographical spread. To determine whether the bacterium has changed, we compared both human and animal isolates from the Sichuan outbreak with those collected previously within China and in other countries using whole genome PCR scanning (WGPScaning) comparative sequencing of several known virulence factor genes and multilocus sequence typing (MLST) analysis. WGPScanning analysis showed that all primer pairs yielded PCR products of the expected sizes in all four strains tested. The nucleotide sequences of all the detected virulence factor genes are identical in the four strains and MLST results showed that the four isolates studied and reference strain all belonged to the ST1 com-plex. No new genetic changes were found in the genome structure of the isolates from this Sichuan outbreak.

  6. Genome bioinformatic analysis of nonsynonymous SNPs

    Directory of Open Access Journals (Sweden)

    Todd John A

    2007-08-01

    Full Text Available Abstract Background Genome-wide association studies of common diseases for common, low penetrance causal variants are underway. A proportion of these will alter protein sequences, the most common of which is the non-synonymous single nucleotide polymorphism (nsSNP. It would be an advantage if the functional effects of an nsSNP on protein structure and function could be predicted, both for the final identification process of a causal variant in a disease-associated chromosome region, and in further functional analyses of the nsSNP and its disease-associated protein. Results In the present report we have compared and contrasted structure- and sequence-based methods of prediction to over 5500 genes carrying nearly 24,000 nsSNPs, by employing an automatic comparative modelling procedure to build models for the genes. The nsSNP information came from two sources, the OMIM database which are rare (minor allele frequency, MAF, 0.05, have no known link to a disease. For over 40% of the nsSNPs, structure-based methods predicted which of these sequence changes are likely to either disrupt the structure of the protein or interfere with the function or interactions of the protein. For the remaining 60%, we generated sequence-based predictions. Conclusion We show that, in general, the prediction tools are able distinguish disease causing mutations from those mutations which are thought to have a neutral affect. We give examples of mutations in genes that are predicted to be deleterious and may have a role in disease. Contrary to previous reports, we also show that rare mutations are consistently predicted to be deleterious as often as commonly occurring nsSNPs.

  7. Avian polymavirus in wild birds: genome analysis of isolates from Falconiformes and Psittaciformes.

    Science.gov (United States)

    Johne, R; Müller, H

    1998-01-01

    Avian polyomavirus (APV) infections have been reported to cause fatal disease in a wide range of psittacine species. Here we demonstrate APV infections in buzzards (Buteo buteo) and in a falcon (Falco tinnunculus) found dead in Germany, and in lovebirds (Agapornis pullaria) with fatal disease, wild-caught in Moçambique. APV infection in buzzards was determined by PCR amplification of parts of the viral genome followed by Southern blot hybridisation. The genomes of the isolates obtained from the falcon and one of the lovebirds proved to be very closely related to those of Budgerigar Fledgling Disease Virus (BFDV)-1, BFDV-2 and BFDV-3, isolated from budgerigar, chicken, and parakeet, respectively. A consensus sequence was delineated from the known nucleotide sequences of APV isolates. The significance of some nucleotide changes is discussed. Infectivity of all of these isolates was neutralized by antibodies directed against BFDV-1. Data presented in this investigation show that the polyomavirus isolates obtained from different avian species so far all belong to one genotype and one serotype within the proposed subgenus Avipolyomavirus of the family Papovaviridae. The designation Budgerigar Fledgling Disease Virus (BFDV) is, therefore, misleading as this virus type infects different species of birds. The name Avian Polymavirus and the abreviation APV should be adopted to all of the isolates investigated in detail at present. The possible role of birds of passage in the epidemiology in APV infections is discussed.

  8. Sequencing and annotated analysis of an Estonian human genome.

    Science.gov (United States)

    Lilleoja, Rutt; Sarapik, Aili; Reimann, Ene; Reemann, Paula; Jaakma, Ülle; Vasar, Eero; Kõks, Sulev

    2012-02-01

    In present study we describe the sequencing and annotated analysis of the individual genome of Estonian. Using SOLID technology we generated 2,449,441,916 of 50-bp reads. The Bioscope version 1.3 was used for mapping and pairing of reads to the NCBI human genome reference (build 36, hg18). Bioscope enables also the annotation of the results of variant (tertiary) analysis. The average mapping of reads was 75.5% with total coverage of 107.72 Gb. resulting in mean fold coverage of 34.6. We found 3,482,975 SNPs out of which 352,492 were novel. 21,222 SNPs were in coding region: 10,649 were synonymous SNPs, 10,360 were nonsynonymous missense SNPs, 155 were nonsynonymous nonsense SNPs and 58 were nonsynonymous frameshifts. We identified 219 CNVs with total base pair coverage of 37,326,300 bp and 87,451 large insertion/deletion polymorphisms covering 10,152,256 bp of the genome. In addition, we found 285,864 small size insertion/deletion polymorphisms out of which 133,969 were novel. Finally, we identified 53 inversions, 19 overlapped genes and 2 overlapped exons. Interestingly, we found the region in chromosome 6 to be enriched with the coding SNPs and CNVs. This study confirms previous findings, that our genomes are more complex and variable as thought before. Therefore, sequencing of the personal genomes followed by annotation would improve the analysis of heritability of phenotypes and our understandings on the functions of genome.

  9. Sequencing and Analysis of Neanderthal Genomic DNA

    Energy Technology Data Exchange (ETDEWEB)

    Noonan, James P.; Coop, Graham; Kudaravalli, Sridhar; Smith,Doug; Krause, Johannes; Alessi, Joe; Chen, Feng; Platt, Darren; Paabo,Svante; Pritchard, Jonathan K.; Rubin, Edward M.

    2006-06-13

    Recovery and analysis of multiple Neanderthal autosomalsequences using a metagenomic approach reveals that modern humans andNeanderthals split ~;400,000 years ago, without significant evidence ofsubsequent admixture.

  10. Whole-genome sequence-based analysis of thyroid function

    OpenAIRE

    Taylor, Peter N; Porcu, Eleonora; Chew, Shelby; Campbell, Purdey J.; Traglia, Michela; Brown, Suzanne J.; Mullin, Benjamin H; Shihab, Hashem A.; Min, Josine; Walter, Klaudia; Memari, Yasin; Huang, Jie; Barnes, Michael R.; Beilby, John P.; Charoen, Pimphen

    2015-01-01

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N=2,287). Using additional whole-genome sequence and deeply imputed data sets, we report meta-analysis results for common variants (MAF≥1%) associated with TSH and FT4 (N=16,335). For TSH, we identify a novel variant in SYN2 (MAF=23.5%, P=6.15 × 1...

  11. Large-scale genomic analysis of ovarian carcinomas.

    Science.gov (United States)

    Gorringe, Kylie L; Campbell, Ian G

    2009-04-01

    Epithelial ovarian cancers are typified by frequent genomic aberrations that have been difficult to unravel. Recently, high-resolution array technologies have provided the first glimpse of the remarkable complexity of these aberrations with some ovarian cancers containing hundreds of copy number breakpoints, micro-deletions and amplifications. Many of these alterations contain cancer-related genes suggesting that the majority is disease-associated and not just the product of random genomic instability. Future developments such as next-generation sequencing and integrated analysis of data from multiple array platforms on large numbers of samples are poised to revolutionize our understanding of this complex disease.

  12. Analysis of recent segmental duplications in the bovine genome

    Directory of Open Access Journals (Sweden)

    Li Congjun

    2009-12-01

    Full Text Available Abstract Background Duplicated sequences are an important source of gene innovation and structural variation within mammalian genomes. We performed the first systematic and genome-wide analysis of segmental duplications in the modern domesticated cattle (Bos taurus. Using two distinct computational analyses, we estimated that 3.1% (94.4 Mb of the bovine genome consists of recently duplicated sequences (≥ 1 kb in length, ≥ 90% sequence identity. Similar to other mammalian draft assemblies, almost half (47% of 94.4 Mb of these sequences have not been assigned to cattle chromosomes. Results In this study, we provide the first experimental validation large duplications and briefly compared their distribution on two independent bovine genome assemblies using fluorescent in situ hybridization (FISH. Our analyses suggest that the (75-90% of segmental duplications are organized into local tandem duplication clusters. Along with rodents and carnivores, these results now confidently establish tandem duplications as the most likely mammalian archetypical organization, in contrast to humans and great ape species which show a preponderance of interspersed duplications. A cross-species survey of duplicated genes and gene families indicated that duplication, positive selection and gene conversion have shaped primates, rodents, carnivores and ruminants to different degrees for their speciation and adaptation. We identified that bovine segmental duplications corresponding to genes are significantly enriched for specific biological functions such as immunity, digestion, lactation and reproduction. Conclusion Our results suggest that in most mammalian lineages segmental duplications are organized in a tandem configuration. Segmental duplications remain problematic for genome and assembly and we highlight genic regions that require higher quality sequence characterization. This study provides insights into mammalian genome evolution and generates a valuable

  13. Genome Assembly and Computational Analysis Pipelines for Bacterial Pathogens

    KAUST Repository

    Rangkuti, Farania Gama Ardhina

    2011-06-01

    Pathogens lie behind the deadliest pandemics in history. To date, AIDS pandemic has resulted in more than 25 million fatal cases, while tuberculosis and malaria annually claim more than 2 million lives. Comparative genomic analyses are needed to gain insights into the molecular mechanisms of pathogens, but the abundance of biological data dictates that such studies cannot be performed without the assistance of computational approaches. This explains the significant need for computational pipelines for genome assembly and analyses. The aim of this research is to develop such pipelines. This work utilizes various bioinformatics approaches to analyze the high-­throughput genomic sequence data that has been obtained from several strains of bacterial pathogens. A pipeline has been compiled for quality control for sequencing and assembly, and several protocols have been developed to detect contaminations. Visualization has been generated of genomic data in various formats, in addition to alignment, homology detection and sequence variant detection. We have also implemented a metaheuristic algorithm that significantly improves bacterial genome assemblies compared to other known methods. Experiments on Mycobacterium tuberculosis H37Rv data showed that our method resulted in improvement of N50 value of up to 9697% while consistently maintaining high accuracy, covering around 98% of the published reference genome. Other improvement efforts were also implemented, consisting of iterative local assemblies and iterative correction of contiguated bases. Our result expedites the genomic analysis of virulent genes up to single base pair resolution. It is also applicable to virtually every pathogenic microorganism, propelling further research in the control of and protection from pathogen-­associated diseases.

  14. Genome analysis of the platypus reveals unique signatures of evolution.

    Science.gov (United States)

    Warren, Wesley C; Hillier, LaDeana W; Marshall Graves, Jennifer A; Birney, Ewan; Ponting, Chris P; Grützner, Frank; Belov, Katherine; Miller, Webb; Clarke, Laura; Chinwalla, Asif T; Yang, Shiaw-Pyng; Heger, Andreas; Locke, Devin P; Miethke, Pat; Waters, Paul D; Veyrunes, Frédéric; Fulton, Lucinda; Fulton, Bob; Graves, Tina; Wallis, John; Puente, Xose S; López-Otín, Carlos; Ordóñez, Gonzalo R; Eichler, Evan E; Chen, Lin; Cheng, Ze; Deakin, Janine E; Alsop, Amber; Thompson, Katherine; Kirby, Patrick; Papenfuss, Anthony T; Wakefield, Matthew J; Olender, Tsviya; Lancet, Doron; Huttley, Gavin A; Smit, Arian F A; Pask, Andrew; Temple-Smith, Peter; Batzer, Mark A; Walker, Jerilyn A; Konkel, Miriam K; Harris, Robert S; Whittington, Camilla M; Wong, Emily S W; Gemmell, Neil J; Buschiazzo, Emmanuel; Vargas Jentzsch, Iris M; Merkel, Angelika; Schmitz, Juergen; Zemann, Anja; Churakov, Gennady; Kriegs, Jan Ole; Brosius, Juergen; Murchison, Elizabeth P; Sachidanandam, Ravi; Smith, Carly; Hannon, Gregory J; Tsend-Ayush, Enkhjargal; McMillan, Daniel; Attenborough, Rosalind; Rens, Willem; Ferguson-Smith, Malcolm; Lefèvre, Christophe M; Sharp, Julie A; Nicholas, Kevin R; Ray, David A; Kube, Michael; Reinhardt, Richard; Pringle, Thomas H; Taylor, James; Jones, Russell C; Nixon, Brett; Dacheux, Jean-Louis; Niwa, Hitoshi; Sekita, Yoko; Huang, Xiaoqiu; Stark, Alexander; Kheradpour, Pouya; Kellis, Manolis; Flicek, Paul; Chen, Yuan; Webber, Caleb; Hardison, Ross; Nelson, Joanne; Hallsworth-Pepin, Kym; Delehaunty, Kim; Markovic, Chris; Minx, Pat; Feng, Yucheng; Kremitzki, Colin; Mitreva, Makedonka; Glasscock, Jarret; Wylie, Todd; Wohldmann, Patricia; Thiru, Prathapan; Nhan, Michael N; Pohl, Craig S; Smith, Scott M; Hou, Shunfeng; Nefedov, Mikhail; de Jong, Pieter J; Renfree, Marilyn B; Mardis, Elaine R; Wilson, Richard K

    2008-05-08

    We present a draft genome sequence of the platypus, Ornithorhynchus anatinus. This monotreme exhibits a fascinating combination of reptilian and mammalian characters. For example, platypuses have a coat of fur adapted to an aquatic lifestyle; platypus females lactate, yet lay eggs; and males are equipped with venom similar to that of reptiles. Analysis of the first monotreme genome aligned these features with genetic innovations. We find that reptile and platypus venom proteins have been co-opted independently from the same gene families; milk protein genes are conserved despite platypuses laying eggs; and immune gene family expansions are directly related to platypus biology. Expansions of protein, non-protein-coding RNA and microRNA families, as well as repeat elements, are identified. Sequencing of this genome now provides a valuable resource for deep mammalian comparative analyses, as well as for monotreme biology and conservation.

  15. Phylogeny and comparative genome analysis of a Basidiomycete fungi

    Energy Technology Data Exchange (ETDEWEB)

    Riley, Robert W.; Salamov, Asaf; Grigoriev, Igor; Hibbett, David

    2011-03-14

    Fungi of the phylum Basidiomycota, make up some 37percent of the described fungi, and are important from the perspectives of forestry, agriculture, medicine, and bioenergy. This diverse phylum includes the mushrooms, wood rots, plant pathogenic rusts and smuts, and some human pathogens. To better understand these important fungi, we have undertaken a comparative genomic analysis of the Basidiomycetes with available sequenced genomes. We report a phylogeny that sheds light on previously unclear evolutionary relationships among the Basidiomycetes. We also define a `core proteome? based on protein families conserved in all Basidiomycetes. We identify key expansions and contractions in protein families that may be responsible for the degradation of plant biomass such as cellulose, hemicellulose, and lignin. Finally, we speculate as to the genomic changes that drove such expansions and contractions.

  16. Genome analysis of the platypus reveals unique signatures of evolution

    Science.gov (United States)

    Warren, Wesley C.; Hillier, LaDeana W.; Marshall Graves, Jennifer A.; Birney, Ewan; Ponting, Chris P.; Grützner, Frank; Belov, Katherine; Miller, Webb; Clarke, Laura; Chinwalla, Asif T.; Yang, Shiaw-Pyng; Heger, Andreas; Locke, Devin P.; Miethke, Pat; Waters, Paul D.; Veyrunes, Frédéric; Fulton, Lucinda; Fulton, Bob; Graves, Tina; Wallis, John; Puente, Xose S.; López-Otín, Carlos; Ordóñez, Gonzalo R.; Eichler, Evan E.; Chen, Lin; Cheng, Ze; Deakin, Janine E.; Alsop, Amber; Thompson, Katherine; Kirby, Patrick; Papenfuss, Anthony T.; Wakefield, Matthew J.; Olender, Tsviya; Lancet, Doron; Huttley, Gavin A.; Smit, Arian F. A.; Pask, Andrew; Temple-Smith, Peter; Batzer, Mark A.; Walker, Jerilyn A.; Konkel, Miriam K.; Harris, Robert S.; Whittington, Camilla M.; Wong, Emily S. W.; Gemmell, Neil J.; Buschiazzo, Emmanuel; Vargas Jentzsch, Iris M.; Merkel, Angelika; Schmitz, Juergen; Zemann, Anja; Churakov, Gennady; Kriegs, Jan Ole; Brosius, Juergen; Murchison, Elizabeth P.; Sachidanandam, Ravi; Smith, Carly; Hannon, Gregory J.; Tsend-Ayush, Enkhjargal; McMillan, Daniel; Attenborough, Rosalind; Rens, Willem; Ferguson-Smith, Malcolm; Lefèvre, Christophe M.; Sharp, Julie A.; Nicholas, Kevin R.; Ray, David A.; Kube, Michael; Reinhardt, Richard; Pringle, Thomas H.; Taylor, James; Jones, Russell C.; Nixon, Brett; Dacheux, Jean-Louis; Niwa, Hitoshi; Sekita, Yoko; Huang, Xiaoqiu; Stark, Alexander; Kheradpour, Pouya; Kellis, Manolis; Flicek, Paul; Chen, Yuan; Webber, Caleb; Hardison, Ross; Nelson, Joanne; Hallsworth-Pepin, Kym; Delehaunty, Kim; Markovic, Chris; Minx, Pat; Feng, Yucheng; Kremitzki, Colin; Mitreva, Makedonka; Glasscock, Jarret; Wylie, Todd; Wohldmann, Patricia; Thiru, Prathapan; Nhan, Michael N.; Pohl, Craig S.; Smith, Scott M.; Hou, Shunfeng; Renfree, Marilyn B.; Mardis, Elaine R.; Wilson, Richard K.

    2009-01-01

    We present a draft genome sequence of the platypus, Ornithorhynchus anatinus. This monotreme exhibits a fascinating combination of reptilian and mammalian characters. For example, platypuses have a coat of fur adapted to an aquatic lifestyle; platypus females lactate, yet lay eggs; and males are equipped with venom similar to that of reptiles. Analysis of the first monotreme genome aligned these features with genetic innovations. We find that reptile and platypus venom proteins have been co-opted independently from the same gene families; milk protein genes are conserved despite platypuses laying eggs; and immune gene family expansions are directly related to platypus biology. Expansions of protein, non-protein-coding RNA and microRNA families, as well as repeat elements, are identified. Sequencing of this genome now provides a valuable resource for deep mammalian comparative analyses, as well as for monotreme biology and conservation. PMID:18464734

  17. Climate-induced range shifts and possible hybridisation consequences in insects.

    Directory of Open Access Journals (Sweden)

    Rosa Ana Sánchez-Guillén

    Full Text Available Many ectotherms have altered their geographic ranges in response to rising global temperatures. Current range shifts will likely increase the sympatry and hybridisation between recently diverged species. Here we predict future sympatric distributions and risk of hybridisation in seven Mediterranean ischnurid damselfly species (I. elegans, I. fountaineae, I. genei, I. graellsii, I. pumilio, I. saharensis and I. senegalensis. We used a maximum entropy modelling technique to predict future potential distribution under four different Global Circulation Models and a realistic emissions scenario of climate change. We carried out a comprehensive data compilation of reproductive isolation (habitat, temporal, sexual, mechanical and gametic between the seven studied species. Combining the potential distribution and data of reproductive isolation at different instances (habitat, temporal, sexual, mechanical and gametic, we infer the risk of hybridisation in these insects. Our findings showed that all but I. graellsii will decrease in distributional extent and all species except I. senegalensis are predicted to have northern range shifts. Models of potential distribution predicted an increase of the likely overlapping ranges for 12 species combinations, out of a total of 42 combinations, 10 of which currently overlap. Moreover, the lack of complete reproductive isolation and the patterns of hybridisation detected between closely related ischnurids, could lead to local extinctions of native species if the hybrids or the introgressed colonising species become more successful.

  18. An enzyme-based in situ hybridisation method for the identification of Streptococcus suis - Brief report

    DEFF Research Database (Denmark)

    Madsen, L.W.; Boye, Mette; Jensen, Henrik E

    2001-01-01

    A method for enzyme-based in situ hybridisation of Streptococcus suis was developed. It enables the light microscopic localization of bacterial ribosomal RNA (rRNA) in formalin-fixed paraffin-embedded tissues. A unique sequence in the 16S rRNA of S. suis was targeted. Different pretreatment...

  19. Exploring accounting and sustainable development hybridisation in the UK public sector

    NARCIS (Netherlands)

    I. Thomson; S. Grubnic; G. Georgakopoulos; D. Owen

    2011-01-01

    This paper explores the relationship between accounting and sustainable development in two public sector contexts in the United Kingdom. By employing Miller et al.’s (2008) extended notion of hybridisation, the paper investigates transformations associated with practices, processes and expertises de

  20. Sequence analysis of the genome of carnation (Dianthus caryophyllus L.).

    Science.gov (United States)

    Yagi, Masafumi; Kosugi, Shunichi; Hirakawa, Hideki; Ohmiya, Akemi; Tanase, Koji; Harada, Taro; Kishimoto, Kyutaro; Nakayama, Masayoshi; Ichimura, Kazuo; Onozaki, Takashi; Yamaguchi, Hiroyasu; Sasaki, Nobuhiro; Miyahara, Taira; Nishizaki, Yuzo; Ozeki, Yoshihiro; Nakamura, Noriko; Suzuki, Takamasa; Tanaka, Yoshikazu; Sato, Shusei; Shirasawa, Kenta; Isobe, Sachiko; Miyamura, Yoshinori; Watanabe, Akiko; Nakayama, Shinobu; Kishida, Yoshie; Kohara, Mitsuyo; Tabata, Satoshi

    2014-06-01

    The whole-genome sequence of carnation (Dianthus caryophyllus L.) cv. 'Francesco' was determined using a combination of different new-generation multiplex sequencing platforms. The total length of the non-redundant sequences was 568,887,315 bp, consisting of 45,088 scaffolds, which covered 91% of the 622 Mb carnation genome estimated by k-mer analysis. The N50 values of contigs and scaffolds were 16,644 bp and 60,737 bp, respectively, and the longest scaffold was 1,287,144 bp. The average GC content of the contig sequences was 36%. A total of 1050, 13, 92 and 143 genes for tRNAs, rRNAs, snoRNA and miRNA, respectively, were identified in the assembled genomic sequences. For protein-encoding genes, 43 266 complete and partial gene structures excluding those in transposable elements were deduced. Gene coverage was ∼ 98%, as deduced from the coverage of the core eukaryotic genes. Intensive characterization of the assigned carnation genes and comparison with those of other plant species revealed characteristic features of the carnation genome. The results of this study will serve as a valuable resource for fundamental and applied research of carnation, especially for breeding new carnation varieties. Further information on the genomic sequences is available at http://carnation.kazusa.or.jp.

  1. A comprehensive analysis of bilaterian mitochondrial genomes and phylogeny.

    Science.gov (United States)

    Bernt, Matthias; Bleidorn, Christoph; Braband, Anke; Dambach, Johannes; Donath, Alexander; Fritzsch, Guido; Golombek, Anja; Hadrys, Heike; Jühling, Frank; Meusemann, Karen; Middendorf, Martin; Misof, Bernhard; Perseke, Marleen; Podsiadlowski, Lars; von Reumont, Björn; Schierwater, Bernd; Schlegel, Martin; Schrödl, Michael; Simon, Sabrina; Stadler, Peter F; Stöger, Isabella; Struck, Torsten H

    2013-11-01

    About 2800 mitochondrial genomes of Metazoa are present in NCBI RefSeq today, two thirds belonging to vertebrates. Metazoan phylogeny was recently challenged by large scale EST approaches (phylogenomics), stabilizing classical nodes while simultaneously supporting new sister group hypotheses. The use of mitochondrial data in deep phylogeny analyses was often criticized because of high substitution rates on nucleotides, large differences in amino acid substitution rate between taxa, and biases in nucleotide frequencies. Nevertheless, mitochondrial genome data might still be promising as it allows for a larger taxon sampling, while presenting a smaller amount of sequence information. We present the most comprehensive analysis of bilaterian relationships based on mitochondrial genome data. The analyzed data set comprises more than 650 mitochondrial genomes that have been chosen to represent a profound sample of the phylogenetic as well as sequence diversity. The results are based on high quality amino acid alignments obtained from a complete reannotation of the mitogenomic sequences from NCBI RefSeq database. However, the results failed to give support for many otherwise undisputed high-ranking taxa, like Mollusca, Hexapoda, Arthropoda, and suffer from extreme long branches of Nematoda, Platyhelminthes, and some other taxa. In order to identify the sources of misleading phylogenetic signals, we discuss several problems associated with mitochondrial genome data sets, e.g. the nucleotide and amino acid landscapes and a strong correlation of gene rearrangements with long branches.

  2. The Chlamydia psittaci genome: a comparative analysis of intracellular pathogens.

    Directory of Open Access Journals (Sweden)

    Anja Voigt

    Full Text Available BACKGROUND: Chlamydiaceae are a family of obligate intracellular pathogens causing a wide range of diseases in animals and humans, and facing unique evolutionary constraints not encountered by free-living prokaryotes. To investigate genomic aspects of infection, virulence and host preference we have sequenced Chlamydia psittaci, the pathogenic agent of ornithosis. RESULTS: A comparison of the genome of the avian Chlamydia psittaci isolate 6BC with the genomes of other chlamydial species, C. trachomatis, C. muridarum, C. pneumoniae, C. abortus, C. felis and C. caviae, revealed a high level of sequence conservation and synteny across taxa, with the major exception of the human pathogen C. trachomatis. Important differences manifest in the polymorphic membrane protein family specific for the Chlamydiae and in the highly variable chlamydial plasticity zone. We identified a number of psittaci-specific polymorphic membrane proteins of the G family that may be related to differences in host-range and/or virulence as compared to closely related Chlamydiaceae. We calculated non-synonymous to synonymous substitution rate ratios for pairs of orthologous genes to identify putative targets of adaptive evolution and predicted type III secreted effector proteins. CONCLUSIONS: This study is the first detailed analysis of the Chlamydia psittaci genome sequence. It provides insights in the genome architecture of C. psittaci and proposes a number of novel candidate genes mostly of yet unknown function that may be important for pathogen-host interactions.

  3. Viral genome analysis and knowledge management.

    Science.gov (United States)

    Kuiken, Carla; Yoon, Hyejin; Abfalterer, Werner; Gaschen, Brian; Lo, Chienchi; Korber, Bette

    2013-01-01

    One of the challenges of genetic data analysis is to combine information from sources that are distributed around the world and accessible through a wide array of different methods and interfaces. The HIV database and its footsteps, the hepatitis C virus (HCV) and hemorrhagic fever virus (HFV) databases, have made it their mission to make different data types easily available to their users. This involves a large amount of behind-the-scenes processing, including quality control and analysis of the sequences and their annotation. Gene and protein sequences are distilled from the sequences that are stored in GenBank; to this end, both submitter annotation and script-generated sequences are used. Alignments of both nucleotide and amino acid sequences are generated, manually curated, distilled into an alignment model, and regenerated in an iterative cycle that results in ever better new alignments. Annotation of epidemiological and clinical information is parsed, checked, and added to the database. User interfaces are updated, and new interfaces are added based upon user requests. Vital for its success, the database staff are heavy users of the system, which enables them to fix bugs and find opportunities for improvement. In this chapter we describe some of the infrastructure that keeps these heavily used analysis platforms alive and vital after nearly 25 years of use. The database/analysis platforms described in this chapter can be accessed at http://hiv.lanl.gov http://hcv.lanl.gov http://hfv.lanl.gov.

  4. Sequence analysis reveals mosaic genome of Aichi virus

    Directory of Open Access Journals (Sweden)

    Han Xiaohong

    2011-08-01

    Full Text Available Abstract Aichi virus is a positive-sense and single-stranded RNA virus, which demonstrated to be related to diarrhea of Children. In the present study, phylogenetic and recombination analysis based on the Aichi virus complete genomes available in GenBank reveal a mosaic genome sequence [GenBank: FJ890523], of which the nt 261-852 region (the nt position was based on the aligned sequence file shows close relationship with AB010145/Japan with 97.9% sequence identity, while the other genomic regions show close relationship with AY747174/German with 90.1% sequence identity. Our results will provide valuable hints for future research on Aichi virus diversity. Aichi virus is a member of the Kobuvirus genus of the Picornaviridae family 12 and belongs to a positive-sense and single-stranded RNA virus. Its presence in fecal specimens of children suffering from diarrhea has been demonstrated in several Asian countries 3456, in Brazil and German 7, in France 8 and in Tunisia 9. Some reports showed the high level of seroprevalence in adults 710, suggesting the widespread exposure to Aichi virus during childhood. The genome of Aichi virus contains 8,280 nucleotides and a poly(A tail. The single large open reading frame (nt 713-8014 according to the strain AB010145 encodes a polyprotein of 2,432 amino acids that is cleaved into the typical picornavirus structural proteins VP0, VP3, VP1, and nonstructural proteins 2A, 2B, 2C, 3A, 3B, 3C and 3D 211. Based on the phylogenetic analysis of 519-bp sequences at the 3C-3D (3CD junction, Aichi viruses can be divided into two genotypes A and B with approximately 90% sequence homology 12. Although only six complete genomes of Aichi virus were deposited in GenBank at present, mosaic genomes can be found in strains from different countries.

  5. Genome-wide Analysis of Gene Regulation

    DEFF Research Database (Denmark)

    Chen, Yun

    IP-seq and small RNA-seq, we delineated the landscape of the promoters with bidirectional transcriptions that yield steady-state RNA in only one directions (Paper III). A subsequent motif analysis enabled us to uncover specific DNA signals – early polyA sites – that make RNA on the reverse strand sensitive...... they regulated or if the sites had global elevated usage rates by multiple TFs. Using RNA-seq, 5’end-seq in combination with depletion of 5’exonuclease as well as nonsensemediated decay (NMD) factors, we systematically analyzed NMD substrates as well as their degradation intermediates in human cells (Paper V......). Gene enrichment analysis on the detected NMD substrates revealed an unappreciated NMD-based regulatory mechanism of the genes hosting multiple intronic snoRNAs, which can facilitate differential expression of individual snoRNAs from a single host gene locus. Finally, supported by RNA-seq and small RNA-seq...

  6. Sequencing and Analysis of Neanderthal Genomic DNA

    OpenAIRE

    Noonan, James P.; Coop, Graham; Kudaravalli, Sridhar; Smith, Doug; Krause, Johannes; Alessi, Joe; Chen, Feng; Platt, Darren; Paabo, Svante; Pritchard, Jonathan K; Rubin, Edward M.

    2006-01-01

    Our knowledge of Neanderthals is based on a limited number of remains and artifacts from which we must make inferences about their biology, behavior, and relationship to ourselves. Here, we describe the characterization of these extinct hominids from a new perspective, based on the development of a Neanderthal metagenomic library and its high-throughput sequencing and analysis. Several lines of evidence indicate that the 65,250 base pairs of hominid sequence so far identified in the library a...

  7. Integrated analysis of whole genome and transcriptome sequencing reveals diverse transcriptomic aberrations driven by somatic genomic changes in liver cancers.

    Directory of Open Access Journals (Sweden)

    Yuichi Shiraishi

    Full Text Available Recent studies applying high-throughput sequencing technologies have identified several recurrently mutated genes and pathways in multiple cancer genomes. However, transcriptional consequences from these genomic alterations in cancer genome remain unclear. In this study, we performed integrated and comparative analyses of whole genomes and transcriptomes of 22 hepatitis B virus (HBV-related hepatocellular carcinomas (HCCs and their matched controls. Comparison of whole genome sequence (WGS and RNA-Seq revealed much evidence that various types of genomic mutations triggered diverse transcriptional changes. Not only splice-site mutations, but also silent mutations in coding regions, deep intronic mutations and structural changes caused splicing aberrations. HBV integrations generated diverse patterns of virus-human fusion transcripts depending on affected gene, such as TERT, CDK15, FN1 and MLL4. Structural variations could drive over-expression of genes such as WNT ligands, with/without creating gene fusions. Furthermore, by taking account of genomic mutations causing transcriptional aberrations, we could improve the sensitivity of deleterious mutation detection in known cancer driver genes (TP53, AXIN1, ARID2, RPS6KA3, and identified recurrent disruptions in putative cancer driver genes such as HNF4A, CPS1, TSC1 and THRAP3 in HCCs. These findings indicate genomic alterations in cancer genome have diverse transcriptomic effects, and integrated analysis of WGS and RNA-Seq can facilitate the interpretation of a large number of genomic alterations detected in cancer genome.

  8. Genome size determination in peronosporales (Oomycota) by Feulgen image analysis.

    Science.gov (United States)

    Voglmayr, H; Greilhuber, J

    1998-12-01

    Genome size was determined, by nuclear Feulgen staining and image analysis, in 46 accessions of 31 species of Peronosporales (Oomycota), including important plant pathogens such as Bremia lactucae, Plasmopara viticola, Pseudoperonospora cubensis, and Pseudoperonospora humuli. The 1C DNA contents ranged from 0.046 (45. 6 Mb) to 0.163 pg (159.9 Mb). This is 0.041- to 0.144-fold that of Glycine max (soybean, 1C = 1.134 pg), which was used as an internal standard for genome size determination. The linearity of Feulgen absorbance photometry method over this range was demonstrated by calibration of Aspergillus species (1C = 31-38 Mb) against Glycine, which revealed differences of less than 6% compared to the published CHEF data. The low coefficients of variation (usually between 5 and 10%), repeatability of the results, and compatibility with CHEF data prove the resolution power of Feulgen image analysis. The applicability and limitations of Feulgen photometry are discussed in relation to other methods of genome size determination (CHEF gel electrophoresis, reassociation kinetics, genomic reconstruction) that have been previously applied to Oomycota. Copyright 1998 Academic Press.

  9. Natural selection on functional modules, a genome-wide analysis.

    Science.gov (United States)

    Serra, François; Arbiza, Leonardo; Dopazo, Joaquín; Dopazo, Hernán

    2011-03-01

    Classically, the functional consequences of natural selection over genomes have been analyzed as the compound effects of individual genes. The current paradigm for large-scale analysis of adaptation is based on the observed significant deviations of rates of individual genes from neutral evolutionary expectation. This approach, which assumed independence among genes, has not been able to identify biological functions significantly enriched in positively selected genes in individual species. Alternatively, pooling related species has enhanced the search for signatures of selection. However, grouping signatures does not allow testing for adaptive differences between species. Here we introduce the Gene-Set Selection Analysis (GSSA), a new genome-wide approach to test for evidences of natural selection on functional modules. GSSA is able to detect lineage specific evolutionary rate changes in a notable number of functional modules. For example, in nine mammal and Drosophilae genomes GSSA identifies hundreds of functional modules with significant associations to high and low rates of evolution. Many of the detected functional modules with high evolutionary rates have been previously identified as biological functions under positive selection. Notably, GSSA identifies conserved functional modules with many positively selected genes, which questions whether they are exclusively selected for fitting genomes to environmental changes. Our results agree with previous studies suggesting that adaptation requires positive selection, but not every mutation under positive selection contributes to the adaptive dynamical process of the evolution of species.

  10. Natural selection on functional modules, a genome-wide analysis.

    Directory of Open Access Journals (Sweden)

    François Serra

    2011-03-01

    Full Text Available Classically, the functional consequences of natural selection over genomes have been analyzed as the compound effects of individual genes. The current paradigm for large-scale analysis of adaptation is based on the observed significant deviations of rates of individual genes from neutral evolutionary expectation. This approach, which assumed independence among genes, has not been able to identify biological functions significantly enriched in positively selected genes in individual species. Alternatively, pooling related species has enhanced the search for signatures of selection. However, grouping signatures does not allow testing for adaptive differences between species. Here we introduce the Gene-Set Selection Analysis (GSSA, a new genome-wide approach to test for evidences of natural selection on functional modules. GSSA is able to detect lineage specific evolutionary rate changes in a notable number of functional modules. For example, in nine mammal and Drosophilae genomes GSSA identifies hundreds of functional modules with significant associations to high and low rates of evolution. Many of the detected functional modules with high evolutionary rates have been previously identified as biological functions under positive selection. Notably, GSSA identifies conserved functional modules with many positively selected genes, which questions whether they are exclusively selected for fitting genomes to environmental changes. Our results agree with previous studies suggesting that adaptation requires positive selection, but not every mutation under positive selection contributes to the adaptive dynamical process of the evolution of species.

  11. Genome-Wide Detection and Analysis of Multifunctional Genes

    Science.gov (United States)

    Pritykin, Yuri; Ghersi, Dario; Singh, Mona

    2015-01-01

    Many genes can play a role in multiple biological processes or molecular functions. Identifying multifunctional genes at the genome-wide level and studying their properties can shed light upon the complexity of molecular events that underpin cellular functioning, thereby leading to a better understanding of the functional landscape of the cell. However, to date, genome-wide analysis of multifunctional genes (and the proteins they encode) has been limited. Here we introduce a computational approach that uses known functional annotations to extract genes playing a role in at least two distinct biological processes. We leverage functional genomics data sets for three organisms—H. sapiens, D. melanogaster, and S. cerevisiae—and show that, as compared to other annotated genes, genes involved in multiple biological processes possess distinct physicochemical properties, are more broadly expressed, tend to be more central in protein interaction networks, tend to be more evolutionarily conserved, and are more likely to be essential. We also find that multifunctional genes are significantly more likely to be involved in human disorders. These same features also hold when multifunctionality is defined with respect to molecular functions instead of biological processes. Our analysis uncovers key features about multifunctional genes, and is a step towards a better genome-wide understanding of gene multifunctionality. PMID:26436655

  12. New genomic resources for switchgrass: a BAC library and comparative analysis of homoeologous genomic regions harboring bioenergy traits

    Directory of Open Access Journals (Sweden)

    Feltus Frank A

    2011-07-01

    Full Text Available Abstract Background Switchgrass, a C4 species and a warm-season grass native to the prairies of North America, has been targeted for development into an herbaceous biomass fuel crop. Genetic improvement of switchgrass feedstock traits through marker-assisted breeding and biotechnology approaches calls for genomic tools development. Establishment of integrated physical and genetic maps for switchgrass will accelerate mapping of value added traits useful to breeding programs and to isolate important target genes using map based cloning. The reported polyploidy series in switchgrass ranges from diploid (2X = 18 to duodecaploid (12X = 108. Like in other large, repeat-rich plant genomes, this genomic complexity will hinder whole genome sequencing efforts. An extensive physical map providing enough information to resolve the homoeologous genomes would provide the necessary framework for accurate assembly of the switchgrass genome. Results A switchgrass BAC library constructed by partial digestion of nuclear DNA with EcoRI contains 147,456 clones covering the effective genome approximately 10 times based on a genome size of 3.2 Gigabases (~1.6 Gb effective. Restriction digestion and PFGE analysis of 234 randomly chosen BACs indicated that 95% of the clones contained inserts, ranging from 60 to 180 kb with an average of 120 kb. Comparative sequence analysis of two homoeologous genomic regions harboring orthologs of the rice OsBRI1 locus, a low-copy gene encoding a putative protein kinase and associated with biomass, revealed that orthologous clones from homoeologous chromosomes can be unambiguously distinguished from each other and correctly assembled to respective fingerprint contigs. Thus, the data obtained not only provide genomic resources for further analysis of switchgrass genome, but also improve efforts for an accurate genome sequencing strategy. Conclusions The construction of the first switchgrass BAC library and comparative analysis of

  13. Large-scale genetic variation of the symbiosis-required megaplasmid pSymA revealed by comparative genomic analysis of Sinorhizobium meliloti natural strains

    Directory of Open Access Journals (Sweden)

    Landry Christian R

    2005-11-01

    Full Text Available Abstract Background Sinorhizobium meliloti is a soil bacterium that forms nitrogen-fixing nodules on the roots of leguminous plants such as alfalfa (Medicago sativa. This species occupies different ecological niches, being present as a free-living soil bacterium and as a symbiont of plant root nodules. The genome of the type strain Rm 1021 contains one chromosome and two megaplasmids for a total genome size of 6 Mb. We applied comparative genomic hybridisation (CGH on an oligonucleotide microarrays to estimate genetic variation at the genomic level in four natural strains, two isolated from Italian agricultural soil and two from desert soil in the Aral Sea region. Results From 4.6 to 5.7 percent of the genes showed a pattern of hybridisation concordant with deletion, nucleotide divergence or ORF duplication when compared to the type strain Rm 1021. A large number of these polymorphisms were confirmed by sequencing and Southern blot. A statistically significant fraction of these variable genes was found on the pSymA megaplasmid and grouped in clusters. These variable genes were found to be mainly transposases or genes with unknown function. Conclusion The obtained results allow to conclude that the symbiosis-required megaplasmid pSymA can be considered the major hot-spot for intra-specific differentiation in S. meliloti.

  14. Comparative Genome Analysis of Basidiomycete Fungi

    Energy Technology Data Exchange (ETDEWEB)

    Riley, Robert; Salamov, Asaf; Morin, Emmanuelle; Nagy, Laszlo; Manning, Gerard; Baker, Scott; Brown, Daren; Henrissat, Bernard; Levasseur, Anthony; Hibbett, David; Martin, Francis; Grigoriev, Igor

    2012-03-19

    Fungi of the phylum Basidiomycota (basidiomycetes), make up some 37percent of the described fungi, and are important in forestry, agriculture, medicine, and bioenergy. This diverse phylum includes the mushrooms, wood rots, symbionts, and plant and animal pathogens. To better understand the diversity of phenotypes in basidiomycetes, we performed a comparative analysis of 35 basidiomycete fungi spanning the diversity of the phylum. Phylogenetic patterns of lignocellulose degrading genes suggest a continuum rather than a sharp dichotomy between the white rot and brown rot modes of wood decay. Patterns of secondary metabolic enzymes give additional insight into the broad array of phenotypes found in the basidiomycetes. We suggest that the profile of an organism in lignocellulose-targeting genes can be used to predict its nutritional mode, and predict Dacryopinax sp. as a brown rot; Botryobasidium botryosum and Jaapia argillacea as white rots.

  15. Integrative Genomic Analysis of Complex traits

    DEFF Research Database (Denmark)

    Ehsani, Ali Reza

    In the last decade rapid development in biotechnologies has made it possible to extract extensive information about practically all levels of biological organization. An ever-increasing number of studies are reporting miltilayered datasets on the entire DNA sequence, transceroption, protein...... expression, and metabolite abundance of more and more populations in a multitude of invironments. However, a solid model for including all of this complex information in one analysis, to disentangle genetic variation and the underlying genetic architecture of complex traits and diseases, has not yet been...... proposed. This thesis introduced a novel way to integrate such huge data sets in an efficient and informative procedure to dissect the comæexity of obesity related traits (e.g. body wight, body fat, feed intake, etc) and map the flow from DNA through RNA ending with individual phenotypes....

  16. Genomic analysis of mouse retinal development.

    Directory of Open Access Journals (Sweden)

    Seth Blackshaw

    2004-09-01

    Full Text Available The vertebrate retina is comprised of seven major cell types that are generated in overlapping but well-defined intervals. To identify genes that might regulate retinal development, gene expression in the developing retina was profiled at multiple time points using serial analysis of gene expression (SAGE. The expression patterns of 1,051 genes that showed developmentally dynamic expression by SAGE were investigated using in situ hybridization. A molecular atlas of gene expression in the developing and mature retina was thereby constructed, along with a taxonomic classification of developmental gene expression patterns. Genes were identified that label both temporal and spatial subsets of mitotic progenitor cells. For each developing and mature major retinal cell type, genes selectively expressed in that cell type were identified. The gene expression profiles of retinal Müller glia and mitotic progenitor cells were found to be highly similar, suggesting that Müller glia might serve to produce multiple retinal cell types under the right conditions. In addition, multiple transcripts that were evolutionarily conserved that did not appear to encode open reading frames of more than 100 amino acids in length ("noncoding RNAs" were found to be dynamically and specifically expressed in developing and mature retinal cell types. Finally, many photoreceptor-enriched genes that mapped to chromosomal intervals containing retinal disease genes were identified. These data serve as a starting point for functional investigations of the roles of these genes in retinal development and physiology.

  17. SIDEKICK: Genomic data driven analysis and decision-making framework

    Directory of Open Access Journals (Sweden)

    Yoon Kihoon

    2010-12-01

    Full Text Available Abstract Background Scientists striving to unlock mysteries within complex biological systems face myriad barriers in effectively integrating available information to enhance their understanding. While experimental techniques and available data sources are rapidly evolving, useful information is dispersed across a variety of sources, and sources of the same information often do not use the same format or nomenclature. To harness these expanding resources, scientists need tools that bridge nomenclature differences and allow them to integrate, organize, and evaluate the quality of information without extensive computation. Results Sidekick, a genomic data driven analysis and decision making framework, is a web-based tool that provides a user-friendly intuitive solution to the problem of information inaccessibility. Sidekick enables scientists without training in computation and data management to pursue answers to research questions like "What are the mechanisms for disease X" or "Does the set of genes associated with disease X also influence other diseases." Sidekick enables the process of combining heterogeneous data, finding and maintaining the most up-to-date data, evaluating data sources, quantifying confidence in results based on evidence, and managing the multi-step research tasks needed to answer these questions. We demonstrate Sidekick's effectiveness by showing how to accomplish a complex published analysis in a fraction of the original time with no computational effort using Sidekick. Conclusions Sidekick is an easy-to-use web-based tool that organizes and facilitates complex genomic research, allowing scientists to explore genomic relationships and formulate hypotheses without computational effort. Possible analysis steps include gene list discovery, gene-pair list discovery, various enrichments for both types of lists, and convenient list manipulation. Further, Sidekick's ability to characterize pairs of genes offers new ways to

  18. BioMet Toolbox: genome-wide analysis of metabolism

    OpenAIRE

    Cvijovic, M.; R. Olivares-Hernandez; Agren, R.; Dahr, N.; Vongsangnak, W.; Nookaew, I.; K. R. Patil; Nielsen, J.

    2010-01-01

    The rapid progress of molecular biology tools for directed genetic modifications, accurate quantitative experimental approaches, high-throughput measurements, together with development of genome sequencing has made the foundation for a new area of metabolic engineering that is driven by metabolic models. Systematic analysis of biological processes by means of modelling and simulations has made the identification of metabolic networks and prediction of metabolic capabilities under different co...

  19. Ensemble analysis of adaptive compressed genome sequencing strategies

    Science.gov (United States)

    2014-01-01

    Background Acquiring genomes at single-cell resolution has many applications such as in the study of microbiota. However, deep sequencing and assembly of all of millions of cells in a sample is prohibitively costly. A property that can come to rescue is that deep sequencing of every cell should not be necessary to capture all distinct genomes, as the majority of cells are biological replicates. Biologically important samples are often sparse in that sense. In this paper, we propose an adaptive compressed method, also known as distilled sensing, to capture all distinct genomes in a sparse microbial community with reduced sequencing effort. As opposed to group testing in which the number of distinct events is often constant and sparsity is equivalent to rarity of an event, sparsity in our case means scarcity of distinct events in comparison to the data size. Previously, we introduced the problem and proposed a distilled sensing solution based on the breadth first search strategy. We simulated the whole process which constrained our ability to study the behavior of the algorithm for the entire ensemble due to its computational intensity. Results In this paper, we modify our previous breadth first search strategy and introduce the depth first search strategy. Instead of simulating the entire process, which is intractable for a large number of experiments, we provide a dynamic programming algorithm to analyze the behavior of the method for the entire ensemble. The ensemble analysis algorithm recursively calculates the probability of capturing every distinct genome and also the expected total sequenced nucleotides for a given population profile. Our results suggest that the expected total sequenced nucleotides grows proportional to log of the number of cells and proportional linearly with the number of distinct genomes. The probability of missing a genome depends on its abundance and the ratio of its size over the maximum genome size in the sample. The modified resource

  20. Pan-Genome Analysis of Brazilian Lineage A Amoebal Mimiviruses

    Directory of Open Access Journals (Sweden)

    Felipe L. Assis

    2015-06-01

    Full Text Available Since the recent discovery of Samba virus, the first representative of the family Mimiviridae from Brazil, prospecting for mimiviruses has been conducted in different environmental conditions in Brazil. Recently, we isolated using Acanthamoeba sp. three new mimiviruses, all of lineage A of amoebal mimiviruses: Kroon virus from urban lake water; Amazonia virus from the Brazilian Amazon river; and Oyster virus from farmed oysters. The aims of this work were to sequence and analyze the genome of these new Brazilian mimiviruses (mimi-BR and update the analysis of the Samba virus genome. The genomes of Samba virus, Amazonia virus and Oyster virus were 97%–99% similar, whereas Kroon virus had a low similarity (90%–91% with other mimi-BR. A total of 3877 proteins encoded by mimi-BR were grouped into 974 orthologous clusters. In addition, we identified three new ORFans in the Kroon virus genome. Additional work is needed to expand our knowledge of the diversity of mimiviruses from Brazil, including if and why among amoebal mimiviruses those of lineage A predominate in the Brazilian environment.

  1. Analysis of the core genome and pangenome of Pseudomonas putida.

    Science.gov (United States)

    Udaondo, Zulema; Molina, Lázaro; Segura, Ana; Duque, Estrella; Ramos, Juan L

    2016-10-01

    Pseudomonas putida are strict aerobes that proliferate in a range of temperate niches and are of interest for environmental applications due to their capacity to degrade pollutants and ability to promote plant growth. Furthermore solvent-tolerant strains are useful for biosynthesis of added-value chemicals. We present a comprehensive comparative analysis of nine strains and the first characterization of the Pseudomonas putida pangenome. The core genome of P. putida comprises approximately 3386 genes. The most abundant genes within the core genome are those that encode nutrient transporters. Other conserved genes include those for central carbon metabolism through the Entner-Doudoroff pathway, the pentose phosphate cycle, arginine and proline metabolism, and pathways for degradation of aromatic chemicals. Genes that encode transporters, enzymes and regulators for amino acid metabolism (synthesis and degradation) are all part of the core genome, as well as various electron transporters, which enable aerobic metabolism under different oxygen regimes. Within the core genome are 30 genes for flagella biosynthesis and 12 key genes for biofilm formation. Pseudomonas putida strains share 85% of the coding regions with Pseudomonas aeruginosa; however, in P. putida, virulence factors such as exotoxins and type III secretion systems are absent.

  2. Benchmarking undedicated cloud computing providers for analysis of genomic datasets.

    Directory of Open Access Journals (Sweden)

    Seyhan Yazar

    Full Text Available A major bottleneck in biological discovery is now emerging at the computational level. Cloud computing offers a dynamic means whereby small and medium-sized laboratories can rapidly adjust their computational capacity. We benchmarked two established cloud computing services, Amazon Web Services Elastic MapReduce (EMR on Amazon EC2 instances and Google Compute Engine (GCE, using publicly available genomic datasets (E.coli CC102 strain and a Han Chinese male genome and a standard bioinformatic pipeline on a Hadoop-based platform. Wall-clock time for complete assembly differed by 52.9% (95% CI: 27.5-78.2 for E.coli and 53.5% (95% CI: 34.4-72.6 for human genome, with GCE being more efficient than EMR. The cost of running this experiment on EMR and GCE differed significantly, with the costs on EMR being 257.3% (95% CI: 211.5-303.1 and 173.9% (95% CI: 134.6-213.1 more expensive for E.coli and human assemblies respectively. Thus, GCE was found to outperform EMR both in terms of cost and wall-clock time. Our findings confirm that cloud computing is an efficient and potentially cost-effective alternative for analysis of large genomic datasets. In addition to releasing our cost-effectiveness comparison, we present available ready-to-use scripts for establishing Hadoop instances with Ganglia monitoring on EC2 or GCE.

  3. Sequencing and comparative analysis of the gorilla MHC genomic sequence.

    Science.gov (United States)

    Wilming, Laurens G; Hart, Elizabeth A; Coggill, Penny C; Horton, Roger; Gilbert, James G R; Clee, Chris; Jones, Matt; Lloyd, Christine; Palmer, Sophie; Sims, Sarah; Whitehead, Siobhan; Wiley, David; Beck, Stephan; Harrow, Jennifer L

    2013-01-01

    Major histocompatibility complex (MHC) genes play a critical role in vertebrate immune response and because the MHC is linked to a significant number of auto-immune and other diseases it is of great medical interest. Here we describe the clone-based sequencing and subsequent annotation of the MHC region of the gorilla genome. Because the MHC is subject to extensive variation, both structural and sequence-wise, it is not readily amenable to study in whole genome shotgun sequence such as the recently published gorilla genome. The variation of the MHC also makes it of evolutionary interest and therefore we analyse the sequence in the context of human and chimpanzee. In our comparisons with human and re-annotated chimpanzee MHC sequence we find that gorilla has a trimodular RCCX cluster, versus the reference human bimodular cluster, and additional copies of Class I (pseudo)genes between Gogo-K and Gogo-A (the orthologues of HLA-K and -A). We also find that Gogo-H (and Patr-H) is coding versus the HLA-H pseudogene and, conversely, there is a Gogo-DQB2 pseudogene versus the HLA-DQB2 coding gene. Our analysis, which is freely available through the VEGA genome browser, provides the research community with a comprehensive dataset for comparative and evolutionary research of the MHC.

  4. Benchmarking undedicated cloud computing providers for analysis of genomic datasets.

    Science.gov (United States)

    Yazar, Seyhan; Gooden, George E C; Mackey, David A; Hewitt, Alex W

    2014-01-01

    A major bottleneck in biological discovery is now emerging at the computational level. Cloud computing offers a dynamic means whereby small and medium-sized laboratories can rapidly adjust their computational capacity. We benchmarked two established cloud computing services, Amazon Web Services Elastic MapReduce (EMR) on Amazon EC2 instances and Google Compute Engine (GCE), using publicly available genomic datasets (E.coli CC102 strain and a Han Chinese male genome) and a standard bioinformatic pipeline on a Hadoop-based platform. Wall-clock time for complete assembly differed by 52.9% (95% CI: 27.5-78.2) for E.coli and 53.5% (95% CI: 34.4-72.6) for human genome, with GCE being more efficient than EMR. The cost of running this experiment on EMR and GCE differed significantly, with the costs on EMR being 257.3% (95% CI: 211.5-303.1) and 173.9% (95% CI: 134.6-213.1) more expensive for E.coli and human assemblies respectively. Thus, GCE was found to outperform EMR both in terms of cost and wall-clock time. Our findings confirm that cloud computing is an efficient and potentially cost-effective alternative for analysis of large genomic datasets. In addition to releasing our cost-effectiveness comparison, we present available ready-to-use scripts for establishing Hadoop instances with Ganglia monitoring on EC2 or GCE.

  5. A GeneTrek analysis of the maize genome.

    Science.gov (United States)

    Liu, Renyi; Vitte, Clémentine; Ma, Jianxin; Mahama, A Assibi; Dhliwayo, Thanda; Lee, Michael; Bennetzen, Jeffrey L

    2007-07-10

    Analysis of the sequences of 74 randomly selected BACs demonstrated that the maize nuclear genome contains approximately 37,000 candidate genes with homologues in other plant species. An additional approximately 5,500 predicted genes are severely truncated and probably pseudogenes. The distribution of genes is uneven, with approximately 30% of BACs containing no genes. BAC gene density varies from 0 to 7.9 per 100 kb, whereas most gene islands contain only one gene. The average number of genes per gene island is 1.7. Only 72% of these genes show collinearity with the rice genome. Particular LTR retrotransposon families (e.g., Gyma) are enriched on gene-free BACs, most of which do not come from pericentromeres or other large heterochromatic regions. Gene-containing BACs are relatively enriched in different families of LTR retrotransposons (e.g., Ji). Two major bursts of LTR retrotransposon activity in the last 2 million years are responsible for the large size of the maize genome, but only the more recent of these is well represented in gene-containing BACs, suggesting that LTR retrotransposons are more efficiently removed in these domains. The results demonstrate that sample sequencing and careful annotation of a few randomly selected BACs can provide a robust description of a complex plant genome.

  6. The sequence and analysis of a Chinese pig genome

    Directory of Open Access Journals (Sweden)

    Fang Xiaodong

    2012-11-01

    Full Text Available Abstract Background The pig is an economically important food source, amounting to approximately 40% of all meat consumed worldwide. Pigs also serve as an important model organism because of their similarity to humans at the anatomical, physiological and genetic level, making them very useful for studying a variety of human diseases. A pig strain of particular interest is the miniature pig, specifically the Wuzhishan pig (WZSP, as it has been extensively inbred. Its high level of homozygosity offers increased ease for selective breeding for specific traits and a more straightforward understanding of the genetic changes that underlie its biological characteristics. WZSP also serves as a promising means for applications in surgery, tissue engineering, and xenotransplantation. Here, we report the sequencing and analysis of an inbreeding WZSP genome. Results Our results reveal some unique genomic features, including a relatively high level of homozygosity in the diploid genome, an unusual distribution of heterozygosity, an over-representation of tRNA-derived transposable elements, a small amount of porcine endogenous retrovirus, and a lack of type C retroviruses. In addition, we carried out systematic research on gene evolution, together with a detailed investigation of the counterparts of human drug target genes. Conclusion Our results provide the opportunity to more clearly define the genomic character of pig, which could enhance our ability to create more useful pig models.

  7. Genomic and gene expression signature of the pre-invasive testicular carcinoma in situ

    DEFF Research Database (Denmark)

    Almstrup, Kristian; Ottesen, Anne Marie; Sonne, Si Brask

    2005-01-01

    on the pre-invasive CIS and its possible fetal origin by reviewing recent data originating from DNA microarrays and comparative genomic hybridisations. A comparison of gene expression and genomic aberrations reveal chromosomal "hot spots" with mutual clustering of gene expression and genomic amplification...

  8. Survey sequencing and comparative analysis of the elephant shark (Callorhinchus milii genome.

    Directory of Open Access Journals (Sweden)

    Byrappa Venkatesh

    2007-04-01

    Full Text Available Owing to their phylogenetic position, cartilaginous fishes (sharks, rays, skates, and chimaeras provide a critical reference for our understanding of vertebrate genome evolution. The relatively small genome of the elephant shark, Callorhinchus milii, a chimaera, makes it an attractive model cartilaginous fish genome for whole-genome sequencing and comparative analysis. Here, the authors describe survey sequencing (1.4x coverage and comparative analysis of the elephant shark genome, one of the first cartilaginous fish genomes to be sequenced to this depth. Repetitive sequences, represented mainly by a novel family of short interspersed element-like and long interspersed element-like sequences, account for about 28% of the elephant shark genome. Fragments of approximately 15,000 elephant shark genes reveal specific examples of genes that have been lost differentially during the evolution of tetrapod and teleost fish lineages. Interestingly, the degree of conserved synteny and conserved sequences between the human and elephant shark genomes are higher than that between human and teleost fish genomes. Elephant shark contains putative four Hox clusters indicating that, unlike teleost fish genomes, the elephant shark genome has not experienced an additional whole-genome duplication. These findings underscore the importance of the elephant shark as a critical reference vertebrate genome for comparative analysis of the human and other vertebrate genomes. This study also demonstrates that a survey-sequencing approach can be applied productively for comparative analysis of distantly related vertebrate genomes.

  9. Survey Sequencing and Comparative Analysis of the Elephant Shark (Callorhinchus milii) Genome

    Science.gov (United States)

    Venkatesh, Byrappa; Kirkness, Ewen F; Loh, Yong-Hwee; Halpern, Aaron L; Lee, Alison P; Johnson, Justin; Dandona, Nidhi; Viswanathan, Lakshmi D; Tay, Alice; Venter, J. Craig; Strausberg, Robert L; Brenner, Sydney

    2007-01-01

    Owing to their phylogenetic position, cartilaginous fishes (sharks, rays, skates, and chimaeras) provide a critical reference for our understanding of vertebrate genome evolution. The relatively small genome of the elephant shark, Callorhinchus milii, a chimaera, makes it an attractive model cartilaginous fish genome for whole-genome sequencing and comparative analysis. Here, the authors describe survey sequencing (1.4× coverage) and comparative analysis of the elephant shark genome, one of the first cartilaginous fish genomes to be sequenced to this depth. Repetitive sequences, represented mainly by a novel family of short interspersed element–like and long interspersed element–like sequences, account for about 28% of the elephant shark genome. Fragments of approximately 15,000 elephant shark genes reveal specific examples of genes that have been lost differentially during the evolution of tetrapod and teleost fish lineages. Interestingly, the degree of conserved synteny and conserved sequences between the human and elephant shark genomes are higher than that between human and teleost fish genomes. Elephant shark contains putative four Hox clusters indicating that, unlike teleost fish genomes, the elephant shark genome has not experienced an additional whole-genome duplication. These findings underscore the importance of the elephant shark as a critical reference vertebrate genome for comparative analysis of the human and other vertebrate genomes. This study also demonstrates that a survey-sequencing approach can be applied productively for comparative analysis of distantly related vertebrate genomes. PMID:17407382

  10. Structural characterization of genomes by large scale sequence-structure threading: application of reliability analysis in structural genomics

    Directory of Open Access Journals (Sweden)

    Brunham Robert C

    2004-07-01

    Full Text Available Abstract Background We establish that the occurrence of protein folds among genomes can be accurately described with a Weibull function. Systems which exhibit Weibull character can be interpreted with reliability theory commonly used in engineering analysis. For instance, Weibull distributions are widely used in reliability, maintainability and safety work to model time-to-failure of mechanical devices, mechanisms, building constructions and equipment. Results We have found that the Weibull function describes protein fold distribution within and among genomes more accurately than conventional power functions which have been used in a number of structural genomic studies reported to date. It has also been found that the Weibull reliability parameter β for protein fold distributions varies between genomes and may reflect differences in rates of gene duplication in evolutionary history of organisms. Conclusions The results of this work demonstrate that reliability analysis can provide useful insights and testable predictions in the fields of comparative and structural genomics.

  11. Comparative analysis of Acinetobacters: three genomes for three lifestyles.

    Directory of Open Access Journals (Sweden)

    David Vallenet

    Full Text Available Acinetobacter baumannii is the source of numerous nosocomial infections in humans and therefore deserves close attention as multidrug or even pandrug resistant strains are increasingly being identified worldwide. Here we report the comparison of two newly sequenced genomes of A. baumannii. The human isolate A. baumannii AYE is multidrug resistant whereas strain SDF, which was isolated from body lice, is antibiotic susceptible. As reference for comparison in this analysis, the genome of the soil-living bacterium A. baylyi strain ADP1 was used. The most interesting dissimilarities we observed were that i whereas strain AYE and A. baylyi genomes harbored very few Insertion Sequence elements which could promote expression of downstream genes, strain SDF sequence contains several hundred of them that have played a crucial role in its genome reduction (gene disruptions and simple DNA loss; ii strain SDF has low catabolic capacities compared to strain AYE. Interestingly, the latter has even higher catabolic capacities than A. baylyi which has already been reported as a very nutritionally versatile organism. This metabolic performance could explain the persistence of A. baumannii nosocomial strains in environments where nutrients are scarce; iii several processes known to play a key role during host infection (biofilm formation, iron uptake, quorum sensing, virulence factors were either different or absent, the best example of which is iron uptake. Indeed, strain AYE and A. baylyi use siderophore-based systems to scavenge iron from the environment whereas strain SDF uses an alternate system similar to the Haem Acquisition System (HAS. Taken together, all these observations suggest that the genome contents of the 3 Acinetobacters compared are partly shaped by life in distinct ecological niches: human (and more largely hospital environment, louse, soil.

  12. Genome sequencing and analysis of BCG vaccine strains.

    Directory of Open Access Journals (Sweden)

    Wen Zhang

    Full Text Available BACKGROUND: Although the Bacillus Calmette-Guérin (BCG vaccine against tuberculosis (TB has been available for more than 75 years, one third of the world's population is still infected with Mycobacterium tuberculosis and approximately 2 million people die of TB every year. To reduce this immense TB burden, a clearer understanding of the functional genes underlying the action of BCG and the development of new vaccines are urgently needed. METHODS AND FINDINGS: Comparative genomic analysis of 19 M. tuberculosis complex strains showed that BCG strains underwent repeated human manipulation, had higher region of deletion rates than those of natural M. tuberculosis strains, and lost several essential components such as T-cell epitopes. A total of 188 BCG strain T-cell epitopes were lost to various degrees. The non-virulent BCG Tokyo strain, which has the largest number of T-cell epitopes (359, lost 124. Here we propose that BCG strain protection variability results from different epitopes. This study is the first to present BCG as a model organism for genetics research. BCG strains have a very well-documented history and now detailed genome information. Genome comparison revealed the selection process of BCG strains under human manipulation (1908-1966. CONCLUSIONS: Our results revealed the cause of BCG vaccine strain protection variability at the genome level and supported the hypothesis that the restoration of lost BCG Tokyo epitopes is a useful future vaccine development strategy. Furthermore, these detailed BCG vaccine genome investigation results will be useful in microbial genetics, microbial engineering and other research fields.

  13. Clinical pertinence metric enables hypothesis-independent genome-phenome analysis for neurologic diagnosis.

    Science.gov (United States)

    Segal, Michael M; Abdellateef, Mostafa; El-Hattab, Ayman W; Hilbush, Brian S; De La Vega, Francisco M; Tromp, Gerard; Williams, Marc S; Betensky, Rebecca A; Gleeson, Joseph

    2015-06-01

    We describe an "integrated genome-phenome analysis" that combines both genomic sequence data and clinical information for genomic diagnosis. It is novel in that it uses robust diagnostic decision support and combines the clinical differential diagnosis and the genomic variants using a "pertinence" metric. This allows the analysis to be hypothesis-independent, not requiring assumptions about mode of inheritance, number of genes involved, or which clinical findings are most relevant. Using 20 genomic trios with neurologic disease, we find that pertinence scores averaging 99.9% identify the causative variant under conditions in which a genomic trio is analyzed and family-aware variant calling is done. The analysis takes seconds, and pertinence scores can be improved by clinicians adding more findings. The core conclusion is that automated genome-phenome analysis can be accurate, rapid, and efficient. We also conclude that an automated process offers a methodology for quality improvement of many components of genomic analysis.

  14. YersiniaBase: a genomic resource and analysis platform for comparative analysis of Yersinia.

    Science.gov (United States)

    Tan, Shi Yang; Dutta, Avirup; Jakubovics, Nicholas S; Ang, Mia Yang; Siow, Cheuk Chuen; Mutha, Naresh Vr; Heydari, Hamed; Wee, Wei Yee; Wong, Guat Jah; Choo, Siew Woh

    2015-01-16

    Yersinia is a Gram-negative bacteria that includes serious pathogens such as the Yersinia pestis, which causes plague, Yersinia pseudotuberculosis, Yersinia enterocolitica. The remaining species are generally considered non-pathogenic to humans, although there is evidence that at least some of these species can cause occasional infections using distinct mechanisms from the more pathogenic species. With the advances in sequencing technologies, many genomes of Yersinia have been sequenced. However, there is currently no specialized platform to hold the rapidly-growing Yersinia genomic data and to provide analysis tools particularly for comparative analyses, which are required to provide improved insights into their biology, evolution and pathogenicity. To facilitate the ongoing and future research of Yersinia, especially those generally considered non-pathogenic species, a well-defined repository and analysis platform is needed to hold the Yersinia genomic data and analysis tools for the Yersinia research community. Hence, we have developed the YersiniaBase, a robust and user-friendly Yersinia resource and analysis platform for the analysis of Yersinia genomic data. YersiniaBase has a total of twelve species and 232 genome sequences, of which the majority are Yersinia pestis. In order to smooth the process of searching genomic data in a large database, we implemented an Asynchronous JavaScript and XML (AJAX)-based real-time searching system in YersiniaBase. Besides incorporating existing tools, which include JavaScript-based genome browser (JBrowse) and Basic Local Alignment Search Tool (BLAST), YersiniaBase also has in-house developed tools: (1) Pairwise Genome Comparison tool (PGC) for comparing two user-selected genomes; (2) Pathogenomics Profiling Tool (PathoProT) for comparative pathogenomics analysis of Yersinia genomes; (3) YersiniaTree for constructing phylogenetic tree of Yersinia. We ran analyses based on the tools and genomic data in YersiniaBase and the

  15. Genomic insight into the common carp (Cyprinus carpio genome by sequencing analysis of BAC-end sequences

    Directory of Open Access Journals (Sweden)

    Wang Jintu

    2011-04-01

    Full Text Available Abstract Background Common carp is one of the most important aquaculture teleost fish in the world. Common carp and other closely related Cyprinidae species provide over 30% aquaculture production in the world. However, common carp genomic resources are still relatively underdeveloped. BAC end sequences (BES are important resources for genome research on BAC-anchored genetic marker development, linkage map and physical map integration, and whole genome sequence assembling and scaffolding. Result To develop such valuable resources in common carp (Cyprinus carpio, a total of 40,224 BAC clones were sequenced on both ends, generating 65,720 clean BES with an average read length of 647 bp after sequence processing, representing 42,522,168 bp or 2.5% of common carp genome. The first survey of common carp genome was conducted with various bioinformatics tools. The common carp genome contains over 17.3% of repetitive elements with GC content of 36.8% and 518 transposon ORFs. To identify and develop BAC-anchored microsatellite markers, a total of 13,581 microsatellites were detected from 10,355 BES. The coding region of 7,127 genes were recognized from 9,443 BES on 7,453 BACs, with 1,990 BACs have genes on both ends. To evaluate the similarity to the genome of closely related zebrafish, BES of common carp were aligned against zebrafish genome. A total of 39,335 BES of common carp have conserved homologs on zebrafish genome which demonstrated the high similarity between zebrafish and common carp genomes, indicating the feasibility of comparative mapping between zebrafish and common carp once we have physical map of common carp. Conclusion BAC end sequences are great resources for the first genome wide survey of common carp. The repetitive DNA was estimated to be approximate 28% of common carp genome, indicating the higher complexity of the genome. Comparative analysis had mapped around 40,000 BES to zebrafish genome and established over 3

  16. SmashCell: A software framework for the analysis of single-cell amplified genome sequences

    DEFF Research Database (Denmark)

    Harrington, Eoghan D; Arumugam, Manimozhiyan; Raes, Jeroen;

    2010-01-01

    SUMMARY: Recent advances in single-cell manipulation technology, whole genome amplification and high-throughput sequencing have now made it possible to sequence the genome of an individual cell. The bioinformatic analysis of these genomes however is far more complicated than the analysis of those...

  17. St2-80: a new FISH marker for St genome and genome analysis in Triticeae.

    Science.gov (United States)

    Wang, Long; Shi, Qinghua; Su, Handong; Wang, Yi; Sha, Lina; Fan, Xing; Kang, Houyang; Zhang, Haiqin; Zhou, Yonghong

    2017-07-01

    The St genome is one of the most fundamental genomes in Triticeae. Repetitive sequences are widely used to distinguish different genomes or species. The primary objectives of this study were to (i) screen a new sequence that could easily distinguish the chromosome of the St genome from those of other genomes by fluorescence in situ hybridization (FISH) and (ii) investigate the genome constitution of some species that remain uncertain and controversial. We used degenerated oligonucleotide primer PCR (Dop-PCR), Dot-blot, and FISH to screen for a new marker of the St genome and to test the efficiency of this marker in the detection of the St chromosome at different ploidy levels. Signals produced by a new FISH marker (denoted St2-80) were present on the entire arm of chromosomes of the St genome, except in the centromeric region. On the contrary, St2-80 signals were present in the terminal region of chromosomes of the E, H, P, and Y genomes. No signal was detected in the A and B genomes, and only weak signals were detected in the terminal region of chromosomes of the D genome. St2-80 signals were obvious and stable in chromosomes of different genomes, whether diploid or polyploid. Therefore, St2-80 is a potential and useful FISH marker that can be used to distinguish the St genome from those of other genomes in Triticeae.

  18. Whole genome microarray analysis, from neonatal blood cards

    Directory of Open Access Journals (Sweden)

    Hogan Michael E

    2009-07-01

    Full Text Available Abstract Background Neonatal blood, obtained from a heel stick and stored dry on paper cards, has been the standard for birth defects screening for 50 years. Such dried blood samples are used, primarily, for analysis of small-molecule analytes. More recently, the DNA complement of such dried blood cards has been used for targeted genetic testing, such as for single nucleotide polymorphism in cystic fibrosis. Expansion of such testing to include polygenic traits, and perhaps whole genome scanning, has been discussed as a formal possibility. However, until now the amount of DNA that might be obtained from such dried blood cards has been limiting, due to inefficient DNA recovery technology. Results A new technology is employed for efficient DNA release from a standard neonatal blood card. Using standard Guthrie cards, stored an average of ten years post-collection, about 1/40th of the air-dried neonatal blood specimen (two 3 mm punches was processed to obtain DNA that was sufficient in mass and quality for direct use in microarray-based whole genome scanning. Using that same DNA release technology, it is also shown that approximately 1/250th of the original purified DNA (about 1 ng could be subjected to whole genome amplification, thus yielding an additional microgram of amplified DNA product. That amplified DNA product was then used in microarray analysis and yielded statistical concordance of 99% or greater to the primary, unamplified DNA sample. Conclusion Together, these data suggest that DNA obtained from less than 10% of a standard neonatal blood specimen, stored dry for several years on a Guthrie card, can support a program of genome-wide neonatal genetic testing.

  19. Comparative genomic in situ hybridization analysis on the ...

    African Journals Online (AJOL)

    AJL

    2012-04-10

    Apr 10, 2012 ... different parents/ancestors/genomes in hybrid plants to be distinguished ... sequences in common between the two species. Therefore, cGISH ... genomic organization and genome evolution in plants. (Zoller et al., 2001).

  20. Comparative Genome Analysis Provides Insights into the Pathogenicity of Flavobacterium psychrophilum

    DEFF Research Database (Denmark)

    Castillo, Daniel; Christiansen, Rói Hammershaimb; Dalsgaard, Inger;

    2016-01-01

    . psychrophilum could hold at least 3373 genes, while the core genome contained 1743 genes. On average, 67 new genes were detected for every new genome added to the analysis, indicating that F. psychrophilum possesses an open pan genome. The putative virulence factors were equally distributed among isolates......, independent of geographic location, year of isolation and source of isolates. Only one prophage-related sequence was found which corresponded to the previously described prophage 6H, and appeared in 5 out of 11 isolates. CRISPR array analysis revealed two different loci with dissimilar spacer content, which...... to describe the F. psychrophilum pan-genome and to examine virulence factors, prophages, CRISPR arrays, and genomic islands present in the genomes. Analysis of the genomic DNA sequences were complemented with selected phenotypic characteristics of the strains. The pan genome analysis showed that F...

  1. FC vehicle hybridisation: an affordable solution for an energy-efficient FC powered drive train

    Science.gov (United States)

    Pede, G.; Iacobazzi, A.; Passerini, S.; Bobbio, A.; Botto, G.

    Fuel cells (FCs) have potential as clean and efficient energy sources for automotive applications without sacrifice in performance or driving range. However, the complete FC system must operate as efficiently as possible over the range of driving conditions that may be encountered while maintaining a low cost. To achieve this target, a storage unit can be introduced in the FC system to reduce the size of the fuel cell that is the most expensive component. This "hybrid" concept would not only reduce the drive train total cost but it also allow the recover of the braking energy and the operation at the voltage-current point of maximum efficiency for the FC system. Pro-and-cons of the "full-power" versus the "hybrid" configuration are shown in this work. The "Hybridisation rate" or "Hybridisation degree", a parameter expressed by the relationship between two installed powers, the generation power and the traction power, is also introduced and it is demonstrated that for each category of hybrid vehicles there is an optimal value of hybridisation degree. The storage systems considered are based on high power batteries or ultra capacitors (UCs) or a combination of them. A preliminary design of a sport utility vehicle (SUV) using a combined storage system and a FC energy source (called Triple Hybrid), is proposed. Finally, the experience of the Italian industry in this field is also reviewed.

  2. On hybridising lettuce seedlings with nanoparticles and the resultant effects on the organisms' electrical characteristics.

    Science.gov (United States)

    Gizzie, Nina; Mayne, Richard; Patton, David; Kendrick, Paul; Adamatzky, Andrew

    2016-09-01

    Lettuce seedlings are attracting interest in the computing world due to their capacity to become hybrid circuit components, more specifically, in the creation of living 'wires'. Previous studies have shown that seedlings can be hybridised with gold nanoparticles and withstand mild electrical currents. In this study, lettuce seedlings were hybridised with a variety of metallic and non-metallic nanomaterials: carbon nanotubes, graphene oxide, aluminium oxide and calcium phosphate. Toxic effects and the following electrical properties were monitored: mean potential, resistance and capacitance. Macroscopic observations revealed only slight deleterious health effects after administration with one variety of particle, aluminium oxide. Mean potential in calcium phosphate-hybridised seedlings showed a considerable increase when compared with the control, whereas those administered with graphene oxide showed a small decrease; there were no notable variations across the remaining treatments. Electrical resistance decreased substantially in graphene oxide-treated seedlings whereas slight increases were shown following calcium phosphate and carbon nanotubes applications. Capacitance showed no considerable variation across treated seedlings. These results demonstrate that use of some nanomaterials, specifically graphene oxide and calcium phosphate, may be towards biohybridisation purposes including the generation of living 'wires'.

  3. Integrative Genomics with Mediation Analysis in a Survival Context

    Directory of Open Access Journals (Sweden)

    Szilárd Nemes

    2013-01-01

    Full Text Available DNA copy number aberrations (DCNA and subsequent altered gene expression profiles may have a major impact on tumor initiation, on development, and eventually on recurrence and cancer-specific mortality. However, most methods employed in integrative genomic analysis of the two biological levels, DNA and RNA, do not consider survival time. In the present note, we propose the adoption of a survival analysis-based framework for the integrative analysis of DCNA and mRNA levels to reveal their implication on patient clinical outcome with the prerequisite that the effect of DCNA on survival is mediated by mRNA levels. The specific aim of the paper is to offer a feasible framework to test the DCNA-mRNA-survival pathway. We provide statistical inference algorithms for mediation based on asymptotic results. Furthermore, we illustrate the applicability of the method in an integrative genomic analysis setting by using a breast cancer data set consisting of 141 invasive breast tumors. In addition, we provide implementation in R.

  4. Integrated genomic analysis of survival outliers in glioblastoma.

    Science.gov (United States)

    Peng, Sen; Dhruv, Harshil; Armstrong, Brock; Salhia, Bodour; Legendre, Christophe; Kiefer, Jeffrey; Parks, Julianna; Virk, Selene; Sloan, Andrew E; Ostrom, Quinn T; Barnholtz-Sloan, Jill S; Tran, Nhan L; Berens, Michael E

    2017-06-01

    To elucidate molecular features associated with disproportionate survival of glioblastoma (GB) patients, we conducted deep genomic comparative analysis of a cohort of patients receiving standard therapy (surgery plus concurrent radiation and temozolomide); "GB outliers" were identified: long-term survivor of 33 months (LTS; n = 8) versus short-term survivor of 7 months (STS; n = 10). We implemented exome, RNA, whole genome sequencing, and DNA methylation for collection of deep genomic data from STS and LTS GB patients. LTS GB showed frequent chromosomal gains in 4q12 (platelet derived growth factor receptor alpha and KIT) and 12q14.1 (cyclin-dependent kinase 4), and deletion in 19q13.33 (BAX, branched chain amino-acid transaminase 2, and cluster of differentiation 33). STS GB showed frequent deletion in 9p11.2 (forkhead box D4-like 2 and aquaporin 7 pseudogene 3) and 22q11.21 (Hypermethylated In Cancer 2). LTS GB showed 2-fold more frequent copy number deletions compared with STS GB. Gene expression differences showed the STS cohort with altered transcriptional regulators: activation of signal transducer and activator of transcription (STAT)5a/b, nuclear factor-kappaB (NF-κB), and interferon-gamma (IFNG), and inhibition of mitogen-activated protein kinase (MAPK1), extracellular signal-regulated kinase (ERK)1/2, and estrogen receptor (ESR)1. Expression-based biological concepts prominent in the STS cohort include metabolic processes, anaphase-promoting complex degradation, and immune processes associated with major histocompatibility complex class I antigen presentation; the LTS cohort features genes related to development, morphogenesis, and the mammalian target of rapamycin signaling pathway. Whole genome methylation analyses showed that a methylation signature of 89 probes distinctly separates LTS from STS GB tumors. We posit that genomic instability is associated with longer survival of GB (possibly with vulnerability to standard therapy); conversely, genomic

  5. Analysis of the complete Fischoederius elongatus (Paramphistomidae, Trematoda) mitochondrial genome.

    Science.gov (United States)

    Yang, Xin; Zhao, Yunyang; Wang, Lixia; Feng, Hanli; Tan, Li; Lei, Weiqiang; Zhao, Pengfei; Hu, Min; Fang, Rui

    2015-05-20

    Fischoederius elongates is an important trematode of Paramphistomes in ruminants. Animals infected with F. elongates often don't show obvious symptoms, so it is easy to be ignored. However it can cause severe economic losses to the breeding industry. Knowledge of the mitochondrial genome of F. elongates can be used for phylogenetic and epidemiological studies. The complete mt genome sequence of F. elongates is 14,120 bp in length and contains 12 protein-coding genes, 22 tRNA genes, two rRNA genes and two non-coding regions (LNR and SNR). The gene arrangement of F. elongates is the same as other trematodes, such as Fasciola hepatica and Paramphistomum cervi. Phylogenetic analyses using concatenated amino acid sequences of the 12 protein-coding genes by Maximum-likelihood and Neighbor-joining analysis method showed that F. elongates was closely related to P. cervi. The complete mt genome sequence of F. elongates should provide information for phylogenetic and epidemiological studies for F. elongates and the family Paramphistomidae.

  6. Comparative Genomic Analysis of Meningitis- and Bacteremia-Causing Pneumococci Identifies a Common Core Genome.

    Science.gov (United States)

    Kulohoma, Benard W; Cornick, Jennifer E; Chaguza, Chrispin; Yalcin, Feyruz; Harris, Simon R; Gray, Katherine J; Kiran, Anmol M; Molyneux, Elizabeth; French, Neil; Parkhill, Julian; Faragher, Brian E; Everett, Dean B; Bentley, Stephen D; Heyderman, Robert S

    2015-10-01

    Streptococcus pneumoniae is a nasopharyngeal commensal that occasionally invades normally sterile sites to cause bloodstream infection and meningitis. Although the pneumococcal population structure and evolutionary genetics are well defined, it is not clear whether pneumococci that cause meningitis are genetically distinct from those that do not. Here, we used whole-genome sequencing of 140 isolates of S. pneumoniae recovered from bloodstream infection (n = 70) and meningitis (n = 70) to compare their genetic contents. By fitting a double-exponential decaying-function model, we show that these isolates share a core of 1,427 genes (95% confidence interval [CI], 1,425 to 1,435 genes) and that there is no difference in the core genome or accessory gene content from these disease manifestations. Gene presence/absence alone therefore does not explain the virulence behavior of pneumococci that reach the meninges. Our analysis, however, supports the requirement of a range of previously described virulence factors and vaccine candidates for both meningitis- and bacteremia-causing pneumococci. This high-resolution view suggests that, despite considerable competency for genetic exchange, all pneumococci are under considerable pressure to retain key components advantageous for colonization and transmission and that these components are essential for access to and survival in sterile sites.

  7. Comparative genome analysis of Bacillus cereus group genomes with Bacillus subtilis

    OpenAIRE

    Anderson, Iain; Sorokin, Alexei; Kapatral, Vinayak; Reznik, Gary; Bhattacharya, Anamitra; Mikhailova, Natalia; Burd, Henry; Joukov, Victor; Kaznadzey, Denis; Walunas, Theresa; D'Souza, Mark; Larsen, Niels; Pusch, Gordon; Liolios, Konstantinos; Grechkin, Yuri

    2005-01-01

    Genome features of the Bacillus cereus group genomes (representative strains of Bacillus cereus, Bacillus anthracis and Bacillus thuringiensis sub spp israelensis) were analyzed and compared with the Bacillus subtilis genome. A core set of 1,381 protein families among the four Bacillus genomes, with an additional set of 933 families common to the B. cereus group, was identified. Differences in signal transduction pathways, membrane transporters, cell surface structures, cell wall, and S-...

  8. Evolutionary insights from suffix array-based genome sequence analysis

    Indian Academy of Sciences (India)

    Anindya Poddar; Nagasuma Chandra; Madhavi Ganapathiraju; K Sekar; Judith Klein-Seetharaman; Raj Reddy; N Balakrishnan

    2007-08-01

    Gene and protein sequence analyses, central components of studies in modern biology are easily amenable to string matching and pattern recognition algorithms. The growing need of analysing whole genome sequences more efficiently and thoroughly, has led to the emergence of new computational methods. Suffix trees and suffix arrays are data structures, well known in many other areas and are highly suited for sequence analysis too. Here we report an improvement to the design of construction of suffix arrays. Enhancement in versatility and scalability, enabled by this approach, is demonstrated through the use of real-life examples. The scalability of the algorithm to whole genomes renders it suitable to address many biologically interesting problems. One example is the evolutionary insight gained by analysing unigrams, bi-grams and higher n-grams, indicating that the genetic code has a direct influence on the overall composition of the genome. Further, different proteomes have been analysed for the coverage of the possible peptide space, which indicate that as much as a quarter of the total space at the tetra-peptide level is left un-sampled in prokaryotic organisms, although almost all tri-peptides can be seen in one protein or another in a proteome. Besides, distinct patterns begin to emerge for the counts of particular tetra and higher peptides, indicative of a ‘meaning’ for tetra and higher n-grams. The toolkit has also been used to demonstrate the usefulness of identifying repeats in whole proteomes efficiently. As an example, 16 members of one COG, coded by the genome of Mycobacterium tuberculosis H37Rv have been found to contain a repeating sequence of 300 amino acids.

  9. Statistical analysis of simple repeats in the human genome

    Science.gov (United States)

    Piazza, F.; Liò, P.

    2005-03-01

    The human genome contains repetitive DNA at different level of sequence length, number and dispersion. Highly repetitive DNA is particularly rich in homo- and di-nucleotide repeats, while middle repetitive DNA is rich of families of interspersed, mobile elements hundreds of base pairs (bp) long, among which belong the Alu families. A link between homo- and di-polymeric tracts and mobile elements has been recently highlighted. In particular, the mobility of Alu repeats, which form 10% of the human genome, has been correlated with the length of poly(A) tracts located at one end of the Alu. These tracts have a rigid and non-bendable structure and have an inhibitory effect on nucleosomes, which normally compact the DNA. We performed a statistical analysis of the genome-wide distribution of lengths and inter-tract separations of poly(X) and poly(XY) tracts in the human genome. Our study shows that in humans the length distributions of these sequences reflect the dynamics of their expansion and DNA replication. By means of general tools from linguistics, we show that the latter play the role of highly-significant content-bearing terms in the DNA text. Furthermore, we find that such tracts are positioned in a non-random fashion, with an apparent periodicity of 150 bases. This allows us to extend the link between repetitive, highly mobile elements such as Alus and low-complexity words in human DNA. More precisely, we show that Alus are sources of poly(X) tracts, which in turn affect in a subtle way the combination and diversification of gene expression and the fixation of multigene families.

  10. Comparative Genomic Analysis of Mannheimia haemolytica from Bovine Sources.

    Directory of Open Access Journals (Sweden)

    Cassidy L Klima

    Full Text Available Bovine respiratory disease is a common health problem in beef production. The primary bacterial agent involved, Mannheimia haemolytica, is a target for antimicrobial therapy and at risk for associated antimicrobial resistance development. The role of M. haemolytica in pathogenesis is linked to serotype with serotypes 1 (S1 and 6 (S6 isolated from pneumonic lesions and serotype 2 (S2 found in the upper respiratory tract of healthy animals. Here, we sequenced the genomes of 11 strains of M. haemolytica, representing all three serotypes and performed comparative genomics analysis to identify genetic features that may contribute to pathogenesis. Possible virulence associated genes were identified within 14 distinct prophage, including a periplasmic chaperone, a lipoprotein, peptidoglycan glycosyltransferase and a stress response protein. Prophage content ranged from 2-8 per genome, but was higher in S1 and S6 strains. A type I-C CRISPR-Cas system was identified in each strain with spacer diversity and organization conserved among serotypes. The majority of spacers occur in S1 and S6 strains and originate from phage suggesting that serotypes 1 and 6 may be more resistant to phage predation. However, two spacers complementary to the host chromosome targeting a UDP-N-acetylglucosamine 2-epimerase and a glycosyl transferases group 1 gene are present in S1 and S6 strains only indicating these serotypes may employ CRISPR-Cas to regulate gene expression to avoid host immune responses or enhance adhesion during infection. Integrative conjugative elements are present in nine of the eleven genomes. Three of these harbor extensive multi-drug resistance cassettes encoding resistance against the majority of drugs used to combat infection in beef cattle, including macrolides and tetracyclines used in human medicine. The findings here identify key features that are likely contributing to serotype related pathogenesis and specific targets for vaccine design

  11. Comparative analysis of whole-genome sequences of Streptococcus suis

    Institute of Scientific and Technical Information of China (English)

    LI Pengli; WEI Wu; LI Yixue; MA Yuanyuan; DING Guohui; LI Xiaoping; WANG Xiaojing; ZHANG Liwen; SUN Jingchun; WANG Yong; TU Kang; WANG Ningning; HAO Pei; WANG Chuan; CAO Zhiwei; SHI Tieliu

    2006-01-01

    The outbreak of Streptococcus suis recently in some districts of Sichuan Province in China has caused over 30 deaths and over 200 infections in human beings. In order to study the pathogenicity mechanism and to prevent the bacteria from spreading and infecting human beings and swine, we have annotated and analyzed the genomes of two strains, Streptococcus suis P1/7 and 89-1591 respectively. The whole length of P1/7 is 2.007 Mb,and has 1969 ORFs. In contrast, the partial genome sequence of 89-1591 is 1.98 Mb in length and exists in 177 contigs with 1918 ORFs. Analysis shows that the average lengths of CDSs in two genomes are very close, and the numbers of the homolog ORFs are 1306 between those two strains. Most of the toxicity factors of the two strains are homologeous, but there are still some significant differences between those two strains. For example, among the 11 genes (cps2A-cps2K) encoding for the capsules in P1/7, 4(cps2A, 2B, 2I, 2J) are not detected in strain 89-1591.At the same time, the genes encoding EF and Haemolysin in P1/7 are also not found in strain 89-1591. Besides, the genes related to DNA replication, repair and recombination differ from each other significantly and there also exist certain differences among the surface proteins. Those characteristics indicate that those two strains have evolved their own specific functions to adapt to the different environments and that the pathogenesis of the two strains is different. We have accumulated comprehensive genomics information for future systematic studies of S.sui. Our results are helpful for disease prevention,vaccine development, as well as drug design for S.suis.

  12. Radiation induced genome instability: multiscale modelling and data analysis

    Science.gov (United States)

    Andreev, Sergey; Eidelman, Yuri

    2012-07-01

    Genome instability (GI) is thought to be an important step in cancer induction and progression. Radiation induced GI is usually defined as genome alterations in the progeny of irradiated cells. The aim of this report is to demonstrate an opportunity for integrative analysis of radiation induced GI on the basis of multiscale modelling. Integrative, systems level modelling is necessary to assess different pathways resulting in GI in which a variety of genetic and epigenetic processes are involved. The multilevel modelling includes the Monte Carlo based simulation of several key processes involved in GI: DNA double strand breaks (DSBs) generation in cells initially irradiated as well as in descendants of irradiated cells, damage transmission through mitosis. Taking the cell-cycle-dependent generation of DNA/chromosome breakage into account ensures an advantage in estimating the contribution of different DNA damage response pathways to GI, as to nonhomologous vs homologous recombination repair mechanisms, the role of DSBs at telomeres or interstitial chromosomal sites, etc. The preliminary estimates show that both telomeric and non-telomeric DSB interactions are involved in delayed effects of radiation although differentially for different cell types. The computational experiments provide the data on the wide spectrum of GI endpoints (dicentrics, micronuclei, nonclonal translocations, chromatid exchanges, chromosome fragments) similar to those obtained experimentally for various cell lines under various experimental conditions. The modelling based analysis of experimental data demonstrates that radiation induced GI may be viewed as processes of delayed DSB induction/interaction/transmission being a key for quantification of GI. On the other hand, this conclusion is not sufficient to understand GI as a whole because factors of DNA non-damaging origin can also induce GI. Additionally, new data on induced pluripotent stem cells reveal that GI is acquired in normal mature

  13. Using genomic DNA-based probe-selection to improve the sensitivity of high-density oligonucleotide arrays when applied to heterologous species

    Directory of Open Access Journals (Sweden)

    Townsend Henrik J

    2005-11-01

    Full Text Available Abstract High-density oligonucleotide (oligo arrays are a powerful tool for transcript profiling. Arrays based on GeneChip® technology are amongst the most widely used, although GeneChip® arrays are currently available for only a small number of plant and animal species. Thus, we have developed a method to improve the sensitivity of high-density oligonucleotide arrays when applied to heterologous species and tested the method by analysing the transcriptome of Brassica oleracea L., a species for which no GeneChip® array is available, using a GeneChip® array designed for Arabidopsis thaliana (L. Heynh. Genomic DNA from B. oleracea was labelled and hybridised to the ATH1-121501 GeneChip® array. Arabidopsis thaliana probe-pairs that hybridised to the B. oleracea genomic DNA on the basis of the perfect-match (PM probe signal were then selected for subsequent B. oleracea transcriptome analysis using a .cel file parser script to generate probe mask files. The transcriptional response of B. oleracea to a mineral nutrient (phosphorus; P stress was quantified using probe mask files generated for a wide range of gDNA hybridisation intensity thresholds. An example probe mask file generated with a gDNA hybridisation intensity threshold of 400 removed > 68 % of the available PM probes from the analysis but retained >96 % of available A. thaliana probe-sets. Ninety-nine of these genes were then identified as significantly regulated under P stress in B. oleracea, including the homologues of P stress responsive genes in A. thaliana. Increasing the gDNA hybridisation intensity thresholds up to 500 for probe-selection increased the sensitivity of the GeneChip® array to detect regulation of gene expression in B. oleracea under P stress by up to 13-fold. Our open-source software to create probe mask files is freely available http://affymetrix.arabidopsis.info/xspecies/ and may be used to facilitate transcriptomic analyses of a wide range of plant and animal

  14. Construction of an integrated database to support genomic sequence analysis

    Energy Technology Data Exchange (ETDEWEB)

    Gilbert, W.; Overbeek, R.

    1994-11-01

    The central goal of this project is to develop an integrated database to support comparative analysis of genomes including DNA sequence data, protein sequence data, gene expression data and metabolism data. In developing the logic-based system GenoBase, a broader integration of available data was achieved due to assistance from collaborators. Current goals are to easily include new forms of data as they become available and to easily navigate through the ensemble of objects described within the database. This report comments on progress made in these areas.

  15. Recombination analysis based on the complete genome of bocavirus

    Directory of Open Access Journals (Sweden)

    Chen Shengxia

    2011-04-01

    Full Text Available Abstract Bocavirus include bovine parvovirus, minute virus of canine, porcine bocavirus, gorilla bocavirus, and Human bocaviruses 1-4 (HBoVs. Although recent reports showed that recombination happened in bocavirus, no systematical study investigated the recombination of bocavirus. The present study performed the phylogenetic and recombination analysis of bocavirus over the complete genomes available in GenBank. Results confirmed that recombination existed among bocavirus, including the likely inter-genotype recombination between HBoV1 and HBoV4, and intra-genotype recombination among HBoV2 variants. Moreover, it is the first report revealing the recombination that occurred between minute viruses of canine.

  16. Comparative Genome Analysis Provides Insights into the Pathogenicity of Flavobacterium psychrophilum

    DEFF Research Database (Denmark)

    Castillo, Daniel; Christiansen, Rói Hammershaimb; Dalsgaard, Inger;

    2016-01-01

    to describe the F. psychrophilum pan-genome and to examine virulence factors, prophages, CRISPR arrays, and genomic islands present in the genomes. Analysis of the genomic DNA sequences were complemented with selected phenotypic characteristics of the strains. The pan genome analysis showed that F......, independent of geographic location, year of isolation and source of isolates. Only one prophage-related sequence was found which corresponded to the previously described prophage 6H, and appeared in 5 out of 11 isolates. CRISPR array analysis revealed two different loci with dissimilar spacer content, which...

  17. Genomic analysis of primordial dwarfism reveals novel disease genes.

    Science.gov (United States)

    Shaheen, Ranad; Faqeih, Eissa; Ansari, Shinu; Abdel-Salam, Ghada; Al-Hassnan, Zuhair N; Al-Shidi, Tarfa; Alomar, Rana; Sogaty, Sameera; Alkuraya, Fowzan S

    2014-02-01

    Primordial dwarfism (PD) is a disease in which severely impaired fetal growth persists throughout postnatal development and results in stunted adult size. The condition is highly heterogeneous clinically, but the use of certain phenotypic aspects such as head circumference and facial appearance has proven helpful in defining clinical subgroups. In this study, we present the results of clinical and genomic characterization of 16 new patients in whom a broad definition of PD was used (e.g., 3M syndrome was included). We report a novel PD syndrome with distinct facies in two unrelated patients, each with a different homozygous truncating mutation in CRIPT. Our analysis also reveals, in addition to mutations in known PD disease genes, the first instance of biallelic truncating BRCA2 mutation causing PD with normal bone marrow analysis. In addition, we have identified a novel locus for Seckel syndrome based on a consanguineous multiplex family and identified a homozygous truncating mutation in DNA2 as the likely cause. An additional novel PD disease candidate gene XRCC4 was identified by autozygome/exome analysis, and the knockout mouse phenotype is highly compatible with PD. Thus, we add a number of novel genes to the growing list of PD-linked genes, including one which we show to be linked to a novel PD syndrome with a distinct facial appearance. PD is extremely heterogeneous genetically and clinically, and genomic tools are often required to reach a molecular diagnosis.

  18. Genome sequence and analysis of the tuber crop potato

    DEFF Research Database (Denmark)

    Xu, X.; Pan, S.; Cheng, S.

    2011-01-01

    and assemble 86% of the 844-megabase genome. We predict 39,031 protein-coding genes and present evidence for at least two genome duplication events indicative of a palaeopolyploid origin. As the first genome sequence of an asterid, the potato genome reveals 2,642 genes specific to this large angiosperm clade...

  19. BioMet Toolbox: genome-wide analysis of metabolism

    DEFF Research Database (Denmark)

    Cvijovic, M.; Olivares Hernandez, Roberto; Agren, R.

    2010-01-01

    models. Systematic analysis of biological processes by means of modelling and simulations has made the identification of metabolic networks and prediction of metabolic capabilities under different conditions possible. For facilitating such systemic analysis, we have developed the BioMet Toolbox, a web......-based resource for stoichiometric analysis and for integration of transcriptome and interactome data, thereby exploiting the capabilities of genome-scale metabolic models. The BioMet Toolbox provides an effective user-friendly way to perform linear programming simulations towards maximized or minimized growth...... rates, substrate uptake rates and metabolic production rates by detecting relevant fluxes, simulate single and double gene deletions or detect metabolites around which major transcriptional changes are concentrated. These tools can be used for high-throughput in silico screening and allows fully...

  20. Comparative analysis of genomic signal processing for microarray data clustering.

    Science.gov (United States)

    Istepanian, Robert S H; Sungoor, Ala; Nebel, Jean-Christophe

    2011-12-01

    Genomic signal processing is a new area of research that combines advanced digital signal processing methodologies for enhanced genetic data analysis. It has many promising applications in bioinformatics and next generation of healthcare systems, in particular, in the field of microarray data clustering. In this paper we present a comparative performance analysis of enhanced digital spectral analysis methods for robust clustering of gene expression across multiple microarray data samples. Three digital signal processing methods: linear predictive coding, wavelet decomposition, and fractal dimension are studied to provide a comparative evaluation of the clustering performance of these methods on several microarray datasets. The results of this study show that the fractal approach provides the best clustering accuracy compared to other digital signal processing and well known statistical methods.

  1. Genome-wide identification of specific oligonucleotides using artificial neural network and computational genomic analysis

    Directory of Open Access Journals (Sweden)

    Chen Jiun-Ching

    2007-05-01

    Full Text Available Abstract Background Genome-wide identification of specific oligonucleotides (oligos is a computationally-intensive task and is a requirement for designing microarray probes, primers, and siRNAs. An artificial neural network (ANN is a machine learning technique that can effectively process complex and high noise data. Here, ANNs are applied to process the unique subsequence distribution for prediction of specific oligos. Results We present a novel and efficient algorithm, named the integration of ANN and BLAST (IAB algorithm, to identify specific oligos. We establish the unique marker database for human and rat gene index databases using the hash table algorithm. We then create the input vectors, via the unique marker database, to train and test the ANN. The trained ANN predicted the specific oligos with high efficiency, and these oligos were subsequently verified by BLAST. To improve the prediction performance, the ANN over-fitting issue was avoided by early stopping with the best observed error and a k-fold validation was also applied. The performance of the IAB algorithm was about 5.2, 7.1, and 6.7 times faster than the BLAST search without ANN for experimental results of 70-mer, 50-mer, and 25-mer specific oligos, respectively. In addition, the results of polymerase chain reactions showed that the primers predicted by the IAB algorithm could specifically amplify the corresponding genes. The IAB algorithm has been integrated into a previously published comprehensive web server to support microarray analysis and genome-wide iterative enrichment analysis, through which users can identify a group of desired genes and then discover the specific oligos of these genes. Conclusion The IAB algorithm has been developed to construct SpecificDB, a web server that provides a specific and valid oligo database of the probe, siRNA, and primer design for the human genome. We also demonstrate the ability of the IAB algorithm to predict specific oligos through

  2. The integrated microbial genomes (IMG) system in 2007: datacontent and analysis tool extensions

    Energy Technology Data Exchange (ETDEWEB)

    Markowitz, Victor M.; Szeto, Ernest; Palaniappan, Krishna; Grechkin, Yuri; Chu, Ken; Chen, I-Min A.; Dubchak, Inna; Anderson, Iain; Lykidis, Athanasios; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2007-08-01

    The Integrated Microbial Genomes (IMG) system is a data management, analysis and annotation platform for all publicly available genomes. IMG contains both draft and complete JGI microbial genomes integrated with all other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and annotating genomes, genes and functions, individually or in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through quarterly releases. IMG is provided by the DOE-Joint Genome Institute (JGI) and is available from http://img.jgi.doe.gov.

  3. Comparative Genomics and Transcriptomic Analysis of Mycobacterium Kansasii

    KAUST Repository

    Alzahid, Yara

    2014-04-01

    The group of Mycobacteria is one of the most intensively studied bacterial taxa, as they cause the two historical and worldwide known diseases: leprosy and tuberculosis. Mycobacteria not identified as tuberculosis or leprosy complex, have been referred to by ‘environmental mycobacteria’ or ‘Nontuberculous mycobacteria (NTM). Mycobacterium kansasii (M. kansasii) is one of the most frequent NTM pathogens, as it causes pulmonary disease in immuno-competent patients and pulmonary, and disseminated disease in patients with various immuno-deficiencies. There have been five documented subtypes of this bacterium, by different molecular typing methods, showing that type I causes tuberculosis-like disease in healthy individuals, and type II in immune-compromised individuals. The remaining types are said to be environmental, thereby, not causing any diseases. The aim of this project was to conduct a comparative genomic study of M. kansasii types I-V and investigating the gene expression level of those types. From various comparative genomics analysis, provided genomics evidence on why M. kansasii type I is considered pathogenic, by focusing on three key elements that are involved in virulence of Mycobacteria: ESX secretion system, Phospholipase c (plcb) and Mammalian cell entry (Mce) operons. The results showed the lack of the espA operon in types II-V, which renders the ESX- 1 operon dysfunctional, as espA is one of the key factors that control this secretion system. However, gene expression analysis showed this operon to be deleted in types II, III and IV. Furthermore, plcB was found to be truncated in types III and IV. Analysis of Mce operons (1-4) show that mce-1 operon is duplicated, mce-2 is absent and mce-3 and mce-4 is present in one copy in M. kansasii types I-V. Gene expression profiles of type I-IV, showed that the secreted proteins of ESX-1 were slightly upregulated in types II-IV when compared to type I and the secreted forms of ESX-5 were highly down

  4. Dating the age of admixture via wavelet transform analysis of genome-wide data

    NARCIS (Netherlands)

    I. Pugach (Irina); R. Matveyev (Rostislav); A. Wollstein (Andreas); M.H. Kayser (Manfred); M. Stoneking (Mark)

    2011-01-01

    textabstractWe describe a PCA-based genome scan approach to analyze genome-wide admixture structure, and introduce wavelet transform analysis as a method for estimating the time of admixture. We test the wavelet transform method with simulations and apply it to genome-wide SNP data from eight admixe

  5. IMG 4 version of the integrated microbial genomes comparative analysis system

    Energy Technology Data Exchange (ETDEWEB)

    Markowitz, Victor M. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Chen, I-Min A. [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Palaniappan, Krishna [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Chu, Ken [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Szeto, Ernest [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Pillay, Manoj [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Ratner, Anna [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Huang, Jinghua [Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States). Biological Data Management and Technology Center. Computational Research Division; Woyke, Tanja [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Huntemann, Marcel [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Anderson, Iain [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Billis, Konstantinos [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Varghese, Neha [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Mavromatis, Konstantinos [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Pati, Amrita [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Ivanova, Natalia N. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program; Kyrpides, Nikos C. [USDOE Joint Genome Institute (JGI), Walnut Creek, CA (United States). Microbial Genome and Metagenome Program

    2013-10-27

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG’s data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG’s annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Finally, different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).

  6. IMG 4 version of the integrated microbial genomes comparative analysis system.

    Science.gov (United States)

    Markowitz, Victor M; Chen, I-Min A; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Pillay, Manoj; Ratner, Anna; Huang, Jinghua; Woyke, Tanja; Huntemann, Marcel; Anderson, Iain; Billis, Konstantinos; Varghese, Neha; Mavromatis, Konstantinos; Pati, Amrita; Ivanova, Natalia N; Kyrpides, Nikos C

    2014-01-01

    The Integrated Microbial Genomes (IMG) data warehouse integrates genomes from all three domains of life, as well as plasmids, viruses and genome fragments. IMG provides tools for analyzing and reviewing the structural and functional annotations of genomes in a comparative context. IMG's data content and analytical capabilities have increased continuously since its first version released in 2005. Since the last report published in the 2012 NAR Database Issue, IMG's annotation and data integration pipelines have evolved while new tools have been added for recording and analyzing single cell genomes, RNA Seq and biosynthetic cluster data. Different IMG datamarts provide support for the analysis of publicly available genomes (IMG/W: http://img.jgi.doe.gov/w), expert review of genome annotations (IMG/ER: http://img.jgi.doe.gov/er) and teaching and training in the area of microbial genome analysis (IMG/EDU: http://img.jgi.doe.gov/edu).

  7. [The Mycobacterium leprae genome: from sequence analysis to therapeutic implications].

    Science.gov (United States)

    Honore, N

    2002-01-01

    The genome of Mycobacterium leprae, the causative agent of leprosy, was analyzed by rapid sequencing of cosmids and plasmids prepared from DNA isolated from one patient's strain. Results showed that the bacillus possesses a single circular chromosome that differs from other known mycobacterium chromosomes with regard to size (3.2 Mb) and G + C content (57.8%). Computer analysis demonstrated that only half of the sequence contains protein-coding genes. The other half contains pseudogenes and non-coding sequences. These findings indicate that M. leprae has undergone a major reductive evolution leaving a minimal set of functional genes for survival. Study of the coding region of the sequence provides evidence accounting for the particular pathogenic properties of M. leprae which is an obligate intracellular parasite. Disappearance of numerous enzymatic pathways in comparison with M. tuberculosis, an intracellular pathogen comparable to M. leprae, could explain the differences observed between the two organisms. Genomic analysis of the leprosy bacillus also provided insight into the molecular basis for resistance to various antibiotics and allowed identification of several potential targets for new drug treatments.

  8. The impact of Ty3-gypsy group LTR retrotransposons Fatima on B-genome specificity of polyploid wheats

    Directory of Open Access Journals (Sweden)

    Huneau Cecile

    2011-06-01

    Full Text Available Abstract Background Transposable elements (TEs are a rapidly evolving fraction of the eukaryotic genomes and the main contributors to genome plasticity and divergence. Recently, occupation of the A- and D-genomes of allopolyploid wheat by specific TE families was demonstrated. Here, we investigated the impact of the well-represented family of gypsy LTR-retrotransposons, Fatima, on B-genome divergence of allopolyploid wheat using the fluorescent in situ hybridisation (FISH method and phylogenetic analysis. Results FISH analysis of a BAC clone (BAC_2383A24 initially screened with Spelt1 repeats demonstrated its predominant localisation to chromosomes of the B-genome and its putative diploid progenitor Aegilops speltoides in hexaploid (genomic formula, BBAADD and tetraploid (genomic formula, BBAA wheats as well as their diploid progenitors. Analysis of the complete BAC_2383A24 nucleotide sequence (113 605 bp demonstrated that it contains 55.6% TEs, 0.9% subtelomeric tandem repeats (Spelt1, and five genes. LTR retrotransposons are predominant, representing 50.7% of the total nucleotide sequence. Three elements of the gypsy LTR retrotransposon family Fatima make up 47.2% of all the LTR retrotransposons in this BAC. In situ hybridisation of the Fatima_2383A24-3 subclone suggests that individual representatives of the Fatima family contribute to the majority of the B-genome specific FISH pattern for BAC_2383A24. Phylogenetic analysis of various Fatima elements available from databases in combination with the data on their insertion dates demonstrated that the Fatima elements fall into several groups. One of these groups, containing Fatima_2383A24-3, is more specific to the B-genome and proliferated around 0.5-2.5 MYA, prior to allopolyploid wheat formation. Conclusion The B-genome specificity of the gypsy-like Fatima, as determined by FISH, is explained to a great degree by the appearance of a genome-specific element within this family for Ae

  9. Analysis of dinucleotide signatures in HIV-1 subtype B genomes

    Indian Academy of Sciences (India)

    Aridaman Pandit; Jyothirmayi Vadlamudi; Somdatta Sinha

    2013-12-01

    Dinucleotide usage is known to vary in the genomes of organisms. The dinucleotide usage profiles or genome signatures are similar for sequence samples taken from the same genome, but are different for taxonomically distant species. This concept of genome signatures has been used to study several organisms including viruses, to elucidate the signatures of evolutionary processes at the genome level. Genome signatures assume greater importance in the case of host–pathogen interactions, where molecular interactions between the two species take place continuously, and can influence their genomic composition. In this study, analyses of whole genome sequences of the HIV-1 subtype B, a retrovirus that caused global pandemic of AIDS, have been carried out to analyse the variation in genome signatures of the virus from 1983 to 2007.We show statistically significant temporal variations in some dinucleotide patterns highlighting the selective evolution of the dinucleotide profiles of HIV-1 subtype B, possibly a consequence of host specific selection.

  10. A genome-wide 20 K citrus microarray for gene expression analysis

    OpenAIRE

    Gadea Jose; Forment Javier; Santiago Julia; Marques M Carmen; Juarez Jose; Mauri Nuria; Martinez-Godoy M Angeles

    2008-01-01

    Abstract Background Understanding of genetic elements that contribute to key aspects of citrus biology will impact future improvements in this economically important crop. Global gene expression analysis demands microarray platforms with a high genome coverage. In the last years, genome-wide EST collections have been generated in citrus, opening the possibility to create new tools for functional genomics in this crop plant. Results We have designed and constructed a publicly available genome-...

  11. A genome-wide 20 K citrus microarray for gene expression analysis

    OpenAIRE

    Martinez-Godoy, M Angeles; Mauri, Nuria; Juarez, Jose; Marques, M Carmen; Santiago, Julia; Forment, Javier; Gadea, Jose

    2008-01-01

    Background Understanding of genetic elements that contribute to key aspects of citrus biology will impact future improvements in this economically important crop. Global gene expression analysis demands microarray platforms with a high genome coverage. In the last years, genome-wide EST collections have been generated in citrus, opening the possibility to create new tools for functional genomics in this crop plant. Results We have designed and constructed a publicly available genome-wide cDNA...

  12. Establishing a framework for comparative analysis of genome sequences

    Energy Technology Data Exchange (ETDEWEB)

    Bansal, A.K.

    1995-06-01

    This paper describes a framework and a high-level language toolkit for comparative analysis of genome sequence alignment The framework integrates the information derived from multiple sequence alignment and phylogenetic tree (hypothetical tree of evolution) to derive new properties about sequences. Multiple sequence alignments are treated as an abstract data type. Abstract operations have been described to manipulate a multiple sequence alignment and to derive mutation related information from a phylogenetic tree by superimposing parsimonious analysis. The framework has been applied on protein alignments to derive constrained columns (in a multiple sequence alignment) that exhibit evolutionary pressure to preserve a common property in a column despite mutation. A Prolog toolkit based on the framework has been implemented and demonstrated on alignments containing 3000 sequences and 3904 columns.

  13. Analysis Of Segmental Duplications In The Pig Genome Based On Next-Generation Sequencing

    DEFF Research Database (Denmark)

    Fadista, João; Bendixen, Christian

    extensively studied in other organisms, its analysis in pig has been hampered by the lack of a complete pig genome assembly. By measuring the depth of coverage of Illumina whole-genome shotgun sequencing reads of the Tabasco animal aligned to the latest pig genome assembly (Sus scrofa 10 – based also...... on Tabasco), led us to the detection of a high-resolution map of segmental duplications in the pig genome. Comparing these segments with four other Duroc animals sequenced at our institute, supplied the resources needed to describe the first genome-wide and systematic analysis of segmental duplications...

  14. Genome-Wide Analysis Reveals Coating of the Mitochondrial Genome by TFAM

    OpenAIRE

    Wang, Yun E.; Marinov, Georgi K.; Wold, Barbara J.; Chan, David C.

    2013-01-01

    Mitochondria contain a 16.6 kb circular genome encoding 13 proteins as well as mitochondrial tRNAs and rRNAs. Copies of the genome are organized into nucleoids containing both DNA and proteins, including the machinery required for mtDNA replication and transcription. The transcription factor TFAM is critical for initiation of transcription and replication of the genome, and is also thought to perform a packaging function. Although specific binding sites required for initiation of transcriptio...

  15. Transcriptome, methylome and genomic variations analysis of ectopic thyroid glands.

    Directory of Open Access Journals (Sweden)

    Rasha Abu-Khudir

    Full Text Available BACKGROUND: Congenital hypothyroidism from thyroid dysgenesis (CHTD is predominantly a sporadic disease characterized by defects in the differentiation, migration or growth of thyroid tissue. Of these defects, incomplete migration resulting in ectopic thyroid tissue is the most common (up to 80%. Germinal mutations in the thyroid-related transcription factors NKX2.1, FOXE1, PAX-8, and NKX2.5 have been identified in only 3% of patients with sporadic CHTD. Moreover, a survey of monozygotic twins yielded a discordance rate of 92%, suggesting that somatic events, genetic or epigenetic, probably play an important role in the etiology of CHTD. METHODOLOGY/PRINCIPAL FINDINGS: To assess the role of somatic genetic or epigenetic processes in CHTD, we analyzed gene expression, genome-wide methylation, and structural genome variations in normal versus ectopic thyroid tissue. In total, 1011 genes were more than two-fold induced or repressed. Expression array was validated by quantitative real-time RT-PCR for 100 genes. After correction for differences in thyroid activation state, 19 genes were exclusively associated with thyroid ectopy, among which genes involved in embryonic development (e.g. TXNIP and in the Wnt pathway (e.g. SFRP2 and FRZB were observed. None of the thyroid related transcription factors (FOXE1, HHEX, NKX2.1, NKX2.5 showed decreased expression, whereas PAX8 expression was associated with thyroid activation state. Finally, the expression profile was independent of promoter and CpG island methylation and of structural genome variations. CONCLUSIONS/SIGNIFICANCE: This is the first integrative molecular analysis of ectopic thyroid tissue. Ectopic thyroids show a differential gene expression compared to that of normal thyroids, although molecular basis could not be defined. Replication of this pilot study on a larger cohort could lead to unraveling the elusive cause of defective thyroid migration during embryogenesis.

  16. Analysis of the Complete Mitochondrial Genome Sequence of the Diploid Cotton Gossypium raimondii by Comparative Genomics Approaches

    Directory of Open Access Journals (Sweden)

    Changwei Bi

    2016-01-01

    Full Text Available Cotton is one of the most important economic crops and the primary source of natural fiber and is an important protein source for animal feed. The complete nuclear and chloroplast (cp genome sequences of G. raimondii are already available but not mitochondria. Here, we assembled the complete mitochondrial (mt DNA sequence of G. raimondii into a circular genome of length of 676,078 bp and performed comparative analyses with other higher plants. The genome contains 39 protein-coding genes, 6 rRNA genes, and 25 tRNA genes. We also identified four larger repeats (63.9 kb, 10.6 kb, 9.1 kb, and 2.5 kb in this mt genome, which may be active in intramolecular recombination in the evolution of cotton. Strikingly, nearly all of the G. raimondii mt genome has been transferred to nucleus on Chr1, and the transfer event must be very recent. Phylogenetic analysis reveals that G. raimondii, as a member of Malvaceae, is much closer to another cotton (G. barbadense than other rosids, and the clade formed by two Gossypium species is sister to Brassicales. The G. raimondii mt genome may provide a crucial foundation for evolutionary analysis, molecular biology, and cytoplasmic male sterility in cotton and other higher plants.

  17. Biofilm biodiversity presented by fluorescent in situ hybridisation

    Directory of Open Access Journals (Sweden)

    Wolf Mirela

    2017-01-01

    Full Text Available Numerous microorganisms may be present in the water distribution system. This is associated with the imperfection of purification processes, or secondary water pollution. Not only it results in the deterioration of water quality parameters, but it also increases threat of epidemiological problems. The water that is biologically unstable creates ideal conditions for colonization of the microorganisms to the inner surface of pipelines which may form biofilm. The key issue, enabling prevention and control of the impact of the development of biofilms, is to assess the biodiversity of microbiocenosis. In order to obtain comprehensive characteristics of microorganisms communities on a particular substrate, it is necessary to combine several techniques. Further analysis using molecular biology methods are usually after traditional methods of assessing the microbiological quality of water. Standard methods do not reflect the actual species composition, because they are targeted at the bacteria that can be isolated and cultured in the laboratory. Conventional methods are capable of detecting less than 10% of the organisms in the sample. In order to study the biodiversity of organisms inhabiting a biofilm (apart from the conventional methods analyses of the diversity of nucleic acids should be used. The first method could be the polymerase chain reaction (PCR and denaturing gradient gel electrophoresis (DGGE. Another way may be fluorescence in situ hybridization, which allows to detect determined DNA sequence using specially labeled oligonucleotide probes. Visualization of the material is performed using a fluorescence microscope. The main purpose of this article is to present rapid and precise identification groups of microorganisms in their natural habitat in biofilm using fluorescent in situ hybridization method (FISH . FISH method can be successfully used to visualize these microorganisms, which show difficulties in culturing, as well as to provide

  18. Possible interspecific origin of the B chromosome of Hypsiboas albopunctatus (Spix, 1824 (Anura, Hylidae, revealed by microdissection, chromosome painting, and reverse hybridisation

    Directory of Open Access Journals (Sweden)

    Simone Gruber

    2014-08-01

    Full Text Available The B chromosome in the hylid Hypsiboas albopunctatus (2n = 22 + B is small, almost entirely composed of C-positive heterochromatin, and does not pair with any chromosome of the A complement. B probe, obtained by microdissection and DOP-PCR amplification, was used to search for homology between the B and regular chromosomes of H. albopunctatus and of the related species H. raniceps (Cope, 1862. Reverse hybridisation was also carried out in the investigation. The B probe exclusively painted the supernumerary, not hybridising any other chromosomes in H. albopunctatus, but all H. raniceps chromosomes showed small labelling signals. This result might be an indication that differences exist between the repetitive sequences of A and B chromosomes of H. albopunctatus, and that the chromosomes of H. raniceps and the heterochromatin of the B chromosome of H. albopunctatus are enriched with the same type of repetitive DNA. In meiotic preparations, the B labelled about 30% of scored spermatids, revealing a non-mendelian inheritance, and the painted B in micronucleus suggests that the supernumerary is eliminated from germ line cells. Although our results could suggest an interespecific origin of the B at first sight, further analysis on its repetitive sequences is still necessary. Nevertheless, the accumulation of repetitive sequences, detected in another species, even though closely related, remains an intriguing question.

  19. Comparative genomic analysis of Vibrio parahaemolyticus: serotype conversion and virulence

    Directory of Open Access Journals (Sweden)

    Gil Ana I

    2011-06-01

    Full Text Available Abstract Background Vibrio parahaemolyticus is a common cause of foodborne disease. Beginning in 1996, a more virulent strain having serotype O3:K6 caused major outbreaks in India and other parts of the world, resulting in the emergence of a pandemic. Other serovariants of this strain emerged during its dissemination and together with the original O3:K6 were termed strains of the pandemic clone. Two genomes, one of this virulent strain and one pre-pandemic strain have been sequenced. We sequenced four additional genomes of V. parahaemolyticus in this study that were isolated from different geographical regions and time points. Comparative genomic analyses of six strains of V. parahaemolyticus isolated from Asia and Peru were performed in order to advance knowledge concerning the evolution of V. parahaemolyticus; specifically, the genetic changes contributing to serotype conversion and virulence. Two pre-pandemic strains and three pandemic strains, isolated from different geographical regions, were serotype O3:K6 and either toxin profiles (tdh+, trh- or (tdh-, trh+. The sixth pandemic strain sequenced in this study was serotype O4:K68. Results Genomic analyses revealed that the trh+ and tdh+ strains had different types of pathogenicity islands and mobile elements as well as major structural differences between the tdh pathogenicity islands of the pre-pandemic and pandemic strains. In addition, the results of single nucleotide polymorphism (SNP analysis showed that 94% of the SNPs between O3:K6 and O4:K68 pandemic isolates were within a 141 kb region surrounding the O- and K-antigen-encoding gene clusters. The "core" genes of V. parahaemolyticus were also compared to those of V. cholerae and V. vulnificus, in order to delineate differences between these three pathogenic species. Approximately one-half (49-59% of each species' core genes were conserved in all three species, and 14-24% of the core genes were species-specific and in different

  20. Analysis of chimpanzee history based on genome sequence alignments.

    Directory of Open Access Journals (Sweden)

    Jennifer L Caswell

    2008-04-01

    Full Text Available Population geneticists often study small numbers of carefully chosen loci, but it has become possible to obtain orders of magnitude for more data from overlaps of genome sequences. Here, we generate tens of millions of base pairs of multiple sequence alignments from combinations of three western chimpanzees, three central chimpanzees, an eastern chimpanzee, a bonobo, a human, an orangutan, and a macaque. Analysis provides a more precise understanding of demographic history than was previously available. We show that bonobos and common chimpanzees were separated approximately 1,290,000 years ago, western and other common chimpanzees approximately 510,000 years ago, and eastern and central chimpanzees at least 50,000 years ago. We infer that the central chimpanzee population size increased by at least a factor of 4 since its separation from western chimpanzees, while the western chimpanzee effective population size decreased. Surprisingly, in about one percent of the genome, the genetic relationships between humans, chimpanzees, and bonobos appear to be different from the species relationships. We used PCR-based resequencing to confirm 11 regions where chimpanzees and bonobos are not most closely related. Study of such loci should provide information about the period of time 5-7 million years ago when the ancestors of humans separated from those of the chimpanzees.

  1. Delineation of Steroid-Degrading Microorganisms through Comparative Genomic Analysis

    Directory of Open Access Journals (Sweden)

    Lee H. Bergstrand

    2016-03-01

    Full Text Available Steroids are ubiquitous in natural environments and are a significant growth substrate for microorganisms. Microbial steroid metabolism is also important for some pathogens and for biotechnical applications. This study delineated the distribution of aerobic steroid catabolism pathways among over 8,000 microorganisms whose genomes are available in the NCBI RefSeq database. Combined analysis of bacterial, archaeal, and fungal genomes with both hidden Markov models and reciprocal BLAST identified 265 putative steroid degraders within only Actinobacteria and Proteobacteria, which mainly originated from soil, eukaryotic host, and aquatic environments. These bacteria include members of 17 genera not previously known to contain steroid degraders. A pathway for cholesterol degradation was conserved in many actinobacterial genera, particularly in members of the Corynebacterineae, and a pathway for cholate degradation was conserved in members of the genus Rhodococcus. A pathway for testosterone and, sometimes, cholate degradation had a patchy distribution among Proteobacteria. The steroid degradation genes tended to occur within large gene clusters. Growth experiments confirmed bioinformatic predictions of steroid metabolism capacity in nine bacterial strains. The results indicate there was a single ancestral 9,10-seco-steroid degradation pathway. Gene duplication, likely in a progenitor of Rhodococcus, later gave rise to a cholate degradation pathway. Proteobacteria and additional Actinobacteria subsequently obtained a cholate degradation pathway via horizontal gene transfer, in some cases facilitated by plasmids. Catabolism of steroids appears to be an important component of the ecological niches of broad groups of Actinobacteria and individual species of Proteobacteria.

  2. Preliminary analysis of the mitochondrial genome evolutionary pattern in primates

    Institute of Scientific and Technical Information of China (English)

    Liang ZHAO; Xingtao ZHANG; Xingkui TAO; Weiwei WANG; Ming LI

    2012-01-01

    Since the birth of molecular evolutionary analysis,primates have been a central focus of study and mitochondrial DNA is well suited to these endeavors because of its unique features.Surprisingly,to date no comprehensive evaluation of the nucleotide substitution patterns has been conducted on the mitochondrial genome of primates.Here,we analyzed the evolutionary patterns and evaluated selection and recombination in the mitochondrial genomes of 44 Primates species downloaded from GenBank.The results revealed that a strong rate heterogeneity occurred among sites and genes in all comparisons.Likewise,an obvious decline in primate nucleotide diversity was noted in the subunit rRNAs and tRNAs as compared to the protein-coding genes.Within 13 protein-coding genes,the pattern of nonsynonymous divergence was similar to that of overall nucleotide divergence,while synonymous changes differed only for individual genes,indicating that the rate heterogeneity may result from the rate of change at nonsynonymous sites.Codon usage analysis revealed that there was intermediate codon usage bias in primate protein-coding genes,and supported the idea that GC mutation pressure might determine codon usage and that positive selection is not the driving force for the codon usage bias.Neutrality tests using site-specific positive selection from a Bayesian framework indicated no sites were under positive selection for any gene,consistent with near neutrality.Recombination tests based on the pairwise homoplasy test statistic supported complete linkage even for much older divergent primate species.Thus,with the exception of rate heterogeneity among mitochondrial genes,evaluating the validity assumed complete linkage and selective neutrality in primates prior to phylogenetic or phylogeographic analysis seems unnecessary.

  3. Preliminary analysis of the mitochondrial genome evolutionary pattern in primates.

    Science.gov (United States)

    Zhao, Liang; Zhang, Xingtao; Tao, Xingkui; Wang, Weiwei; Li, Ming

    2012-08-01

    Since the birth of molecular evolutionary analysis, primates have been a central focus of study and mitochondrial DNA is well suited to these endeavors because of its unique features. Surprisingly, to date no comprehensive evaluation of the nucleotide substitution patterns has been conducted on the mitochondrial genome of primates. Here, we analyzed the evolutionary patterns and evaluated selection and recombination in the mitochondrial genomes of 44 Primates species downloaded from GenBank. The results revealed that a strong rate heterogeneity occurred among sites and genes in all comparisons. Likewise, an obvious decline in primate nucleotide diversity was noted in the subunit rRNAs and tRNAs as compared to the protein-coding genes. Within 13 protein-coding genes, the pattern of nonsynonymous divergence was similar to that of overall nucleotide divergence, while synonymous changes differed only for individual genes, indicating that the rate heterogeneity may result from the rate of change at nonsynonymous sites. Codon usage analysis revealed that there was intermediate codon usage bias in primate protein-coding genes, and supported the idea that GC mutation pressure might determine codon usage and that positive selection is not the driving force for the codon usage bias. Neutrality tests using site-specific positive selection from a Bayesian framework indicated no sites were under positive selection for any gene, consistent with near neutrality. Recombination tests based on the pairwise homoplasy test statistic supported complete linkage even for much older divergent primate species. Thus, with the exception of rate heterogeneity among mitochondrial genes, evaluating the validity assumed complete linkage and selective neutrality in primates prior to phylogenetic or phylogeographic analysis seems unnecessary.

  4. Genomic analysis of stress response against arsenic in Caenorhabditis elegans.

    Directory of Open Access Journals (Sweden)

    Surasri N Sahu

    Full Text Available Arsenic, a known human carcinogen, is widely distributed around the world and found in particularly high concentrations in certain regions including Southwestern US, Eastern Europe, India, China, Taiwan and Mexico. Chronic arsenic poisoning affects millions of people worldwide and is associated with increased risk of many diseases including arthrosclerosis, diabetes and cancer. In this study, we explored genome level global responses to high and low levels of arsenic exposure in Caenorhabditis elegans using Affymetrix expression microarrays. This experimental design allows us to do microarray analysis of dose-response relationships of global gene expression patterns. High dose (0.03% exposure caused stronger global gene expression changes in comparison with low dose (0.003% exposure, suggesting a positive dose-response correlation. Biological processes such as oxidative stress, and iron metabolism, which were previously reported to be involved in arsenic toxicity studies using cultured cells, experimental animals, and humans, were found to be affected in C. elegans. We performed genome-wide gene expression comparisons between our microarray data and publicly available C. elegans microarray datasets of cadmium, and sediment exposure samples of German rivers Rhine and Elbe. Bioinformatics analysis of arsenic-responsive regulatory networks were done using FastMEDUSA program. FastMEDUSA analysis identified cancer-related genes, particularly genes associated with leukemia, such as dnj-11, which encodes a protein orthologous to the mammalian ZRF1/MIDA1/MPP11/DNAJC2 family of ribosome-associated molecular chaperones. We analyzed the protective functions of several of the identified genes using RNAi. Our study indicates that C. elegans could be a substitute model to study the mechanism of metal toxicity using high-throughput expression data and bioinformatics tools such as FastMEDUSA.

  5. 13C metabolic flux analysis at a genome-scale.

    Science.gov (United States)

    Gopalakrishnan, Saratram; Maranas, Costas D

    2015-11-01

    Metabolic models used in 13C metabolic flux analysis generally include a limited number of reactions primarily from central metabolism. They typically omit degradation pathways, complete cofactor balances, and atom transition contributions for reactions outside central metabolism. This study addresses the impact on prediction fidelity of scaling-up mapping models to a genome-scale. The core mapping model employed in this study accounts for (75 reactions and 65 metabolites) primarily from central metabolism. The genome-scale metabolic mapping model (GSMM) (697 reaction and 595 metabolites) is constructed using as a basis the iAF1260 model upon eliminating reactions guaranteed not to carry flux based on growth and fermentation data for a minimal glucose growth medium. Labeling data for 17 amino acid fragments obtained from cells fed with glucose labeled at the second carbon was used to obtain fluxes and ranges. Metabolic fluxes and confidence intervals are estimated, for both core and genome-scale mapping models, by minimizing the sum of square of differences between predicted and experimentally measured labeling patterns using the EMU decomposition algorithm. Overall, we find that both topology and estimated values of the metabolic fluxes remain largely consistent between core and GSM model. Stepping up to a genome-scale mapping model leads to wider flux inference ranges for 20 key reactions present in the core model. The glycolysis flux range doubles due to the possibility of active gluconeogenesis, the TCA flux range expanded by 80% due to the availability of a bypass through arginine consistent with labeling data, and the transhydrogenase reaction flux was essentially unresolved due to the presence of as many as five routes for the inter-conversion of NADPH to NADH afforded by the genome-scale model. By globally accounting for ATP demands in the GSMM model the unused ATP decreased drastically with the lower bound matching the maintenance ATP requirement. A non

  6. Comparative genomic analysis and phylogenetic position of Theileria equi

    Directory of Open Access Journals (Sweden)

    Kappmeyer Lowell S

    2012-11-01

    Full Text Available Abstract Background Transmission of arthropod-borne apicomplexan parasites that cause disease and result in death or persistent infection represents a major challenge to global human and animal health. First described in 1901 as Piroplasma equi, this re-emergent apicomplexan parasite was renamed Babesia equi and subsequently Theileria equi, reflecting an uncertain taxonomy. Understanding mechanisms by which apicomplexan parasites evade immune or chemotherapeutic elimination is required for development of effective vaccines or chemotherapeutics. The continued risk of transmission of T. equi from clinically silent, persistently infected equids impedes the goal of returning the U. S. to non-endemic status. Therefore comparative genomic analysis of T. equi was undertaken to: 1 identify genes contributing to immune evasion and persistence in equid hosts, 2 identify genes involved in PBMC infection biology and 3 define the phylogenetic position of T. equi relative to sequenced apicomplexan parasites. Results The known immunodominant proteins, EMA1, 2 and 3 were discovered to belong to a ten member gene family with a mean amino acid identity, in pairwise comparisons, of 39%. Importantly, the amino acid diversity of EMAs is distributed throughout the length of the proteins. Eight of the EMA genes were simultaneously transcribed. As the agents that cause bovine theileriosis infect and transform host cell PBMCs, we confirmed that T. equi infects equine PBMCs, however, there is no evidence of host cell transformation. Indeed, a number of genes identified as potential manipulators of the host cell phenotype are absent from the T. equi genome. Comparative genomic analysis of T. equi revealed the phylogenetic positioning relative to seven apicomplexan parasites using deduced amino acid sequences from 150 genes placed it as a sister taxon to Theileria spp. Conclusions The EMA family does not fit the paradigm for classical antigenic variation, and we propose a

  7. Complete genome sequence and comparative genomic analysis of an emerging human pathogen, serotype V Streptococcus agalactiae

    NARCIS (Netherlands)

    Tettelin, H; Masignani, [No Value; Cieslewicz, MJ; Eisen, JA; Peterson, S; Paulsen, IT; Nelson, KE; Margarit, [No Value; Read, TD; Madoff, LC; Beanan, MJ; Brinkac, LM; Daugherty, SC; DeBoy, RT; Durkin, AS; Kolonay, JF; Madupu, R; Lewis, MR; Radune, D; Fedorova, NB; Scanlan, D; Khouri, H; Mulligan, S; Carty, HA; Cline, RT; Van Aken, SE; Gill, J; Scarselli, M; Mora, M; Iacobini, ET; Brettoni, C; Galli, G; Mariani, M; Vegni, F; Maione, D; Rinaudo, D; Rappuoli, R; Telford, JL; Kasper, DL; Grandi, G; Fraser, CM

    2002-01-01

    The 2,160,267 bp genome sequence of Streptococcus agalactiae, the leading cause of bacterial sepsis, pneumonia, and meningitis in neonates in the U.S. and Europe, is predicted to encode 2,175 genes. Genome comparisons among S. agalactiae, Streptococcus pneumoniae, Streptococcus pyogenes, and the oth

  8. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae : Implications for the microbial "pan-genome"

    NARCIS (Netherlands)

    Tettelin, H; Masignani, [No Value; Cieslewicz, MJ; Donati, C; Medini, D; Ward, NL; Angiuoli, SV; Crabtree, J; Jones, AL; Durkin, AS; DeBoy, RT; Davidsen, TM; Mora, M; Scarselli, M; Ros, IMY; Peterson, JD; Hauser, CR; Sundaram, JP; Nelson, WC; Madupu, R; Brinkac, LM; Dodson, RJ; Rosovitz, MJ; Sullivan, SA; Daugherty, SC; Haft, DH; Selengut, J; Gwinn, ML; Zhou, LW; Zafar, N; Khouri, H; Radune, D; Dimitrov, G; Watkins, K; O'Connor, KJB; Smith, S; Utterback, TR; White, O; Rubens, CE; Grandi, G; Madoff, LC; Kasper, DL; Telford, JL; Wessels, MR; Rappuoli, R; Fraser, CM

    2005-01-01

    The development of efficient and inexpensive genome sequencing methods has revolutionized the study of human bacterial pathogens and improved vaccine design. Unfortunately, the sequence of a single genome does not reflect how genetic variability drives pathogenesis within a bacterial species and als

  9. Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae : Implications for the microbial "pan-genome"

    NARCIS (Netherlands)

    Tettelin, H; Masignani, [No Value; Cieslewicz, MJ; Donati, C; Medini, D; Ward, NL; Angiuoli, SV; Crabtree, J; Jones, AL; Durkin, AS; DeBoy, RT; Davidsen, TM; Mora, M; Scarselli, M; Ros, IMY; Peterson, JD; Hauser, CR; Sundaram, JP; Nelson, WC; Madupu, R; Brinkac, LM; Dodson, RJ; Rosovitz, MJ; Sullivan, SA; Daugherty, SC; Haft, DH; Selengut, J; Gwinn, ML; Zhou, LW; Zafar, N; Khouri, H; Radune, D; Dimitrov, G; Watkins, K; O'Connor, KJB; Smith, S; Utterback, TR; White, O; Rubens, CE; Grandi, G; Madoff, LC; Kasper, DL; Telford, JL; Wessels, MR; Rappuoli, R; Fraser, CM

    2005-01-01

    The development of efficient and inexpensive genome sequencing methods has revolutionized the study of human bacterial pathogens and improved vaccine design. Unfortunately, the sequence of a single genome does not reflect how genetic variability drives pathogenesis within a bacterial species and als

  10. Complete genome sequence and comparative genomic analysis of an emerging human pathogen, serotype V Streptococcus agalactiae

    NARCIS (Netherlands)

    Tettelin, H; Masignani, [No Value; Cieslewicz, MJ; Eisen, JA; Peterson, S; Paulsen, IT; Nelson, KE; Margarit, [No Value; Read, TD; Madoff, LC; Beanan, MJ; Brinkac, LM; Daugherty, SC; DeBoy, RT; Durkin, AS; Kolonay, JF; Madupu, R; Lewis, MR; Radune, D; Fedorova, NB; Scanlan, D; Khouri, H; Mulligan, S; Carty, HA; Cline, RT; Van Aken, SE; Gill, J; Scarselli, M; Mora, M; Iacobini, ET; Brettoni, C; Galli, G; Mariani, M; Vegni, F; Maione, D; Rinaudo, D; Rappuoli, R; Telford, JL; Kasper, DL; Grandi, G; Fraser, CM

    2002-01-01

    The 2,160,267 bp genome sequence of Streptococcus agalactiae, the leading cause of bacterial sepsis, pneumonia, and meningitis in neonates in the U.S. and Europe, is predicted to encode 2,175 genes. Genome comparisons among S. agalactiae, Streptococcus pneumoniae, Streptococcus pyogenes, and the

  11. Genome Sizes of Nine Insect Species Determined by Flow Cytometry and k-mer Analysis

    Science.gov (United States)

    He, Kang; Lin, Kejian; Wang, Guirong; Li, Fei

    2016-01-01

    The flow cytometry method was used to estimate the genome sizes of nine agriculturally important insects, including two coleopterans, five Hemipterans, and two hymenopterans. Among which, the coleopteran Lissorhoptrus oryzophilus (Kuschel) had the largest genome of 981 Mb. The average genome size was 504 Mb, suggesting that insects have a moderate-size genome. Compared with the insects in other orders, hymenopterans had small genomes, which were averagely about ~200 Mb. We found that the genome sizes of four insect species were different between male and female, showing the organismal complexity of insects. The largest difference occurred in the coconut leaf beetle Brontispa longissima (Gestro). The male coconut leaf beetle had a 111 Mb larger genome than females, which might be due to the chromosome number difference between the sexes. The results indicated that insect invasiveness was not related to genome size. We also determined the genome sizes of the small brown planthopper Laodelphax striatellus (Fallén) and the parasitic wasp Macrocentrus cingulum (Brischke) using k-mer analysis with Illunima Solexa sequencing data. There were slight differences in the results from the two methods. k-mer analysis indicated that the genome size of L. striatellus was 500–700 Mb and that of M. cingulum was ~150 Mb. In all, the genome sizes information presented here should be helpful for designing the genome sequencing strategy when necessary. PMID:27932995

  12. Genome sequence analysis of the model grass Brachypodium distachyon: insights into grass genome evolution

    Energy Technology Data Exchange (ETDEWEB)

    Schulman, Al

    2009-08-09

    Three subfamilies of grasses, the Erhardtoideae (rice), the Panicoideae (maize, sorghum, sugar cane and millet), and the Pooideae (wheat, barley and cool season forage grasses) provide the basis of human nutrition and are poised to become major sources of renewable energy. Here we describe the complete genome sequence of the wild grass Brachypodium distachyon (Brachypodium), the first member of the Pooideae subfamily to be completely sequenced. Comparison of the Brachypodium, rice and sorghum genomes reveals a precise sequence- based history of genome evolution across a broad diversity of the grass family and identifies nested insertions of whole chromosomes into centromeric regions as a predominant mechanism driving chromosome evolution in the grasses. The relatively compact genome of Brachypodium is maintained by a balance of retroelement replication and loss. The complete genome sequence of Brachypodium, coupled to its exceptional promise as a model system for grass research, will support the development of new energy and food crops

  13. Examination of equine glandular stomach lesions for bacteria, including Helicobacter spp by fluorescence in situ hybridisation

    DEFF Research Database (Denmark)

    husted, Louise; Jensen, Tim Kåre; Olsen, Susanne N.

    2010-01-01

    appearing mucosa were obtained from horses slaughtered for human consumption. All samples were tested for urease activity using the Pyloritek® assay, while mucosal bacterial content was evaluated using Fluorescence In Situ Hybridisation. In selected sub samples, bacteria characterisation was pursued further...... samples, clones with 99% similarities to Lactobacillus salivarius and Sarcina ventriculi were found. Escherichia like bacterium clones and Enterococcus clones were demonstrated in one focal erosion. Based on a phylogenetic tree these clones had 100% similarity to Escherichia fergusonii and Enterococcus...

  14. HER2 testing in the UK: recommendations for breast and gastric in-situ hybridisation methods

    LENUS (Irish Health Repository)

    Bartlett, J. M. S.

    2011-01-01

    These guidelines supplement existing guidelines on HER2 testing by immunohistochemistry and in-situ hybridisation(ISH) methods in the UK. They provide a specific focus on aspects of guidance relevant to HER2 ISH testing methods, both fluorescent and chromogenic. They are formulated to give advice on methodology, interpretation and quality control for ISH-based testing of HER2 status in common tumour types, including both breast and gastric tumours. The aim is to ensure that all ISH-based testing is accurate, reliable and timely.

  15. Electronic hybridisation implications for the damage-tolerance of thin film metallic glasses

    Science.gov (United States)

    Schnabel, Volker; Jaya, B. Nagamani; Köhler, Mathias; Music, Denis; Kirchlechner, Christoph; Dehm, Gerhard; Raabe, Dierk; Schneider, Jochen M.

    2016-01-01

    A paramount challenge in materials science is to design damage-tolerant glasses. Poisson’s ratio is commonly used as a criterion to gauge the brittle-ductile transition in glasses. However, our data, as well as results in the literature, are in conflict with the concept of Poisson’s ratio serving as a universal parameter for fracture energy. Here, we identify the electronic structure fingerprint associated with damage tolerance in thin film metallic glasses. Our correlative theoretical and experimental data reveal that the fraction of bonds stemming from hybridised states compared to the overall bonding can be associated with damage tolerance in thin film metallic glasses. PMID:27819318

  16. Research study on analysis/use technologies of genome information; Genome joho kaidoku riyo gijutsu no chosa kenkyu

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-03-01

    For wide use of genome information in the industrial field, the required R and D was surveyed from the standpoints of biology and information science. To clarify the present state and issues of the international research on genome analysis, the genome map as well as sequence and function information are first surveyed. The current analysis/use technologies of genome information are analyzed, and the following are summarized: prediction and identification of gene regions in genome sequences, techniques for searching and selecting useful genes, and techniques for predicting the expression of gene functions and the gene-product structure and functions. It is recommended that R and D and data collection/interpretation necessary to clarify inter-gene interactions and information networks should be promoted by integrating Japanese advanced know-how and technologies. As examples of the impact of the research results on industry and society, the present state and future expected effect are summarized for medicines, diagnosis/analysis instruments, chemicals, foods, agriculture, fishery, animal husbandry, electronics, environment and information. 278 refs., 42 figs., 5 tabs.

  17. Genomic analysis of smoothened inhibitor resistance in basal cell carcinoma.

    Science.gov (United States)

    Sharpe, Hayley J; Pau, Gregoire; Dijkgraaf, Gerrit J; Basset-Seguin, Nicole; Modrusan, Zora; Januario, Thomas; Tsui, Vickie; Durham, Alison B; Dlugosz, Andrzej A; Haverty, Peter M; Bourgon, Richard; Tang, Jean Y; Sarin, Kavita Y; Dirix, Luc; Fisher, David C; Rudin, Charles M; Sofen, Howard; Migden, Michael R; Yauch, Robert L; de Sauvage, Frederic J

    2015-03-09

    Smoothened (SMO) inhibitors are under clinical investigation for the treatment of several cancers. Vismodegib is approved for the treatment of locally advanced and metastatic basal cell carcinoma (BCC). Most BCC patients experience significant clinical benefit on vismodegib, but some develop resistance. Genomic analysis of tumor biopsies revealed that vismodegib resistance is associated with Hedgehog (Hh) pathway reactivation, predominantly through mutation of the drug target SMO and to a lesser extent through concurrent copy number changes in SUFU and GLI2. SMO mutations either directly impaired drug binding or activated SMO to varying levels. Furthermore, we found evidence for intra-tumor heterogeneity, suggesting that a combination of therapies targeting components at multiple levels of the Hh pathway is required to overcome resistance.

  18. Technology-Driven and Evidence-Based Genomic Analysis for Integrated Pediatric and Prenatal Genetics Evaluation

    Institute of Scientific and Technical Information of China (English)

    Yuan Wei; Fang Xu; Peining Li

    2013-01-01

    The first decade since the completion of the Human Genome Project has been marked with rapid development of genomic technologies and their immediate clinical applications.Genomic analysis using oligonucleotide array comparative genomic hybridization (aCGH) or single nucleotide polymorphism (SNP) chips has been applied to pediatric patients with developmental and intellectual disabilities (DD/ID),multiple congenital anomalies (MCA) and autistic spectrum disorders (ASD).Evaluation of analytical and clinical validities of aCGH showed > 99% sensitivity and specificity and increased analytical resolution by higher density probe coverage.Reviews of case series,multi-center comparison and large patient-control studies demonstrated a diagnostic yield of 12%-20%; approximately 60% of these abnormalities were recurrent genomic disorders.This pediatric experience has been extended toward prenatal diagnosis.A series of reports indicated approximately 10% of pregnancies with ultrasound-detected structural anomalies and normal cytogenetic findings had genomic abnormalities,and 30% of these abnormalities were syndromic genomic disorders.Evidence-based practice guidelines and standards for implementing genomic analysis and web-delivered knowledge resources for interpreting genomic findings have been established.The progress from this technology-driven and evidence-based genomic analysis provides not only opportunities to dissect disease-causing mechanisms and develop rational therapeutic interventions but also important lessons for integrating genomic sequencing into pediatric and prenatal genetic evaluation.

  19. Comparative Genomics Analysis of Streptomyces Species Reveals Their Adaptation to the Marine Environment and Their Diversity at the Genomic Level

    Science.gov (United States)

    Tian, Xinpeng; Zhang, Zhewen; Yang, Tingting; Chen, Meili; Li, Jie; Chen, Fei; Yang, Jin; Li, Wenjie; Zhang, Bing; Zhang, Zhang; Wu, Jiayan; Zhang, Changsheng; Long, Lijuan; Xiao, Jingfa

    2016-01-01

    Over 200 genomes of streptomycete strains that were isolated from various environments are available from the NCBI. However, little is known about the characteristics that are linked to marine adaptation in marine-derived streptomycetes. The particularity and complexity of the marine environment suggest that marine streptomycetes are genetically diverse. Here, we sequenced nine strains from the Streptomyces genus that were isolated from different longitudes, latitudes, and depths of the South China Sea. Then we compared these strains to 22 NCBI downloaded streptomycete strains. Thirty-one streptomycete strains are clearly grouped into a marine-derived subgroup and multiple source subgroup-based phylogenetic tree. The phylogenetic analyses have revealed the dynamic process underlying streptomycete genome evolution, and lateral gene transfer is an important driving force during the process. Pan-genomics analyses have revealed that streptomycetes have an open pan-genome, which reflects the diversity of these streptomycetes and guarantees the species a quick and economical response to diverse environments. Functional and comparative genomics analyses indicate that the marine-derived streptomycetes subgroup possesses some common characteristics of marine adaptation. Our findings have expanded our knowledge of how ocean isolates of streptomycete strains adapt to marine environments. The availability of streptomycete genomes from the South China Sea will be beneficial for further analysis on marine streptomycetes and will enrich the South China Sea’s genetic data sources. PMID:27446038

  20. Ethical considerations of research policy for personal genome analysis: the approach of the Genome Science Project in Japan.

    Science.gov (United States)

    Minari, Jusaku; Shirai, Tetsuya; Kato, Kazuto

    2014-12-01

    As evidenced by high-throughput sequencers, genomic technologies have recently undergone radical advances. These technologies enable comprehensive sequencing of personal genomes considerably more efficiently and less expensively than heretofore. These developments present a challenge to the conventional framework of biomedical ethics; under these changing circumstances, each research project has to develop a pragmatic research policy. Based on the experience with a new large-scale project-the Genome Science Project-this article presents a novel approach to conducting a specific policy for personal genome research in the Japanese context. In creating an original informed-consent form template for the project, we present a two-tiered process: making the draft of the template following an analysis of national and international policies; refining the draft template in conjunction with genome project researchers for practical application. Through practical use of the template, we have gained valuable experience in addressing challenges in the ethical review process, such as the importance of sharing details of the latest developments in genomics with members of research ethics committees. We discuss certain limitations of the conventional concept of informed consent and its governance system and suggest the potential of an alternative process using information technology.

  1. Comparative genomics analysis of rice and pineapple contributes to understand the chromosome number reduction and genomic changes in grasses

    Directory of Open Access Journals (Sweden)

    Jinpeng Wang

    2016-10-01

    Full Text Available Rice is one of the most researched model plant, and has a genome structure most resembling that of the grass common ancestor after a grass common tetraploidization ~100 million years ago. There has been a standing controversy whether there had been 5 or 7 basic chromosomes, before the tetraploidization, which were tackled but could not be well solved for the lacking of a sequenced and assembled outgroup plant to have a conservative genome structure. Recently, the availability of pineapple genome, which has not been subjected to the grass-common tetraploidization, provides a precious opportunity to solve the above controversy and to research into genome changes of rice and other grasses. Here, we performed a comparative genomics analysis of pineapple and rice, and found solid evidence that grass-common ancestor had 2n =2x =14 basic chromosomes before the tetraploidization and duplicated to 2n = 4x = 28 after the event. Moreover, we proposed that enormous gene missing from duplicated regions in rice should be explained by an allotetraploid produced by prominently divergent parental lines, rather than gene losses after their divergence. This means that genome fractionation might have occurred before the formation of the allotetraploid grass ancestor.

  2. Comparative Genomics Analysis of Streptomyces Species Reveals Their Adaptation to the Marine Environment and Their Diversity at the Genomic Level.

    Science.gov (United States)

    Tian, Xinpeng; Zhang, Zhewen; Yang, Tingting; Chen, Meili; Li, Jie; Chen, Fei; Yang, Jin; Li, Wenjie; Zhang, Bing; Zhang, Zhang; Wu, Jiayan; Zhang, Changsheng; Long, Lijuan; Xiao, Jingfa

    2016-01-01

    Over 200 genomes of streptomycete strains that were isolated from various environments are available from the NCBI. However, little is known about the characteristics that are linked to marine adaptation in marine-derived streptomycetes. The particularity and complexity of the marine environment suggest that marine streptomycetes are genetically diverse. Here, we sequenced nine strains from the Streptomyces genus that were isolated from different longitudes, latitudes, and depths of the South China Sea. Then we compared these strains to 22 NCBI downloaded streptomycete strains. Thirty-one streptomycete strains are clearly grouped into a marine-derived subgroup and multiple source subgroup-based phylogenetic tree. The phylogenetic analyses have revealed the dynamic process underlying streptomycete genome evolution, and lateral gene transfer is an important driving force during the process. Pan-genomics analyses have revealed that streptomycetes have an open pan-genome, which reflects the diversity of these streptomycetes and guarantees the species a quick and economical response to diverse environments. Functional and comparative genomics analyses indicate that the marine-derived streptomycetes subgroup possesses some common characteristics of marine adaptation. Our findings have expanded our knowledge of how ocean isolates of streptomycete strains adapt to marine environments. The availability of streptomycete genomes from the South China Sea will be beneficial for further analysis on marine streptomycetes and will enrich the South China Sea's genetic data sources.

  3. Exploring a Nonmodel Teleost Genome Through RAD Sequencing-Linkage Mapping in Common Pandora, Pagellus erythrinus and Comparative Genomic Analysis.

    Science.gov (United States)

    Manousaki, Tereza; Tsakogiannis, Alexandros; Taggart, John B; Palaiokostas, Christos; Tsaparis, Dimitris; Lagnel, Jacques; Chatziplis, Dimitrios; Magoulas, Antonios; Papandroulakis, Nikos; Mylonas, Constantinos C; Tsigenopoulos, Costas S

    2015-12-29

    Common pandora (Pagellus erythrinus) is a benthopelagic marine fish belonging to the teleost family Sparidae, and a newly recruited species in Mediterranean aquaculture. The paucity of genetic information relating to sparids, despite their growing economic value for aquaculture, provides the impetus for exploring the genomics of this fish group. Genomic tool development, such as genetic linkage maps provision, lays the groundwork for linking genotype to phenotype, allowing fine-mapping of loci responsible for beneficial traits. In this study, we applied ddRAD methodology to identify polymorphic markers in a full-sib family of common pandora. Employing the Illumina MiSeq platform, we sampled and sequenced a size-selected genomic fraction of 99 individuals, which led to the identification of 920 polymorphic loci. Downstream mapping analysis resulted in the construction of 24 robust linkage groups, corresponding to the karyotype of the species. The common pandora linkage map showed varying degrees of conserved synteny with four other teleost genomes, namely the European seabass (Dicentrarchus labrax), Nile tilapia (Oreochromis niloticus), stickleback (Gasterosteus aculeatus), and medaka (Oryzias latipes), suggesting a conserved genomic evolution in Sparidae. Our work exploits the possibilities of genotyping by sequencing to gain novel insights into genome structure and evolution. Such information will boost the study of cultured species and will set the foundation for a deeper understanding of the complex evolutionary history of teleosts.

  4. Exploring a Nonmodel Teleost Genome Through RAD Sequencing—Linkage Mapping in Common Pandora, Pagellus erythrinus and Comparative Genomic Analysis

    Directory of Open Access Journals (Sweden)

    Tereza Manousaki

    2016-03-01

    Full Text Available Common pandora (Pagellus erythrinus is a benthopelagic marine fish belonging to the teleost family Sparidae, and a newly recruited species in Mediterranean aquaculture. The paucity of genetic information relating to sparids, despite their growing economic value for aquaculture, provides the impetus for exploring the genomics of this fish group. Genomic tool development, such as genetic linkage maps provision, lays the groundwork for linking genotype to phenotype, allowing fine-mapping of loci responsible for beneficial traits. In this study, we applied ddRAD methodology to identify polymorphic markers in a full-sib family of common pandora. Employing the Illumina MiSeq platform, we sampled and sequenced a size-selected genomic fraction of 99 individuals, which led to the identification of 920 polymorphic loci. Downstream mapping analysis resulted in the construction of 24 robust linkage groups, corresponding to the karyotype of the species. The common pandora linkage map showed varying degrees of conserved synteny with four other teleost genomes, namely the European seabass (Dicentrarchus labrax, Nile tilapia (Oreochromis niloticus, stickleback (Gasterosteus aculeatus, and medaka (Oryzias latipes, suggesting a conserved genomic evolution in Sparidae. Our work exploits the possibilities of genotyping by sequencing to gain novel insights into genome structure and evolution. Such information will boost the study of cultured species and will set the foundation for a deeper understanding of the complex evolutionary history of teleosts.

  5. Exploring a Nonmodel Teleost Genome Through RAD Sequencing—Linkage Mapping in Common Pandora, Pagellus erythrinus and Comparative Genomic Analysis

    Science.gov (United States)

    Manousaki, Tereza; Tsakogiannis, Alexandros; Taggart, John B.; Palaiokostas, Christos; Tsaparis, Dimitris; Lagnel, Jacques; Chatziplis, Dimitrios; Magoulas, Antonios; Papandroulakis, Nikos; Mylonas, Constantinos C.; Tsigenopoulos, Costas S.

    2015-01-01

    Common pandora (Pagellus erythrinus) is a benthopelagic marine fish belonging to the teleost family Sparidae, and a newly recruited species in Mediterranean aquaculture. The paucity of genetic information relating to sparids, despite their growing economic value for aquaculture, provides the impetus for exploring the genomics of this fish group. Genomic tool development, such as genetic linkage maps provision, lays the groundwork for linking genotype to phenotype, allowing fine-mapping of loci responsible for beneficial traits. In this study, we applied ddRAD methodology to identify polymorphic markers in a full-sib family of common pandora. Employing the Illumina MiSeq platform, we sampled and sequenced a size-selected genomic fraction of 99 individuals, which led to the identification of 920 polymorphic loci. Downstream mapping analysis resulted in the construction of 24 robust linkage groups, corresponding to the karyotype of the species. The common pandora linkage map showed varying degrees of conserved synteny with four other teleost genomes, namely the European seabass (Dicentrarchus labrax), Nile tilapia (Oreochromis niloticus), stickleback (Gasterosteus aculeatus), and medaka (Oryzias latipes), suggesting a conserved genomic evolution in Sparidae. Our work exploits the possibilities of genotyping by sequencing to gain novel insights into genome structure and evolution. Such information will boost the study of cultured species and will set the foundation for a deeper understanding of the complex evolutionary history of teleosts. PMID:26715088

  6. Functional Analysis of Shewanella, a cross genome comparison.

    Energy Technology Data Exchange (ETDEWEB)

    Serres, Margrethe H.

    2009-05-15

    The bacterial genus Shewanella includes a group of highly versatile organisms that have successfully adapted to life in many environments ranging from aquatic (fresh and marine) to sedimentary (lake and marine sediments, subsurface sediments, sea vent). A unique respiratory capability of the Shewanellas, initially observed for Shewanella oneidensis MR-1, is the ability to use metals and metalloids, including radioactive compounds, as electron acceptors. Members of the Shewanella genus have also been shown to degrade environmental pollutants i.e. halogenated compounds, making this group highly applicable for the DOE mission. S. oneidensis MR-1 has in addition been found to utilize a diverse set of nutrients and to have a large set of genes dedicated to regulation and to sensing of the environment. The sequencing of the S. oneidensis MR-1 genome facilitated experimental and bioinformatics analyses by a group of collaborating researchers, the Shewanella Federation. Through the joint effort and with support from Department of Energy S. oneidensis MR-1 has become a model organism of study. Our work has been a functional analysis of S. oneidensis MR-1, both by itself and as part of a comparative study. We have improved the annotation of gene products, assigned metabolic functions, and analyzed protein families present in S. oneidensis MR-1. The data has been applied to analysis of experimental data (i.e. gene expression, proteome) generated for S. oneidensis MR-1. Further, this work has formed the basis for a comparative study of over 20 members of the Shewanella genus. The species and strains selected for genome sequencing represented an evolutionary gradient of DNA relatedness, ranging from close to intermediate, and to distant. The organisms selected have also adapted to a variety of ecological niches. Through our work we have been able to detect and interpret genome similarities and differences between members of the genus. We have in this way contributed to the

  7. Identification of conserved regulatory elements by comparative genome analysis

    Directory of Open Access Journals (Sweden)

    Jareborg Niclas

    2003-05-01

    Full Text Available Abstract Background For genes that have been successfully delineated within the human genome sequence, most regulatory sequences remain to be elucidated. The annotation and interpretation process requires additional data resources and significant improvements in computational methods for the detection of regulatory regions. One approach of growing popularity is based on the preferential conservation of functional sequences over the course of evolution by selective pressure, termed 'phylogenetic footprinting'. Mutations are more likely to be disruptive if they appear in functional sites, resulting in a measurable difference in evolution rates between functional and non-functional genomic segments. Results We have devised a flexible suite of methods for the identification and visualization of conserved transcription-factor-binding sites. The system reports those putative transcription-factor-binding sites that are both situated in conserved regions and located as pairs of sites in equivalent positions in alignments between two orthologous sequences. An underlying collection of metazoan transcription-factor-binding profiles was assembled to facilitate the study. This approach results in a significant improvement in the detection of transcription-factor-binding sites because of an increased signal-to-noise ratio, as demonstrated with two sets of promoter sequences. The method is implemented as a graphical web application, ConSite, which is at the disposal of the scientific community at http://www.phylofoot.org/. Conclusions Phylogenetic footprinting dramatically improves the predictive selectivity of bioinformatic approaches to the analysis of promoter sequences. ConSite delivers unparalleled performance using a novel database of high-quality binding models for metazoan transcription factors. With a dynamic interface, this bioinformatics tool provides broad access to promoter analysis with phylogenetic footprinting.

  8. SIGMA2: A system for the integrative genomic multi-dimensional analysis of cancer genomes, epigenomes, and transcriptomes

    Directory of Open Access Journals (Sweden)

    MacAulay Calum

    2008-10-01

    Full Text Available Abstract Background High throughput microarray technologies have afforded the investigation of genomes, epigenomes, and transcriptomes at unprecedented resolution. However, software packages to handle, analyze, and visualize data from these multiple 'omics disciplines have not been adequately developed. Results Here, we present SIGMA2, a system for the integrative genomic multi-dimensional analysis of cancer genomes, epigenomes, and transcriptomes. Multi-dimensional datasets can be simultaneously visualized and analyzed with respect to each dimension, allowing combinatorial integration of the different assays belonging to the different 'omics. Conclusion The identification of genes altered at multiple levels such as copy number, loss of heterozygosity (LOH, DNA methylation and the detection of consequential changes in gene expression can be concertedly performed, establishing SIGMA2 as a novel tool to facilitate the high throughput systems biology analysis of cancer.

  9. Genomic rearrangements at the FRA2H common fragile site frequently involve non-homologous recombination events across LTR and L1(LINE) repeats.

    Science.gov (United States)

    Brueckner, Lena M; Sagulenko, Evgeny; Hess, Elisa M; Zheglo, Diana; Blumrich, Anne; Schwab, Manfred; Savelyeva, Larissa

    2012-08-01

    Common fragile sites (cFSs) are non-random chromosomal regions that are prone to breakage under conditions of replication stress. DNA damage and chromosomal alterations at cFSs appear to be critical events in the development of various human diseases, especially carcinogenesis. Despite the growing interest in understanding the nature of cFS instability, only a few cFSs have been molecularly characterised. In this study, we fine-mapped the location of FRA2H using six-colour fluorescence in situ hybridisation and showed that it is one of the most active cFSs in the human genome. FRA2H encompasses approximately 530 kb of a gene-poor region containing a novel large intergenic non-coding RNA gene (AC097500.2). Using custom-designed array comparative genomic hybridisation, we detected gross and submicroscopic chromosomal rearrangements involving FRA2H in a panel of 54 neuroblastoma, colon and breast cancer cell lines. The genomic alterations frequently involved different classes of long terminal repeats and long interspersed nuclear elements. An analysis of breakpoint junction sequence motifs predominantly revealed signatures of microhomology-mediated non-homologous recombination events. Our data provide insight into the molecular structure of cFSs and sequence motifs affected by their activation in cancer. Identifying cFS sequences will accelerate the search for DNA biomarkers and targets for individualised therapies.

  10. Endemic North African Quercus afares Pomel originates from hybridisation between two genetically very distant oak species (Q. suber L. and Q. canariensis Willd.): evidence from nuclear and cytoplasmic markers.

    Science.gov (United States)

    Mir, C; Toumi, L; Jarne, P; Sarda, V; Di Giusto, F; Lumaret, R

    2006-02-01

    Hybridisation is a potent force in plant evolution, although there are few reported examples of stabilised species that have been created through homoploid hybridisation. We focus here on Quercus afares, an endemic North African species that combines morphological, physiological and ecological traits of both Q. suber and Q. canariensis, two phylogenetically distant species. These two species are sympatric with Q. afares over most of its distribution. We studied two Q. afares populations (one from Algeria and one from Tunisia), as well as several populations of both Q. suber and Q. canariensis sampled both within and outside areas where these species overlap with Q. afares. A genetic analysis was conducted using both nuclear (allozymes) and chloroplastic markers, which shows that Q. afares originates from a Q. suber x Q. canariensis hybridisation. At most loci, Q. afares predominantly possesses alleles from Q. suber, suggesting that the initial cross between Q. suber and Q. canariensis was followed by backcrossing with Q. suber. Other hypotheses that can account for this result, including genetic drift, gene silencing, gene conversion and selection, are discussed. A single Q. suber chlorotype was detected, and all Q. afares individuals displayed this chlorotype, indicating that Q. suber was the maternal parent. Q. afares is genetically, morphologically and ecologically differentiated from its parental species, and can therefore be considered as a stabilised hybrid species.

  11. Sequencing and comparative genome analysis of two pathogenic Streptococcus gallolyticus subspecies: genome plasticity, adaptation and virulence.

    Directory of Open Access Journals (Sweden)

    I-Hsuan Lin

    Full Text Available Streptococcus gallolyticus infections in humans are often associated with bacteremia, infective endocarditis and colon cancers. The disease manifestations are different depending on the subspecies of S. gallolyticus causing the infection. Here, we present the complete genomes of S. gallolyticus ATCC 43143 (biotype I and S. pasteurianus ATCC 43144 (biotype II.2. The genomic differences between the two biotypes were characterized with comparative genomic analyses. The chromosome of ATCC 43143 and ATCC 43144 are 2,36 and 2,10 Mb in length and encode 2246 and 1869 CDS respectively. The organization and genomic contents of both genomes were most similar to the recently published S. gallolyticus UCN34, where 2073 (92% and 1607 (86% of the ATCC 43143 and ATCC 43144 CDS were conserved in UCN34 respectively. There are around 600 CDS conserved in all Streptococcus genomes, indicating the Streptococcus genus has a small core-genome (constitute around 30% of total CDS and substantial evolutionary plasticity. We identified eight and five regions of genome plasticity in ATCC 43143 and ATCC 43144 respectively. Within these regions, several proteins were recognized to contribute to the fitness and virulence of each of the two subspecies. We have also predicted putative cell-surface associated proteins that could play a role in adherence to host tissues, leading to persistent infections causing sub-acute and chronic diseases in humans. This study showed evidence that the S. gallolyticus still possesses genes making it suitable in a rumen environment, whereas the ability for S. pasteurianus to live in rumen is reduced. The genome heterogeneity and genetic diversity among the two biotypes, especially membrane and lipoproteins, most likely contribute to the differences in the pathogenesis of the two S. gallolyticus biotypes and the type of disease an infected patient eventually develops.

  12. Whole genome sequence and comparative genomic sequence analysis of Helicoverpa armigera nucleopolyhedrovirus (HearNPV-L1) isolated from India.

    Science.gov (United States)

    Raghavendra, Ashika T; Jalali, Sushil K; Ojha, Rakshit; Shivalingaswamy, Timalapur M; Bhatnagar, Raj

    2017-03-01

    The whole genome of Helicoverpa armigera nucleopolyhedrovirus (HearNPV) from India, HearNPV-L1, was sequenced and analyzed, with a view to look for genes and/or nucleotide sequences that might be involved in the differences and virulence among other HearNPVs sequenced from other countries like SP1A (Spain), NNg1 (Kenya) and G4 (China). The entire nucleotide sequence of the HearNPV-L1 genome was 136,740 bp in length having GC content of 39.19% and contained 113 ORFs that could encode polypeptides with more than 50 amino acids (GenBank accession number KT013224). Two ORFs, viz., ORF 18 (300 bp) and ORF 19 (401 bp) identified were unique in HearNPV-L1 genome. Most of the HearNPV-L1 ORFs showed high similarity to NNg1, SP1A and G4 genomes. HearNPV-L1 genome contains 5 h (hr1-hr5), these regions were found 84-100% similar to hr region of NNg1, SP1A and G4 genomes. A total of four bro genes were observed in HearNPV-L1 genome, of which bro-a gene was 12 and 351 bp bigger than SP1A and G4 bro-a, respectively, while bro-b was 15 bp bigger SP1A and NNg1 bro-b, whereas 593 bp shorter than G4 bro-b, while bro-c was 12 bp shorter than NNg1, however bro-c was absent in G4 genome. HearNPV-L1 bro-d was 100% homologous to bro-d of SP1A, NNg1 and G4 genomes, respectively. The comparative analysis of HearNPV-L1 genome indicated that there are several other putative genes and nucleotide sequences that may be responsible for insecticidal activity in HearNPV-L1 isolate, however, further functional analysis of the hypothetical (putative) genes may help identifying the genes that are crucial for the virulence and insecticidal activity.

  13. e-Fungi: a data resource for comparative analysis of fungal genomes

    Directory of Open Access Journals (Sweden)

    Hubbard Simon J

    2007-11-01

    Full Text Available Abstract Background The number of sequenced fungal genomes is ever increasing, with about 200 genomes already fully sequenced or in progress. Only a small percentage of those genomes have been comprehensively studied, for example using techniques from functional genomics. Comparative analysis has proven to be a useful strategy for enhancing our understanding of evolutionary biology and of the less well understood genomes. However, the data required for these analyses tends to be distributed in various heterogeneous data sources, making systematic comparative studies a cumbersome task. Furthermore, comparative analyses benefit from close integration of derived data sets that cluster genes or organisms in a way that eases the expression of requests that clarify points of similarity or difference between species. Description To support systematic comparative analyses of fungal genomes we have developed the e-Fungi database, which integrates a variety of data for more than 30 fungal genomes. Publicly available genome data, functional annotations, and pathway information has been integrated into a single data repository and complemented with results of comparative analyses, such as MCL and OrthoMCL cluster analysis, and predictions of signaling proteins and the sub-cellular localisation of proteins. To access the data, a library of analysis tasks is available through a web interface. The analysis tasks are motivated by recent comparative genomics studies, and aim to support the study of evolutionary biology as well as community efforts for improving the annotation of genomes. Web services for each query are also available, enabling the tasks to be incorporated into workflows. Conclusion The e-Fungi database provides fungal biologists with a resource for comparative studies of a large range of fungal genomes. Its analysis library supports the comparative study of genome data, functional annotation, and results of large scale analyses over all the

  14. Genome-wide analysis of alternative splicing in Chlamydomonas reinhardtii

    Directory of Open Access Journals (Sweden)

    Thomas Julie

    2010-02-01

    Full Text Available Abstract Background Genome-wide computational analysis of alternative splicing (AS in several flowering plants has revealed that pre-mRNAs from about 30% of genes undergo AS. Chlamydomonas, a simple unicellular green alga, is part of the lineage that includes land plants. However, it diverged from land plants about one billion years ago. Hence, it serves as a good model system to study alternative splicing in early photosynthetic eukaryotes, to obtain insights into the evolution of this process in plants, and to compare splicing in simple unicellular photosynthetic and non-photosynthetic eukaryotes. We performed a global analysis of alternative splicing in Chlamydomonas reinhardtii using its recently completed genome sequence and all available ESTs and cDNAs. Results Our analysis of AS using BLAT and a modified version of the Sircah tool revealed AS of 498 transcriptional units with 611 events, representing about 3% of the total number of genes. As in land plants, intron retention is the most prevalent form of AS. Retained introns and skipped exons tend to be shorter than their counterparts in constitutively spliced genes. The splice site signals in all types of AS events are weaker than those in constitutively spliced genes. Furthermore, in alternatively spliced genes, the prevalent splice form has a stronger splice site signal than the non-prevalent form. Analysis of constitutively spliced introns revealed an over-abundance of motifs with simple repetitive elements in comparison to introns involved in intron retention. In almost all cases, AS results in a truncated ORF, leading to a coding sequence that is around 50% shorter than the prevalent splice form. Using RT-PCR we verified AS of two genes and show that they produce more isoforms than indicated by EST data. All cDNA/EST alignments and splice graphs are provided in a website at http://combi.cs.colostate.edu/as/chlamy. Conclusions The extent of AS in Chlamydomonas that we observed is much

  15. Pattern Analysis and Decision Support for Cancer through Clinico-Genomic Profiles

    Science.gov (United States)

    Exarchos, Themis P.; Giannakeas, Nikolaos; Goletsis, Yorgos; Papaloukas, Costas; Fotiadis, Dimitrios I.

    Advances in genome technology are playing a growing role in medicine and healthcare. With the development of new technologies and opportunities for large-scale analysis of the genome, genomic data have a clear impact on medicine. Cancer prognostics and therapeutics are among the first major test cases for genomic medicine, given that all types of cancer are related with genomic instability. In this paper we present a novel system for pattern analysis and decision support in cancer. The system integrates clinical data from electronic health records and genomic data. Pattern analysis and data mining methods are applied to these integrated data and the discovered knowledge is used for cancer decision support. Through this integration, conclusions can be drawn for early diagnosis, staging and cancer treatment.

  16. Analysis of pan-genome content and its application in microbial identification

    DEFF Research Database (Denmark)

    Lukjancenko, Oksana

    of genomic data and use this to answer important biological questions. More specifically, comparison of prokaryotic proteomes is used to determine possible sets of functions, essential to sustain microbial life; to extract and interpret similarities and variance in genomic content within different taxonomic...... analyses for the characterization of two Listeria monocytogenes strains. Chapter 4 describes the use of profile HMMs for comparative analysis using for sequence-based homology searches. Paper III introduces PanFunPro a new, profile HMM-based method for pan-genome analysis. Paper IV illustrates...... the application of PanFunPro to a set of more than 2000 genomes; this paper aims to define set of protein families, which are conserved among all the genomes. Papers V demonstrates comparative genomics analysis of proteomes, belonging to Vibrio genus. In the last project, described in Chapter 5, both BLAST...

  17. Decelerated genome evolution in modern vertebrates revealed by analysis of multiple lancelet genomes.

    Science.gov (United States)

    Huang, Shengfeng; Chen, Zelin; Yan, Xinyu; Yu, Ting; Huang, Guangrui; Yan, Qingyu; Pontarotti, Pierre Antoine; Zhao, Hongchen; Li, Jie; Yang, Ping; Wang, Ruihua; Li, Rui; Tao, Xin; Deng, Ting; Wang, Yiquan; Li, Guang; Zhang, Qiujin; Zhou, Sisi; You, Leiming; Yuan, Shaochun; Fu, Yonggui; Wu, Fenfang; Dong, Meiling; Chen, Shangwu; Xu, Anlong

    2014-12-19

    Vertebrates diverged from other chordates ~500 Myr ago and experienced successful innovations and adaptations, but the genomic basis underlying vertebrate origins are not fully understood. Here we suggest, through comparison with multiple lancelet (amphioxus) genomes, that ancient vertebrates experienced high rates of protein evolution, genome rearrangement and domain shuffling and that these rates greatly slowed down after the divergence of jawed and jawless vertebrates. Compared with lancelets, modern vertebrates retain, at least relatively, less protein diversity, fewer nucleotide polymorphisms, domain combinations and conserved non-coding elements (CNE). Modern vertebrates also lost substantial transposable element (TE) diversity, whereas lancelets preserve high TE diversity that includes even the long-sought RAG transposon. Lancelets also exhibit rapid gene turnover, pervasive transcription, fastest exon shuffling in metazoans and substantial TE methylation not observed in other invertebrates. These new lancelet genome sequences provide new insights into the chordate ancestral state and the vertebrate evolution.

  18. Volatile Organic Compound Emission from Quercus suber, Quercus canariensis, and its hybridisation product Quercus afares

    Science.gov (United States)

    Welter, S.; Bracho Nuñez, A.; Staudt, M.; Kesselmeier, J.

    2009-04-01

    Oaks represent one of the most important plant genera in the Northern hemisphere and include many intensively VOC emitting species. The major group constitutes the isoprene emitters, but also monoterpene emitters and non-emitters can be found. These variations in the oak species might partly be due to their propensity for inter- and intraspecific hybridisation. This study addresses the foliar VOC production of the former hybridisation product the deciduous Quercus afares and its parents, two very distant species: the evergreen monoterpene emitter Quercus suber and the deciduous isoprene emitter Quercus canariensis. The measurements were performed in Southern France, applying two different methods. Plants were investigated in situ in the field with a portable gas exchange measuring system as well as in the laboratory on cut branches with an adapted enclosure system. Quercus afares was found to be a monoterpene emitting species. However, the monoterpene emission was lower and the composition different to that of Quercus suber. Whereas Quercus suber trees belonged to the pinene type most individuals of Quercus afares were identified to represent a limonene type. Quercus canariensis emitted besides high amounts of isoprene also linalool and (Z)-3-hexenylacetate. Emissions from Quercus suber and Quercus afares were higher in the field measurements than in the laboratory on cut branches whereas Quercus canariensis exhibited lower isoprene emissions from cut branches. The results demonstrate the need of further emission studies on a plant species level.

  19. Genomic and single nucleotide polymorphism analysis of infectious bronchitis coronavirus.

    Science.gov (United States)

    Abolnik, Celia

    2015-06-01

    Infectious bronchitis virus (IBV) is a Gammacoronavirus that causes a highly contagious respiratory disease in chickens. A QX-like strain was analysed by high-throughput Illumina sequencing and genetic variation across the entire viral genome was explored at the sub-consensus level by single nucleotide polymorphism (SNP) analysis. Thirteen open reading frames (ORFs) in the order 5'-UTR-1a-1ab-S-3a-3b-E-M-4b-4c-5a-5b-N-6b-3'UTR were predicted. The relative frequencies of missense: silent SNPs were calculated to obtain a comparative measure of variability in specific genes. The most variable ORFs in descending order were E, 3b, 5'UTR, N, 1a, S, 1ab, M, 4c, 5a, 6b. The E and 3b protein products play key roles in coronavirus virulence, and RNA folding demonstrated that the mutations in the 5'UTR did not alter the predicted secondary structure. The frequency of SNPs in the Spike (S) protein ORF of 0.67% was below the genomic average of 0.76%. Only three SNPS were identified in the S1 subunit, none of which were located in hypervariable region (HVR) 1 or HVR2. The S2 subunit was considerably more variable containing 87% of the polymorphisms detected across the entire S protein. The S2 subunit also contained a previously unreported multi-A insertion site and a stretch of four consecutive mutated amino acids, which mapped to the stalk region of the spike protein. Template-based protein structure modelling produced the first theoretical model of the IBV spike monomer. Given the lack of diversity observed at the sub-consensus level, the tenet that the HVRs in the S1 subunit are very tolerant of amino acid changes produced by genetic drift is questioned. Copyright © 2015 Elsevier B.V. All rights reserved.

  20. Cost analysis of whole genome sequencing in German clinical practice.

    Science.gov (United States)

    Plöthner, Marika; Frank, Martin; von der Schulenburg, J-Matthias Graf

    2017-06-01

    Whole genome sequencing (WGS) is an emerging tool in clinical diagnostics. However, little has been said about its procedure costs, owing to a dearth of related cost studies. This study helps fill this research gap by analyzing the execution costs of WGS within the setting of German clinical practice. First, to estimate costs, a sequencing process related to clinical practice was undertaken. Once relevant resources were identified, a quantification and monetary evaluation was conducted using data and information from expert interviews with clinical geneticists, and personnel at private enterprises and hospitals. This study focuses on identifying the costs associated with the standard sequencing process, and the procedure costs for a single WGS were analyzed on the basis of two sequencing platforms-namely, HiSeq 2500 and HiSeq Xten, both by Illumina, Inc. In addition, sensitivity analyses were performed to assess the influence of various uses of sequencing platforms and various coverage values on a fixed-cost degression. In the base case scenario-which features 80 % utilization and 30-times coverage-the cost of a single WGS analysis with the HiSeq 2500 was estimated at €3858.06. The cost of sequencing materials was estimated at €2848.08; related personnel costs of €396.94 and acquisition/maintenance costs (€607.39) were also found. In comparison, the cost of sequencing that uses the latest technology (i.e., HiSeq Xten) was approximately 63 % cheaper, at €1411.20. The estimated costs of WGS currently exceed the prediction of a 'US$1000 per genome', by more than a factor of 3.8. In particular, the material costs in themselves exceed this predicted cost.

  1. Genomic risk profiling of ischemic stroke: results of an international genome-wide association meta-analysis.

    Directory of Open Access Journals (Sweden)

    James F Meschia

    Full Text Available INTRODUCTION: Familial aggregation of ischemic stroke derives from shared genetic and environmental factors. We present a meta-analysis of genome-wide association scans (GWAS from 3 cohorts to identify the contribution of common variants to ischemic stroke risk. METHODS: This study involved 1464 ischemic stroke cases and 1932 controls. Cases were genotyped using the Illumina 610 or 660 genotyping arrays; controls, with Illumina HumanHap 550Kv1 or 550Kv3 genotyping arrays. Imputation was performed with the 1000 Genomes European ancestry haplotypes (August 2010 release as a reference. A total of 5,156,597 single-nucleotide polymorphisms (SNPs were incorporated into the fixed effects meta-analysis. All SNPs associated with ischemic stroke (P<1×10(-5 were incorporated into a multivariate risk profile model. RESULTS: No SNP reached genome-wide significance for ischemic stroke (P<5×10(-8. Secondary analysis identified a significant cumulative effect for age at onset of stroke (first versus fifth quintile of cumulative profiles based on SNPs associated with late onset, ß = 14.77 [10.85,18.68], P = 5.5×10(-12, as well as a strong effect showing increased risk across samples with a high propensity for stroke among samples with enriched counts of suggestive risk alleles (P<5×10(-6. Risk profile scores based only on genomic information offered little incremental prediction. DISCUSSION: There is little evidence of a common genetic variant contributing to moderate risk of ischemic stroke. Quintiles based on genetic loading of alleles associated with a younger age at onset of ischemic stroke revealed a significant difference in age at onset between those in the upper and lower quintiles. Using common variants from GWAS and imputation, genomic profiling remains inferior to family history of stroke for defining risk. Inclusion of genomic (rare variant information may be required to improve clinical risk profiling.

  2. Genome-wide analysis reveals coating of the mitochondrial genome by TFAM.

    Directory of Open Access Journals (Sweden)

    Yun E Wang

    Full Text Available Mitochondria contain a 16.6 kb circular genome encoding 13 proteins as well as mitochondrial tRNAs and rRNAs. Copies of the genome are organized into nucleoids containing both DNA and proteins, including the machinery required for mtDNA replication and transcription. The transcription factor TFAM is critical for initiation of transcription and replication of the genome, and is also thought to perform a packaging function. Although specific binding sites required for initiation of transcription have been identified in the D-loop, little is known about the characteristics of TFAM binding in its nonspecific packaging state. In addition, it is unclear whether TFAM also plays a role in the regulation of nuclear gene expression. Here we investigate these questions by using ChIP-seq to directly localize TFAM binding to DNA in human cells. Our results demonstrate that TFAM uniformly coats the whole mitochondrial genome, with no evidence of robust TFAM binding to the nuclear genome. Our study represents the first high-resolution assessment of TFAM binding on a genome-wide scale in human cells.

  3. [Clinical use of the ImmunoCyt/uCyt+ and fluorescence in situ hybridisation (FISH) tests for urothelial carcinomas].

    Science.gov (United States)

    Lodde, Michele; Mian, Christine

    2013-01-01

    In recent decades years, we have witnessed the propagation and marketing of numerous diagnostic tests capable of detecting, in the urine of patients, the presence of urothelial tumor markers. Among None of the different markers studied to date , no one has been able to meet all the requirements of the ideal marker. We present and discuss below we discuss the results reported in the literature of about two tests approved by the Food and Drug Administration [ImmunoCyt/uCyt+ and Fluorescence In Situ Hybridisation (FISH)], which have been and commercially available for about 10 years., ImmunoCyt/uCyt + and Fluorescence In Situ Hybridisation (FISH).

  4. Genome-Wide Analysis of DNA Methylation in Human Amnion

    Directory of Open Access Journals (Sweden)

    Jinsil Kim

    2013-01-01

    Full Text Available The amnion is a specialized tissue in contact with the amniotic fluid, which is in a constantly changing state. To investigate the importance of epigenetic events in this tissue in the physiology and pathophysiology of pregnancy, we performed genome-wide DNA methylation profiling of human amnion from term (with and without labor and preterm deliveries. Using the Illumina Infinium HumanMethylation27 BeadChip, we identified genes exhibiting differential methylation associated with normal labor and preterm birth. Functional analysis of the differentially methylated genes revealed biologically relevant enriched gene sets. Bisulfite sequencing analysis of the promoter region of the oxytocin receptor (OXTR gene detected two CpG dinucleotides showing significant methylation differences among the three groups of samples. Hypermethylation of the CpG island of the solute carrier family 30 member 3 (SLC30A3 gene in preterm amnion was confirmed by methylation-specific PCR. This work provides preliminary evidence that DNA methylation changes in the amnion may be at least partially involved in the physiological process of labor and the etiology of preterm birth and suggests that DNA methylation profiles, in combination with other biological data, may provide valuable insight into the mechanisms underlying normal and pathological pregnancies.

  5. Rice–arsenate interactions in hydroponics: whole genome transcriptional analysis

    Science.gov (United States)

    Norton, Gareth J.; Lou-Hing, Daniel E.; Meharg, Andrew A.; Price, Adam H.

    2008-01-01

    Rice (Oryza sativa) varieties that are arsenate-tolerant (Bala) and -sensitive (Azucena) were used to conduct a transcriptome analysis of the response of rice seedlings to sodium arsenate (AsV) in hydroponic solution. RNA extracted from the roots of three replicate experiments of plants grown for 1 week in phosphate-free nutrient with or without 13.3 μM AsV was used to challenge the Affymetrix (52K) GeneChip Rice Genome array. A total of 576 probe sets were significantly up-regulated at least 2-fold in both varieties, whereas 622 were down-regulated. Ontological classification is presented. As expected, a large number of transcription factors, stress proteins, and transporters demonstrated differential expression. Striking is the lack of response of classic oxidative stress-responsive genes or phytochelatin synthases/synthatases. However, the large number of responses from genes involved in glutathione synthesis, metabolism, and transport suggests that glutathione conjugation and arsenate methylation may be important biochemical responses to arsenate challenge. In this report, no attempt is made to dissect differences in the response of the tolerant and sensitive variety, but analysis in a companion article will link gene expression to the known tolerance loci available in the Bala×Azucena mapping population. PMID:18453530

  6. Rice-arsenate interactions in hydroponics: whole genome transcriptional analysis.

    Science.gov (United States)

    Norton, Gareth J; Lou-Hing, Daniel E; Meharg, Andrew A; Price, Adam H

    2008-01-01

    Rice (Oryza sativa) varieties that are arsenate-tolerant (Bala) and -sensitive (Azucena) were used to conduct a transcriptome analysis of the response of rice seedlings to sodium arsenate (AsV) in hydroponic solution. RNA extracted from the roots of three replicate experiments of plants grown for 1 week in phosphate-free nutrient with or without 13.3 muM AsV was used to challenge the Affymetrix (52K) GeneChip Rice Genome array. A total of 576 probe sets were significantly up-regulated at least 2-fold in both varieties, whereas 622 were down-regulated. Ontological classification is presented. As expected, a large number of transcription factors, stress proteins, and transporters demonstrated differential expression. Striking is the lack of response of classic oxidative stress-responsive genes or phytochelatin synthases/synthatases. However, the large number of responses from genes involved in glutathione synthesis, metabolism, and transport suggests that glutathione conjugation and arsenate methylation may be important biochemical responses to arsenate challenge. In this report, no attempt is made to dissect differences in the response of the tolerant and sensitive variety, but analysis in a companion article will link gene expression to the known tolerance loci available in the BalaxAzucena mapping population.

  7. Genome-wide transcriptome analysis of 150 cell samples†

    Science.gov (United States)

    Russom, Aman; Xiao, Wenzhong; Wilhelmy, Julie; Wang, Shenglong; Heath, Joe Don; Kurn, Nurith; Tompkins, Ronald G.; Davis, Ronald W.; Toner, Mehmet

    2013-01-01

    A major challenge in molecular biology is interrogating the human transcriptome on a genome wide scale when only a limited amount of biological sample is available for analysis. Current methodologies using microarray technologies for simultaneously monitoring mRNA transcription levels require nanogram amounts of total RNA. To overcome the sample size limitation of current technologies, we have developed a method to probe the global gene expression in biological samples as small as 150 cells, or the equivalent of approximately 300 pg total RNA. The new method employs microfluidic devices for the purification of total RNA from mammalian cells and ultra-sensitive whole transcriptome amplification techniques. We verified that the RNA integrity is preserved through the isolation process, accomplished highly reproducible whole transcriptome analysis, and established high correlation between repeated isolations of 150 cells and the same cell culture sample. We validated the technology by demonstrating that the combined microfluidic and amplification protocol is capable of identifying biological pathways perturbed by stimulation, which are consistent with the information recognized in bulk-isolated samples. PMID:20023796

  8. Genome-wide transcriptome analysis of 150 cell samples.

    Science.gov (United States)

    Irimia, Daniel; Mindrinos, Michael; Russom, Aman; Xiao, Wenzhong; Wilhelmy, Julie; Wang, Shenglong; Heath, Joe Don; Kurn, Nurith; Tompkins, Ronald G; Davis, Ronald W; Toner, Mehmet

    2009-01-01

    A major challenge in molecular biology is interrogating the human transcriptome on a genome wide scale when only a limited amount of biological sample is available for analysis. Current methodologies using microarray technologies for simultaneously monitoring mRNA transcription levels require nanogram amounts of total RNA. To overcome the sample size limitation of current technologies, we have developed a method to probe the global gene expression in biological samples as small as 150 cells, or the equivalent of approximately 300 pg total RNA. The new method employs microfluidic devices for the purification of total RNA from mammalian cells and ultra-sensitive whole transcriptome amplification techniques. We verified that the RNA integrity is preserved through the isolation process, accomplished highly reproducible whole transcriptome analysis, and established high correlation between repeated isolations of 150 cells and the same cell culture sample. We validated the technology by demonstrating that the combined microfluidic and amplification protocol is capable of identifying biological pathways perturbed by stimulation, which are consistent with the information recognized in bulk-isolated samples.

  9. Genome-Wide Analysis of DNA Methylation in Human Amnion

    Science.gov (United States)

    Kim, Jinsil; Pitlick, Mitchell M.; Christine, Paul J.; Schaefer, Amanda R.; Saleme, Cesar; Comas, Belén; Cosentino, Viviana; Gadow, Enrique; Murray, Jeffrey C.

    2013-01-01

    The amnion is a specialized tissue in contact with the amniotic fluid, which is in a constantly changing state. To investigate the importance of epigenetic events in this tissue in the physiology and pathophysiology of pregnancy, we performed genome-wide DNA methylation profiling of human amnion from term (with and without labor) and preterm deliveries. Using the Illumina Infinium HumanMethylation27 BeadChip, we identified genes exhibiting differential methylation associated with normal labor and preterm birth. Functional analysis of the differentially methylated genes revealed biologically relevant enriched gene sets. Bisulfite sequencing analysis of the promoter region of the oxytocin receptor (OXTR) gene detected two CpG dinucleotides showing significant methylation differences among the three groups of samples. Hypermethylation of the CpG island of the solute carrier family 30 member 3 (SLC30A3) gene in preterm amnion was confirmed by methylation-specific PCR. This work provides preliminary evidence that DNA methylation changes in the amnion may be at least partially involved in the physiological process of labor and the etiology of preterm birth and suggests that DNA methylation profiles, in combination with other biological data, may provide valuable insight into the mechanisms underlying normal and pathological pregnancies. PMID:23533356

  10. Improved statistics for genome-wide interaction analysis.

    Science.gov (United States)

    Ueki, Masao; Cordell, Heather J

    2012-01-01

    Recently, Wu and colleagues [1] proposed two novel statistics for genome-wide interaction analysis using case/control or case-only data. In computer simulations, their proposed case/control statistic outperformed competing approaches, including the fast-epistasis option in PLINK and logistic regression analysis under the correct model; however, reasons for its superior performance were not fully explored. Here we investigate the theoretical properties and performance of Wu et al.'s proposed statistics and explain why, in some circumstances, they outperform competing approaches. Unfortunately, we find minor errors in the formulae for their statistics, resulting in tests that have higher than nominal type 1 error. We also find minor errors in PLINK's fast-epistasis and case-only statistics, although theory and simulations suggest that these errors have only negligible effect on type 1 error. We propose adjusted versions of all four statistics that, both theoretically and in computer simulations, maintain correct type 1 error rates under the null hypothesis. We also investigate statistics based on correlation coefficients that maintain similar control of type 1 error. Although designed to test specifically for interaction, we show that some of these previously-proposed statistics can, in fact, be sensitive to main effects at one or both loci, particularly in the presence of linkage disequilibrium. We propose two new "joint effects" statistics that, provided the disease is rare, are sensitive only to genuine interaction effects. In computer simulations we find, in most situations considered, that highest power is achieved by analysis under the correct genetic model. Such an analysis is unachievable in practice, as we do not know this model. However, generally high power over a wide range of scenarios is exhibited by our joint effects and adjusted Wu statistics. We recommend use of these alternative or adjusted statistics and urge caution when using Wu et al

  11. Development and characterization of genomic and expressed SSRs in citrus by genome-wide analysis.

    Directory of Open Access Journals (Sweden)

    Sheng-Rui Liu

    Full Text Available Microsatellites or simple sequence repeats (SSRs are one of the most popular sources of genetic markers and play a significant role in plant genetics and breeding. In this study, we identified citrus SSRs in the genome of Clementine mandarin and analyzed their frequency and distribution in different genomic regions. A total of 80,708 SSRs were detected in the genome with an overall density of 268 SSRs/Mb. While di-nucleotide repeats were the most frequent microsatellites in genomic DNA sequence, tetra-nucleotides, which had more repeat units than any other SSR types, had the highest cumulative sequence length. We identified 6,834 transcripts as containing 8,989 SSRs in 33,929 Clementine mandarin transcripts, among which, tri-nucleotide motifs (36.0% were the most common, followed by di-nucleotide (26.9% and hexa-nucleotide motifs (15.1%. The motif AG (16.7% was most abundant among these SSRs, while motifs AAG (6.6%, AAT (5.0%, and TAG (2.2% were most common among tri-nucleotides. Functional categorization of transcripts containing SSRs revealed that 5,879 (86.0% of such transcripts had homology with known proteins, GO and KEGG annotation revealed that transcripts containing SSRs were those implicated in diverse biological processes in plants, including binding, development, transcription, and protein degradation. When 27 genomic and 78 randomly selected SSRs were tested on Clementine mandarin, 95 SSRs revealed polymorphism. These 95 SSRs were further deployed on 18 genotypes of the three generas of Rutaceae for the genetic diversity assessment, genomic SSRs generally show low transferability in comparison to SSRs developed from expressed sequences. These transcript-markers identified in our study may provide a valuable genetic and genomic tool for further genetic research and varietal development in citrus, such as diversity study, QTL mapping, molecular breeding, comparative mapping and other genetic analyses.

  12. Development and characterization of genomic and expressed SSRs in citrus by genome-wide analysis.

    Science.gov (United States)

    Liu, Sheng-Rui; Li, Wen-Yang; Long, Dang; Hu, Chun-Gen; Zhang, Jin-Zhi

    2013-01-01

    Microsatellites or simple sequence repeats (SSRs) are one of the most popular sources of genetic markers and play a significant role in plant genetics and breeding. In this study, we identified citrus SSRs in the genome of Clementine mandarin and analyzed their frequency and distribution in different genomic regions. A total of 80,708 SSRs were detected in the genome with an overall density of 268 SSRs/Mb. While di-nucleotide repeats were the most frequent microsatellites in genomic DNA sequence, tetra-nucleotides, which had more repeat units than any other SSR types, had the highest cumulative sequence length. We identified 6,834 transcripts as containing 8,989 SSRs in 33,929 Clementine mandarin transcripts, among which, tri-nucleotide motifs (36.0%) were the most common, followed by di-nucleotide (26.9%) and hexa-nucleotide motifs (15.1%). The motif AG (16.7%) was most abundant among these SSRs, while motifs AAG (6.6%), AAT (5.0%), and TAG (2.2%) were most common among tri-nucleotides. Functional categorization of transcripts containing SSRs revealed that 5,879 (86.0%) of such transcripts had homology with known proteins, GO and KEGG annotation revealed that transcripts containing SSRs were those implicated in diverse biological processes in plants, including binding, development, transcription, and protein degradation. When 27 genomic and 78 randomly selected SSRs were tested on Clementine mandarin, 95 SSRs revealed polymorphism. These 95 SSRs were further deployed on 18 genotypes of the three generas of Rutaceae for the genetic diversity assessment, genomic SSRs generally show low transferability in comparison to SSRs developed from expressed sequences. These transcript-markers identified in our study may provide a valuable genetic and genomic tool for further genetic research and varietal development in citrus, such as diversity study, QTL mapping, molecular breeding, comparative mapping and other genetic analyses.

  13. Inverted Low-Copy Repeats and Genome Instability—A Genome-Wide Analysis

    Science.gov (United States)

    Dittwald, Piotr; Gambin, Tomasz; Gonzaga-Jauregui, Claudia; Carvalho, Claudia M.B.; Lupski, James R.; Stankiewicz, Paweł; Gambin, Anna

    2013-01-01

    Inverse paralogous low-copy repeats (IP-LCRs) can cause genome instability by nonallelic homologous recombination (NAHR)-mediated balanced inversions. When disrupting a dosage-sensitive gene(s), balanced inversions can lead to abnormal phenotypes. We delineated the genome-wide distribution of IP-LCRs >1 kB in size with >95% sequence identity and mapped the genes, potentially intersected by an inversion, that overlap at least one of the IP-LCRs. Remarkably, our results show that 12.0% of the human genome is potentially susceptible to such inversions and 942 genes, 99 of which are on the X chromosome, are predicted to be disrupted secondary to such an inversion! In addition, IP-LCRs larger than 800 bp with at least 98% sequence identity (duplication/triplication facilitating IP-LCRs, DTIP-LCRs) were recently implicated in the formation of complex genomic rearrangements with a duplication-inverted triplication–duplication (DUP-TRP/INV-DUP) structure by a replication-based mechanism involving a template switch between such inverted repeats. We identified 1,551 DTIP-LCRs that could facilitate DUP-TRP/INV-DUP formation. Remarkably, 1,445 disease-associated genes are at risk of undergoing copy-number gain as they map to genomic intervals susceptible to the formation of DUP-TRP/INV-DUP complex rearrangements. We implicate inverted LCRs as a human genome architectural feature that could potentially be responsible for genomic instability associated with many human disease traits. PMID:22965494

  14. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

    Directory of Open Access Journals (Sweden)

    Jianmin Fu

    Full Text Available Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.

  15. Complete genome sequence of Borrelia afzelii K78 and comparative genome analysis.

    Directory of Open Access Journals (Sweden)

    Wolfgang Schüler

    Full Text Available The main Borrelia species causing Lyme borreliosis in Europe and Asia are Borrelia afzelii, B. garinii, B. burgdorferi and B. bavariensis. This is in contrast to the United States, where infections are exclusively caused by B. burgdorferi. Until to date the genome sequences of four B. afzelii strains, of which only two include the numerous plasmids, are available. In order to further assess the genetic diversity of B. afzelii, the most common species in Europe, responsible for the large variety of clinical manifestations of Lyme borreliosis, we have determined the full genome sequence of the B. afzelii strain K78, a clinical isolate from Austria. The K78 genome contains a linear chromosome (905,949 bp and 13 plasmids (8 linear and 5 circular together presenting 1,309 open reading frames of which 496 are located on plasmids. With the exception of lp28-8, all linear replicons in their full length including their telomeres have been sequenced. The comparison with the genomes of the four other B. afzelii strains, ACA-1, PKo, HLJ01 and Tom3107, as well as the one of B. burgdorferi strain B31, confirmed a high degree of conservation within the linear chromosome of B. afzelii, whereas plasmid encoded genes showed a much larger diversity. Since some plasmids present in B. burgdorferi are missing in the B. afzelii genomes, the corresponding virulence factors of B. burgdorferi are found in B. afzelii on other unrelated plasmids. In addition, we have identified a species specific region in the circular plasmid, cp26, which could be used for species determination. Different non-coding RNAs have been located on the B. afzelii K78 genome, which have not previously been annotated in any of the published Borrelia genomes.

  16. Complete sequence of the mitochondrial genome of a diatom alga Synedra acus and comparative analysis of diatom mitochondrial genomes.

    Science.gov (United States)

    Ravin, Nikolai V; Galachyants, Yuri P; Mardanov, Andrey V; Beletsky, Alexey V; Petrova, Darya P; Sherbakova, Tatyana A; Zakharova, Yuliya R; Likhoshway, Yelena V; Skryabin, Konstantin G; Grachev, Mikhail A

    2010-06-01

    The first two mitochondrial genomes of marine diatoms were previously reported for the centric Thalassiosira pseudonana and the raphid pennate Phaeodactylum tricornutum. As part of a genomic project, we sequenced the complete mitochondrial genome of the freshwater araphid pennate diatom Synedra acus. This 46,657 bp mtDNA encodes 2 rRNAs, 24 tRNAs, and 33 proteins. The mtDNA of S. acus contains three group II introns, two inserted into the cox1 gene and containing ORFs, and one inserted into the rnl gene and lacking an ORF. The compact gene organization contrasts with the presence of a 4.9-kb-long intergenic region, which contains repeat sequences. Comparison of the three sequenced mtDNAs showed that these three genomes carry similar gene pools, but the positions of some genes are rearranged. Phylogenetic analysis performed with a fragment of the cox1 gene of diatoms and other heterokonts produced a tree that is similar to that derived from 18S RNA genes. The introns of mtDNA in the diatoms seem to be polyphyletic. This study demonstrates that pyrosequencing is an efficient method for complete sequencing of mitochondrial genomes from diatoms, and may soon give valuable information about the molecular phylogeny of this outstanding group of unicellular organisms.

  17. Analysis of the Complete Chloroplast Genome of a Medicinal Plant, Dianthus superbus var. longicalyncinus, from a Comparative Genomics Perspective.

    Directory of Open Access Journals (Sweden)

    Gurusamy Raman

    Full Text Available Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicinal plant that is also used for ornamental purposes. In this study, D. superbus was compared to its closely related family of Caryophyllaceae chloroplast (cp genomes such as Lychnis chalcedonica and Spinacia oleracea. D. superbus had the longest large single copy (LSC region (82,805 bp, with some variations in the inverted repeat region A (IRA/LSC regions. The IRs underwent both expansion and constriction during evolution of the Caryophyllaceae family; however, intense variations were not identified. The pseudogene ribosomal protein subunit S19 (rps19 was identified at the IRA/LSC junction, but was not present in the cp genome of other Caryophyllaceae family members. The translation initiation factor IF-1 (infA and ribosomal protein subunit L23 (rpl23 genes were absent from the Dianthus cp genome. When the cp genome of Dianthus was compared with 31 other angiosperm lineages, the infA gene was found to have been lost in most members of rosids, solanales of asterids and Lychnis of Caryophyllales, whereas rpl23 gene loss or pseudogization had occurred exclusively in Caryophyllales. Nevertheless, the cp genome of Dianthus and Spinacia has two introns in the proteolytic subunit of ATP-dependent protease (clpP gene, but Lychnis has lost introns from the clpP gene. Furthermore, phylogenetic analysis of individual protein-coding genes infA and rpl23 revealed that gene loss or pseudogenization occurred independently in the cp genome of Dianthus. Molecular phylogenetic analysis also demonstrated a sister relationship between Dianthus and Lychnis based on 78 protein-coding sequences. The results presented herein will contribute to studies of the evolution, molecular biology and genetic engineering of the medicinal and ornamental plant, D. superbus var. longicalycinus.

  18. Analysis of the Complete Chloroplast Genome of a Medicinal Plant, Dianthus superbus var. longicalyncinus, from a Comparative Genomics Perspective.

    Science.gov (United States)

    Raman, Gurusamy; Park, SeonJoo

    2015-01-01

    Dianthus superbus var. longicalycinus is an economically important traditional Chinese medicinal plant that is also used for ornamental purposes. In this study, D. superbus was compared to its closely related family of Caryophyllaceae chloroplast (cp) genomes such as Lychnis chalcedonica and Spinacia oleracea. D. superbus had the longest large single copy (LSC) region (82,805 bp), with some variations in the inverted repeat region A (IRA)/LSC regions. The IRs underwent both expansion and constriction during evolution of the Caryophyllaceae family; however, intense variations were not identified. The pseudogene ribosomal protein subunit S19 (rps19) was identified at the IRA/LSC junction, but was not present in the cp genome of other Caryophyllaceae family members. The translation initiation factor IF-1 (infA) and ribosomal protein subunit L23 (rpl23) genes were absent from the Dianthus cp genome. When the cp genome of Dianthus was compared with 31 other angiosperm lineages, the infA gene was found to have been lost in most members of rosids, solanales of asterids and Lychnis of Caryophyllales, whereas rpl23 gene loss or pseudogization had occurred exclusively in Caryophyllales. Nevertheless, the cp genome of Dianthus and Spinacia has two introns in the proteolytic subunit of ATP-dependent protease (clpP) gene, but Lychnis has lost introns from the clpP gene. Furthermore, phylogenetic analysis of individual protein-coding genes infA and rpl23 revealed that gene loss or pseudogenization occurred independently in the cp genome of Dianthus. Molecular phylogenetic analysis also demonstrated a sister relationship between Dianthus and Lychnis based on 78 protein-coding sequences. The results presented herein will contribute to studies of the evolution, molecular biology and genetic engineering of the medicinal and ornamental plant, D. superbus var. longicalycinus.

  19. Comparative genome analysis of Bacillus cereus group genomes withBacillus subtilis

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, Iain; Sorokin, Alexei; Kapatral, Vinayak; Reznik, Gary; Bhattacharya, Anamitra; Mikhailova, Natalia; Burd, Henry; Joukov, Victor; Kaznadzey, Denis; Walunas, Theresa; D' Souza, Mark; Larsen, Niels; Pusch,Gordon; Liolios, Konstantinos; Grechkin, Yuri; Lapidus, Alla; Goltsman,Eugene; Chu, Lien; Fonstein, Michael; Ehrlich, S. Dusko; Overbeek, Ross; Kyrpides, Nikos; Ivanova, Natalia

    2005-09-14

    Genome features of the Bacillus cereus group genomes (representative strains of Bacillus cereus, Bacillus anthracis and Bacillus thuringiensis sub spp israelensis) were analyzed and compared with the Bacillus subtilis genome. A core set of 1,381 protein families among the four Bacillus genomes, with an additional set of 933 families common to the B. cereus group, was identified. Differences in signal transduction pathways, membrane transporters, cell surface structures, cell wall, and S-layer proteins suggesting differences in their phenotype were identified. The B. cereus group has signal transduction systems including a tyrosine kinase related to two-component system histidine kinases from B. subtilis. A model for regulation of the stress responsive sigma factor sigmaB in the B. cereus group different from the well studied regulation in B. subtilis has been proposed. Despite a high degree of chromosomal synteny among these genomes, significant differences in cell wall and spore coat proteins that contribute to the survival and adaptation in specific hosts has been identified.

  20. CoCoNUT: an efficient system for the comparison and analysis of genomes

    Directory of Open Access Journals (Sweden)

    Kurtz Stefan

    2008-11-01

    Full Text Available Abstract Background Comparative genomics is the analysis and comparison of genomes from different species. This area of research is driven by the large number of sequenced genomes and heavily relies on efficient algorithms and software to perform pairwise and multiple genome comparisons. Results Most of the software tools available are tailored for one specific task. In contrast, we have developed a novel system CoCoNUT (Computational Comparative geNomics Utility Toolkit that allows solving several different tasks in a unified framework: (1 finding regions of high similarity among multiple genomic sequences and aligning them, (2 comparing two draft or multi-chromosomal genomes, (3 locating large segmental duplications in large genomic sequences, and (4 mapping cDNA/EST to genomic sequences. Conclusion CoCoNUT is competitive with other software tools w.r.t. the quality of the results. The use of state of the art algorithms and data structures allows CoCoNUT to solve comparative genomics tasks more efficiently than previous tools. With the improved user interface (including an interactive visualization component, CoCoNUT provides a unified, versatile, and easy-to-use software tool for large scale studies in comparative genomics.

  1. Full-length genomic analysis of korean porcine sapelovirus strains

    DEFF Research Database (Denmark)

    Son, Kyu-Yeol; Kim, Deok-Song; Kwon, Joseph

    2014-01-01

    the structural features of PSV genomes, the full-length nucleotide sequences of three Korean PSV strains were determined and analyzed using bioinformatic techniques in comparison with other known PSV strains. The Korean PSV genomes ranged from 7,542 to 7,566 nucleotides excluding the 3' poly(A) tail, and showed...

  2. Whole-genome sequence-based analysis of thyroid function

    DEFF Research Database (Denmark)

    Taylor, Peter N.; Porcu, Eleonora; Chew, Shelby

    2015-01-01

    Normal thyroid function is essential for health, but its genetic architecture remains poorly understood. Here, for the heritable thyroid traits thyrotropin (TSH) and free thyroxine (FT4), we analyse whole-genome sequence data from the UK10K project (N = 2,287). Using additional whole-genome seque...

  3. Sensitive non-isotopic DNA hybridisation assay or immediate-early antigen detection for rapid identification of human cytomegalovirus in urine.

    Science.gov (United States)

    Kimpton, C P; Morris, D J; Corbitt, G

    1991-04-01

    A sensitive non-radioactive DNA hybridisation assay employing digoxigenin-labelled probes was compared with immediate-early antigen detection and conventional virus isolation for the identification of human cytomegalovirus (HCMV) in 249 urine samples. Of 44 specimens yielding HCMV by virus isolation, more were positive by DNA hybridisation (32; 73%) than by immediate-early antigen detection (25; 52%) (P = 0.05). The specificity of the hybridisation assay in 45 apparently falsely positive specimens was supported by detection of HCMV DNA in 40 of these specimens using the polymerase chain reaction. Many urine specimens may thus contain large amounts of non-viable virus or free viral DNA. Evaluation of various protocols for the extraction and denaturation of virus DNA prior to hybridisation showed that proteinase K digestion with phenol/chloroform extraction was the most sensitive and reliable procedure. We conclude that the non-radioactive DNA hybridisation assay described is a potentially valuable routine diagnostic test.

  4. Cinteny: flexible analysis and visualization of synteny and genome rearrangements in multiple organisms

    Directory of Open Access Journals (Sweden)

    Meller Jaroslaw

    2007-03-01

    Full Text Available Abstract Background Identifying syntenic regions, i.e., blocks of genes or other markers with evolutionary conserved order, and quantifying evolutionary relatedness between genomes in terms of chromosomal rearrangements is one of the central goals in comparative genomics. However, the analysis of synteny and the resulting assessment of genome rearrangements are sensitive to the choice of a number of arbitrary parameters that affect the detection of synteny blocks. In particular, the choice of a set of markers and the effect of different aggregation strategies, which enable coarse graining of synteny blocks and exclusion of micro-rearrangements, need to be assessed. Therefore, existing tools and resources that facilitate identification, visualization and analysis of synteny need to be further improved to provide a flexible platform for such analysis, especially in the context of multiple genomes. Results We present a new tool, Cinteny, for fast identification and analysis of synteny with different sets of markers and various levels of coarse graining of syntenic blocks. Using Hannenhalli-Pevzner approach and its extensions, Cinteny also enables interactive determination of evolutionary relationships between genomes in terms of the number of rearrangements (the reversal distance. In particular, Cinteny provides: i integration of synteny browsing with assessment of evolutionary distances for multiple genomes; ii flexibility to adjust the parameters and re-compute the results on-the-fly; iii ability to work with user provided data, such as orthologous genes, sequence tags or other conserved markers. In addition, Cinteny provides many annotated mammalian, invertebrate and fungal genomes that are pre-loaded and available for analysis at http://cinteny.cchmc.org. Conclusion Cinteny allows one to automatically compare multiple genomes and perform sensitivity analysis for synteny block detection and for the subsequent computation of reversal distances

  5. Single cell genome analysis of an uncultured heterotrophic stramenopile

    Science.gov (United States)

    Roy, Rajat S.; Price, Dana C.; Schliep, Alexander; Cai, Guohong; Korobeynikov, Anton; Yoon, Hwan Su; Yang, Eun Chan; Bhattacharya, Debashish

    2014-04-01

    A broad swath of eukaryotic microbial biodiversity cannot be cultivated in the lab and is therefore inaccessible to conventional genome-wide comparative methods. One promising approach to study these lineages is single cell genomics (SCG), whereby an individual cell is captured from nature and genome data are produced from the amplified total DNA. Here we tested the efficacy of SCG to generate a draft genome assembly from a single sample, in this case a cell belonging to the broadly distributed MAST-4 uncultured marine stramenopiles. Using de novo gene prediction, we identified 6,996 protein-encoding genes in the MAST-4 genome. This genetic inventory was sufficient to place the cell within the ToL using multigene phylogenetics and provided preliminary insights into the complex evolutionary history of horizontal gene transfer (HGT) in the MAST-4 lineage.

  6. Genomic Islands Prediction and Analysis in Cyanobacteira by Bioinfomatics

    Institute of Scientific and Technical Information of China (English)

    Yi Li; Ni-Ni Rao; Feng Yang; Han-Ming Liu

    2014-01-01

    Genomic islands (Gis) are one of the most important components for cyanobacterial genome. The Gis code has many functions, such as symbiosis, pathogenesis, and adaptation. In this article, we predict and analyze the Gis in Synechocystis sp. PCC 6803 by bioinfomatics, and the results show that ISL1, ISL8, and ISL16 are homologous with many other bacteria, and they involve in basic reactions and have a conservative evolution. On the contrary, ISL15 has a unique sequence and function only for Synechocystis sp. PCC 6803. Most of Gis play a role in genome rearrangement because they have lots of transposase. Moreover, we find that recombination and horizontal transfer of Gis are important factors to affect the distribution of non-coding RNA. Our work contributes to a comprehensive understanding of genomic islands and their impact on genome of cyanobacteria.

  7. Comparative Genome Analysis Provides Insights into the Pathogenicity of Flavobacterium psychrophilum

    Science.gov (United States)

    Castillo, Daniel; Christiansen, Rói Hammershaimb; Dalsgaard, Inger; Madsen, Lone; Espejo, Romilio

    2016-01-01

    Flavobacterium psychrophilum is a fish pathogen in salmonid aquaculture worldwide that causes cold water disease (CWD) and rainbow trout fry syndrome (RTFS). Comparative genome analyses of 11 F. psychrophilum isolates representing temporally and geographically distant populations were used to describe the F. psychrophilum pan-genome and to examine virulence factors, prophages, CRISPR arrays, and genomic islands present in the genomes. Analysis of the genomic DNA sequences were complemented with selected phenotypic characteristics of the strains. The pan genome analysis showed that F. psychrophilum could hold at least 3373 genes, while the core genome contained 1743 genes. On average, 67 new genes were detected for every new genome added to the analysis, indicating that F. psychrophilum possesses an open pan genome. The putative virulence factors were equally distributed among isolates, independent of geographic location, year of isolation and source of isolates. Only one prophage-related sequence was found which corresponded to the previously described prophage 6H, and appeared in 5 out of 11 isolates. CRISPR array analysis revealed two different loci with dissimilar spacer content, which only matched one sequence in the database, the temperate bacteriophage 6H. Genomic Islands (GIs) were identified in F. psychrophilum isolates 950106-1/1 and CSF 259–93, associated with toxins and antibiotic resistance. Finally, phenotypic characterization revealed a high degree of similarity among the strains with respect to biofilm formation and secretion of extracellular enzymes. Global scale dispersion of virulence factors in the genomes and the abilities for biofilm formation, hemolytic activity and secretion of extracellular enzymes among the strains suggested that F. psychrophilum isolates have a similar mode of action on adhesion, colonization and destruction of fish tissues across large spatial and temporal scales of occurrence. Overall, the genomic characterization and

  8. Genomic organization and sequence analysis of the vomeronasal receptor V2R genes in mouse genome

    Institute of Scientific and Technical Information of China (English)

    YANG Hui; Zhang YaPing

    2007-01-01

    Two multigene superfamilies, named V1R and V2R, encoding seven-transmembrane-domain G-protein coupled receptors (GPCRs) have been identified as pheromone receptors in mammals. Three V2R gene families have been described in mouse and rat. Here we screened the updated mouse genome sequence database and finally retrieved 63 putative functional V2R genes including three newly identified genes which formed a new additional family. We described the genomic organization of these genes and also characterized the conservation of mouse V2R protein sequences. These genomic and sequence information we described are useful as part of the evidence to speculate the functional domain of V2Rs and should give aid to the functionality study in the future.

  9. Genomic analysis by oligonucleotide array Comparative Genomic Hybridization utilizing formalin-fixed, paraffin-embedded tissues.

    Science.gov (United States)

    Savage, Stephanie J; Hostetter, Galen

    2011-01-01

    Formalin fixation has been used to preserve tissues for more than a hundred years, and there are currently more than 300 million archival samples in the United States alone. The application of genomic protocols such as high-density oligonucleotide array Comparative Genomic Hybridization (aCGH) to formalin-fixed, paraffin-embedded (FFPE) tissues, therefore, opens an untapped resource of available tissues for research and facilitates utilization of existing clinical data in a research sample set. However, formalin fixation results in cross-linking of proteins and DNA, typically leading to such a significant degradation of DNA template that little is available for use in molecular applications. Here, we describe a protocol to circumvent formalin fixation artifact by utilizing enzymatic reactions to obtain quality DNA from a wide range of FFPE tissues for successful genome-wide discovery of gene dosage alterations in archival clinical samples.

  10. In silico comparative genomic analysis of GABAA receptor transcriptional regulation

    Directory of Open Access Journals (Sweden)

    Joyce Christopher J

    2007-06-01

    Full Text Available Abstract Background Subtypes of the GABAA receptor subunit exhibit diverse temporal and spatial expression patterns. In silico comparative analysis was used to predict transcriptional regulatory features in individual mammalian GABAA receptor subunit genes, and to identify potential transcriptional regulatory components involved in the coordinate regulation of the GABAA receptor gene clusters. Results Previously unreported putative promoters were identified for the β2, γ1, γ3, ε, θ and π subunit genes. Putative core elements and proximal transcriptional factors were identified within these predicted promoters, and within the experimentally determined promoters of other subunit genes. Conserved intergenic regions of sequence in the mammalian GABAA receptor gene cluster comprising the α1, β2, γ2 and α6 subunits were identified as potential long range transcriptional regulatory components involved in the coordinate regulation of these genes. A region of predicted DNase I hypersensitive sites within the cluster may contain transcriptional regulatory features coordinating gene expression. A novel model is proposed for the coordinate control of the gene cluster and parallel expression of the α1 and β2 subunits, based upon the selective action of putative Scaffold/Matrix Attachment Regions (S/MARs. Conclusion The putative regulatory features identified by genomic analysis of GABAA receptor genes were substantiated by cross-species comparative analysis and now require experimental verification. The proposed model for the coordinate regulation of genes in the cluster accounts for the head-to-head orientation and parallel expression of the α1 and β2 subunit genes, and for the disruption of transcription caused by insertion of a neomycin gene in the close vicinity of the α6 gene, which is proximal to a putative critical S/MAR.

  11. A structural genomics analysis of histidine kinase sensor domains

    Science.gov (United States)

    Cheung, Jonah

    2005-11-01

    Histidine kinase sensors are a part of a two-component system of protein signaling in prokaryotes and lower eukaryotes that relay an external environmental signal to an adaptive internal cellular response. Signal transduction occurs via phosphotransfer between a sensor protein and a response regulator which interact in tandem. The sensor is usually a transmembrane protein that contains a conserved cytoplasmic histidine kinase transmitter domain and a modular periplasmic sensor domain. The response regulator is cytoplasmic protein that contains a receiver domain that interacts with the histidine kinase, and an output domain that interacts with regulators of transcription or chemotaxis. My work focuses on the X-ray structure determination of a variety of bacterial sensor domains, based on a structural genomics analysis of the entire sensor domain family. Structures of the NarX, DcuS, LisK, and DctB sensor domains have been solved to atomic resolution, some in both ligand-bound and ligand-free states. Two distinct structural folds have been revealed---all-alpha helical and mixed alpha-beta. An analysis of the structures reveals a possible mechanism of transmembrane signaling in histidine kinase sensors as a sliding-piston motion between transmembrane helices. Although there is great diversity in ligand binding, there appears to be a small number of distinct sensor domain folds for which structural representatives of two have been solved. A final synthesis of the structural information with a comprehensive bio-informatics analysis of all histidine kinase sensor domain sequences allows fold prediction for over 400 sensor domains, in a step towards mapping the entire structural landscape of this protein family.

  12. A comprehensive 1000 Genomes-based genome-wide association meta-analysis of coronary artery disease

    Science.gov (United States)

    Kyriakou, Theodosios; Nelson, Christopher P; Hopewell, Jemma C; Webb, Thomas R; Zeng, Lingyao; Dehghan, Abbas; Alver, Maris; Armasu, Sebastian M; Auro, Kirsi; Bjonnes, Andrew; Chasman, Daniel I; Chen, Shufeng; Ford, Ian; Franceschini, Nora; Gieger, Christian; Grace, Christopher; Gustafsson, Stefan; Huang, Jie; Hwang, Shih-Jen; Kim, Yun Kyoung; Kleber, Marcus E; Lau, King Wai; Lu, Xiangfeng; Lu, Yingchang; Lyytikäinen, Leo-Pekka; Mihailov, Evelin; Morrison, Alanna C; Pervjakova, Natalia; Qu, Liming; Rose, Lynda M; Salfati, Elias; Saxena, Richa; Scholz, Markus; Smith, Albert V; Tikkanen, Emmi; Uitterlinden, Andre; Yang, Xueli; Zhang, Weihua; Zhao, Wei; de Andrade, Mariza; de Vries, Paul S; van Zuydam, Natalie R; Anand, Sonia S; Bertram, Lars; Beutner, Frank; Dedoussis, George; Frossard, Philippe; Gauguier, Dominique; Goodall, Alison H; Gottesman, Omri; Haber, Marc; Han, Bok-Ghee; Huang, Jianfeng; Jalilzadeh, Shapour; Kessler, Thorsten; König, Inke R; Lannfelt, Lars; Lieb, Wolfgang; Lind, Lars; Lindgren, Cecilia M; Lokki, Marja-Liisa; Magnusson, Patrik K; Mallick, Nadeem H; Mehra, Narinder; Meitinger, Thomas; Memon, Fazal-ur-Rehman; Morris, Andrew P; Nieminen, Markku S; Pedersen, Nancy L; Peters, Annette; Rallidis, Loukianos S; Rasheed, Asif; Samuel, Maria; Shah, Svati H; Sinisalo, Juha; Stirrups, Kathleen E; Trompet, Stella; Wang, Laiyuan; Zaman, Khan S; Ardissino, Diego; Boerwinkle, Eric; Borecki, Ingrid B; Bottinger, Erwin P; Buring, Julie E; Chambers, John C; Collins, Rory; Cupples, L Adrienne; Danesh, John; Demuth, Ilja; Elosua, Roberto; Epstein, Stephen E; Esko, Tõnu; Feitosa, Mary F; Franco, Oscar H; Franzosi, Maria Grazia; Granger, Christopher B; Gu, Dongfeng; Gudnason, Vilmundur; Hall, Alistair S; Hamsten, Anders; Harris, Tamara B; Hazen, Stanley L; Hengstenberg, Christian; Hofman, Albert; Ingelsson, Erik; Iribarren, Carlos; Jukema, J Wouter; Karhunen, Pekka J; Kim, Bong-Jo; Kooner, Jaspal S; Kullo, Iftikhar J; Lehtimäki, Terho; Loos, Ruth J F; Melander, Olle; Metspalu, Andres; März, Winfried; Palmer, Colin N; Perola, Markus; Quertermous, Thomas; Rader, Daniel J; Ridker, Paul M; Ripatti, Samuli; Roberts, Robert; Salomaa, Veikko; Sanghera, Dharambir K; Schwartz, Stephen M; Seedorf, Udo; Stewart, Alexandre F; Stott, David J; Thiery, Joachim; Zalloua, Pierre A; O’Donnell, Christopher J; Reilly, Muredach P; Assimes, Themistocles L; Thompson, John R; Erdmann, Jeanette; Clarke, Robert; Watkins, Hugh; Kathiresan, Sekar; McPherson, Ruth; Deloukas, Panos; Schunkert, Heribert; Samani, Nilesh J; Farrall, Martin

    2015-01-01

    Existing knowledge of genetic variants affecting risk of coronary artery disease (CAD) is largely based on genome-wide association studies (GWAS) analysis of common SNPs. Leveraging phased haplotypes from the 1000 Genomes Project, we report a GWAS meta-analysis of 185 thousand CAD cases and controls, interrogating 6.7 million common (MAF>0.05) as well as 2.7 million low frequency (0.005analysis provides a comprehensive survey of the fine genetic architecture of CAD showing that genetic susceptibility to this common disease is largely determined by common SNPs of small effect size. PMID:26343387

  13. Organization and comparative analysis of the mitochondrial genomes of bioluminescent Elateroidea (Coleoptera: Polyphaga).

    Science.gov (United States)

    Amaral, Danilo T; Mitani, Yasuo; Ohmiya, Yoshihiro; Viviani, Vadim R

    2016-07-25

    Mitochondrial genome organization in the Elateroidea superfamily (Coleoptera), which include the main families of bioluminescent beetles, has been poorly studied and lacking information about Phengodidae family. We sequenced the mitochondrial genomes of Neotropical Lampyridae (Bicellonycha lividipennis), Phengodidae (Brasilocerus sp.2 and Phrixothrix hirtus) and Elateridae (Pyrearinus termitilluminans, Hapsodrilus ignifer and Teslasena femoralis). All species had a typical insect mitochondrial genome except for the following: in the elaterid T. femoralis genome there is a non-coding region between NADH2 and tRNA-Trp; in the phengodids Brasilocerus sp.2 and P. hirtus genomes we did not find the tRNA-Ile and tRNA-Gln. The P. hirtus genome showed a ~1.6kb non-coding region, the rearrangement of tRNA-Tyr, a new tRNA-Leu copy, and several regions with higher AT contents. Phylogenetics analysis using Bayesian and ML models indicated that the Phengodidae+Rhagophthalmidae are closely related to Lampyridae family, and included Drilus flavescens (Drilidae) as an internal clade within Elateridae. This is the first report that compares the mitochondrial genomes organization of the three main families of bioluminescent Elateroidea, including the first Neotropical Lampyridae and Phengodidae. The losses of tRNAs, and translocation and duplication events found in Phengodidae mt genomes, mainly in P. hirtus, may indicate different evolutionary rates in these mitochondrial genomes. The mitophylogenomics analysis indicates the monophyly of the three bioluminescent families and a closer relationship between Lampyridae and Phengodidae/Rhagophthalmidae, in contrast with previous molecular analysis.

  14. Genome-wide analysis of TCP family in tobacco.

    Science.gov (United States)

    Chen, L; Chen, Y Q; Ding, A M; Chen, H; Xia, F; Wang, W F; Sun, Y H

    2016-05-23

    The TCP family is a transcription factor family, members of which are extensively involved in plant growth and development as well as in signal transduction in the response against many physiological and biochemical stimuli. In the present study, 61 TCP genes were identified in tobacco (Nicotiana tabacum) genome. Bioinformatic methods were employed for predicting and analyzing the gene structure, gene expression, phylogenetic analysis, and conserved domains of TCP proteins in tobacco. The 61 NtTCP genes were divided into three diverse groups, based on the division of TCP genes in tomato and Arabidopsis, and the results of the conserved domain and sequence analyses further confirmed the classification of the NtTCP genes. The expression pattern of NtTCP also demonstrated that majority of these genes play important roles in all the tissues, while some special genes exercise their functions only in specific tissues. In brief, the comprehensive and thorough study of the TCP family in other plants provides sufficient resources for studying the structure and functions of TCPs in tobacco.

  15. Comparative Analysis of Fatty Acid Desaturases in Cyanobacterial Genomes

    Directory of Open Access Journals (Sweden)

    Xiaoyuan Chi

    2008-01-01

    Full Text Available Fatty acid desaturases are enzymes that introduce double bonds into the hydrocarbon chains of fatty acids. The fatty acid desaturases from 37 cyanobacterial genomes were identified and classified based upon their conserved histidine-rich motifs and phylogenetic analysis, which help to determine the amounts and distributions of desaturases in cyanobacterial species. The filamentous or N2-fixing cyanobacteria usually possess more types of fatty acid desaturases than that of unicellular species. The pathway of acyl-lipid desaturation for unicellular marine cyanobacteria Synechococcus and Prochlorococcus differs from that of other cyanobacteria, indicating different phylogenetic histories of the two genera from other cyanobacteria isolated from freshwater, soil, or symbiont. Strain Gloeobacter violaceus PCC 7421 was isolated from calcareous rock and lacks thylakoid membranes. The types and amounts of desaturases of this strain are distinct to those of other cyanobacteria, reflecting the earliest divergence of it from the cyanobacterial line. Three thermophilic unicellular strains, Thermosynechococcus elongatus BP-1 and two Synechococcus Yellowstone species, lack highly unsaturated fatty acids in lipids and contain only one Δ9 desaturase in contrast with mesophilic strains, which is probably due to their thermic habitats. Thus, the amounts and types of fatty acid desaturases are various among different cyanobacterial species, which may result from the adaption to environments in evolution.

  16. Methylation Linear Discriminant Analysis (MLDA for identifying differentially methylated CpG islands

    Directory of Open Access Journals (Sweden)

    Vass J Keith

    2008-08-01

    Full Text Available Abstract Background Hypermethylation of promoter CpG islands is strongly correlated to transcriptional gene silencing and epigenetic maintenance of the silenced state. As well as its role in tumor development, CpG island methylation contributes to the acquisition of resistance to chemotherapy. Differential Methylation Hybridisation (DMH is one technique used for genome-wide DNA methylation analysis. The study of such microarray data sets should ideally account for the specific biological features of DNA methylation and the non-symmetrical distribution of the ratios of unmethylated and methylated sequences hybridised on the array. We have therefore developed a novel algorithm tailored to this type of data, Methylation Linear Discriminant Analysis (MLDA. Results MLDA was programmed in R (version 2.7.0 and the package is available at CRAN 1. This approach utilizes linear regression models of non-normalised hybridisation data to define methylation status. Log-transformed signal intensities of unmethylated controls on the microarray are used as a reference. The signal intensities of DNA samples digested with methylation sensitive restriction enzymes and mock digested are then transformed to the likelihood of a locus being methylated using this reference. We tested the ability of MLDA to identify loci differentially methylated as analysed by DMH between cisplatin sensitive and resistant ovarian cancer cell lines. MLDA identified 115 differentially methylated loci and 23 out of 26 of these loci have been independently validated by Methylation Specific PCR and/or bisulphite pyrosequencing. Conclusion MLDA has advantages for analyzing methylation data from CpG island microarrays, since there is a clear rational for the definition of methylation status, it uses DMH data without between-group normalisation and is less influenced by cross-hybridisation of loci. The MLDA algorithm successfully identified differentially methylated loci between two classes of

  17. The genome sequence of Blochmannia floridanus: Comparative analysis of reduced genomes

    NARCIS (Netherlands)

    Gil, R.; Silva, F.J.; Zientz, E.; Delmotte, F.; Gonzalez-Candelas, F.; Latorre, A.; Rausell, C.; Kamerbeek, J.; Gadau, J.; Hölldobler, B.; Ham, van R.C.H.J.; Gross, R.; Moya, A.

    2003-01-01

    Bacterial symbioses are widespread among insects, probably being one of the key factors of their evolutionary success. We present the complete genome sequence of Blochmannia floridanus, the primary endosymbiont of carpenter ants. Although these ants feed on a complex diet, this symbiosis very likely

  18. Meta-analysis of genome-wide association from genomic prediction models

    Science.gov (United States)

    A limitation of many genome-wide association studies (GWA) in animal breeding is that there are many loci with small effect sizes; thus, larger sample sizes (N) are required to guarantee suitable power of detection. To increase sample size, results from different GWA can be combined in a meta-analys...

  19. Optimising single cell activity assessment of Lactobacillus plantarum by fluorescent in situ hybridisation as affected by growth

    NARCIS (Netherlands)

    Vries, de M.C.; Vaughan, E.E.; Kleerebezem, M.; Vos, de W.M.

    2004-01-01

    Fluorescent in situ hybridisation (FISH) with a 16S ribosomal RNA (rRNA)-targeted oligonucleotide probe, Eub338, could be used to estimate the in situ activity of Lactobacillus plantarum WCFS1 in exponentially growing cells. However, L. plantarum is capable of growth to very high cell densities, and

  20. The conservation significance of natural hybridisation in Mediterranean plants: from a case study on Cyclamen (Primulaceae) to a general perspective.

    Science.gov (United States)

    Thompson, John D; Gauthier, Perrine; Papuga, Guillaume; Pons, Virginie; Debussche, Max; Farris, Emmanuele

    2017-06-23

    Hybridisation plays a prominent role in plant evolution due to its influence on genetic diversity, fitness and adaptive potential We identify a case of on-going hybrid evolution of floral phenotypes in disjunct populations of Cyclamen balearicum and C. repandum subsp. repandum on Corsica and Sardinia. Hybrid populations on the two islands contain similar patterns of variation in flower colour and size but are probably at different stages in the evolutionary process of hybridisation and differences in the frequency of floral types and flower size suggest hybrid vigour that may contribute to the dynamics and maintenance of hybrid forms. In a review of cases of hybridisation in Mediterranean plants we found an equivalent number of cases for the contemporary occurrence of mixed hybrid populations, as there are cases of homoploid hybrid species differentiation. We argue for the development of a conservation strategy for Mediterranean plants that integrates the need to protect not just pure endemic species (some of hybrid origin) but also mixed populations where adaptive variation and new species are evolving due to contemporary hybridisation. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.

  1. Genomic analysis of the basal lineage fungus Rhizopus oryzae reveals a whole-genome duplication.

    Directory of Open Access Journals (Sweden)

    Li-Jun Ma

    2009-07-01

    Full Text Available Rhizopus oryzae is the primary cause of mucormycosis, an emerging, life-threatening infection characterized by rapid angioinvasive growth with an overall mortality rate that exceeds 50%. As a representative of the paraphyletic basal group of the fungal kingdom called "zygomycetes," R. oryzae is also used as a model to study fungal evolution. Here we report the genome sequence of R. oryzae strain 99-880, isolated from a fatal case of mucormycosis. The highly repetitive 45.3 Mb genome assembly contains abundant transposable elements (TEs, comprising approximately 20% of the genome. We predicted 13,895 protein-coding genes not overlapping TEs, many of which are paralogous gene pairs. The order and genomic arrangement of the duplicated gene pairs and their common phylogenetic origin provide evidence for an ancestral whole-genome duplication (WGD event. The WGD resulted in the duplication of nearly all subunits of the protein complexes associated with respiratory electron transport chains, the V-ATPase, and the ubiquitin-proteasome systems. The WGD, together with recent gene duplications, resulted in the expansion of multiple gene families related to cell growth and signal transduction, as well as secreted aspartic protease and subtilase protein families, which are known fungal virulence factors. The duplication of the ergosterol biosynthetic pathway, especially the major azole target, lanosterol 14alpha-demethylase (ERG11, could contribute to the variable responses of R. oryzae to different azole drugs, including voriconazole and posaconazole. Expanded families of cell-wall synthesis enzymes, essential for fungal cell integrity but absent in mammalian hosts, reveal potential targets for novel and R. oryzae-specific diagnostic and therapeutic treatments.

  2. Connecting Genomic Alterations to Cancer Biology with Proteomics: The NCI Clinical Proteomic Tumor Analysis Consortium

    Energy Technology Data Exchange (ETDEWEB)

    Ellis, Matthew; Gillette, Michael; Carr, Steven A.; Paulovich, Amanda G.; Smith, Richard D.; Rodland, Karin D.; Townsend, Reid; Kinsinger, Christopher; Mesri, Mehdi; Rodriguez, Henry; Liebler, Daniel

    2013-10-03

    The National Cancer Institute (NCI) Clinical Proteomic Tumor Analysis Consortium is applying the latest generation of proteomic technologies to genomically annotated tumors from The Cancer Genome Atlas (TCGA) program, a joint initiative of the NCI and the National Human Genome Research Institute. By providing a fully integrated accounting of DNA, RNA, and protein abnormalities in individual tumors, these datasets will illuminate the complex relationship between genomic abnormalities and cancer phenotypes, thus producing biologic insights as well as a wave of novel candidate biomarkers and therapeutic targets amenable to verifi cation using targeted mass spectrometry methods.

  3. Genome-wide Association Analysis of Kernel Weight in Hard Winter Wheat

    Science.gov (United States)

    Wheat kernel weight is an important and heritable component of wheat grain yield and a key predictor of flour extraction. Genome-wide association analysis was conducted to identify genomic regions associated with kernel weight and kernel weight environmental response in 8 trials of 299 hard winter ...

  4. Meta-Analysis of Genome-Wide Association Studies of Attention-Deficit/Hyperactivity Disorder

    Science.gov (United States)

    Neale, Benjamin M.; Medland, Sarah E.; Ripke, Stephan; Asherson, Philip; Franke, Barbara; Lesch, Klaus-Peter; Faraone, Stephen V.; Nguyen, Thuy Trang; Schafer, Helmut; Holmans, Peter; Daly, Mark; Steinhausen, Hans-Christoph; Freitag, Christine; Reif, Andreas; Renner, Tobias J.; Romanos, Marcel; Romanos, Jasmin; Walitza, Susanne; Warnke, Andreas; Meyer, Jobst; Palmason, Haukur; Buitelaar, Jan; Vasquez, Alejandro Arias; Lambregts-Rommelse, Nanda; Gill, Michael; Anney, Richard J. L.; Langely, Kate; O'Donovan, Michael; Williams, Nigel; Owen, Michael; Thapar, Anita; Kent, Lindsey; Sergeant, Joseph; Roeyers, Herbert; Mick, Eric; Biederman, Joseph; Doyle, Alysa; Smalley, Susan; Loo, Sandra; Hakonarson, Hakon; Elia, Josephine; Todorov, Alexandre; Miranda, Ana; Mulas, Fernando; Ebstein, Richard P.; Rothenberger, Aribert; Banaschewski, Tobias; Oades, Robert D.; Sonuga-Barke, Edmund; McGough, James; Nisenbaum, Laura; Middleton, Frank; Hu, Xiaolan; Nelson, Stan

    2010-01-01

    Objective: Although twin and family studies have shown attention-deficit/hyperactivity disorder (ADHD) to be highly heritable, genetic variants influencing the trait at a genome-wide significant level have yet to be identified. As prior genome-wide association studies (GWAS) have not yielded significant results, we conducted a meta-analysis of…

  5. BGI-RIS: an integrated information resource and comparative analysis workbench for rice genomics

    DEFF Research Database (Denmark)

    Zhao, Wenming; Wang, Jing; He, Ximiao

    2004-01-01

    the application of the rice genomic information and to provide a foundation for functional and evolutionary studies of other important cereal crops, we implemented our Rice Information System (BGI-RIS), the most up-to-date integrated information resource as well as a workbench for comparative genomic analysis...

  6. Genome-Wide Association Study and Linkage Analysis of the Healthy Aging Index

    DEFF Research Database (Denmark)

    Minster, Ryan L; Sanders, Jason L; Singh, Jatinder;

    2015-01-01

    BACKGROUND: The Healthy Aging Index (HAI) is a tool for measuring the extent of health and disease across multiple systems. METHODS: We conducted a genome-wide association study and a genome-wide linkage analysis to map quantitative trait loci associated with the HAI and a modified HAI weighted...

  7. Genomic analysis of a nontoxigenic, invasive Corynebacterium diphtheriae strain from Brazil

    Directory of Open Access Journals (Sweden)

    Fernando Encinas

    2015-09-01

    Full Text Available We report the complete genome sequence and analysis of an invasive Corynebacterium diphtheriae strain that caused endocarditis in Rio de Janeiro, Brazil. It was selected for sequencing on the basis of the current relevance of nontoxigenic strains for public health. The genomic information was explored in the context of diversity, plasticity and genetic relatedness with other contemporary strains.

  8. Genomic analysis of a nontoxigenic, invasive Corynebacterium diphtheriae strain from Brazil.

    Science.gov (United States)

    Encinas, Fernando; Marin, Michel A; Ramos, Juliana N; Vieira, Verônica V; Mattos-Guaraldi, Ana Luiza; Vicente, Ana Carolina P

    2015-09-01

    We report the complete genome sequence and analysis of an invasive Corynebacterium diphtheriae strain that caused endocarditis in Rio de Janeiro, Brazil. It was selected for sequencing on the basis of the current relevance of nontoxigenic strains for public health. The genomic information was explored in the context of diversity, plasticity and genetic relatedness with other contemporary strains.

  9. Meta-Analysis of Genome-Wide Association Studies of Attention-Deficit/Hyperactivity Disorder

    Science.gov (United States)

    Neale, Benjamin M.; Medland, Sarah E.; Ripke, Stephan; Asherson, Philip; Franke, Barbara; Lesch, Klaus-Peter; Faraone, Stephen V.; Nguyen, Thuy Trang; Schafer, Helmut; Holmans, Peter; Daly, Mark; Steinhausen, Hans-Christoph; Freitag, Christine; Reif, Andreas; Renner, Tobias J.; Romanos, Marcel; Romanos, Jasmin; Walitza, Susanne; Warnke, Andreas; Meyer, Jobst; Palmason, Haukur; Buitelaar, Jan; Vasquez, Alejandro Arias; Lambregts-Rommelse, Nanda; Gill, Michael; Anney, Richard J. L.; Langely, Kate; O'Donovan, Michael; Williams, Nigel; Owen, Michael; Thapar, Anita; Kent, Lindsey; Sergeant, Joseph; Roeyers, Herbert; Mick, Eric; Biederman, Joseph; Doyle, Alysa; Smalley, Susan; Loo, Sandra; Hakonarson, Hakon; Elia, Josephine; Todorov, Alexandre; Miranda, Ana; Mulas, Fernando; Ebstein, Richard P.; Rothenberger, Aribert; Banaschewski, Tobias; Oades, Robert D.; Sonuga-Barke, Edmund; McGough, James; Nisenbaum, Laura; Middleton, Frank; Hu, Xiaolan; Nelson, Stan

    2010-01-01

    Objective: Although twin and family studies have shown attention-deficit/hyperactivity disorder (ADHD) to be highly heritable, genetic variants influencing the trait at a genome-wide significant level have yet to be identified. As prior genome-wide association studies (GWAS) have not yielded significant results, we conducted a meta-analysis of…

  10. Dissection of genomic correlation matrices of US Holsteins using multivariate factor analysis

    Science.gov (United States)

    Aim of the study was to compare correlation matrices between direct genomic predictions for 31 production, fitness and conformation traits both at genomic and chromosomal level in US Holstein bulls. Multivariate factor analysis was used to quantify basic features of correlation matrices. Factor extr...

  11. Carotenoid biosynthetic genes in Brassica rapa: comparative genomic analysis, phylogenetic analysis, and expression profiling

    OpenAIRE

    Li, Peirong; Zhang, Shujiang; Zhang, Shifan; Li, Fei; Zhang, Hui; Cheng, Feng; Wu, Jian; Wang, Xiaowu; Sun, Rifei

    2015-01-01

    Background Carotenoids are isoprenoid compounds synthesized by all photosynthetic organisms. Despite much research on carotenoid biosynthesis in the model plant Arabidopsis thaliana, there is a lack of information on the carotenoid pathway in Brassica rapa. To better understand its carotenoid biosynthetic pathway, we performed a systematic analysis of carotenoid biosynthetic genes at the genome level in B. rapa. Results We identified 67 carotenoid biosynthetic genes in B. rapa, which were ort...

  12. Application of fluorescent in situ hybridisation for demonstration of Coxiella burnetti in placentas from ruminant abortions

    DEFF Research Database (Denmark)

    Jensen, Tim Kåre; Montgomery, Donald L.; Jaeger, Paula T.;

    2007-01-01

    A fluorescent in situ hybridisation (FISH) assay targeting 16S ribosomal RNA was developed for detection of the zoonotic bacterium Coxiella burnetii in formalin-fixed, paraffin-embedded tissue, and applied on placentas from ruminant abortions. The applicability of the FISH assay was compared...... to immunohistochemistry (IHC) using human positive control serum in 12 cases of C burnetii-associated placentitis as well as 7 negative control tissue samples. In all 12 cases the bacterium was detected within trophoblasts as well as free in the placental debris by both FISH and IHC. Extensive and significant infection...... by C. burnetii was revealed in 10 of the cases, whereas a slighter and focal distribution of the bacterium was observed in two cases. 90 aborted placentas from Danish ruminants were investigated by FISH. C burnetii was detected in one bovine case only, representing the first confirmation of C burnetii...

  13. Comparative analysis of catfish BAC end sequences with the zebrafish genome

    Directory of Open Access Journals (Sweden)

    Abernathy Jason

    2009-12-01

    Full Text Available Abstract Background Comparative mapping is a powerful tool to transfer genomic information from sequenced genomes to closely related species for which whole genome sequence data are not yet available. However, such an approach is still very limited in catfish, the most important aquaculture species in the United States. This project was initiated to generate additional BAC end sequences and demonstrate their applications in comparative mapping in catfish. Results We reported the generation of 43,000 BAC end sequences and their applications for comparative genome analysis in catfish. Using these and the additional 20,000 existing BAC end sequences as a resource along with linkage mapping and existing physical map, conserved syntenic regions were identified between the catfish and zebrafish genomes. A total of 10,943 catfish BAC end sequences (17.3% had significant BLAST hits to the zebrafish genome (cutoff value ≤ e-5, of which 3,221 were unique gene hits, providing a platform for comparative mapping based on locations of these genes in catfish and zebrafish. Genetic linkage mapping of microsatellites associated with contigs allowed identification of large conserved genomic segments and construction of super scaffolds. Conclusion BAC end sequences and their associated polymorphic markers are great resources for comparative genome analysis in catfish. Highly conserved chromosomal regions were identified to exist between catfish and zebrafish. However, it appears that the level of conservation at local genomic regions are high while a high level of chromosomal shuffling and rearrangements exist between catfish and zebrafish genomes. Orthologous regions established through comparative analysis should facilitate both structural and functional genome analysis in catfish.

  14. Quantitative multiplex quantum dot in-situ hybridisation based gene expression profiling in tissue microarrays identifies prognostic genes in acute myeloid leukaemia

    Energy Technology Data Exchange (ETDEWEB)

    Tholouli, Eleni [Department of Haematology, Manchester Royal Infirmary, Oxford Road, Manchester, M13 9WL (United Kingdom); MacDermott, Sarah [The Medical School, The University of Manchester, Oxford Road, M13 9PT Manchester (United Kingdom); Hoyland, Judith [School of Biomedicine, Faculty of Medical and Human Sciences, The University of Manchester, Oxford Road, M13 9PT Manchester (United Kingdom); Yin, John Liu [Department of Haematology, Manchester Royal Infirmary, Oxford Road, Manchester, M13 9WL (United Kingdom); Byers, Richard, E-mail: richard.byers@cmft.nhs.uk [School of Cancer and Enabling Sciences, Faculty of Medical and Human Sciences, The University of Manchester, Stopford Building, Oxford Road, M13 9PT Manchester (United Kingdom)

    2012-08-24

    Highlights: Black-Right-Pointing-Pointer Development of a quantitative high throughput in situ expression profiling method. Black-Right-Pointing-Pointer Application to a tissue microarray of 242 AML bone marrow samples. Black-Right-Pointing-Pointer Identification of HOXA4, HOXA9, Meis1 and DNMT3A as prognostic markers in AML. -- Abstract: Measurement and validation of microarray gene signatures in routine clinical samples is problematic and a rate limiting step in translational research. In order to facilitate measurement of microarray identified gene signatures in routine clinical tissue a novel method combining quantum dot based oligonucleotide in situ hybridisation (QD-ISH) and post-hybridisation spectral image analysis was used for multiplex in-situ transcript detection in archival bone marrow trephine samples from patients with acute myeloid leukaemia (AML). Tissue-microarrays were prepared into which white cell pellets were spiked as a standard. Tissue microarrays were made using routinely processed bone marrow trephines from 242 patients with AML. QD-ISH was performed for six candidate prognostic genes using triplex QD-ISH for DNMT1, DNMT3A, DNMT3B, and for HOXA4, HOXA9, Meis1. Scrambled oligonucleotides were used to correct for background staining followed by normalisation of expression against the expression values for the white cell pellet standard. Survival analysis demonstrated that low expression of HOXA4 was associated with poorer overall survival (p = 0.009), whilst high expression of HOXA9 (p < 0.0001), Meis1 (p = 0.005) and DNMT3A (p = 0.04) were associated with early treatment failure. These results demonstrate application of a standardised, quantitative multiplex QD-ISH method for identification of prognostic markers in formalin-fixed paraffin-embedded clinical samples, facilitating measurement of gene expression signatures in routine clinical samples.

  15. The Integrated Microbial Genomes (IMG) System: An Expanding Comparative Analysis Resource

    Energy Technology Data Exchange (ETDEWEB)

    Markowitz, Victor M.; Chen, I-Min A.; Palaniappan, Krishna; Chu, Ken; Szeto, Ernest; Grechkin, Yuri; Ratner, Anna; Anderson, Iain; Lykidis, Athanasios; Mavromatis, Konstantinos; Ivanova, Natalia N.; Kyrpides, Nikos C.

    2009-09-13

    The integrated microbial genomes (IMG) system serves as a community resource for comparative analysis of publicly available genomes in a comprehensive integrated context. IMG contains both draft and complete microbial genomes integrated with other publicly available genomes from all three domains of life, together with a large number of plasmids and viruses. IMG provides tools and viewers for analyzing and reviewing the annotations of genes and genomes in a comparative context. Since its first release in 2005, IMG's data content and analytical capabilities have been constantly expanded through regular releases. Several companion IMG systems have been set up in order to serve domain specific needs, such as expert review of genome annotations. IMG is available at .

  16. Comparative bacterial proteomics: analysis of the core genome concept.

    Directory of Open Access Journals (Sweden)

    Stephen J Callister

    Full Text Available While comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry, experimental validation of the existence of this core genome requires extensive measurement and is typically not undertaken. Enabled by an extensive proteome database developed over six years, we have experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. Although genomic studies can establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits.

  17. Comparative Bacterial Proteomics: Analysis of the Core Genome Concept

    Energy Technology Data Exchange (ETDEWEB)

    Callister, Stephen J.; McCue, Lee Ann; Turse, Josh E.; Monroe, Matthew E.; Auberry, Kenneth J.; Smith, Richard D.; Adkins, Joshua N.; Lipton, Mary S.

    2008-02-06

    Comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry. Experimental validation of the existence of this core genome requires extensive measurement and is not typically undertaken. Enabled by an extensive proteome database development over a six year period, we experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. While genomic studies establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits.

  18. Comparative genomics and phylogenetic analysis of S. dysenteriae subgroup

    Institute of Scientific and Technical Information of China (English)

    YANG; E; BIN; Wen; PENG; Junping; ZHANG; Xiaobing; WANG; Ji

    2005-01-01

    Genomic compositions of representatives of thirteen S. Dysenteriae serotypes were investigated by performing comparative genomic hybridization (CGH) with microarray containing the whole genomic ORFs (open reading frames, ORFs) of E. Coli K12 strain MG1655 and specific ORFs of S. Dysenteriae A1 strain Sd51197. The CGH results indicated the genomes of the serotypes contain 2654 conserved ORFs originating from E. Coli. However, 219 intrinsic genes of E. Coli including those prophage genes, molecular chaperones, synthesis of specific O antigen and so on were absent. Moreover, some specific genes such as type II secretion system associated components, iron transport related genes and some others as well were acquired through horizontal transfer. According to phylogenic trees based on genetic composition, it was demonstrated that A1, A2, A8, A10 were distinct from the other S. Dysenteriae serotypes. Our results in this report may provide new insights into the physiological process, pathogenicity and evolution of S. Dysenteriae.

  19. Integrated proteomic and genomic analysis of colorectal cancer

    Science.gov (United States)

    Investigators who analyzed 95 human colorectal tumor samples have determined how gene alterations identified in previous analyses of the same samples are expressed at the protein level. The integration of proteomic and genomic data, or proteogenomics, pro

  20. Assigning protein functions by comparative genome analysis protein phylogenetic profiles

    Science.gov (United States)

    Pellegrini, Matteo; Marcotte, Edward M.; Thompson, Michael J.; Eisenberg, David; Grothe, Robert; Yeates, Todd O.

    2003-05-13

    A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.

  1. Determining protein function and interaction from genome analysis

    Science.gov (United States)

    Eisenberg, David; Marcotte, Edward M.; Thompson, Michael J.; Pellegrini, Matteo; Yeates, Todd O.

    2004-08-03

    A computational method system, and computer program are provided for inferring functional links from genome sequences. One method is based on the observation that some pairs of proteins A' and B' have homologs in another organism fused into a single protein chain AB. A trans-genome comparison of sequences can reveal these AB sequences, which are Rosetta Stone sequences because they decipher an interaction between A' and B. Another method compares the genomic sequence of two or more organisms to create a phylogenetic profile for each protein indicating its presence or absence across all the genomes. The profile provides information regarding functional links between different families of proteins. In yet another method a combination of the above two methods is used to predict functional links.

  2. Comparative Bacterial Proteomics: Analysis of the Core Genome Concept

    Science.gov (United States)

    Callister, Stephen J.; McCue, Lee Ann; Turse, Joshua E.; Monroe, Matthew E.; Auberry, Kenneth J.; Smith, Richard D.; Adkins, Joshua N.; Lipton, Mary S.

    2008-01-01

    While comparative bacterial genomic studies commonly predict a set of genes indicative of common ancestry, experimental validation of the existence of this core genome requires extensive measurement and is typically not undertaken. Enabled by an extensive proteome database developed over six years, we have experimentally verified the expression of proteins predicted from genomic ortholog comparisons among 17 environmental and pathogenic bacteria. More exclusive relationships were observed among the expressed protein content of phenotypically related bacteria, which is indicative of the specific lifestyles associated with these organisms. Although genomic studies can establish relative orthologous relationships among a set of bacteria and propose a set of ancestral genes, our proteomics study establishes expressed lifestyle differences among conserved genes and proposes a set of expressed ancestral traits. PMID:18253490

  3. An Alternative Methodological Approach for Cost-Effectiveness Analysis and Decision Making in Genomic Medicine.

    Science.gov (United States)

    Fragoulakis, Vasilios; Mitropoulou, Christina; van Schaik, Ron H; Maniadakis, Nikolaos; Patrinos, George P

    2016-05-01

    Genomic Medicine aims to improve therapeutic interventions and diagnostics, the quality of life of patients, but also to rationalize healthcare costs. To reach this goal, careful assessment and identification of evidence gaps for public health genomics priorities are required so that a more efficient healthcare environment is created. Here, we propose a public health genomics-driven approach to adjust the classical healthcare decision making process with an alternative methodological approach of cost-effectiveness analysis, which is particularly helpful for genomic medicine interventions. By combining classical cost-effectiveness analysis with budget constraints, social preferences, and patient ethics, we demonstrate the application of this model, the Genome Economics Model (GEM), based on a previously reported genome-guided intervention from a developing country environment. The model and the attendant rationale provide a practical guide by which all major healthcare stakeholders could ensure the sustainability of funding for genome-guided interventions, their adoption and coverage by health insurance funds, and prioritization of Genomic Medicine research, development, and innovation, given the restriction of budgets, particularly in developing countries and low-income healthcare settings in developed countries. The implications of the GEM for the policy makers interested in Genomic Medicine and new health technology and innovation assessment are also discussed.

  4. CGUG: in silico proteome and genome parsing tool for the determination of "core" and unique genes in the analysis of genomes up to ca. 1.9 Mb

    Directory of Open Access Journals (Sweden)

    Mahadevan Padmanabhan

    2009-08-01

    Full Text Available Abstract Background Viruses and small-genome bacteria (~2 megabases and smaller comprise a considerable population in the biosphere and are of interest to many researchers. These genomes are now sequenced at an unprecedented rate and require complementary computational tools to analyze. "CoreGenesUniqueGenes" (CGUG is an in silico genome data mining tool that determines a "core" set of genes from two to five organisms with genomes in this size range. Core and unique genes may reflect similar niches and needs, and may be used in classifying organisms. Findings CGUG is available at http://binf.gmu.edu/geneorder.html as a web-based on-the-fly tool that performs iterative BLASTP analyses using a reference genome and up to four query genomes to provide a table of genes common to these genomes. The result is an in silico display of genomes and their proteomes, allowing for further analysis. CGUG can be used for "genome annotation by homology", as demonstrated with Chlamydophila and Francisella genomes. Conclusion CGUG is used to reanalyze the ICTV-based classifications of bacteriophages, to reconfirm long-standing relationships and to explore new classifications. These genomes have been problematic in the past, due largely to horizontal gene transfers. CGUG is validated as a tool for reannotating small genome bacteria using more up-to-date annotations by similarity or homology. These serve as an entry point for wet-bench experiments to confirm the functions of these "hypothetical" and "unknown" proteins.

  5. First fungal genome sequence from Africa: A preliminary analysis

    Directory of Open Access Journals (Sweden)

    Rene Sutherland

    2012-01-01

    Full Text Available Some of the most significant breakthroughs in the biological sciences this century will emerge from the development of next generation sequencing technologies. The ease of availability of DNA sequence made possible through these new technologies has given researchers opportunities to study organisms in a manner that was not possible with Sanger sequencing. Scientists will, therefore, need to embrace genomics, as well as develop and nurture the human capacity to sequence genomes and utilise the ’tsunami‘ of data that emerge from genome sequencing. In response to these challenges, we sequenced the genome of Fusarium circinatum, a fungal pathogen of pine that causes pitch canker, a disease of great concern to the South African forestry industry. The sequencing work was conducted in South Africa, making F. circinatum the first eukaryotic organism for which the complete genome has been sequenced locally. Here we report on the process that was followed to sequence, assemble and perform a preliminary characterisation of the genome. Furthermore, details of the computer annotation and manual curation of this genome are presented. The F. circinatum genome was found to be nearly 44 million bases in size, which is similar to that of four other Fusarium genomes that have been sequenced elsewhere. The genome contains just over 15 000 open reading frames, which is less than that of the related species, Fusarium oxysporum, but more than that for Fusarium verticillioides. Amongst the various putative gene clusters identified in F. circinatum, those encoding the secondary metabolites fumosin and fusarin appeared to harbour evidence of gene translocation. It is anticipated that similar comparisons of other loci will provide insights into the genetic basis for pathogenicity of the pitch canker pathogen. Perhaps more importantly, this project has engaged a relatively large group of scientists

  6. Network Based Prediction Model for Genomics Data Analysis*

    OpenAIRE

    Huang, Ying; Wang, Pei

    2012-01-01

    Biological networks, such as genetic regulatory networks and protein interaction networks, provide important information for studying gene/protein activities. In this paper, we propose a new method, NetBoosting, for incorporating a priori biological network information in analyzing high dimensional genomics data. Specially, we are interested in constructing prediction models for disease phenotypes of interest based on genomics data, and at the same time identifying disease susceptible genes. ...

  7. Genome analysis of E. coli isolated from Crohn's disease patients.

    Science.gov (United States)

    Rakitina, Daria V; Manolov, Alexander I; Kanygina, Alexandra V; Garushyants, Sofya K; Baikova, Julia P; Alexeev, Dmitry G; Ladygina, Valentina G; Kostryukova, Elena S; Larin, Andrei K; Semashko, Tatiana A; Karpova, Irina Y; Babenko, Vladislav V; Ismagilova, Ruzilya K; Malanin, Sergei Y; Gelfand, Mikhail S; Ilina, Elena N; Gorodnichev, Roman B; Lisitsyna, Eugenia S; Aleshkin, Gennady I; Scherbakov, Petr L; Khalif, Igor L; Shapina, Marina V; Maev, Igor V; Andreev, Dmitry N; Govorun, Vadim M

    2017-07-19

    Escherichia coli (E. coli) has been increasingly implicated in the pathogenesis of Crohn's disease (CD). The phylogeny of E. coli isolated from Crohn's disease patients (CDEC) was controversial, and while genotyping results suggested heterogeneity, the sequenced strains of E. coli from CD patients were closely related. We performed the shotgun genome sequencing of 28 E. coli isolates from ten CD patients and compared genomes from these isolates with already published genomes of CD strains and other pathogenic and non-pathogenic strains. CDEC was shown to belong to A, B1, B2 and D phylogenetic groups. The plasmid and several operons from the reference CD-associated E. coli strain LF82 were demonstrated to be more often present in CDEC genomes belonging to different phylogenetic groups than in genomes of commensal strains. The operons include carbon-source induced invasion GimA island, prophage I, iron uptake operons I and II, capsular assembly pathogenetic island IV and propanediol and galactitol utilization operons. Our findings suggest that CDEC are phylogenetically diverse. However, some strains isolated from independent sources possess highly similar chromosome or plasmids. Though no CD-specific genes or functional domains were present in all CD-associated strains, some genes and operons are more often found in the genomes of CDEC than in commensal E. coli. They are principally linked to gut colonization and utilization of propanediol and other sugar alcohols.

  8. Analysis of Human Accelerated DNA Regions Using Archaic Hominin Genomes

    Science.gov (United States)

    Burbano, Hernán A.; Green, Richard E.; Maricic, Tomislav; Lalueza-Fox, Carles; de la Rasilla, Marco; Rosas, Antonio; Kelso, Janet; Pollard, Katherine S.; Lachmann, Michael; Pääbo, Svante

    2012-01-01

    Several previous comparisons of the human genome with other primate and vertebrate genomes identified genomic regions that are highly conserved in vertebrate evolution but fast-evolving on the human lineage. These human accelerated regions (HARs) may be regions of past adaptive evolution in humans. Alternatively, they may be the result of non-adaptive processes, such as biased gene conversion. We captured and sequenced DNA from a collection of previously published HARs using DNA from an Iberian Neandertal. Combining these new data with shotgun sequence from the Neandertal and Denisova draft genomes, we determine at least one archaic hominin allele for 84% of all positions within HARs. We find that 8% of HAR substitutions are not observed in the archaic hominins and are thus recent in the sense that the derived allele had not come to fixation in the common ancestor of modern humans and archaic hominins. Further, we find that recent substitutions in HARs tend to have come to fixation faster than substitutions elsewhere in the genome and that substitutions in HARs tend to cluster in time, consistent with an episodic rather than a clock-like process underlying HAR evolution. Our catalog of sequence changes in HARs will help prioritize them for functional studies of genomic elements potentially responsible for modern human adaptations. PMID:22412940

  9. Assembly, Annotation, and Analysis of Multiple Mycorrhizal Fungal Genomes

    Energy Technology Data Exchange (ETDEWEB)

    Initiative Consortium, Mycorrhizal Genomics; Kuo, Alan; Grigoriev, Igor; Kohler, Annegret; Martin, Francis

    2013-03-08

    Mycorrhizal fungi play critical roles in host plant health, soil community structure and chemistry, and carbon and nutrient cycling, all areas of intense interest to the US Dept. of Energy (DOE) Joint Genome Institute (JGI). To this end we are building on our earlier sequencing of the Laccaria bicolor genome by partnering with INRA-Nancy and the mycorrhizal research community in the MGI to sequence and analyze dozens of mycorrhizal genomes of all Basidiomycota and Ascomycota orders and multiple ecological types (ericoid, orchid, and ectomycorrhizal). JGI has developed and deployed high-throughput sequencing techniques, and Assembly, RNASeq, and Annotation Pipelines. In 2012 alone we sequenced, assembled, and annotated 12 draft or improved genomes of mycorrhizae, and predicted ~;;232831 genes and ~;;15011 multigene families, All of this data is publicly available on JGI MycoCosm (http://jgi.doe.gov/fungi/), which provides access to both the genome data and tools with which to analyze the data. Preliminary comparisons of the current total of 14 public mycorrhizal genomes suggest that 1) short secreted proteins potentially involved in symbiosis are more enriched in some orders than in others amongst the mycorrhizal Agaricomycetes, 2) there are wide ranges of numbers of genes involved in certain functional categories, such as signal transduction and post-translational modification, and 3) novel gene families are specific to some ecological types.

  10. Analysis of high-identity segmental duplications in the grapevine genome

    Directory of Open Access Journals (Sweden)

    Carelli Francesco N

    2011-08-01

    Full Text Available Abstract Background Segmental duplications (SDs are blocks of genomic sequence of 1-200 kb that map to different loci in a genome and share a sequence identity > 90%. SDs show at the sequence level the same characteristics as other regions of the human genome: they contain both high-copy repeats and gene sequences. SDs play an important role in genome plasticity by creating new genes and modeling genome structure. Although data is plentiful for mammals, not much was known about the representation of SDs in plant genomes. In this regard, we performed a genome-wide analysis of high-identity SDs on the sequenced grapevine (Vitis vinifera genome (PN40024. Results We demonstrate that recent SDs (> 94% identity and >= 10 kb in size are a relevant component of the grapevine genome (85 Mb, 17% of the genome sequence. We detected mitochondrial and plastid DNA and genes (10% of gene annotation in segmentally duplicated regions of the nuclear genome. In particular, the nine highest copy number genes have a copy in either or both organelle genomes. Further we showed that several duplicated genes take part in the biosynthesis of compounds involved in plant response to environmental stress. Conclusions These data show the great influence of SDs and organelle DNA transfers in modeling the Vitis vinifera nuclear DNA structure as well as the impact of SDs in contributing to the adaptive capacity of grapevine and the nutritional content of grape products through genome variation. This study represents a step forward in the full characterization of duplicated genes important for grapevine cultural needs and human health.

  11. [Phylogenetic relationships and intraspecific variation of D-genome Aegilops L. as revealed by RAPD analysis].

    Science.gov (United States)

    Goriunova, S V; Kochieva, E Z; Chikida, N N; Pukhal'skiĭ, V A

    2004-05-01

    RAPD analysis was carried out to study the genetic variation and phylogenetic relationships of polyploid Aegilops species, which contain the D genome as a component of the alloploid genome, and diploid Aegilops tauschii, which is a putative donor of the D genome for common wheat. In total, 74 accessions of six D-genome Aegilops species were examined. The highest intraspecific variation (0.03-0.21) was observed for Ae. tauschii. Intraspecific distances between accessions ranged 0.007-0.067 in Ae. cylindrica, 0.017-0.047 in Ae. vavilovii, and 0.00-0.053 in Ae. juvenalis. Likewise, Ae. ventricosa and Ae. crassa showed low intraspecific polymorphism. The among-accession difference in alloploid Ae. ventricosa (genome DvNv) was similar to that of one parental species, Ae. uniaristata (N), and substantially lower than in the other parent, Ae. tauschii (D). The among-accession difference in Ae. cylindrica (CcDc) was considerably lower than in either parent, Ae. tauschii (D) or Ae. caudata (C). With the exception of Ae. cylindrica, all D-genome species--Ae. tauschii (D), Ae. ventricosa (DvNv), Ae. crassa (XcrDcrl and XcrDcrlDcr2), Ae. juvenalis (XjDjUj), and Ae. vavilovii (XvaDvaSva)--formed a single polymorphic cluster, which was distinct from clusters of other species. The only exception, Ae. cylindrica, did not group with the other D-genome species, but clustered with Ae. caudata (C), a donor of the C genome. The cluster of these two species was clearly distinct from the cluster of the other D-genome species and close to a cluster of Ae. umbellulata (genome U) and Ae. ovata (genome UgMg). Thus, RAPD analysis for the first time was used to estimate and to compare the interpopulation polymorphism and to establish the phylogenetic relationships of all diploid and alloploid D-genome Aegilops species.

  12. Genome Sequencing and Comparative Genomics Analysis Revealed Pathogenic Potential in Penicillium capsulatum as a Novel Fungal Pathogen Belonging to Eurotiales

    Science.gov (United States)

    Yang, Ying; Chen, Min; Li, Zongwei; Al-Hatmi, Abdullah M. S.; de Hoog, Sybren; Pan, Weihua; Ye, Qiang; Bo, Xiaochen; Li, Zhen; Wang, Shengqi; Wang, Junzhi; Chen, Huipeng; Liao, Wanqing

    2016-01-01

    Penicillium capsulatum is a rare Penicillium species used in paper manufacturing, but recently it has been reported to cause invasive infection. To research the pathogenicity of the clinical Penicillium strain, we sequenced the genomes and transcriptomes of the clinical and environmental strains of P. capsulatum. Comparative analyses of these two P. capsulatum strains and close related strains belonging to Eurotiales were performed. The assembled genome sizes of P. capsulatum are approximately 34.4 Mbp in length and encode 11,080 predicted genes. The different isolates of P. capsulatum are highly similar, with the exception of several unique genes, INDELs or SNPs in the genes coding for glycosyl hydrolases, amino acid transporters and circumsporozoite protein. A phylogenomic analysis was performed based on the whole genome data of 38 strains belonging to Eurotiales. By comparing the whole genome sequences and the virulence-related genes from 20 important related species, including fungal pathogens and non-human pathogens belonging to Eurotiales, we found meaningful pathogenicity characteristics between P. capsulatum and its closely related species. Our research indicated that P. capsulatum may be a neglected opportunistic pathogen. This study is beneficial for mycologists, geneticists and epidemiologists to achieve a deeper understanding of the genetic basis of the role of P. capsulatum as a newly reported fungal pathogen. PMID:27761131

  13. BATCH-GE: Batch analysis of Next-Generation Sequencing data for genome editing assessment.

    Science.gov (United States)

    Boel, Annekatrien; Steyaert, Woutert; De Rocker, Nina; Menten, Björn; Callewaert, Bert; De Paepe, Anne; Coucke, Paul; Willaert, Andy

    2016-07-27

    Targeted mutagenesis by the CRISPR/Cas9 system is currently revolutionizing genetics. The ease of this technique has enabled genome engineering in-vitro and in a range of model organisms and has pushed experimental dimensions to unprecedented proportions. Due to its tremendous progress in terms of speed, read length, throughput and cost, Next-Generation Sequencing (NGS) has been increasingly used for the analysis of CRISPR/Cas9 genome editing experiments. However, the current tools for genome editing assessment lack flexibility and fall short in the analysis of large amounts of NGS data. Therefore, we designed BATCH-GE, an easy-to-use bioinformatics tool for batch analysis of NGS-generated genome editing data, available from https://github.com/WouterSteyaert/BATCH-GE.git. BATCH-GE detects and reports indel mutations and other precise genome editing events and calculates the corresponding mutagenesis efficiencies for a large number of samples in parallel. Furthermore, this new tool provides flexibility by allowing the user to adapt a number of input variables. The performance of BATCH-GE was evaluated in two genome editing experiments, aiming to generate knock-out and knock-in zebrafish mutants. This tool will not only contribute to the evaluation of CRISPR/Cas9-based experiments, but will be of use in any genome editing experiment and has the ability to analyze data from every organism with a sequenced genome.

  14. Genomic resources for sea lice: analysis of ESTs and mitochondrial genomes.

    Science.gov (United States)

    Yasuike, Motoshige; Leong, Jong; Jantzen, Stuart G; von Schalburg, Kristian R; Nilsen, Frank; Jones, Simon R M; Koop, Ben F

    2012-04-01

    Sea lice are common parasites of both farmed and wild salmon. Salmon farming constitutes an important economic market in North America, South America, and Northern Europe. Infections with sea lice can result in significant production losses. A compilation of genomic information on different genera of sea lice is an important resource for understanding their biology as well as for the study of population genetics and control strategies. We report on over 150,000 expressed sequence tags (ESTs) from five different species (Pacific Lepeophtheirus salmonis (49,672 new ESTs in addition to 14,994 previously reported ESTs), Atlantic L. salmonis (57,349 ESTs), Caligus clemensi (14,821 ESTs), Caligus rogercresseyi (32,135 ESTs), and Lernaeocera branchialis (16,441 ESTs)). For each species, ESTs were assembled into complete or partial genes and annotated by comparisons to known proteins in public databases. In addition, whole mitochondrial (mt) genome sequences of C. clemensi (13,440 bp) and C. rogercresseyi (13,468 bp) were determined and compared to L. salmonis. Both nuclear and mtDNA genes show very high levels of sequence divergence between these ectoparastic copepods suggesting that the different species of sea lice have been in existence for 37-113 million years and that parasitic association with salmonids is also quite ancient. Our ESTs and mtDNA data provide a novel resource for the study of sea louse biology, population genetics, and control strategies. This genomic information provides the material basis for the development of a 38K sea louse microarray that can be used in conjunction with our existing 44K salmon microarray to study host-parasite interactions at the molecular level. This report represents the largest genomic resource for any copepod species to date.

  15. Genome sequence of Cronobacter sakazakii BAA-894 and comparative genomic hybridization analysis with other Cronobacter species.

    Directory of Open Access Journals (Sweden)

    Eva Kucerova

    Full Text Available BACKGROUND: The genus Cronobacter (formerly called Enterobacter sakazakii is composed of five species; C. sakazakii, C. malonaticus, C. turicensis, C. muytjensii, and C. dublinensis. The genus includes opportunistic human pathogens, and the first three species have been associated with neonatal infections. The most severe diseases are caused in neonates and include fatal necrotizing enterocolitis and meningitis. The genetic basis of the diversity within the genus is unknown, and few virulence traits have been identified. METHODOLOGY/PRINCIPAL FINDINGS: We report here the first sequence of a member of this genus, C. sakazakii strain BAA-894. The genome of Cronobacter sakazakii strain BAA-894 comprises a 4.4 Mb chromosome (57% GC content and two plasmids; 31 kb (51% GC and 131 kb (56% GC. The genome was used to construct a 387,000 probe oligonucleotide tiling DNA microarray covering the whole genome. Comparative genomic hybridization (CGH was undertaken on five other C. sakazakii strains, and representatives of the four other Cronobacter species. Among 4,382 annotated genes inspected in this study, about 55% of genes were common to all C. sakazakii strains and 43% were common to all Cronobacter strains, with 10-17% absence of genes. CONCLUSIONS/SIGNIFICANCE: CGH highlighted 15 clusters of genes in C. sakazakii BAA-894 that were divergent or absent in more than half of the tested strains; six of these are of probable prophage origin. Putative virulence factors were identified in these prophage and in other variable regions. A number of genes unique to Cronobacter species associated with neonatal infections (C. sakazakii, C. malonaticus and C. turicensis were identified. These included a copper and silver resistance system known to be linked to invasion of the blood-brain barrier by neonatal meningitic strains of Escherichia coli. In addition, genes encoding for multidrug efflux pumps and adhesins were identified that were unique to C. sakazakii

  16. Analysis of the genome content of Lactococcus garvieae by genomic interspecies microarray hybridization

    Directory of Open Access Journals (Sweden)

    Gibello Alicia

    2010-03-01

    Full Text Available Abstract Background Lactococcus garvieae is a bacterial pathogen that affects different animal species in addition to humans. Despite the widespread distribution and emerging clinical significance of L. garvieae in both veterinary and human medicine, there is almost a complete lack of knowledge about the genetic content of this microorganism. In the present study, the genomic content of L. garvieae CECT 4531 was analysed using bioinformatics tools and microarray-based comparative genomic hybridization (CGH experiments. Lactococcus lactis subsp. lactis IL1403 and Streptococcus pneumoniae TIGR4 were used as reference microorganisms. Results The combination and integration of in silico analyses and in vitro CGH experiments, performed in comparison with the reference microorganisms, allowed establishment of an inter-species hybridization framework with a detection threshold based on a sequence similarity of ≥ 70%. With this threshold value, 267 genes were identified as having an analogue in L. garvieae, most of which (n = 258 have been documented for the first time in this pathogen. Most of the genes are related to ribosomal, sugar metabolism or energy conversion systems. Some of the identified genes, such as als and mycA, could be involved in the pathogenesis of L. garvieae infections. Conclusions In this study, we identified 267 genes that were potentially present in L. garvieae CECT 4531. Some of the identified genes could be involved in the pathogenesis of L. garvieae infections. These results provide the first insight into the genome content of L. garvieae.

  17. Genome-association analysis of Korean Holstein milk traits using genomic estimated breeding value

    Directory of Open Access Journals (Sweden)

    Donghyun Shin

    2017-03-01

    Full Text Available Objective Holsteins are known as the world’s highest-milk producing dairy cattle. The purpose of this study was to identify genetic regions strongly associated with milk traits (milk production, fat, and protein using Korean Holstein data. Methods This study was performed using single nucleotide polymorphism (SNP chip data (Illumina BovineSNP50 Beadchip of 911 Korean Holstein individuals. We inferred each genomic estimated breeding values based on best linear unbiased prediction (BLUP and ridge regression using BLUPF90 and R. We then performed a genome-wide association study and identified genetic regions related to milk traits. Results We identified 9, 6, and 17 significant genetic regions related to milk production, fat and protein, respectively. These genes are newly reported in the genetic association with milk traits of Holstein. Conclusion This study complements a recent Holstein genome-wide association studies that identified other SNPs and genes as the most significant variants. These results will help to expand the knowledge of the polygenic nature of milk production in Holsteins.

  18. Symbolic flux analysis for genome-scale metabolic networks

    Directory of Open Access Journals (Sweden)

    Peterson Pearu

    2011-05-01

    Full Text Available Abstract Background With the advent of genomic technology, the size of metabolic networks that are subject to analysis is growing. A common task when analyzing metabolic networks is to find all possible steady state regimes. There are several technical issues that have to be addressed when analyzing large metabolic networks including accumulation of numerical errors and presentation of the solution to the researcher. One way to resolve those technical issues is to analyze the network using symbolic methods. The aim of this paper is to develop a routine that symbolically finds the steady state solutions of large metabolic networks. Results A symbolic Gauss-Jordan elimination routine was developed for analyzing large metabolic networks. This routine was tested by finding the steady state solutions for a number of curated stoichiometric matrices with the largest having about 4000 reactions. The routine was able to find the solution with a computational time similar to the time used by a numerical singular value decomposition routine. As an advantage of symbolic solution, a set of independent fluxes can be suggested by the researcher leading to the formation of a desired flux basis describing the steady state solution of the network. These independent fluxes can be constrained using experimental data. We demonstrate the application of constraints by calculating a flux distribution for the central metabolic and amino acid biosynthesis pathways of yeast. Conclusions We were able to find symbolic solutions for the steady state flux distribution of large metabolic networks. The ability to choose a flux basis was found to be useful in the constraint process and provides a strong argument for using symbolic Gauss-Jordan elimination in place of singular value decomposition.

  19. Symbolic flux analysis for genome-scale metabolic networks.

    Science.gov (United States)

    Schryer, David W; Vendelin, Marko; Peterson, Pearu

    2011-05-23

    With the advent of genomic technology, the size of metabolic networks that are subject to analysis is growing. A common task when analyzing metabolic networks is to find all possible steady state regimes. There are several technical issues that have to be addressed when analyzing large metabolic networks including accumulation of numerical errors and presentation of the solution to the researcher. One way to resolve those technical issues is to analyze the network using symbolic methods. The aim of this paper is to develop a routine that symbolically finds the steady state solutions of large metabolic networks. A symbolic Gauss-Jordan elimination routine was developed for analyzing large metabolic networks. This routine was tested by finding the steady state solutions for a number of curated stoichiometric matrices with the largest having about 4000 reactions. The routine was able to find the solution with a computational time similar to the time used by a numerical singular value decomposition routine. As an advantage of symbolic solution, a set of independent fluxes can be suggested by the researcher leading to the formation of a desired flux basis describing the steady state solution of the network. These independent fluxes can be constrained using experimental data. We demonstrate the application of constraints by calculating a flux distribution for the central metabolic and amino acid biosynthesis pathways of yeast. We were able to find symbolic solutions for the steady state flux distribution of large metabolic networks. The ability to choose a flux basis was found to be useful in the constraint process and provides a strong argument for using symbolic Gauss-Jordan elimination in place of singular value decomposition.

  20. Micro and nanofluidic structures for cell sorting and genomic analysis

    Science.gov (United States)

    Morton, Keith J.

    Microfluidic systems promise rapid analysis of small samples in a compact and inexpensive format. But direct scaling of lab bench protocols on-chip is challenging because laminar flows in typical microfluidic devices are characterized by non-mixing streamlines. Common microfluidic mixers and sorters work by diffusion, limiting application to objects that diffuse slowly such as cells and DNA. Recently Huang et.al. developed a passive microfluidic element to continuously separate bio-particles deterministically. In Deterministic Lateral Displacement (DLD), objects are sorted by size as they transit an asymmetric array of microfabricated posts. This thesis further develops DLD arrays with applications in three broad new areas. First the arrays are used, not simply to sort particles, but to move streams of cells through functional flows for chemical treatment---such as on-chip immunofluorescent labeling of blood cells with washing, and on-chip E.coli cell lysis with simultaneous chromosome extraction. Secondly, modular tiling of the basic DLD element is used to construct complex particle handling modes that include beam steering for jets of cells and beads. Thirdly, nanostructured DLD arrays are built using Nanoimprint Lithography (NIL) and continuous-flow separation of 100 nm and 200 nm size particles is demonstrated. Finally a number of ancillary nanofabrication techniques were developed in support of these overall goals, including methods to interface nanofluidic structures with standard microfluidic components such as inlet channels and reservoirs, precision etching of ultra-high aspect ratio (>50:1) silicon nanostructures, and fabrication of narrow (˜ 35 nm) channels used to stretch genomic length DNA.

  1. Draft genome sequence and detailed analysis of Pantoea eucrina strain Russ and implication for opportunistic pathogenesis

    Directory of Open Access Journals (Sweden)

    Farzaneh Moghadam

    2016-12-01

    Full Text Available The genus Pantoea is a predominant member of host-associated microbiome. We here report on the genomic analysis of Pantoea eucrina strain Russ that was isolated from a trashcan at Oklahoma State University, Stillwater, OK. The draft genome of Pantoea eucrina strain Russ consists of 3,939,877 bp of DNA with 3704 protein-coding genes and 134 RNA genes. This is the first report of a genome sequence of a member of Pantoea eucrina. Genomic analysis revealed metabolic versatility with genes involved in the metabolism and transport of all amino acids as well as glucose, fructose, mannose, xylose, arabinose and galactose, suggesting the organism is a versatile heterotroph. The genome also encodes an extensive secretory machinery including types I, II, III, IV, and Vb secretion systems, and several genes for pili production including the new usher/chaperone system (pfam 05,229. The implications of these systems for opportunistic pathogenesis are discussed.

  2. DEVELOPMENT OF NEW SEQUENCING TECHNOLOGIES AND THEIR APPLICATION IN GENOME ANALYSIS OF DOMESTIC ANIMALS

    Directory of Open Access Journals (Sweden)

    Kristina Gvozdanović

    2015-12-01

    Full Text Available Sequencing and detailed study of the genom of domestic animals began in the middle of the last century. It was primarily referred to development of the first generation sequencing methods, i.e. Sanger sequencing method. Next generation sequencing methods are currently the most common methods in the analysis of domestic animals genom. The application of these methods gave us up to 100 time more data in comparison with Sanger method. Analyses including RNA sequencing, genotyping of whole genome, immunoprecipitation associated with DNA microarrays, detection ofmutations and inherited diseases, sequencing ofthemitochondrial genome and many others have been conducted with development and application of new sequencing methods since 2005 until today. Application of new sequencing methods in the analysis ofdomestic animal genome provides better understanding of the genetic basis for important production traits which could help in improving the livestock production.

  3. The genome sequence of E. coli W (ATCC 9637: comparative genome analysis and an improved genome-scale reconstruction of E. coli

    Directory of Open Access Journals (Sweden)

    Lee Sang

    2011-01-01

    Full Text Available Abstract Background Escherichia coli is a model prokaryote, an important pathogen, and a key organism for industrial biotechnology. E. coli W (ATCC 9637, one of four strains designated as safe for laboratory purposes, has not been sequenced. E. coli W is a fast-growing strain and is the only safe strain that can utilize sucrose as a carbon source. Lifecycle analysis has demonstrated that sucrose from sugarcane is a preferred carbon source for industrial bioprocesses. Results We have sequenced and annotated the genome of E. coli W. The chromosome is 4,900,968 bp and encodes 4,764 ORFs. Two plasmids, pRK1 (102,536 bp and pRK2 (5,360 bp, are also present. W has unique features relative to other sequenced laboratory strains (K-12, B and Crooks: it has a larger genome and belongs to phylogroup B1 rather than A. W also grows on a much broader range of carbon sources than does K-12. A genome-scale reconstruction was developed and validated in order to interrogate metabolic properties. Conclusions The genome of W is more similar to commensal and pathogenic B1 strains than phylogroup A strains, and therefore has greater utility for comparative analyses with these strains. W should therefore be the strain of choice, or 'type strain' for group B1 comparative analyses. The genome annotation and tools created here are expected to allow further utilization and development of E. coli W as an industrial organism for sucrose-based bioprocesses. Refinements in our E. coli metabolic reconstruction allow it to more accurately define E. coli metabolism relative to previous models.

  4. Genome sequencing and analysis of the first complete genome of Lactobacillus kunkeei strain MP2, an Apis mellifera gut isolate

    Directory of Open Access Journals (Sweden)

    Freddy Asenjo

    2016-04-01

    Full Text Available Background. The honey bee (Apis mellifera is the most important pollinator in agriculture worldwide. However, the number of honey bees has fallen significantly since 2006, becoming a huge ecological problem nowadays. The principal cause is CCD, or Colony Collapse Disorder, characterized by the seemingly spontaneous abandonment of hives by their workers. One of the characteristics of CCD in honey bees is the alteration of the bacterial communities in their gastrointestinal tract, mainly due to the decrease of Firmicutes populations, such as the Lactobacilli. At this time, the causes of these alterations remain unknown. We recently isolated a strain of Lactobacillus kunkeei (L. kunkeei strain MP2 from the gut of Chilean honey bees. L. kunkeei, is one of the most commonly isolated bacterium from the honey bee gut and is highly versatile in different ecological niches. In this study, we aimed to elucidate in detail, the L. kunkeei genetic background and perform a comparative genome analysis with other Lactobacillus species. Methods. L. kunkeei MP2 was originally isolated from the guts of Chilean A. mellifera individuals. Genome sequencing was done using Pacific Biosciences single-molecule real-time sequencing technology. De novo assembly was performed using Celera assembler. The genome was annotated using Prokka, and functional information was added using the EggNOG 3.1 database. In addition, genomic islands were predicted using IslandViewer, and pro-phage sequences using PHAST. Comparisons between L. kunkeei MP2 with other L. kunkeei, and Lactobacillus strains were done using Roary. Results. The complete genome of L. kunkeei MP2 comprises one circular chromosome of 1,614,522 nt. with a GC content of 36,9%. Pangenome analysis with 16 L. kunkeei strains, identified 113 unique genes, most of them related to phage insertions. A large and unique region of L. kunkeei MP2 genome contains several genes that encode for phage structural protein and

  5. Genome sequencing and analysis of the first complete genome of Lactobacillus kunkeei strain MP2, an Apis mellifera gut isolate.

    Science.gov (United States)

    Asenjo, Freddy; Olmos, Alejandro; Henríquez-Piskulich, Patricia; Polanco, Victor; Aldea, Patricia; Ugalde, Juan A; Trombert, Annette N

    2016-01-01

    Background. The honey bee (Apis mellifera) is the most important pollinator in agriculture worldwide. However, the number of honey bees has fallen significantly since 2006, becoming a huge ecological problem nowadays. The principal cause is CCD, or Colony Collapse Disorder, characterized by the seemingly spontaneous abandonment of hives by their workers. One of the characteristics of CCD in honey bees is the alteration of the bacterial communities in their gastrointestinal tract, mainly due to the decrease of Firmicutes populations, such as the Lactobacilli. At this time, the causes of these alterations remain unknown. We recently isolated a strain of Lactobacillus kunkeei (L. kunkeei strain MP2) from the gut of Chilean honey bees. L. kunkeei, is one of the most commonly isolated bacterium from the honey bee gut and is highly versatile in different ecological niches. In this study, we aimed to elucidate in detail, the L. kunkeei genetic background and perform a comparative genome analysis with other Lactobacillus species. Methods. L. kunkeei MP2 was originally isolated from the guts of Chilean A. mellifera individuals. Genome sequencing was done using Pacific Biosciences single-molecule real-time sequencing technology. De novo assembly was performed using Celera assembler. The genome was annotated using Prokka, and functional information was added using the EggNOG 3.1 database. In addition, genomic islands were predicted using IslandViewer, and pro-phage sequences using PHAST. Comparisons between L. kunkeei MP2 with other L. kunkeei, and Lactobacillus strains were done using Roary. Results. The complete genome of L. kunkeei MP2 comprises one circular chromosome of 1,614,522 nt. with a GC content of 36,9%. Pangenome analysis with 16 L. kunkeei strains, identified 113 unique genes, most of them related to phage insertions. A large and unique region of L. kunkeei MP2 genome contains several genes that encode for phage structural protein and replication components

  6. Genome sequencing and analysis of the first complete genome of Lactobacillus kunkeei strain MP2, an Apis mellifera gut isolate

    Science.gov (United States)

    Asenjo, Freddy; Olmos, Alejandro; Henríquez-Piskulich, Patricia; Polanco, Victor; Aldea, Patricia

    2016-01-01

    Background. The honey bee (Apis mellifera) is the most important pollinator in agriculture worldwide. However, the number of honey bees has fallen significantly since 2006, becoming a huge ecological problem nowadays. The principal cause is CCD, or Colony Collapse Disorder, characterized by the seemingly spontaneous abandonment of hives by their workers. One of the characteristics of CCD in honey bees is the alteration of the bacterial communities in their gastrointestinal tract, mainly due to the decrease of Firmicutes populations, such as the Lactobacilli. At this time, the causes of these alterations remain unknown. We recently isolated a strain of Lactobacillus kunkeei (L. kunkeei strain MP2) from the gut of Chilean honey bees. L. kunkeei, is one of the most commonly isolated bacterium from the honey bee gut and is highly versatile in different ecological niches. In this study, we aimed to elucidate in detail, the L. kunkeei genetic background and perform a comparative genome analysis with other Lactobacillus species. Methods. L. kunkeei MP2 was originally isolated from the guts of Chilean A. mellifera individuals. Genome sequencing was done using Pacific Biosciences single-molecule real-time sequencing technology. De novo assembly was performed using Celera assembler. The genome was annotated using Prokka, and functional information was added using the EggNOG 3.1 database. In addition, genomic islands were predicted using IslandViewer, and pro-phage sequences using PHAST. Comparisons between L. kunkeei MP2 with other L. kunkeei, and Lactobacillus strains were done using Roary. Results. The complete genome of L. kunkeei MP2 comprises one circular chromosome of 1,614,522 nt. with a GC content of 36,9%. Pangenome analysis with 16 L. kunkeei strains, identified 113 unique genes, most of them related to phage insertions. A large and unique region of L. kunkeei MP2 genome contains several genes that encode for phage structural protein and replication components

  7. When Anthropogenic River Disturbance Decreases Hybridisation between Non-Native and Endemic Cyprinids and Drives an Ecomorphological Displacement towards Juvenile State in Both Species.

    Directory of Open Access Journals (Sweden)

    Emmanuel Corse

    Full Text Available Understanding the impact of non-native species on native species is a major challenge in molecular ecology, particularly for genetically compatible fish species. Invasions are generally difficult to study because their effects may be confused with those of environmental or human disturbances. Colonized ecosystems are differently impacted by human activities, resulting in diverse responses and interactions between native and non-native species. We studied the dynamics between two Cyprinids species (invasive Chondrostoma nasus and endemic Parachondrostoma toxostoma and their hybrids in 16 populations (from allopatric to sympatric situations and from little to highly fragmented areas corresponding to 2,256 specimens. Each specimen was assigned to a particular species or to a hybrid pool using molecular identification (cytochrome b and 41 microsatellites. We carried out an ecomorphological analysis based on size, age, body shape, and diet (gut vacuity and molecular fecal contents. Our results contradicted our initial assumptions on the pattern of invasion and the rate of introgression. There was no sign of underperformance for the endemic species in areas where hybridisation occurred. In the unfragmented zone, the introduced species was found mostly downstream, with body shapes similar to those in allopatric populations while both species were found to be more insectivorous than the reference populations. However, high level of hybridisation was detected, suggesting interactions between the two species during spawning and/or the existence of hybrid swarm. In the disturbed zone, introgression was less frequent and slender body shape was associated with diatomivorous behaviour, smaller size (juvenile characteristics and greater gut vacuity. Results suggested that habitat degradation induced similar ecomorphological trait changes in the two species and their hybrids (i.e. a transition towards a pedomorphic state where the invasive species is more

  8. Implications of hybridisation and cytotypic differentiation in speciation assessed by AFLP and plastid haplotypes - a case study of Potentilla alpicola La Soie

    Directory of Open Access Journals (Sweden)

    Paule Juraj

    2012-08-01

    Full Text Available Abstract Background Hybridisation is presumed to be an important mechanism in plant speciation and a creative evolutionary force often accompanied by polyploidisation and in some cases by apomixis. The Potentilla collina group constitutes a particularly suitable model system to study these phenomena as it is morphologically extensively variable, exclusively polyploid and expresses apomixis. In the present study, the alpine taxon Potentilla alpicola has been chosen in order to study its presumed hybrid origin, identify underlying evolutionary processes and infer the discreteness or taxonomic value of hybrid forms. Results Combined analysis of AFLP, cpDNA sequences and ploidy level variation revealed a hybrid origin of the P. alpicola populations from South Tyrol (Italy resulting from crosses between P. pusilla and two cytotypes of P. argentea. Hybrids were locally sympatric with at least one of the parental forms. Three lineages of different evolutionary origin comprising two ploidy levels were identified within P. alpicola. The lineages differed in parentage and the complexity of the evolutionary process. A geographically wide-spread lineage thus contrasted with locally distributed lineages of different origins. Populations of P. collina studied in addition, have been regarded rather as recent derivatives of the hexaploid P. argentea. The observation of clones within both P. alpicola and P. collina suggested a possible apomictic mode of reproduction. Conclusions Different hybridisation scenarios taking place on geographically small scales resulted in viable progeny presumably stabilised by apomixis. The case study of P. alpicola supports that these processes played a significant role in the creation of polymorphism in the genus Potentilla. However, multiple origin of hybrids and backcrossing are considered to produce a variety of evolutionary spontaneous forms existing aside of reproductively stabilised, established lineages.

  9. Flow cytometric analysis of oil palm: a preliminary analysis for cultivars and genomic DNA alteration

    Directory of Open Access Journals (Sweden)

    Warawut Chuthammathat

    2005-12-01

    Full Text Available DNA contents of oil palm (Elaeis guineensis Jacq. cultivars were analyzed by flow cytometry using different external reference plant species. Analysis using corn (Zea mays line CE-777 as a reference plant gave the highest DNA content of oil palm (4.72±0.23 pg 2C-1 whereas the DNA content was found to be lower when using soybean (Glycine max cv. Polanka (3.77±0.09 pg 2C-1 or tomato (Lycopersicon esculentum cv. Stupicke (4.25±0.09 pg 2C-1 as a reference. The nuclear DNA contents of Dura (D109, Pisifera (P168 and Tenera (T38 cultivars were 3.46±0.04, 3.24±0.03 and 3.76±0.04 pg 2C-1 nuclei, respectively, using soybean as a reference. One haploid genome of oil palm therefore ranged from 1.56 to 1.81±109 base pairs. DNA contents from one-year-old calli and cell suspension of oil palm were found to be significantly different from those of seedlings. It thus should be noted that genomic DNA alteration occurred in these cultured tissues. We therefore confirm that flow cytometric analysis could verify cultivars, DNA content and genomic DNA alteration of oil palm using soybean as an external reference standard.

  10. The Methanosarcina barkeri genome: comparative analysis withMethanosarcina acetivorans and Methanosarcina mazei reveals extensiverearrangement within methanosarcinal genomes

    Energy Technology Data Exchange (ETDEWEB)

    Maeder, Dennis L.; Anderson, Iain; Brettin, Thomas S.; Bruce,David C.; Gilna, Paul; Han, Cliff S.; Lapidus, Alla; Metcalf, William W.; Saunders, Elizabeth; Tapia, Roxanne; Sowers, Kevin R.

    2006-05-19

    We report here a comparative analysis of the genome sequence of Methanosarcina barkeri with those of Methanosarcina acetivorans and Methanosarcina mazei. All three genomes share a conserved double origin of replication and many gene clusters. M. barkeri is distinguished by having an organization that is well conserved with respect to the other Methanosarcinae in the region proximal to the origin of replication with interspecies gene similarities as high as 95%. However it is disordered and marked by increased transposase frequency and decreased gene synteny and gene density in the proximal semi-genome. Of the 3680 open reading frames in M. barkeri, 678 had paralogs with better than 80% similarity to both M. acetivorans and M. mazei while 128 nonhypothetical orfs were unique (non-paralogous) amongst these species including a complete formate dehydrogenase operon, two genes required for N-acetylmuramic acid synthesis, a 14 gene gas vesicle cluster and a bacterial P450-specific ferredoxin reductase cluster not previously observed or characterized in this genus. A cryptic 36 kbp plasmid sequence was detected in M. barkeri that contains an orc1 gene flanked by a presumptive origin of replication consisting of 38 tandem repeats of a 143 nt motif. Three-way comparison of these genomes reveals differing mechanisms for the accrual of changes. Elongation of the large M. acetivorans is the result of multiple gene-scale insertions and duplications uniformly distributed in that genome, while M. barkeri is characterized by localized inversions associated with the loss of gene content. In contrast, the relatively short M. mazei most closely approximates the ancestral organizational state.

  11. Complete genome sequence of Nitrobacter hamburgensis X14 and comparative genomic analysis of species within the genus Nitrobacter.

    Energy Technology Data Exchange (ETDEWEB)

    Starkenburg, Shawn R [Oregon State University; Larimer, Frank W [ORNL; Stein, Lisa Y [University of California, Riverside; Klotz, Martin G [University of Louisville, Louisville; Chain, Patrick S. G. [Lawrence Livermore National Laboratory (LLNL); Sayavedra-Soto, LA [Oregon State University; Poret-Peterson, Amisha T. [University of Louisville, Louisville; Gentry, ME [University of Louisville, Louisville; Arp, D J [Oregon State University; Ward, Bess B. [Princeton University; Bottomley, Peter J [Oregon State University

    2008-05-01

    The alphaproteobacterium Nitrobacter hamburgensis X14 is a gram-negative facultative chemolithoautotroph that conserves energy from the oxidation of nitrite to nitrate. Sequencing and analysis of the Nitrobacter hamburgensis X14 genome revealed four replicons comprised of one chromosome (4.4 Mbp) and three plasmids (294, 188, and 121 kbp). Over 20% of the genome is composed of pseudogenes and paralogs. Whole-genome comparisons were conducted between N. hamburgensis and the finished and draft genome sequences of Nitrobacter winogradskyi and Nitrobacter sp. strain Nb-311A, respectively. Most of the plasmid-borne genes were unique to N. hamburgensis and encode a variety of functions (central metabolism, energy conservation, conjugation, and heavy metal resistance), yet approximately 21 kb of a approximately 28-kb "autotrophic" island on the largest plasmid was conserved in the chromosomes of Nitrobacter winogradskyi Nb-255 and Nitrobacter sp. strain Nb-311A. The N. hamburgensis chromosome also harbors many unique genes, including those for heme-copper oxidases, cytochrome b(561), and putative pathways for the catabolism of aromatic, organic, and one-carbon compounds, which help verify and extend its mixotrophic potential. A Nitrobacter "subcore" genome was also constructed by removing homologs found in strains of the closest evolutionary relatives, Bradyrhizobium japonicum and Rhodopseudomonas palustris. Among the Nitrobacter subcore inventory (116 genes), copies of genes or gene clusters for nitrite oxidoreductase (NXR), cytochromes associated with a dissimilatory nitrite reductase (NirK), PII-like regulators, and polysaccharide formation were identified. Many of the subcore genes have diverged significantly from, or have origins outside, the alphaproteobacterial lineage and may indicate some of the unique genetic requirements for nitrite oxidation in Nitrobacter.

  12. Analysis of CR1 Repeats in the Zebra Finch Genome

    Directory of Open Access Journals (Sweden)

    George E. Liu

    2013-06-01

    Full Text Available Most bird species have smaller genomes and fewer repeats than mammals. Chicken Repeat 1 (CR1 repeat is one of the most abundant families of repeats, ranging from ~133,000 to ~187,000 copies accounting for ~50 to ~80% of the interspersed repeats in the zebra finch and chicken genomes, respectively. CR1 repeats are believed to have arisen from the retrotransposition of a small number of master elements, which gave rise to multiple CR1 subfamilies in the chicken. In this study, we performed a global assessment of the divergence distributions, phylogenies, and consensus sequences of CR1 repeats in the zebra finch genome. We identified and validated 34 CR1 subfamilies and further analyzed the correlation between these subfamilies. We also discovered 4 novel lineage-specific CR1 subfamilies in the zebra finch when compared to the chicken genome. We built various evolutionary trees of these subfamilies and concluded that CR1 repeats may play an important role in reshaping the structure of bird genomes.

  13. The first complete chloroplast genome sequences of Ulmus species by de novo sequencing: Genome comparative and taxonomic position analysis

    Science.gov (United States)

    Zhang, Shuang; Yu, Xiao-Yue; Ren, Ya-Chao; Yang, Min-Sheng; Wang, Jin-Mao

    2017-01-01

    further analysis of their nuclear genomes. This study is the first report on Ulmus chloroplast genomes, which has significance for understanding photosynthesis, evolution, and chloroplast transgenic engineering. PMID:28158318

  14. Analysis of pan-genome to identify the core genes and essential genes of Brucella spp.

    Science.gov (United States)

    Yang, Xiaowen; Li, Yajie; Zang, Juan; Li, Yexia; Bie, Pengfei; Lu, Yanli; Wu, Qingmin

    2016-04-01

    Brucella spp. are facultative intracellular pathogens, that cause a contagious zoonotic disease, that can result in such outcomes as abortion or sterility in susceptible animal hosts and grave, debilitating illness in humans. For deciphering the survival mechanism of Brucella spp. in vivo, 42 Brucella complete genomes from NCBI were analyzed for the pan-genome and core genome by identification of their composition and function of Brucella genomes. The results showed that the total 132,143 protein-coding genes in these genomes were divided into 5369 clusters. Among these, 1710 clusters were associated with the core genome, 1182 clusters with strain-specific genes and 2477 clusters with dispensable genomes. COG analysis indicated that 44 % of the core genes were devoted to metabolism, which were mainly responsible for energy production and conversion (COG category C), and amino acid transport and metabolism (COG category E). Meanwhile, approximately 35 % of the core genes were in positive selection. In addition, 1252 potential essential genes were predicted in the core genome by comparison with a prokaryote database of essential genes. The results suggested that the core genes in Brucella genomes are relatively conservation, and the energy and amino acid metabolism play a more important role in the process of growth and reproduction in Brucella spp. This study might help us to better understand the mechanisms of Brucella persistent infection and provide some clues for further exploring the gene modules of the intracellular survival in Brucella spp.

  15. Parallel WGA and WTA for Comparative Genome and Transcriptome NGS Analysis Using Tiny Cell Numbers.

    Science.gov (United States)

    Korfhage, Christian; Fricke, Evelyn; Meier, Andreas

    2015-07-01

    Genomic DNA determines how and when the transcriptome is changed by a trigger or environmental change and how cellular metabolism is influenced. Comparative genome and transcriptome analysis of the same cell sample links a defined genome with all changes in the bases, structure, or numbers of the transcriptome. However, comparative genome and transcriptome analysis using next-generation sequencing (NGS) or real-time PCR is often limited by the small amount of sample available. In mammals, the amount of DNA and RNA in a single cell is ∼10 picograms, but deep analysis of the genome and transcriptome currently requires several hundred nanograms of nucleic acids for library preparation for NGS sequencing. Consequently, accurate whole-genome amplification (WGA) and whole-transcriptome amplification (WTA) is required for such quantitative analysis. This unit describes how the genome and the transcriptome of a tiny number of cells can be amplified in a highly parallel and comparable process. Protocols for quality control of amplified DNA and application of amplified DNA for NGS are included.

  16. Genomic sequence around butterfly wing development genes: annotation and comparative analysis.

    Directory of Open Access Journals (Sweden)

    Inês C Conceição

    Full Text Available BACKGROUND: Analysis of genomic sequence allows characterization of genome content and organization, and access beyond gene-coding regions for identification of functional elements. BAC libraries, where relatively large genomic regions are made readily available, are especially useful for species without a fully sequenced genome and can increase genomic coverage of phylogenetic and biological diversity. For example, no butterfly genome is yet available despite the unique genetic and biological properties of this group, such as diversified wing color patterns. The evolution and development of these patterns is being studied in a few target species, including Bicyclus anynana, where a whole-genome BAC library allows targeted access to large genomic regions. METHODOLOGY/PRINCIPAL FINDINGS: We characterize ∼1.3 Mb of genomic sequence around 11 selected genes expressed in B. anynana developing wings. Extensive manual curation of in silico predictions, also making use of a large dataset of expressed genes for this species, identified repetitive elements and protein coding sequence, and highlighted an expansion of Alcohol dehydrogenase genes. Comparative analysis with orthologous regions of the lepidopteran reference genome allowed assessment of conservation of fine-scale synteny (with detection of new inversions and translocations and of DNA sequence (with detection of high levels of conservation of non-coding regions around some, but not all, developmental genes. CONCLUSIONS: The general properties and organization of the available B. anynana genomic sequence are similar to the lepidopteran reference, despite the more than 140 MY divergence. Our results lay the groundwork for further studies of new interesting findings in relation to both coding and non-coding sequence: 1 the Alcohol dehydrogenase expansion with higher similarity between the five tandemly-repeated B. anynana paralogs than with the corresponding B. mori orthologs, and 2 the high

  17. GENOME SIZE DETERMINATION AND RAPD ANALYSIS OF FOUR EDIBLE AROIDS OF NORTH EAST INDIA

    Directory of Open Access Journals (Sweden)

    Jyoti P. Saikia1*, Bolin K. Konwar 2 and Susmita Singh3

    2010-10-01

    Full Text Available Four edible aroid species were selected for the study. The genomic DNA of the plants was isolated and estimated. A part of the genomic DNA was used for analysis using six different primers from Operon Technologies, USA. The genome size determined for the aroids is in the order of Colocasia esculenta> Xanthosoma caracu> Xanthosoma sagittifolium > Amorphophallus paeonifolius. Amorphophallus species was found to be 50% similar to both Xanthosoma caracu and Colocasia esculenta. The analysis will provide a ground for exploring the vast diversified aroid population of the region.

  18. The Challenges of Genome Analysis in the Health Care Setting

    Directory of Open Access Journals (Sweden)

    Anneke Lucassen

    2014-07-01

    Full Text Available Genome sequencing is now a sufficiently mature and affordable technology for clinical use. Its application promises not only to transform clinicians’ diagnostic and predictive ability, but also to improve preventative therapies, surveillance regimes, and tailor patient treatment to an individual’s genetic make-up. However, as with any technological advance, there are associated fresh challenges. While some of the ethical, legal and social aspects resulting from the generation of data from genome sequencing are generic, several nuances are unique. Since the UK government recently announced plans to sequence the genomes of 100,000 Health Service patients, and similar initiatives are being considered elsewhere, a discussion of these nuances is timely and needs to go hand in hand with formulation of guidelines and public engagement activities around implementation of sequencing in clinical practice.

  19. Comparative Genome Analysis of Lolium-Festuca Complex Species

    DEFF Research Database (Denmark)

    Czaban, Adrian; Byrne, Stephen; Sharma, Sapna

    2015-01-01

    The Lolium-Festuca complex incorporates species from the Lolium genera and the broad leaf Fescues. Plants belonging to this complex exhibit significant phenotypic plasticity for agriculturally important traits, such as annuality/perenniality, establishment potential, growth speed, nutritional value......, winter hardiness, drought tolerance and resistance to grazing. In this study we have sequenced and assembled the low copy fraction of the genomes of Lolium westerwoldicum, Lolium multiflorum, Festuca pratensis and Lolium temulentum. We have also generated de-novo transcriptome assemblies for each species......, and these have aided in the annotation of the genomic sequence. Using this data we were able to generate annotated assemblies of the gene rich regions of the four species to complement the already sequenced Lolium perenne genome. Using these gene models we have identified orthologous genes between the species...

  20. Power analysis for genome-wide association studies

    Directory of Open Access Journals (Sweden)

    Klein Robert J

    2007-08-01

    Full Text Available Abstract Background Genome-wide association studies are a promising new tool for deciphering the genetics of complex diseases. To choose the proper sample size and genotyping platform for such studies, power calculations that take into account genetic model, tag SNP selection, and the population of interest are required. Results The power of genome-wide association studies can be computed using a set of tag SNPs and a large number of genotyped SNPs in a representative population, such as available through the HapMap project. As expected, power increases with increasing sample size and effect size. Power also depends on the tag SNPs selected. In some cases, more power is obtained by genotyping more individuals at fewer SNPs than fewer individuals at more SNPs. Conclusion Genome-wide association studies should be designed thoughtfully, with the choice of genotyping platform and sample size being determined from careful power calculations.

  1. Ten years of maintaining and expanding a microbial genome and metagenome analysis system.

    Science.gov (United States)

    Markowitz, Victor M; Chen, I-Min A; Chu, Ken; Pati, Amrita; Ivanova, Natalia N; Kyrpides, Nikos C

    2015-11-01

    Launched in March 2005, the Integrated Microbial Genomes (IMG) system is a comprehensive data management system that supports multidimensional comparative analysis of genomic data. At the core of the IMG system is a data warehouse that contains genome and metagenome datasets sequenced at the Joint Genome Institute or provided by scientific users, as well as public genome datasets available at the National Center for Biotechnology Information Genbank sequence data archive. Genomes and metagenome datasets are processed using IMG's microbial genome and metagenome sequence data processing pipelines and are integrated into the data warehouse using IMG's data integration toolkits. Microbial genome and metagenome application specific data marts and user interfaces provide access to different subsets of IMG's data and analysis toolkits. This review article revisits IMG's original aims, highlights key milestones reached by the system during the past 10 years, and discusses the main challenges faced by a rapidly expanding system, in particular the complexity of maintaining such a system in an academic setting with limited budgets and computing and data management infrastructure.

  2. Chloroplast genome analysis of Australian eucalypts--Eucalyptus, Corymbia, Angophora, Allosyncarpia and Stockwellia (Myrtaceae).

    Science.gov (United States)

    Bayly, Michael J; Rigault, Philippe; Spokevicius, Antanas; Ladiges, Pauline Y; Ades, Peter K; Anderson, Charlotte; Bossinger, Gerd; Merchant, Andrew; Udovicic, Frank; Woodrow, Ian E; Tibbits, Josquin

    2013-12-01

    We present a phylogenetic analysis and comparison of structural features of chloroplast genomes for 39 species of the eucalypt group (genera Eucalyptus, Corymbia, Angophora, and outgroups Allosyncarpia and Stockwellia). We use 41 complete chloroplast genome sequences, adding 39 finished-quality chloroplast genomes to two previously published genomes. Maximum parsimony and Bayesian analyses, based on >7000 variable nucleotide positions, produced one fully resolved phylogenetic tree (35 supported nodes, 27 with 100% bootstrap support). Eucalyptus and its sister lineage Angophora+Corymbia show a deep divergence. Within Eucalyptus, three lineages are resolved: the 'eudesmid', 'symphyomyrt' and 'monocalypt' groups. Corymbia is paraphyletic with respect to Angophora. Gene content and order do not vary among eucalypt chloroplasts; length mutations, especially frame shifts, are uncommon in protein-coding genes. Some non-synonymous mutations are highly incongruent with the overall phylogenetic signal, notably in rbcL, and may be adaptive. Application of custom informatics pipelines (GYDLE Inc.) enabled direct chloroplast genome assembly, resolving each genome to finished-quality with no need for PCR gap-filling or contig order resolution. Analysis of whole chloroplast genomes resolved major eucalypt clades and revealed variable regions of the genome that will be useful in lower-level genetic studies (including phylogeography and geneflow).

  3. The Genome of Nosema sp. Isolate YNPr: A Comparative Analysis of Genome Evolution within the Nosema/Vairimorpha Clade

    Science.gov (United States)

    Ma, Zhenggang; Li, Tian; Zhang, Xiaoyan; Debrunner-Vossbrinck, Bettina A.; Zhou, Zeyang; Vossbrinck, Charles R.

    2016-01-01

    The microsporidian parasite designated here as Nosema sp. Isolate YNPr was isolated from the cabbage butterfly Pieris rapae collected in Honghe Prefecture, Yunnan Province, China. The genome was sequenced by Illumina sequencing and compared to those of two related members of the Nosema/Vairimorpha clade, Nosema ceranae and Nosema apis. Based upon assembly statistics, the Nosema sp. YNPr genome is 3.36 x 106bp with a G+C content of 23.18% and 2,075 protein coding sequences. An “ACCCTT” motif is present approximately 50-bp upstream of the start codon, as reported from other members of the clade and from Encephalitozoon cuniculi, a sister taxon. Comparative small subunit ribosomal DNA (SSU rDNA) analysis as well as genome-wide phylogenetic analysis confirms a closer relationship between N. ceranae and Nosema sp. YNPr than between the two honeybee parasites N. ceranae and N. apis. The more closely related N. ceranae and Nosema sp. YNPr show similarities in a number of structural characteristics such as gene synteny, gene length, gene number, transposon composition and gene reduction. Based on transposable element content of the assemblies, the transposon content of Nosema sp. YNPr is 4.8%, that of N. ceranae is 3.7%, and that of N. apis is 2.5%, with large differences in the types of transposons present among these 3 species. Gene function annotation indicates that the number of genes participating in most metabolic activities is similar in all three species. However, the number of genes in the transcription, general function, and cysteine protease categories is greater in N. apis than in the other two species. Our studies further characterize the evolution of the Nosema/Vairimorpha clade of microsporidia. These organisms maintain variable but very reduced genomes. We are interested in understanding the effects of genetic drift versus natural selection on genome size in the microsporidia and in developing a testable hypothesis for further studies on the genomic

  4. Complete genome sequence and comparative genome analysis of a new special Yersinia enterocolitica.

    Science.gov (United States)

    Shi, Guoxiang; Su, Mingming; Liang, Junrong; Duan, Ran; Gu, Wenpeng; Xiao, Yuchun; Zhang, Zhewen; Qiu, Haiyan; Zhang, Zheng; Li, Yi; Zhang, Xiaohe; Ling, Yunchao; Song, Lai; Chen, Meili; Zhao, Yongbing; Wu, Jiayan; Jing, Huaiqi; Xiao, Jingfa; Wang, Xin

    2016-09-01

    Yersinia enterocolitica is the most diverse species among the Yersinia genera and shows more polymorphism, especially for the non-pathogenic strains. Individual non-pathogenic Y. enterocolitica strains are wrongly identified because of atypical phenotypes. In this study, we isolated an unusual Y. enterocolitica strain LC20 from Rattus norvegicus. The strain did not utilize urea and could not be classified as the biotype. API 20E identified Escherichia coli; however, it grew well at 25 °C, but E. coli grew well at 37 °C. We analyzed the genome of LC20 and found the whole chromosome of LC20 was collinear with Y. enterocolitica 8081, and the urease gene did not exist on the genome which is consistent with the result of API 20E. Also, the 16 S and 23 SrRNA gene of LC20 lay on a branch of Y. enterocolitica. Furthermore, the core-based and pan-based phylogenetic trees showed that LC20 was classified into the Y. enterocolitica cluster. Two plasmids (80 and 50 k) from LC20 shared low genetic homology with pYV from the Yersinia genus, one was an ancestral Yersinia plasmid and the other was novel encoding a number of transposases. Some pathogenic and non-pathogenic Y. enterocolitica-specific genes coexisted in LC20. Thus, although it could not be classified into any Y. enterocolitica biotype due to its special biochemical metabolism, we concluded the LC20 was a Y. enterocolitica strain because its genome was similar to other Y. enterocolitica and it might be a strain with many mutations and combinations emerging in the processes of its evolution.

  5. Genome-wide analysis reveals a complex pattern of genomic imprinting in mice.

    Directory of Open Access Journals (Sweden)

    Jason B Wolf

    2008-06-01

    Full Text Available Parent-of-origin-dependent gene expression resulting from genomic imprinting plays an important role in modulating complex traits ranging from developmental processes to cognitive abilities and associated disorders. However, while gene-targeting techniques have allowed for the identification of imprinted loci, very little is known about the contribution of imprinting to quantitative variation in complex traits. Most studies, furthermore, assume a simple pattern of imprinting, resulting in either paternal or maternal gene expression; yet, more complex patterns of effects also exist. As a result, the distribution and number of different imprinting patterns across the genome remain largely unexplored. We address these unresolved issues using a genome-wide scan for imprinted quantitative trait loci (iQTL affecting body weight and growth in mice using a novel three-generation design. We identified ten iQTL that display much more complex and diverse effect patterns than previously assumed, including four loci with effects similar to the callipyge mutation found in sheep. Three loci display a new phenotypic pattern that we refer to as bipolar dominance, where the two heterozygotes are different from each other while the two homozygotes are identical to each other. Our study furthermore detected a paternally expressed iQTL on Chromosome 7 in a region containing a known imprinting cluster with many paternally expressed genes. Surprisingly, the effects of the iQTL were mostly restricted to traits expressed after weaning. Our results imply that the quantitative effects of an imprinted allele at a locus depend both on its parent of origin and the allele it is paired with. Our findings also show that the imprinting pattern of a locus can be variable over ontogenetic time and, in contrast to current views, may often be stronger at later stages in life.

  6. Analysis of Aspergillus nidulans metabolism at the genome-scale

    DEFF Research Database (Denmark)

    David, Helga; Ozcelik, İlknur Ş; Hofmann, Gerald

    2008-01-01

    Background: Aspergillus nidulans is a member of a diverse group of filamentous fungi, sharing many of the properties of its close relatives with significance in the fields of medicine, agriculture and industry. Furthermore, A. nidulans has been a classical model organism for studies of development...... biology and gene regulation, and thus it has become one of the best-characterized filamentous fungi. It was the first Aspergillus species to have its genome sequenced, and automated gene prediction tools predicted 9,451 open reading frames (ORFs) in the genome, of which less than 10% were assigned...

  7. BambooGDB: a bamboo genome database with functional annotation and an analysis platform.

    Science.gov (United States)

    Zhao, Hansheng; Peng, Zhenhua; Fei, Benhua; Li, Lubin; Hu, Tao; Gao, Zhimin; Jiang, Zehui

    2014-01-01

    Bamboo, as one of the most important non-timber forest products and fastest-growing plants in the world, represents the only major lineage of grasses that is native to forests. Recent success on the first high-quality draft genome sequence of moso bamboo (Phyllostachys edulis) provides new insights on bamboo genetics and evolution. To further extend our understanding on bamboo genome and facilitate future studies on the basis of previous achievements, here we have developed BambooGDB, a bamboo genome database with functional annotation and analysis platform. The de novo sequencing data, together with the full-length complementary DNA and RNA-seq data of moso bamboo composed the main contents of this database. Based on these sequence data, a comprehensively functional annotation for bamboo genome was made. Besides, an analytical platform composed of comparative genomic analysis, protein-protein interactions network, pathway analysis and visualization of genomic data was also constructed. As discovery tools to understand and identify biological mechanisms of bamboo, the platform can be used as a systematic framework for helping and designing experiments for further validation. Moreover, diverse and powerful search tools and a convenient browser were incorporated to facilitate the navigation of these data. As far as we know, this is the first genome database for bamboo. Through integrating high-throughput sequencing data, a full functional annotation and several analysis modules, BambooGDB aims to provide worldwide researchers with a central genomic resource and an extensible analysis platform for bamboo genome. BambooGDB is freely available at http://www.bamboogdb.org/. Database URL: http://www.bamboogdb.org.

  8. Analysis of anoxybacillus genomes from the aspects of lifestyle adaptations, prophage diversity, and carbohydrate metabolism.

    Directory of Open Access Journals (Sweden)

    Kian Mau Goh

    Full Text Available Species of Anoxybacillus are widespread in geothermal springs, manure, and milk-processing plants. The genus is composed of 22 species and two subspecies, but the relationship between its lifestyle and genome is little understood. In this study, two high-quality draft genomes were generated from Anoxybacillus spp. SK3-4 and DT3-1, isolated from Malaysian hot springs. De novo assembly and annotation were performed, followed by comparative genome analysis with the complete genome of Anoxybacillus flavithermus WK1 and two additional draft genomes, of A. flavithermus TNO-09.006 and A. kamchatkensis G10. The genomes of Anoxybacillus spp. are among the smaller of the family Bacillaceae. Despite having smaller genomes, their essential genes related to lifestyle adaptations at elevated temperature, extreme pH, and protection against ultraviolet are complete. Due to the presence of various competence proteins, Anoxybacillus spp. SK3-4 and DT3-1 are able to take up foreign DNA fragments, and some of these transferred genes are important for the survival of the cells. The analysis of intact putative prophage genomes shows that they are highly diversified. Based on the genome analysis using SEED, many of the annotated sequences are involved in carbohydrate metabolism. The presence of glycosyl hydrolases among the Anoxybacillus spp. was compared, and the potential applications of these unexplored enzymes are suggested here. This is the first study that compares Anoxybacillus genomes from the aspect of lifestyle adaptations, the capacity for horizontal gene transfer, and carbohydrate metabolism.

  9. Complete sequence of the mitochondrial genome of Odontamblyopus rubicundus (Perciformes: Gobiidae): genome characterization and phylogenetic analysis

    Indian Academy of Sciences (India)

    Tianxing Liu; Xiaoxiao Jin; Rixin Wang; Tianjun Xu

    2013-12-01

    Odontamblyopus rubicundus is a species of gobiid fishes, inhabits muddy-bottomed coastal waters. In this paper, the first complete mitochondrial genome sequence of O. rubicundus is reported. The complete mitochondrial genome sequence is 17119 bp in length and contains 13 protein-coding genes, two rRNA genes, 22 tRNA genes, a control region and an L-strand origin as in other teleosts. Most mitochondrial genes are encoded on H-strand except for ND6 and seven tRNA genes. Some overlaps occur in protein-coding genes and tRNAs ranging from 1 to 7 bp. The possibly nonfunctional L-strand origin folded into a typical stem-loop secondary structure and a conserved motif (5′-GCCGG-3′) was found at the base of the stem within the $tRNA^{Cys}$ gene. The TAS, CSB-2 and CSB-3 could be detected in the control region. However, in contrast to most of other fishes, the central conserved sequence block domain and the CSB-1 could not be recognized in O. rubicundus, which is consistent with Acanthogobius hasta (Gobiidae). In addition, phylogenetic analyses based on different sequences of species of Gobiidae and different methods showed that the classification of O. rubicundus into Odontamblyopus due to morphology is debatable.

  10. Complete sequence of the mitochondrial genome of Odontamblyopus rubicundus (Perciformes: Gobiidae): genome characterization and phylogenetic analysis.

    Science.gov (United States)

    Liu, Tianxing; Jin, Xiaoxiao; Wang, Rixin; Xu, Tianjun

    2013-12-01

    Odontamblyopus rubicundus is a species of gobiid fishes, inhabits muddy-bottomed coastal waters. In this paper, the first complete mitochondrial genome sequence of O. rubicundus is reported. The complete mitochondrial genome sequence is 17119 bp in length and contains 13 protein-coding genes, two rRNA genes, 22 tRNA genes, a control region and an L-strand origin as in other teleosts. Most mitochondrial genes are encoded on H-strand except for ND6 and seven tRNA genes. Some overlaps occur in protein-coding genes and tRNAs ranging from 1 to 7 bp. The possibly nonfunctional L-strand origin folded into a typical stem-loop secondary structure and a conserved motif (5'-GCCGG-3') was found at the base of the stem within the tRNACys gene. The TAS, CSB-2 and CSB-3 could be detected in the control region. However, in contrast to most of other fishes, the central conserved sequence block domain and the CSB-1 could not be recognized in O. rubicundus, which is consistent with Acanthogobius hasta (Gobiidae). In addition, phylogenetic analyses based on different sequences of species of Gobiidae and different methods showed that the classification of O. rubicundus into Odontamblyopus due to morphology is debatable.

  11. In situ reverse transcriptase-nested polymerase chain reaction to identify intracellular nucleic acids without the necessity of DNAse pretreatment and hybridisation.

    Science.gov (United States)

    Menschikowski, M; Vogel, M; Eckey, R; Dinnebier, G; Jaross, W

    2001-01-01

    In the present study a protocol of in situ reverse transcriptase-nested polymerase chain reaction (in situ RT-nested PCR) was examined based on the following modifications. (i) To exclude false positive signals caused by "DNA repair mechanisms" and "endogenous priming", a two-step PCR was applied after reverse transcription. The first step was performed in the presence of extrinsic primers and unlabeled nucleotides with a maximum of PCR cycles possible without destroying the cell morphology. The second step consisted of only one annealing/elongation reaction, the target sequence marked by addition of digoxigenin-labeled nucleotides and intrinsic primers. (ii) In order to prevent amplifications of genomic DNA nested primer pairs were applied crossing intron sequences. (iii) To minimize the diffusion of PCR products in cells, the extrinsic primers were extended with complementary 5(prime, variant)-tails. This approach results in the generation of high molecular weight concatamers during PCR cycles. By applying this protocol, immunostainings specific for phospholipase A2 of type IIA mRNA were exclusively detectable in the cytoplasm of HepG2 hepatoma cells, which were used as a model system, whereas the nuclei were unstained. Multiple control experiments yielded completely negative results. These data suggest that the in situ RT-nested PCR, which in comparison to the method of in situ RT-PCR-in situ-hybridisation is simpler and less time-consuming, can be used as an alternative approach to identify intracellular nucleic acids.

  12. Analysis on n-gram statistics and linguistic features of whole genome protein sequences

    Institute of Scientific and Technical Information of China (English)

    DONG Qi-wen; WANG Xiao-long; LIN Lei

    2008-01-01

    To obtain the statistical sequence analysis on a large number of genomic and proteomie sequences available for different organisms,the n-grams of whole genome protein sequences from 20 organisms were extracted.Their linguistic features were analyzed by two tests:Zipf power law and Shannon entropy,developed for analysis of natural languages and symbolic sequences.The natural genome proteins and the artificial genome proteins were compared with each other and some statistical features of n-grams were discovered.The results show that:the n-grams of whole genome protein sequences approximately follow the Zipf law when n is larger than 4;the Shannon n-gram entropy of natural genome proteins is lower than that of artificial proteins;a simple unigram model can distinguish different organisms;there exist organism-specific usages of "phrases" in protein sequences.It is suggested that further detailed analysis on n-gram of whole genome protein sequences will result in a powerful model for mapping the relationship of protein sequence,structure and function.

  13. Genomic analysis of the rainbow trout response to crowding

    Science.gov (United States)

    Genomic analyses have the potential to impact selective breeding programs by identifying markers as proxies for traits which are expensive or difficult to measure. One such set of traits is the physiological responses of rainbow trout to the stresses of the aquaculture environment. Typical stresso...

  14. Transcriptome and genome size analysis of the venus flytrap

    DEFF Research Database (Denmark)

    Jensen, Michael Krogh; Vogt, Josef Korbinian; Bressendorff, Simon

    2015-01-01

    The insectivorous Venus flytrap (Dionaea muscipula) is renowned from Darwin's studies of plant carnivory and the origins of species. To provide tools to analyze the evolution and functional genomics of D. muscipula, we sequenced a normalized cDNA library synthesized from mRNA isolated from D...

  15. Analysis of the hybrid genomes of brewing yeasts

    NARCIS (Netherlands)

    Bolat, I.

    2016-01-01

    One of the best guarded secrets of brewers is represented by the brewing yeast employed in beer fermentation, due to its profound impact upon the specific flavour profile of the final product. The current research tackles the genome diversity of lager brewing strains as well as their impact on

  16. Sequencing and analysis of an Irish human genome.

    LENUS (Irish Health Repository)

    Tong, Pin

    2010-01-01

    Recent studies generating complete human sequences from Asian, African and European subgroups have revealed population-specific variation and disease susceptibility loci. Here, choosing a DNA sample from a population of interest due to its relative geographical isolation and genetic impact on further populations, we extend the above studies through the generation of 11-fold coverage of the first Irish human genome sequence.

  17. Molecular cytogenetic applications in analysis of the cancer genome.

    Science.gov (United States)

    Rao, Pulivarthi H; Nandula, Subhadra V; Murty, Vundavalli V

    2007-01-01

    Cancer cells exhibit nonrandom and complex chromosome abnormalities. The role of genomic changes in cancer is well established. However, the identification of complex and cryptic chromosomal changes is beyond the resolution of conventional banding methods. The fluorescence microscopy afforded by imaging technologies, developed recently, facilitates a precise identification of these chromosome alterations in cancer. The three most commonly utilized molecular cytogenetics methods comparative genomic hybridization, spectral karyotype, and fluorescence in situ hybridization, that have already become benchmark tools in cancer cytogenetics, are described in this chapter. Comparative genomic hybridization is a powerful tool for screening copy-number changes in tumor genomes without the need for preparation of metaphases from tumor cells. Multicolor spectral karyotype permits visualization of all chromosomes in one experiment permitting identification of precise chromosomal changes on metaphases derived from tumor cells. The uses of fluorescence in situ hybridization are diverse, including mapping of alteration in single copy genes, chromosomal regions, or entire chromosomes. The opportunities to detect genetic alterations in cancer cells continue to evolve with the use of these methodologies both in diagnosis and research.

  18. Gene hunting : molecular analysis of the chicken genome

    NARCIS (Netherlands)

    Crooijmans, R.P.M.A.

    2000-01-01

    This dissertation describes the development of molecular tools to identify genes that are involved in production and health traits in poultry. To unravel the chicken genome, fluorescent molecular markers (microsatellite markers) were developed and optimized to perform high throughput screening of re

  19. Whole Genome Analysis of Epidemiologically Closely Related Staphylococcus aureus Isolates

    NARCIS (Netherlands)

    M. Schijffelen (Maarten); S.R. Konstantinov (Sergey); G. Lina (Gérard); I. Spiliopoulou (Iris); E. van Duijkeren (Engeline); E.C. Brouwer (Ellen); A.C. Fluit (Ad)

    2013-01-01

    textabstractThe change of the bacteria from colonizers to pathogens is accompanied by a drastic change in expression profiles. These changes may be due to environmental signals or to mutational changes. We therefore compared the whole genome sequences of four sets of S. aureus isolates. Three sets

  20. QTL Analysis and Functional Genomics of Animal Model

    DEFF Research Database (Denmark)

    Farajzadeh, Leila

    In recent years, the use of functional genomics and next-generation sequencing technologies has increased the probability of success in studies of complex properties. The integration of large data sets from association studies, DNA resequencing, gene expression profiles and phenotypic data...

  1. Pan-genome analysis of Senegalese and Gambian strains of ...

    African Journals Online (AJOL)

    Mbaye

    2016-11-09

    Nov 9, 2016 ... this work, the pan-genome of B. anthracis was studied based on nine strains and using ... Humans can be infected by various routes: ingestion, inhalation ... associated with 96 other projects on an Applied Biosystems SOLiD.

  2. Analysis of the hybrid genomes of brewing yeasts

    NARCIS (Netherlands)

    Bolat, I.

    2016-01-01

    One of the best guarded secrets of brewers is represented by the brewing yeast employed in beer fermentation, due to its profound impact upon the specific flavour profile of the final product. The current research tackles the genome diversity of lager brewing strains as well as their impact on impor

  3. Genomic Analysis of Secondary Metabolite Production by Pseudomonas fluorescens

    Science.gov (United States)

    Pseudomonas fluorescens is a diverse bacterial species known for its ubiquity in natural habitats and its production of secondary metabolites. The high degree of ecological and metabolic diversity represented in P. fluorescens is reflected in the genomic diversity displayed among strains. Certain st...

  4. Genomic and metagenomic analysis of antibiotic resistance in dairy animals

    Science.gov (United States)

    The extent to which carriage of antibiotic resistant bacteria in food animals is responsible for the burden of antibiotic resistance in human infections is currently not well known. Thus, there is a need to further evaluate the genomic diversity of multidrug resistant (MDR) bacteria and the microbi...

  5. Comprehensive DNA methylation analysis of the Aedes aegypti genome

    Science.gov (United States)

    Falckenhayn, Cassandra; Carneiro, Vitor Coutinho; de Mendonça Amarante, Anderson; Schmid, Katharina; Hanna, Katharina; Kang, Seokyoung; Helm, Mark; Dimopoulos, George; Fantappié, Marcelo Rosado; Lyko, Frank

    2016-01-01

    Aedes aegypti mosquitoes are important vectors of viral diseases. Mosquito host factors play key roles in virus control and it has been suggested that dengue virus replication is regulated by Dnmt2-mediated DNA methylation. However, recent studies have shown that Dnmt2 is a tRNA methyltransferase and that Dnmt2-dependent methylomes lack defined DNA methylation patterns, thus necessitating a systematic re-evaluation of the mosquito genome methylation status. We have now searched the Ae. aegypti genome for candidate DNA modification enzymes. This failed to reveal any known (cytosine-5) DNA methyltransferases, but identified homologues for the Dnmt2 tRNA methyltransferase, the Mettl4 (adenine-6) DNA methyltransferase, and the Tet DNA demethylase. All genes were expressed at variable levels throughout mosquito development. Mass spectrometry demonstrated that DNA methylation levels were several orders of magnitude below the levels that are usually detected in organisms with DNA methylation-dependent epigenetic regulation. Furthermore, whole-genome bisulfite sequencing failed to reveal any evidence of defined DNA methylation patterns. These results suggest that the Ae. aegypti genome is unmethylated. Interestingly, additional RNA bisulfite sequencing provided first evidence for Dnmt2-mediated tRNA methylation in mosquitoes. These findings have important implications for understanding the mechanism of Dnmt2-dependent virus regulation. PMID:27805064

  6. Whole genome analysis of a schistosomiasis-transmitting freshwater snail

    DEFF Research Database (Denmark)

    Adema, Coen M; Hillier, Ladeana W; Jones, Catherine S

    2017-01-01

    Biomphalaria snails are instrumental in transmission of the human blood fluke Schistosoma mansoni. With the World Health Organization's goal to eliminate schistosomiasis as a global health problem by 2025, there is now renewed emphasis on snail control. Here, we characterize the genome of Biompha...

  7. On the Analysis of a Repeated Measure Design in Genome-Wide Association Analysis

    Directory of Open Access Journals (Sweden)

    Young Lee

    2014-11-01

    Full Text Available Longitudinal data enables detecting the effect of aging/time, and as a repeated measures design is statistically more efficient compared to cross-sectional data if the correlations between repeated measurements are not large. In particular, when genotyping cost is more expensive than phenotyping cost, the collection of longitudinal data can be an efficient strategy for genetic association analysis. However, in spite of these advantages, genome-wide association studies (GWAS with longitudinal data have rarely been analyzed taking this into account. In this report, we calculate the required sample size to achieve 80% power at the genome-wide significance level for both longitudinal and cross-sectional data, and compare their statistical efficiency. Furthermore, we analyzed the GWAS of eight phenotypes with three observations on each individual in the Korean Association Resource (KARE. A linear mixed model allowing for the correlations between observations for each individual was applied to analyze the longitudinal data, and linear regression was used to analyze the first observation on each individual as cross-sectional data. We found 12 novel genome-wide significant disease susceptibility loci that were then confirmed in the Health Examination cohort, as well as some significant interactions between age/sex and SNPs.

  8. Funding Opportunity: Genomic Data Centers

    Science.gov (United States)

    Funding Opportunity CCG, Funding Opportunity Center for Cancer Genomics, CCG, Center for Cancer Genomics, CCG RFA, Center for cancer genomics rfa, genomic data analysis network, genomic data analysis network centers,

  9. Comparative optical genome analysis of two pangolin species: Manis pentadactyla and Manis javanica.

    Science.gov (United States)

    Zhihai, Huang; Jiang, Xu; Shuiming, Xiao; Baosheng, Liao; Yuan, Gao; Chaochao, Zhai; Xiaohui, Qiu; Wen, Xu; Shilin, Chen

    2016-12-01

    The pangolin is a Pholidota mammal with large keratin scales protecting its skin. Two pangolin species ( Manis pentadactyla and Manis javanica ) have been recorded as critically endangered on the International Union for Conservation of Nature Red List of Threatened Species. Optical mapping constructs high-resolution restriction maps from single DNA molecules for genome analysis at the megabase scale and to assist genome assembly. Here, we constructed restriction maps of M. pentadactyla and M. javanica using optical mapping to assist with genome assembly and analysis of these species. Genomic DNA was nicked with Nt.BspQI and then labeled using fluorescently labeled bases that were detected by the Irys optical mapping system. In total, 3,313,734 DNA molecules (517.847 Gb) for M. pentadactyla and 3,439,885 DNA molecules (504.743 Gb) for M. javanica were obtained, which corresponded to approximately 178X and 177X genome coverage, respectively. Qualified molecules (≥150 kb with a label density of >6 sites per 100 kb) were analyzed using the de novo assembly program embedded in the IrysView pipeline. We obtained two maps that were 2.91 Gb and 2.85 Gb in size with N50s of 1.88 Mb and 1.97 Mb, respectively. Optical mapping reveals large-scale structural information that is especially important for non-model genomes that lack a good reference. The approach has the potential to guide de novo assembly of genomes sequenced using next-generation sequencing. Our data provide a resource for Manidae genome analysis and references for de novo assembly. This note also provides new insights into Manidae evolutionary analysis at the genome structure level.

  10. Genome sequence, comparative analysis and haplotype structure of the domestic dog.

    Science.gov (United States)

    Lindblad-Toh, Kerstin; Wade, Claire M; Mikkelsen, Tarjei S; Karlsson, Elinor K; Jaffe, David B; Kamal, Michael; Clamp, Michele; Chang, Jean L; Kulbokas, Edward J; Zody, Michael C; Mauceli, Evan; Xie, Xiaohui; Breen, Matthew; Wayne, Robert K; Ostrander, Elaine A; Ponting, Chris P; Galibert, Francis; Smith, Douglas R; DeJong, Pieter J; Kirkness, Ewen; Alvarez, Pablo; Biagi, Tara; Brockman, William; Butler, Jonathan; Chin, Chee-Wye; Cook, April; Cuff, James; Daly, Mark J; DeCaprio, David; Gnerre, Sante; Grabherr, Manfred; Kellis, Manolis; Kleber, Michael; Bardeleben, Carolyne; Goodstadt, Leo; Heger, Andreas; Hitte, Christophe; Kim, Lisa; Koepfli, Klaus-Peter; Parker, Heidi G; Pollinger, John P; Searle, Stephen M J; Sutter, Nathan B; Thomas, Rachael; Webber, Caleb; Baldwin, Jennifer; Abebe, Adal; Abouelleil, Amr; Aftuck, Lynne; Ait-Zahra, Mostafa; Aldredge, Tyler; Allen, Nicole; An, Peter; Anderson, Scott; Antoine, Claudel; Arachchi, Harindra; Aslam, Ali; Ayotte, Laura; Bachantsang, Pasang; Barry, Andrew; Bayul, Tashi; Benamara, Mostafa; Berlin, Aaron; Bessette, Daniel; Blitshteyn, Berta; Bloom, Toby; Blye, Jason; Boguslavskiy, Leonid; Bonnet, Claude; Boukhgalter, Boris; Brown, Adam; Cahill, Patrick; Calixte, Nadia; Camarata, Jody; Cheshatsang, Yama; Chu, Jeffrey; Citroen, Mieke; Collymore, Alville; Cooke, Patrick; Dawoe, Tenzin; Daza, Riza; Decktor, Karin; DeGray, Stuart; Dhargay, Norbu; Dooley, Kimberly; Dooley, Kathleen; Dorje, Passang; Dorjee, Kunsang; Dorris, Lester; Duffey, Noah; Dupes, Alan; Egbiremolen, Osebhajajeme; Elong, Richard; Falk, Jill; Farina, Abderrahim; Faro, Susan; Ferguson, Diallo; Ferreira, Patricia; Fisher, Sheila; FitzGerald, Mike; Foley, Karen; Foley, Chelsea; Franke, Alicia; Friedrich, Dennis; Gage, Diane; Garber, Manuel; Gearin, Gary; Giannoukos, Georgia; Goode, Tina; Goyette, Audra; Graham, Joseph; Grandbois, Edward; Gyaltsen, Kunsang; Hafez, Nabil; Hagopian, Daniel; Hagos, Birhane; Hall, Jennifer; Healy, Claire; Hegarty, Ryan; Honan, Tracey; Horn, Andrea; Houde, Nathan; Hughes, Leanne; Hunnicutt, Leigh; Husby, M; Jester, Benjamin; Jones, Charlien; Kamat, Asha; Kanga, Ben; Kells, Cristyn; Khazanovich, Dmitry; Kieu, Alix Chinh; Kisner, Peter; Kumar, Mayank; Lance, Krista; Landers, Thomas; Lara, Marcia; Lee, William; Leger, Jean-Pierre; Lennon, Niall; Leuper, Lisa; LeVine, Sarah; Liu, Jinlei; Liu, Xiaohong; Lokyitsang, Yeshi; Lokyitsang, Tashi; Lui, Annie; Macdonald, Jan; Major, John; Marabella, Richard; Maru, Kebede; Matthews, Charles; McDonough, Susan; Mehta, Teena; Meldrim, James; Melnikov, Alexandre; Meneus, Louis; Mihalev, Atanas; Mihova, Tanya; Miller, Karen; Mittelman, Rachel; Mlenga, Valentine; Mulrain, Leonidas; Munson, Glen; Navidi, Adam; Naylor, Jerome; Nguyen, Tuyen; Nguyen, Nga; Nguyen, Cindy; Nguyen, Thu; Nicol, Robert; Norbu, Nyima; Norbu, Choe; Novod, Nathaniel; Nyima, Tenchoe; Olandt, Peter; O'Neill, Barry; O'Neill, Keith; Osman, Sahal; Oyono, Lucien; Patti, Christopher; Perrin, Danielle; Phunkhang, Pema; Pierre, Fritz; Priest, Margaret; Rachupka, Anthony; Raghuraman, Sujaa; Rameau, Rayale; Ray, Verneda; Raymond, Christina; Rege, Filip; Rise, Cecil; Rogers, Julie; Rogov, Peter; Sahalie, Julie; Settipalli, Sampath; Sharpe, Theodore; Shea, Terrance; Sheehan, Mechele; Sherpa, Ngawang; Shi, Jianying; Shih, Diana; Sloan, Jessie; Smith, Cherylyn; Sparrow, Todd; Stalker, John; Stange-Thomann, Nicole; Stavropoulos, Sharon; Stone, Catherine; Stone, Sabrina; Sykes, Sean; Tchuinga, Pierre; Tenzing, Pema; Tesfaye, Senait; Thoulutsang, Dawa; Thoulutsang, Yama; Topham, Kerri; Topping, Ira; Tsamla, Tsamla; Vassiliev, Helen; Venkataraman, Vijay; Vo, Andy; Wangchuk, Tsering; Wangdi, Tsering; Weiand, Michael; Wilkinson, Jane; Wilson, Adam; Yadav, Shailendra; Yang, Shuli; Yang, Xiaoping; Young, Geneva; Yu, Qing; Zainoun, Joanne; Zembek, Lisa; Zimmer, Andrew; Lander, Eric S

    2005-12-08

    Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of genomes and genes. Notably, the majority of the most highly conserved non-coding sequences in mammalian genomes are clustered near a small subset of genes with important roles in development. Analysis of SNPs reveals long-range haplotypes across the entire dog genome, and defines the nature of genetic diversity within and across breeds. The current SNP map now makes it possible for genome-wide association studies to identify genes responsible for diseases and traits, with important consequences for human and companion animal health.

  11. Complete genome sequencing and analysis of Saprospira grandis str. Lewin, a predatory marine bacterium.

    Science.gov (United States)

    Saw, Jimmy H W; Yuryev, Anton; Kanbe, Masaomi; Hou, Shaobin; Young, Aaron G; Aizawa, Shin-Ichi; Alam, Maqsudul

    2012-03-19

    Saprospira grandis is a coastal marine bacterium that can capture and prey upon other marine bacteria using a mechanism known as 'ixotrophy'. Here, we present the complete genome sequence of Saprospira grandis str. Lewin isolated from La Jolla beach in San Diego, California. The complete genome sequence comprises a chromosome of 4.35 Mbp and a plasmid of 54.9 Kbp. Genome analysis revealed incomplete pathways for the biosynthesis of nine essential amino acids but presence of a large number of peptidases. The genome encodes multiple copies of sensor globin-coupled rsbR genes thought to be essential for stress response and the presence of such sensor globins in Bacteroidetes is unprecedented. A total of 429 spacer sequences within the three CRISPR repeat regions were identified in the genome and this number is the largest among all the Bacteroidetes sequenced to date.

  12. Genome sequence and comparative analysis of Avibacterium paragallinarum

    Science.gov (United States)

    Requena, David; Chumbe, Ana; Torres, Michael; Alzamora, Ofelia; Ramirez, Manuel; Valdivia-Olarte, Hugo; Gutierrez, Andres Hazaet; Izquierdo-Lara, Ray; Saravia, Luis Enrique; Zavaleta, Milagros; Tataje-Lavanda, Luis; Best, Ivan; Fernández-Sánchez, Manolo; Icochea, Eliana; Zimic, Mirko; Fernández-Díaz, Manolo

    2013-01-01

    Background: Avibacterium paragallinarum, the causative agent of infectious coryza, is a highly contagious respiratory acute disease of poultry, which affects commercial chickens, laying hens and broilers worldwide. Methodology: In this study, we performed the whole genome sequencing, assembly and annotation of a Peruvian isolate of A. paragallinarum. Genome was sequenced in a 454 GS FLX Titanium system. De novo assembly was performed and annotation was completed with GS De Novo Assembler 2.6 using the H. influenzae str. F3031 gene model. Manual curation of the genome was performed with Artemis. Putative function of genes was predicted with Blast2GO. Virulence factors were identified by comparison with the Virulence Factor Database. Results: The genome obtained has a length of 2.47 Mb with 40.66% of GC content. Seventy five large contigs (>500 nt) were obtained, which comprised 1,204 predicted genes. All the contigs are available in Genbank [GenBank: PRJNA64665]. A total of 103 virulence factors, reported in the Virulence Factor Database, were found in A. paragallinarum. Forty four of them are present in 7 species of Haemophilus, which are related with pathogenesis, virulence and host immune system evasion. A tetracycline-resistance associated transposon (Tn10), was found in A. paragallinarum, possibly acting as a defense mechanism. Discussion and conclusion: The availability of A. paragallinarum genome represents an important source of information for the development of diagnostic tests, genotyping, and novel antigens for potential vaccines against infectious coryza. Identification of virulence factors contributes to better understanding the pathogenesis, and planning efforts for prevention and control of the disease. PMID:23861570

  13. MD-SeeGH: a platform for integrative analysis of multi-dimensional genomic data

    Directory of Open Access Journals (Sweden)

    Ng Raymond T

    2008-05-01

    Full Text Available Abstract Background Recent advances in global genomic profiling methodologies have enabled multi-dimensional characterization of biological systems. Complete analysis of these genomic profiles require an in depth look at parallel profiles of segmental DNA copy number status, DNA methylation state, single nucleotide polymorphisms, as well as gene expression profiles. Due to the differences in data types it is difficult to conduct parallel analysis of multiple datasets from diverse platforms. Results To address this issue, we have developed an integrative genomic analysis platform MD-SeeGH, a software tool that allows users to rapidly and directly analyze genomic datasets spanning multiple genomic experiments. With MD-SeeGH, users have the flexibility to easily update datasets in accordance with new genomic builds, make a quality assessment of data using the filtering features, and identify genetic alterations within single or across multiple experiments. Multiple sample analysis in MD-SeeGH allows users to compare profiles from many experiments alongside tracks containing detailed localized gene information, microRNA, CpG islands, and copy number variations. Conclusion MD-SeeGH is a new platform for the integrative analysis of diverse microarray data, facilitating multiple profile analyses and group comparisons.

  14. Computational workflow for analysis of gain and loss of genes in distantly related genomes

    Directory of Open Access Journals (Sweden)

    Ptitsyn Andrey

    2012-09-01

    Full Text Available Abstract Background Early evolution of animals led to profound changes in body plan organization, symmetry and the rise of tissue complexity including formation of muscular and nervous systems. This process was associated with massive restructuring of animal genomes as well as deletion, acquisition and rapid differentiation of genes from a common metazoan ancestor. Here, we present a simple but efficient workflow for elucidation of gene gain and gene loss within major branches of the animal kingdom. Methods We have designed a pipeline of sequence comparison, clustering and functional annotation using 12 major phyla as illustrative examples. Specifically, for the input we used sets of ab initio predicted gene models from the genomes of six bilaterians, three basal metazoans (Cnidaria, Placozoa, Porifera, two unicellular eukaryotes (Monosiga and Capsospora and the green plant Arabidopsis as an out-group. Due to the large amounts of data the software required a high-performance Linux cluster. The final results can be imported into standard spreadsheet analysis software and queried for the numbers and specific sets of genes absent in specific genomes, uniquely present or shared among different taxons. Results and conclusions The developed software is open source and available free of charge on Open Source principles. It allows the user to address a number of specific questions regarding gene gain and gene loss in particular genomes, and user-defined groups of genomes can be formulated in a type of logical expression. For example, our analysis of 12 sequenced genomes indicated that these genomes possess at least 90,000 unique genes and gene families, suggesting enormous diversity of the genome repertoire in the animal kingdom. Approximately 9% of these gene families are shared universally (homologous among all genomes, 53% are unique to specific taxa, and the rest are shared between two or more distantly related genomes.

  15. The chickpea genomic web resource: visualization and analysis of the desi-type Cicer arietinum nuclear genome for comparative exploration of legumes.

    Science.gov (United States)

    Misra, Gopal; Priya, Piyush; Bandhiwal, Nitesh; Bareja, Neha; Jain, Mukesh; Bhatia, Sabhyata; Chattopadhyay, Debasis; Tyagi, Akhilesh K; Yadav, Gitanjali

    2014-12-18

    Availability of the draft nuclear genome sequences of small-seeded desi-type legume crop Cicer arietinum has provided an opportunity for investigating unique chickpea genomic features and evaluation of their biological significance. The increasing number of legume genome sequences also presents a challenge for developing reliable and information-driven bioinformatics applications suitable for comparative exploration of this important class of crop plants. The Chickpea Genomic Web Resource (CGWR) is an implementation of a suite of web-based applications dedicated to chickpea genome visualization and comparative analysis, based on next generation sequencing and assembly of Cicer arietinum desi-type genotype ICC4958. CGWR has been designed and configured for mapping, scanning and browsing the significant chickpea genomic features in view of the important existing and potential roles played by the various legume genome projects in mutant mapping and cloning. It also enables comparative informatics of ICC4958 DNA sequence analysis with other wild and cultivated genotypes of chickpea, various other leguminous species as well as several non-leguminous model plants, to enable investigations into evolutionary processes that shape legume genomes. CGWR is an online database offering a comprehensive visual and functional genomic analysis of the chickpea genome, along with customized maps and gene-clustering options. It is also the only plant based web resource supporting display and analysis of nucleosome positioning patterns in the genome. The usefulness of CGWR has been demonstrated with discoveries of biological significance made using this server. The CGWR is compatible with all available operating systems and browsers, and is available freely under the open source license at http://www.nipgr.res.in/CGWR/home.php.

  16. Comparative genomic analysis of bacteriophages specific to the channel catfish pathogen Edwardsiella ictaluri

    Directory of Open Access Journals (Sweden)

    Mead David A

    2011-01-01

    Full Text Available Abstract Background The bacterial pathogen Edwardsiella ictaluri is a primary cause of mortality in channel catfish raised commercially in aquaculture farms. Additional treatment and diagnostic regimes are needed for this enteric pathogen, motivating the discovery and characterization of bacteriophages specific to E. ictaluri. Results The genomes of three Edwardsiella ictaluri-specific bacteriophages isolated from geographically distant aquaculture ponds, at different times, were sequenced and analyzed. The genomes for phages eiAU, eiDWF, and eiMSLS are 42.80 kbp, 42.12 kbp, and 42.69 kbp, respectively, and are greater than 95% identical to each other at the nucleotide level. Nucleotide differences were mostly observed in non-coding regions and in structural proteins, with significant variability in the sequences of putative tail fiber proteins. The genome organization of these phages exhibit a pattern shared by other Siphoviridae. Conclusions These E. ictaluri-specific phage genomes reveal considerable conservation of genomic architecture and sequence identity, even with considerable temporal and spatial divergence in their isolation. Their genomic homogeneity is similarly observed among E. ictaluri bacterial isolates. The genomic analysis of these phages supports the conclusion that these are virulent phages, lacking the capacity for lysogeny or expression of virulence genes. This study contributes to our knowledge of phage genomic diversity and facilitates studies on the diagnostic and therapeutic applications of these phages.

  17. Comparative genome analysis of Spiroplasma melliferum IPMB4A, a honeybee-associated bacterium

    Directory of Open Access Journals (Sweden)

    Lo Wen-Sui

    2013-01-01

    Full Text Available Abstract Background The genus Spiroplasma contains a group of helical, motile, and wall-less bacteria in the class Mollicutes. Similar to other members of this class, such as the animal-pathogenic Mycoplasma and the plant-pathogenic ‘Candidatus Phytoplasma’, all characterized Spiroplasma species were found to be associated with eukaryotic hosts. While most of the Spiroplasma species appeared to be harmless commensals of insects, a small number of species have evolved pathogenicity toward various arthropods and plants. In this study, we isolated a novel strain of honeybee-associated S. melliferum and investigated its genetic composition and evolutionary history by whole-genome shotgun sequencing and comparative analysis with other Mollicutes genomes. Results The whole-genome shotgun sequencing of S. melliferum IPMB4A produced a draft assembly that was ~1.1 Mb in size and covered ~80% of the chromosome. Similar to other Spiroplasma genomes that have been studied to date, we found that this genome contains abundant repetitive sequences that originated from plectrovirus insertions. These phage fragments represented a major obstacle in obtaining a complete genome sequence of Spiroplasma with the current sequencing technology. Comparative analysis of S. melliferum IPMB4A with other Spiroplasma genomes revealed that these phages may have facilitated extensive genome rearrangements in these bacteria and contributed to horizontal gene transfers that led to species-specific adaptation to different eukaryotic hosts. In addition, comparison of gene content with other Mollicutes suggested that the common ancestor of the SEM (Spiroplasma, Entomoplasma, and Mycoplasma clade may have had a relatively large genome and flexible metabolic capacity; the extremely reduced genomes of present day Mycoplasma and ‘Candidatus Phytoplasma’ species are likely to be the result of independent gene losses in these lineages. Conclusions The findings in this study

  18. Quantitative analysis of polycomb response elements (PREs at identical genomic locations distinguishes contributions of PRE sequence and genomic environment

    Directory of Open Access Journals (Sweden)

    Okulski Helena

    2011-03-01

    Full Text Available Abstract Background Polycomb/Trithorax response elements (PREs are cis-regulatory elements essential for the regulation of several hundred developmentally important genes. However, the precise sequence requirements for PRE function are not fully understood, and it is also unclear whether these elements all function in a similar manner. Drosophila PRE reporter assays typically rely on random integration by P-element insertion, but PREs are extremely sensitive to genomic position. Results We adapted the ΦC31 site-specific integration tool to enable systematic quantitative comparison of PREs and sequence variants at identical genomic locations. In this adaptation, a miniwhite (mw reporter in combination with eye-pigment analysis gives a quantitative readout of PRE function. We compared the Hox PRE Frontabdominal-7 (Fab-7 with a PRE from the vestigial (vg gene at four landing sites. The analysis revealed that the Fab-7 and vg PREs have fundamentally different properties, both in terms of their interaction with the genomic environment at each site and their inherent silencing abilities. Furthermore, we used the ΦC31 tool to examine the effect of deletions and mutations in the vg PRE, identifying a 106 bp region containing a previously predicted motif (GTGT that is essential for silencing. Conclusions This analysis showed that different PREs have quantifiably different properties, and that changes in as few as four base pairs have profound effects on PRE function, thus illustrating the power and sensitivity of ΦC31 site-specific integration as a tool for the rapid and quantitative dissection of elements of PRE design.

  19. Whole-genome single-nucleotide-polymorphism analysis for discrimination of Clostridium botulinum group I strains.

    Science.gov (United States)

    Gonzalez-Escalona, Narjol; Timme, Ruth; Raphael, Brian H; Zink, Donald; Sharma, Shashi K

    2014-04-01

    Clostridium botulinum is a genetically diverse Gram-positive bacterium producing extremely potent neurotoxins (botulinum neurotoxins A through G [BoNT/A-G]). The complete genome sequences of three strains harboring only the BoNT/A1 nucleotide sequence are publicly available. Although these strains contain a toxin cluster (HA(+) OrfX(-)) associated with hemagglutinin genes, little is known about the genomes of subtype A1 strains (termed HA(-) OrfX(+)) that lack hemagglutinin genes in the toxin gene cluster. We sequenced the genomes of three BoNT/A1-producing C. botulinum strains: two strains with the HA(+) OrfX(-) cluster (69A and 32A) and one strain with the HA(-) OrfX(+) cluster (CDC297). Whole-genome phylogenic single-nucleotide-polymorphism (SNP) analysis of these strains along with other publicly available C. botulinum group I strains revealed five distinct lineages. Strains 69A and 32A clustered with the C. botulinum type A1 Hall group, and strain CDC297 clustered with the C. botulinum type Ba4 strain 657. This study reports the use of whole-genome SNP sequence analysis for discrimination of C. botulinum group I strains and demonstrates the utility of this analysis in quickly differentiating C. botulinum strains harboring identical toxin gene subtypes. This analysis further supports previous work showing that strains CDC297 and 657 likely evolved from a common ancestor and independently acquired separate BoNT/A1 toxin gene clusters at distinct genomic locations.

  20. A Brief Review: The Z-curve Theory and its Application in Genome Analysis.

    Science.gov (United States)

    Zhang, Ren; Zhang, Chun-Ting

    2014-04-01

    In theoretical physics, there exist two basic mathematical approaches, algebraic and geometrical methods, which, in most cases, are complementary. In the area of genome sequence analysis, however, algebraic approaches have been widely used, while geometrical approaches have been less explored for a long time. The Z-curve theory is a geometrical approach to genome analysis. The Z-curve is a three-dimensional curve that represents a given DNA sequence in the sense that each can be uniquely reconstructed given the other. The Z-curve, therefore, contains all the information that the corresponding DNA sequence carries. The analysis of a DNA sequence can then be performed through studying the corresponding Z-curve. The Z-curve method has found applications in a wide range of areas in the past two decades, including the identifications of protein-coding genes, replication origins, horizontally-transferred genomic islands, promoters, translational start sides and isochores, as well as studies on phylogenetics, genome visualization and comparative genomics. Here, we review the progress of Z-curve studies from aspects of both theory and applications in genome analysis.

  1. Decoding the genome with an integrative analysis tool: combinatorial CRM Decoder.

    Science.gov (United States)

    Kang, Keunsoo; Kim, Joomyeong; Chung, Jae Hoon; Lee, Daeyoup

    2011-09-01

    The identification of genome-wide cis-regulatory modules (CRMs) and characterization of their associated epigenetic features are fundamental steps toward the understanding of gene regulatory networks. Although integrative analysis of available genome-wide information can provide new biological insights, the lack of novel methodologies has become a major bottleneck. Here, we present a comprehensive analysis tool called combinatorial CRM decoder (CCD), which utilizes the publicly available information to identify and characterize genome-wide CRMs in a species of interest. CCD first defines a set of the epigenetic features which is significantly associated with a set of known CRMs as a code called 'trace code', and subsequently uses the trace code to pinpoint putative CRMs throughout the genome. Using 61 genome-wide data sets obtained from 17 independent mouse studies, CCD successfully catalogued ∼12 600 CRMs (five distinct classes) including polycomb repressive complex 2 target sites as well as imprinting control regions. Interestingly, we discovered that ∼4% of the identified CRMs belong to at least two different classes named 'multi-functional CRM', suggesting their functional importance for regulating spatiotemporal gene expression. From these examples, we show that CCD can be applied to any potential genome-wide datasets and therefore will shed light on unveiling genome-wide CRMs in various species.

  2. Genome analysis of the Anerobic Thermohalophilic bacterium Halothermothrix orenii

    Energy Technology Data Exchange (ETDEWEB)

    Mavromatis, Konstantinos; Ivanova, Natalia; Anderson, Iain; Lykidis, Athanasios; Hooper, Sean D.; Sun, Hui; Kunin, Victor; Lapidus, Alla; Hugenholtz, Philip; Patel, Bharat; Kyrpides, Nikos C.

    2008-11-03

    Halothermothirx orenii is a strictly anaerobic thermohalophilic bacterium isolated from sediment of a Tunisian salt lake. It belongs to the order Halanaerobiales in the phylum Firmicutes. The complete sequence revealed that the genome consists of one circular chromosome of 2578146 bps encoding 2451 predicted genes. This is the first genome sequence of an organism belonging to the Haloanaerobiales. Features of both Gram positive and Gram negative bacteria were identified with the presence of both a sporulating mechanism typical of Firmicutes and a characteristic Gram negative lipopolysaccharide being the most prominent. Protein sequence analyses and metabolic reconstruction reveal a unique combination of strategies for thermophilic and halophilic adaptation. H. orenii can serve as a model organism for the study of the evolution of the Gram negative phenotype as well as the adaptation under thermohalophilic conditions and the development of biotechnological applications under conditions that require high temperatures and high salt concentrations.

  3. Analysis of genome rearrangement by block-interchanges.

    Science.gov (United States)

    Lu, Chin Lung; Lin, Ying Chih; Huang, Yen Lin; Tang, Chuan Yi

    2007-01-01

    Block-interchanges are a new kind of genome rearrangements that affect the gene order in a chromosome by swapping two nonintersecting blocks of genes of any length. More recently, the study of such rearrangements is becoming increasingly important because of its applications in molecular evolution. Usually, this kind of study requires to solve a combinatorial problem, called the block-interchange distance problem, which is to find a minimum number of block-interchanges between two given gene orders of linear/circular chromosomes to transform one gene order into another. In this chapter, we shall introduce the basics of block-interchange rearrangements and permutation groups in algebra that are useful in analyses of genome rearrangements. In addition, we shall present a simple algorithm on the basis of permutation groups to efficiently solve the block-interchange distance problem, as well as ROBIN, a web server for the online analyses of block-interchange rearrangements.

  4. Complete genome analysis of Ketogulonigenium sp.WB0104

    Institute of Scientific and Technical Information of China (English)

    2006-01-01

    Ketogulonigenium sp. may convert L-sorbose into 2-keto-L-gulonic acid, the vitamin C precursor. The genome of Ketogulonigenium sp. WB0104 consists of a circular 2765030 bp chromosome with 61.69% G+C content and two circular plasmids of 267968 and 242707 bp. The genome contains 2727 open reading frames (ORFs). The systems of replication, transcription, translation, carbohydrate and energy metabolism are intact, but the repair system is incomplete. About 640 predicted ORFs have been found to encode transporter proteins, which account for about one fourth of total predicted ORFs, noticeably higher than other documented bacteria. This may be due to the fact that WB0104 adapts to soil circumstance.

  5. Complete genome sequencing and analysis of Saprospira grandis str. Lewin, a predatory marine bacterium

    OpenAIRE

    Saw, Jimmy H. W.; Yuryev, Anton; Kanbe, Masaomi; Hou, Shaobin; Young, Aaron G; Aizawa, Shin-Ichi; Alam, Maqsudul

    2012-01-01

    Saprospira grandis is a coastal marine bacterium that can capture and prey upon other marine bacteria using a mechanism known as ‘ixotrophy’. Here, we present the complete genome sequence of Saprospira grandis str. Lewin isolated from La Jolla beach in San Diego, California. The complete genome sequence comprises a chromosome of 4.35 Mbp and a plasmid of 54.9 Kbp. Genome analysis revealed incomplete pathways for the biosynthesis of nine essential amino acids but presence of a large number of ...

  6. Comparative Genomic Analysis Reveals Organization, Function and Evolution of ars Genes in Pantoea spp.

    OpenAIRE

    Wang, Liying; Wang, Jin; Jing, Chuanyong

    2017-01-01

    Numerous genes are involved in various strategies to resist toxic arsenic (As). However, the As resistance strategy in genus Pantoea is poorly understood. In this study, a comparative genome analysis of 23 Pantoea genomes was conducted. Two vertical genetic arsC-like genes without any contribution to As resistance were found to exist in the 23 Pantoea strains. Besides the two arsC-like genes, As resistance gene clusters arsRBC or arsRBCH were found in 15 Pantoea genomes. These ars clusters we...

  7. Genomic analysis of the symbiotic marine crenarchaeon, Cenarchaeumsymbiosum

    Energy Technology Data Exchange (ETDEWEB)

    Hallam, Steven J.; Konstantinidis, Konstantinos T.; Brochier,Celine; Putnam, Nik; Schleper, Christa; Watanabe, Yoh-ichi; Sugahara,Junichi; Preston, Christina; de la Torre, Jose; Richardson, Paul M.; DeLong, Edward F.

    2006-06-24

    Crenarchaea are ubiquitous and abundant microbial constituents of soils, sediments, lakes and ocean waters, yet relatively little is known about their fundamental evolutionary, ecological, and physiological properties. To better describe the ubiquitous nonthermophilic Crenarchaea, we analyzed the genome sequence of one representative, the uncultivated sponge symbiont, Cenarchaeum symbiosum. C. symbiosum genotypes coinhabiting the same host partitioned into two dominant populations, corresponding to previously described a- and b-type ribosomal RNA variants. Although synthetic, overlapping a- and b-type ribotypes harbored significant genetic variability. A single tiling path comprising the dominant a-type genotype was assembled, and used to explore the biological properties of C. symbiosum and its planktonic relatives. Out of a total of 2,066 predicted open reading frames, 36% were more highly conserved with other Archaea. The remainder partitioned between bacteria (18%), eukaryotes (1.5%) and viruses (0.1%). A total of 525 open reading frames were more highly conserved with sequences derived from marine environmental genomic surveys, most probably representing orthologous genes found in free-living planktonic Crenarchaea. The remaining genes partitioned between functional RNAs (2.4%), and hypotheticals (42%) with limited homology to known functional genes. The latter category likely contains genes specifically involved in mediated archaeal-sponge symbiosis. Phylogenetic analyses placed C. symbiosum as a basal crenarchaeon, sharing specific genomic features in common with either Crenarchaea, Euryarchaea, or both. The genome sequence of C. symbiosum reflect a unique and unusual evolutionary, physiological, and ecological history, one remarkably distinct from that of any other previously known microbial lineage.

  8. QTL Analysis and Functional Genomics of Animal Model

    DEFF Research Database (Denmark)

    Farajzadeh, Leila

    In recent years, the use of functional genomics and next-generation sequencing technologies has increased the probability of success in studies of complex properties. The integration of large data sets from association studies, DNA resequencing, gene expression profiles and phenotypic data......, for example, has enabled scientists to examine more complex interactions in connection with studies of properties and diseases. In her PhD project, Leila Farajzadeh integrated different organisational levels in biology, including genotype, phenotype, association studies, transcription profiles and genetic...

  9. STINGRAY: system for integrated genomic resources and analysis

    OpenAIRE

    Wagner, Glauber; Jardim, Rodrigo; Tschoeke, Diogo A; Loureiro, Daniel R.; Ocaña, Kary ACS; Ribeiro, Antonio CB; Vanessa E. Emmel; Probst, Christian M.; Pitaluga, André N; Grisard, Edmundo C; Cavalcanti, Maria C; Campos, Maria LM; Mattoso, Marta; Dávila, Alberto MR

    2014-01-01

    Background The STINGRAY system has been conceived to ease the tasks of integrating, analyzing, annotating and presenting genomic and expression data from Sanger and Next Generation Sequencing (NGS) platforms. Findings STINGRAY includes: (a) a complete and integrated workflow (more than 20 bioinformatics tools) ranging from functional annotation to phylogeny; (b) a MySQL database schema, suitable for data integration and user access control; and (c) a user-friendly graphical web-based interfac...

  10. Comparative Genomic Analysis of Human Fungal Pathogens Causing Paracoccidioidomycosis

    OpenAIRE

    Desjardins, Christopher A; Champion, Mia D.; Holder, Jason W.; Muszewska, Anna; Goldberg, Jonathan; Bailao, Alexandre M.; Brigido, Marcelo de Macedo; Silva Ferreira, Marcia Eliana da; Garcia, Ana Maria; Grynberg, Marcin; Gujja, Sharvari; Heiman, David I.; Henn, Matthew R.; Kodira, Chinnappa D.; Leon-Narvaez, Henry

    2011-01-01

    Paracoccidioides is a fungal pathogen and the cause of paracoccidioidomycosis, a health-threatening human systemic mycosis endemic to Latin America. Infection by Paracoccidioides, a dimorphic fungus in the order Onygenales, is coupled with a thermally regulated transition from a soil-dwelling filamentous form to a yeast-like pathogenic form. To better understand the genetic basis of growth and pathogenicity in Paracoccidioides, we sequenced the genomes of two strains of Paracoccidioides brasi...

  11. General metabolism of Laribacter hongkongensis: a genome-wide analysis

    Directory of Open Access Journals (Sweden)

    Curreem Shirly O

    2011-04-01

    Full Text Available Abstract Background Laribacter hongkongensis is associated with community-acquired gastroenteritis and traveler's diarrhea. In this study, we performed an in-depth annotation of the genes and pathways of the general metabolism of L. hongkongensis and correlated them with its phenotypic characteristics. Results The L. hongkongensis genome possesses the pentose phosphate and gluconeogenesis pathways and tricarboxylic acid and glyoxylate cycles, but incomplete Embden-Meyerhof-Parnas and Entner-Doudoroff pathways, in agreement with its asaccharolytic phenotype. It contains enzymes for biosynthesis and β-oxidation of saturated fatty acids, biosynthesis of all 20 universal amino acids and selenocysteine, the latter not observed in Neisseria gonorrhoeae, Neisseria meningitidis and Chromobacterium violaceum. The genome contains a variety of dehydrogenases, enabling it to utilize different substrates as electron donors. It encodes three terminal cytochrome oxidases for respiration using oxygen as the electron acceptor under aerobic and microaerophilic conditions and four reductases for respiration with alternative electron acceptors under anaerobic conditions. The presence of complete tetrathionate reductase operon may confer survival advantage in mammalian host in association with diarrhea. The genome contains CDSs for incorporating sulfur and nitrogen by sulfate assimilation, ammonia assimilation and nitrate reduction. The existence of both glutamate dehydrogenase and glutamine synthetase/glutamate synthase pathways suggests an importance of ammonia metabolism in the living environments that it may encounter. Conclusions The L. hongkongensis genome possesses a variety of genes and pathways for carbohydrate, amino acid and lipid metabolism, respiratory chain and sulfur and nitrogen metabolism. These allow the bacterium to utilize various substrates for energy production and survive in different environmental niches.

  12. Comparative genomic analysis of two-component regulatory proteins in Pseudomonas syringae

    DEFF Research Database (Denmark)

    Lavin, J.L.; Kiil, Kristoffer; Resano, O.

    2007-01-01

    important differences in TCS proteins among the three P. syringae pathovars. Conclusion: In this article we present a thorough analysis of the identification and distribution of TCS proteins among the sequenced genomes of P. syringae. We have identified differences in TCS proteins among the three P...... requires a complex array of TCS proteins to cope with diverse plant hosts, host responses, and environmental conditions. Results: Based on the genomic data, pattern searches with Hidden Markov Model (HMM) profiles have been used to identify putative HKs and RRs. The genomes of Psy B728a, Pto DC3000 and Pph...... 1448A were found to contain a large number of genes encoding TCS proteins, and a core of complete TCS proteins were shared between these genomes: 30 putative TCS clusters, 11 orphan HKs, 33 orphan RRs, and 16 hybrid HKs. A close analysis of the distribution of genes encoding TCS proteins revealed...

  13. Whole genome sequence analysis of unidentified genetically modified papaya for development of a specific detection method.

    Science.gov (United States)

    Nakamura, Kosuke; Kondo, Kazunari; Akiyama, Hiroshi; Ishigaki, Takumi; Noguchi, Akio; Katsumata, Hiroshi; Takasaki, Kazuto; Futo, Satoshi; Sakata, Kozue; Fukuda, Nozomi; Mano, Junichi; Kitta, Kazumi; Tanaka, Hidenori; Akashi, Ryo; Nishimaki-Mogami, Tomoko

    2016-08-15

    Identification of transgenic sequences in an unknown genetically modified (GM) papaya (Carica papaya L.) by whole genome sequence analysis was demonstrated. Whole genome sequence data were generated for a GM-positive fresh papaya fruit commodity detected in monitoring using real-time polymerase chain reaction (PCR). The sequences obtained were mapped against an open database for papaya genome sequence. Transgenic construct- and event-specific sequences were identified as a GM papaya developed to resist infection from a Papaya ringspot virus. Based on the transgenic sequences, a specific real-time PCR detection method for GM papaya applicable to various food commodities was developed. Whole genome sequence analysis enabled identifying unknown transgenic construct- and event-specific sequences in GM papaya and development of a reliable method for detecting them in papaya food commodities.

  14. Genetic Characterization and Comparative Genome Analysis of Brucella melitensis Isolates from India

    Directory of Open Access Journals (Sweden)

    Sarwar Azam

    2016-01-01

    Full Text Available Brucellosis is the most frequent zoonotic disease worldwide, with over 500,000 new human infections every year. Brucella melitensis, the most virulent species in humans, primarily affects goats and the zoonotic transmission occurs by ingestion of unpasteurized milk products or through direct contact with fetal tissues. Brucellosis is endemic in India but no information is available on population structure and genetic diversity of Brucella spp. in India. We performed multilocus sequence typing of four B. melitensis strains isolated from naturally infected goats from India. For more detailed genetic characterization, we carried out whole genome sequencing and comparative genome analysis of one of the B. melitensis isolates, Bm IND1. Genome analysis identified 141 unique SNPs, 78 VNTRs, 51 Indels, and 2 putative prophage integrations in the Bm IND1 genome. Our data may help to develop improved epidemiological typing tools and efficient preventive strategies to control brucellosis.

  15. Genetic Characterization and Comparative Genome Analysis of Brucella melitensis Isolates from India

    Science.gov (United States)

    Azam, Sarwar; Rao, Sashi Bhushan; Jakka, Padmaja; NarasimhaRao, Veera; Bhargavi, Bindu; Gupta, Vivek Kumar

    2016-01-01

    Brucellosis is the most frequent zoonotic disease worldwide, with over 500,000 new human infections every year. Brucella melitensis, the most virulent species in humans, primarily affects goats and the zoonotic transmission occurs by ingestion of unpasteurized milk products or through direct contact with fetal tissues. Brucellosis is endemic in India but no information is available on population structure and genetic diversity of Brucella spp. in India. We performed multilocus sequence typing of four B. melitensis strains isolated from naturally infected goats from India. For more detailed genetic characterization, we carried out whole genome sequencing and comparative genome analysis of one of the B. melitensis isolates, Bm IND1. Genome analysis identified 141 unique SNPs, 78 VNTRs, 51 Indels, and 2 putative prophage integrations in the Bm IND1 genome. Our data may help to develop improved epidemiological typing tools and efficient preventive strategies to control brucellosis. PMID:27525259

  16. Genetic Characterization and Comparative Genome Analysis of Brucella melitensis Isolates from India.

    Science.gov (United States)

    Azam, Sarwar; Rao, Sashi Bhushan; Jakka, Padmaja; NarasimhaRao, Veera; Bhargavi, Bindu; Gupta, Vivek Kumar; Radhakrishnan, Girish

    2016-01-01

    Brucellosis is the most frequent zoonotic disease worldwide, with over 500,000 new human infections every year. Brucella melitensis, the most virulent species in humans, primarily affects goats and the zoonotic transmission occurs by ingestion of unpasteurized milk products or through direct contact with fetal tissues. Brucellosis is endemic in India but no information is available on population structure and genetic diversity of Brucella spp. in India. We performed multilocus sequence typing of four B. melitensis strains isolated from naturally infected goats from India. For more detailed genetic characterization, we carried out whole genome sequencing and comparative genome analysis of one of the B. melitensis isolates, Bm IND1. Genome analysis identified 141 unique SNPs, 78 VNTRs, 51 Indels, and 2 putative prophage integrations in the Bm IND1 genome. Our data may help to develop improved epidemiological typing tools and efficient preventive strategies to control brucellosis.

  17. Bioinformatics tools and databases for whole genome sequence analysis of Mycobacterium tuberculosis.

    Science.gov (United States)

    Faksri, Kiatichai; Tan, Jun Hao; Chaiprasert, Angkana; Teo, Yik-Ying; Ong, Rick Twee-Hee

    2016-11-01

    Tuberculosis (TB) is an infectious disease of global public health importance caused by Mycobacterium tuberculosis complex (MTC) in which M. tuberculosis (Mtb) is the major causative agent. Recent advancements in genomic technologies such as next generation sequencing have enabled high throughput cost-effective generation of whole genome sequence information from Mtb clinical isolates, providing new insights into the evolution, genomic diversity and transmission of the Mtb bacteria, including molecular mechanisms of antibiotic resistance. The large volume of sequencing data generated however necessitated effective and efficient management, storage, analysis and visualization of the data and results through development of novel and customized bioinformatics software tools and databases. In this review, we aim to provide a comprehensive survey of the current freely available bioinformatics software tools and publicly accessible databases for genomic analysis of Mtb for identifying disease transmission in molecular epidemiology and in rapid determination of the antibiotic profiles of clinical isolates for prompt and optimal patient treatment.

  18. Transcriptome and genome size analysis of the Venus flytrap.

    Science.gov (United States)

    Jensen, Michael Krogh; Vogt, Josef Korbinian; Bressendorff, Simon; Seguin-Orlando, Andaine; Petersen, Morten; Sicheritz-Pontén, Thomas; Mundy, John

    2015-01-01

    The insectivorous Venus flytrap (Dionaea muscipula) is renowned from Darwin's studies of plant carnivory and the origins of species. To provide tools to analyze the evolution and functional genomics of D. muscipula, we sequenced a normalized cDNA library synthesized from mRNA isolated from D. muscipula flowers and traps. Using the Oases transcriptome assembler 79,165,657 quality trimmed reads were assembled into 80,806 cDNA contigs, with an average length of 679 bp and an N50 length of 1,051 bp. A total of 17,047 unique proteins were identified, and assigned to Gene Ontology (GO) and classified into functional categories. A total of 15,547 full-length cDNA sequences were identified, from which open reading frames were detected in 10,941. Comparative GO analyses revealed that D. muscipula is highly represented in molecular functions related to catalytic, antioxidant, and electron carrier activities. Also, using a single copy sequence PCR-based method, we estimated that the genome size of D. muscipula is approx. 3 Gb. Our genome size estimate and transcriptome analyses will contribute to future research on this fascinating, monotypic species and its heterotrophic adaptations.

  19. Transcriptome and genome size analysis of the Venus flytrap.

    Directory of Open Access Journals (Sweden)

    Michael Krogh Jensen

    Full Text Available The insectivorous Venus flytrap (Dionaea muscipula is renowned from Darwin's studies of plant carnivory and the origins of species. To provide tools to analyze the evolution and functional genomics of D. muscipula, we sequenced a normalized cDNA library synthesized from mRNA isolated from D. muscipula flowers and traps. Using the Oases transcriptome assembler 79,165,657 quality trimmed reads were assembled into 80,806 cDNA contigs, with an average length of 679 bp and an N50 length of 1,051 bp. A total of 17,047 unique proteins were identified, and assigned to Gene Ontology (GO and classified into functional categories. A total of 15,547 full-length cDNA sequences were identified, from which open reading frames were detected in 10,941. Comparative GO analyses revealed that D. muscipula is highly represented in molecular functions related to catalytic, antioxidant, and electron carrier activities. Also, using a single copy sequence PCR-based method, we estimated that the genome size of D. muscipula is approx. 3 Gb. Our genome size estimate and transcriptome analyses will contribute to future research on this fascinating, monotypic species and its heterotrophic adaptations.

  20. MIPS: analysis and annotation of genome information in 2007.

    Science.gov (United States)

    Mewes, H W; Dietmann, S; Frishman, D; Gregory, R; Mannhaupt, G; Mayer, K F X; Münsterkötter, M; Ruepp, A; Spannagl, M; Stümpflen, V; Rattei, T

    2008-01-01

    The Munich Information Center for Protein Sequences (MIPS-GSF, Neuherberg, Germany) combines automatic processing of large amounts of sequences with manual annotation of selected model genomes. Due to the massive growth of the available data, the depth of annotation varies widely between independent databases. Also, the criteria for the transfer of information from known to orthologous sequences are diverse. To cope with the task of global in-depth genome annotation has become unfeasible. Therefore, our efforts are dedicated to three levels of annotation: (i) the curation of selected genomes, in particular from fungal and plant taxa (e.g. CYGD, MNCDB, MatDB), (ii) the comprehensive, consistent, automatic annotation employing exhaustive methods for the computation of sequence similarities and sequence-related attributes as well as the classification of individual sequences (SIMAP, PEDANT and FunCat) and (iii) the compilation of manually curated databases for protein interactions based on scrutinized information from the literature to serve as an accepted set of reliable annotated interaction data (MPACT, MPPI, CORUM). All databases and tools described as well as the detailed descriptions of our projects can be accessed through the MIPS web server (http://mips.gsf.de).

  1. Chromosome region-specific libraries for human genome analysis

    Energy Technology Data Exchange (ETDEWEB)

    Kao, Fa-Ten.

    1991-01-01

    We have made important progress since the beginning of the current grant year. We have further developed the microdissection and PCR- assisted microcloning techniques using the linker-adaptor method. We have critically evaluated the microdissection libraries constructed by this microtechnology and proved that they are of high quality. We further demonstrated that these microdissection clones are useful in identifying corresponding YAC clones for a thousand-fold expansion of the genomic coverage and for contig construction. We are also improving the technique of cloning the dissected fragments in test tube by the TDT method. We are applying both of these PCR cloning technique to human chromosomes 2 and 5 to construct region-specific libraries for physical mapping purposes of LLNL and LANL. Finally, we are exploring efficient procedures to use unique sequence microclones to isolate cDNA clones from defined chromosomal regions as valuable resources for identifying expressed gene sequences in the human genome. We believe that we are making important progress under the auspices of this DOE human genome program grant and we will continue to make significant contributions in the coming year. 4 refs., 4 figs.

  2. Quantitative analysis of genomic element interactions by molecular colony technique.

    Science.gov (United States)

    Gavrilov, Alexey A; Chetverina, Helena V; Chermnykh, Elina S; Razin, Sergey V; Chetverin, Alexander B

    2014-03-01

    Distant genomic elements were found to interact within the folded eukaryotic genome. However, the used experimental approach (chromosome conformation capture, 3C) enables neither determination of the percentage of cells in which the interactions occur nor demonstration of simultaneous interaction of >2 genomic elements. Each of the above can be done using in-gel replication of interacting DNA segments, the technique reported here. Chromatin fragments released from formaldehyde-cross-linked cells by sodium dodecyl sulfate extraction and sonication are distributed in a polyacrylamide gel layer followed by amplification of selected test regions directly in the gel by multiplex polymerase chain reaction. The fragments that have been cross-linked and separate fragments give rise to multi- and monocomponent molecular colonies, respectively, which can be distinguished and counted. Using in-gel replication of interacting DNA segments, we demonstrate that in the material from mouse erythroid cells, the majority of fragments containing the promoters of active β-globin genes and their remote enhancers do not form complexes stable enough to survive sodium dodecyl sulfate extraction and sonication. This indicates that either these elements do not interact directly in the majority of cells at a given time moment, or the formed DNA-protein complex cannot be stabilized by formaldehyde cross-linking.

  3. [RAPD analysis of the intraspecific and interspecific variation and phylogenetic relationships of Aegilops L. species with the U genome].

    Science.gov (United States)

    Goriunova, S V; Chikida, N N; Kochieva, E Z

    2010-07-01

    RAPD analysis was used to study the genetic variation and phylogenetic relationships of polyploid Aegilops species with the U genome. In total, 115 DNA samples of eight polyploid species containing the U genome and the diploid species Ae. umbellulata (U) were examined. Substantial interspecific polymorphism was observed for the majority of the polyploid species with the U genome (interspecific differences, 0.01-0,2; proportion of polymorphic loci, 56.6-88.2%). Aegilops triuncialis was identified as the only alloploid species with low interspecific polymorphism (interspecific differences, 0-0.01, P = 50%) in the U-genome group. The U-genome Aegilops species proved to be separated from other species of the genus. The phylogenetic relationships were established for the U-genome species. The greatest separation within the U-genome group was observed for the US-genome species Ae. kotschyi and Ae. variabilis. The tetraploid species Ae. triaristata and Ae. columnaris, which had the UX genome, and the hexaploid species Ae. recta (UXN) were found to be related to each other and separate from the UM-genome species. A similarity was observed between the U M-genome species Ae. ovata and Ae. biuncialis, which had the UM genome, and the ancestral diploid U-genome species Ae. umbellulata. The UC-genome species Ae. triuncialis was rather separate and slightly similar to the UX-genome species.

  4. Differentiation of Enterococcus faecium from Lactobacillus delbrueckii subsp. bulgaricus and Streptococcus thermophilus strains by PCR and dot-blot hybridisation.

    Science.gov (United States)

    Langa, S; Fernández, A; Martín, R; Reviriego, C; Marín, M L; Fernández, L; Rodríguez, J M

    2003-12-01

    Variations in length and sequence of the 16S/23S spacer region of Enterococcus faecium provided the basis for development of simple PCR and dot-blot hybridisation assays that enabled the differentiation of potentially probiotic Enterococcus faecium strains from Lactobacillus delbrueckii subsp. bulgaricus and Streptococcus thermophilus. Such assays may be useful for differentiation of yoghurt starter cultures and enterococcal strains when they are simultaneously present in probiotic food products.

  5. Hybridising Medicine: Illness, Healing and the Dynamics of Reciprocal Exchange on the Upper Guinea Coast (West Africa)

    OpenAIRE

    Philip J. Havik

    2016-01-01

    The present article seeks to fill a number of lacunae with regard to the study of the circulation and assimilation of different bodies of medical knowledge in an important cultural contact zone, that is the Upper Guinea Coast. Building upon ongoing research on trade and cultural brokerage in the area, it focuses upon shifting attitudes and practices with regard to health and healing as a result of cultural interaction and hybridisation against the background of growing intra-African and Afro-...

  6. SpeedSeq: Ultra-fast personal genome analysis and interpretation

    Science.gov (United States)

    Chiang, Colby; Layer, Ryan M.; Faust, Gregory G.; Lindberg, Michael R.; Rose, David B.; Garrison, Erik P.; Marth, Gabor T.; Quinlan, Aaron R.; Hall, Ira M.

    2015-01-01

    SpeedSeq is an open-source genome analysis platform that accomplishes alignment, variant detection and functional annotation of a 50× human genome in 13 hours on a low-cost server, alleviating a bioinformatics bottleneck that typically demands weeks of computation with extensive hands-on expert involvement. SpeedSeq offers competitive or superior performance to current methods for detecting germline and somatic single nucleotide variants, indels, and structural variants, and includes novel functionality for streamlined interpretation. PMID:26258291

  7. Genome-wide association study meta-analysis identifies seven new rheumatoid arthritis risk loci

    OpenAIRE

    Stahl, Eli A; Raychaudhuri, Soumya; Remmers, Elaine F.; Xie, Gang; Eyre, Stephen; Thomson, Brian P.; Li, Yonghong; Kurreeman, Fina A. S.; Zhernakova, Alexandra; Hinks, Anne; Guiducci, Candace; Chen, Robert; Alfredsson, Lars; Amos, Christopher I.; Ardlie, Kristin G.

    2010-01-01

    To identify novel genetic risk factors for rheumatoid arthritis (RA), we conducted a genome-wide association study (GWAS) meta-analysis of 5,539 autoantibody positive RA cases and 20,169 controls of European descent, followed by replication in an independent set of 6,768 RA cases and 8,806 controls. Of 34 SNPs selected for replication, 7 novel RA risk alleles were identified at genome-wide significance (P

  8. 'PACLIMS': A component LIM system for high-throughput functional genomic analysis

    OpenAIRE

    Farman Mark; Patel Gayatri; Orbach Marc J; Tucker Sara; Galadima Natalia; Mitchell Thomas; Floyd Anna; Nolin Shelly; Windham Donald; Diener Stephen; Brown Douglas; Rajagopalon Ravi; Donofrio Nicole; Pampanwar Vishal; Soderlund Cari

    2005-01-01

    Abstract Background Recent advances in sequencing techniques leading to cost reduction have resulted in the generation of a growing number of sequenced eukaryotic genomes. Computational tools greatly assist in defining open reading frames and assigning tentative annotations. However, gene functions cannot be asserted without biological support through, among other things, mutational analysis. In taking a genome-wide approach to functionally annotate an entire organism, in this application the...

  9. Structure-infectivity analysis of the human rhinovirus genomic RNA 3' non-coding region.

    OpenAIRE

    1996-01-01

    The specific recognition of genomic positive strand RNAS as templates for the synthesis of intermediate negative strands by the picornavirus replication machinery is presumably mediated by cis-acting sequences within the genomic RNA 3' non-coding region (NCR). A structure-infectivity analysis was conducted on the 44 nt human rhinovirus 14 (HRV14) 3' NCR to identify the primary sequence and/or secondary structure determinants required for viral replication. Using biochemical RNA secondary stru...

  10. A genome-wide 20 K citrus microarray for gene expression analysis

    OpenAIRE

    Martínez-Godoy, M. Ángeles; Mauri, Nuria; Juárez, José; Marqués, M. Carmen; Santiago, Julia; Forment, Javier; Gadea Vacas, José

    2008-01-01

    Background: Understanding of genetic elements that contribute to key aspects of citrus biology will impact future improvements in this economically important crop. Global gene expression analysis demands microarray platforms with a high genome coverage. In the last years, genomewide EST collections have been generated in citrus, opening the possibility to create new tools for functional genomics in this crop plant. Results: We have designed and constructed a publicly available ...

  11. Targeted analysis of whole genome sequence data to diagnose genetic cardiomyopathy.

    Science.gov (United States)

    Golbus, Jessica R; Puckelwartz, Megan J; Dellefave-Castillo, Lisa; Fahrenbach, John P; Nelakuditi, Viswateja; Pesce, Lorenzo L; Pytel, Peter; McNally, Elizabeth M

    2014-12-01

    Cardiomyopathy is highly heritable but genetically diverse. At present, genetic testing for cardiomyopathy uses targeted sequencing to simultaneously assess the coding regions of >50 genes. New genes are routinely added to panels to improve the diagnostic yield. With the anticipated $1000 genome, it is expected that genetic testing will shift toward comprehensive genome sequencing accompanied by targeted gene analysis. Therefore, we assessed the reliability of whole genome sequencing and targeted analysis to identify cardiomyopathy variants in 11 subjects with cardiomyopathy. Whole genome sequencing with an average of 37× coverage was combined with targeted analysis focused on 204 genes linked to cardiomyopathy. Genetic variants were scored using multiple prediction algorithms combined with frequency data from public databases. This pipeline yielded 1 to 14 potentially pathogenic variants per individual. Variants were further analyzed using clinical criteria and segregation analysis, where available. Three of 3 previously identified primary mutations were detected by this analysis. In 6 subjects for whom the primary mutation was previously unknown, we identified mutations that segregated with disease, had clinical correlates, and had additional pathological correlation to provide evidence for causality. For 2 subjects with previously known primary mutations, we identified additional variants that may act as modifiers of disease severity. In total, we identified the likely pathological mutation in 9 of 11 (82%) subjects. These pilot data demonstrate that ≈30 to 40× coverage whole genome sequencing combined with targeted analysis is feasible and sensitive to identify rare variants in cardiomyopathy-associated genes. © 2014 American Heart Association, Inc.

  12. Examination of equine glandular stomach lesions for bacteria, including Helicobacter spp by fluorescence in situ hybridisation

    Directory of Open Access Journals (Sweden)

    Olsen Susanne N

    2010-03-01

    Full Text Available Abstract Background The equine glandular stomach is commonly affected by erosion and ulceration. The aim of this study was to assess whether bacteria, including Helicobacter, could be involved in the aetiology of gastric glandular lesions seen in horses. Results Stomach lesions, as well as normal appea