WorldWideScience

Sample records for full-length genomic characterization

  1. Full-length genomic characterization and molecular evolution of canine parvovirus in China.

    Science.gov (United States)

    Zhou, Ling; Tang, Qinghai; Shi, Lijun; Kong, Miaomiao; Liang, Lin; Mao, Qianqian; Bu, Bin; Yao, Lunguang; Zhao, Kai; Cui, Shangjin; Leal, Élcio

    2016-06-01

    Canine parvovirus type 2 (CPV-2) can cause acute haemorrhagic enteritis in dogs and myocarditis in puppies. This disease has become one of the most serious infectious diseases of dogs. During 2014 in China, there were many cases of acute infectious diarrhoea in dogs. Some faecal samples were negative for the CPV-2 antigen based on a colloidal gold test strip but were positive based on PCR, and a viral strain was isolated from one such sample. The cytopathic effect on susceptible cells and the results of the immunoperoxidase monolayer assay, PCR, and sequencing indicated that the pathogen was CPV-2. The strain was named CPV-NY-14, and the full-length genome was sequenced and analysed. A maximum likelihood tree was constructed using the full-length genome and all available CPV-2 genomes. New strains have replaced the original strain in Taiwan and Italy, although the CPV-2a strain is still predominant there. However, CPV-2a still causes many cases of acute infectious diarrhoea in dogs in China.

  2. Genetic characterization of human herpesvirus type 1: Full-length genome sequence of strain obtained from an encephalitis case from India

    Directory of Open Access Journals (Sweden)

    Vijay P Bondre

    2016-01-01

    Interpretation & conclusions: Our results showed that the full-length genome sequence generated from an Indian HSV-1 isolate shared close genetic relationship with the American KOS and Chinese CR38 strains which belonged to the Asian genetic lineage. Recombination analysis of Indian isolate demonstrated multiple recombination crossover points throughout the genome. This full-length genome sequence amplified from the Indian isolate would be helpful to study HSV evolution, genetic basis of differential pathogenesis, host-virus interactions and viral factors contributing towards differential clinical outcome in human infections.

  3. Full-length genome sequences of porcine epidemic diarrhoea virus strain CV777; Use of NGS to analyse genomic and sub-genomic RNAs

    DEFF Research Database (Denmark)

    Rasmussen, Thomas Bruun; Boniotti, Maria Beatrice; Papetti, Alice

    2018-01-01

    Porcine epidemic diarrhoea virus, strain CV777, was initially characterized in 1978 as the causative agent of a disease first identified in the UK in 1971. This coronavirus has been widely distributed among laboratories and has been passaged both within pigs and in cell culture. To determine...... the variability between different stocks of the PEDV strain CV777, sequencing of the full-length genome (ca. 28kb) has been performed in 6 different laboratories, using different protocols. Not surprisingly, each of the different full genome sequences were distinct from each other and from the reference sequence...... the analysis of sub-genomic mRNAs from infected cells. It is clearly important to know the features of the specific sample of CV777 being used for experimental studies....

  4. Salmo salar and Esox lucius full-length cDNA sequences reveal changes in evolutionary pressures on a post-tetraploidization genome

    Directory of Open Access Journals (Sweden)

    Holt Robert A

    2010-04-01

    Full Text Available Abstract Background Salmonids are one of the most intensely studied fish, in part due to their economic and environmental importance, and in part due to a recent whole genome duplication in the common ancestor of salmonids. This duplication greatly impacts species diversification, functional specialization, and adaptation. Extensive new genomic resources have recently become available for Atlantic salmon (Salmo salar, but documentation of allelic versus duplicate reference genes remains a major uncertainty in the complete characterization of its genome and its evolution. Results From existing expressed sequence tag (EST resources and three new full-length cDNA libraries, 9,057 reference quality full-length gene insert clones were identified for Atlantic salmon. A further 1,365 reference full-length clones were annotated from 29,221 northern pike (Esox lucius ESTs. Pairwise dN/dS comparisons within each of 408 sets of duplicated salmon genes using northern pike as a diploid out-group show asymmetric relaxation of selection on salmon duplicates. Conclusions 9,057 full-length reference genes were characterized in S. salar and can be used to identify alleles and gene family members. Comparisons of duplicated genes show that while purifying selection is the predominant force acting on both duplicates, consistent with retention of functionality in both copies, some relaxation of pressure on gene duplicates can be identified. In addition, there is evidence that evolution has acted asymmetrically on paralogs, allowing one of the pair to diverge at a faster rate.

  5. Characterization of full-length sequenced cDNA inserts (FLIcs from Atlantic salmon (Salmo salar

    Directory of Open Access Journals (Sweden)

    Lunner Sigbjørn

    2009-10-01

    Full Text Available Abstract Background Sequencing of the Atlantic salmon genome is now being planned by an international research consortium. Full-length sequenced inserts from cDNAs (FLIcs are an important tool for correct annotation and clustering of the genomic sequence in any species. The large amount of highly similar duplicate sequences caused by the relatively recent genome duplication in the salmonid ancestor represents a particular challenge for the genome project. FLIcs will therefore be an extremely useful resource for the Atlantic salmon sequencing project. In addition to be helpful in order to distinguish between duplicate genome regions and in determining correct gene structures, FLIcs are an important resource for functional genomic studies and for investigation of regulatory elements controlling gene expression. In contrast to the large number of ESTs available, including the ESTs from 23 developmental and tissue specific cDNA libraries contributed by the Salmon Genome Project (SGP, the number of sequences where the full-length of the cDNA insert has been determined has been small. Results High quality full-length insert sequences from 560 pre-smolt white muscle tissue specific cDNAs were generated, accession numbers [GenBank: BT043497 - BT044056]. Five hundred and ten (91% of the transcripts were annotated using Gene Ontology (GO terms and 440 of the FLIcs are likely to contain a complete coding sequence (cCDS. The sequence information was used to identify putative paralogs, characterize salmon Kozak motifs, polyadenylation signal variation and to identify motifs likely to be involved in the regulation of particular genes. Finally, conserved 7-mers in the 3'UTRs were identified, of which some were identical to miRNA target sequences. Conclusion This paper describes the first Atlantic salmon FLIcs from a tissue and developmental stage specific cDNA library. We have demonstrated that many FLIcs contained a complete coding sequence (cCDS. This

  6. Characterization of full-length sequenced cDNA inserts (FLIcs) from Atlantic salmon (Salmo salar)

    Science.gov (United States)

    Andreassen, Rune; Lunner, Sigbjørn; Høyheim, Bjørn

    2009-01-01

    Background Sequencing of the Atlantic salmon genome is now being planned by an international research consortium. Full-length sequenced inserts from cDNAs (FLIcs) are an important tool for correct annotation and clustering of the genomic sequence in any species. The large amount of highly similar duplicate sequences caused by the relatively recent genome duplication in the salmonid ancestor represents a particular challenge for the genome project. FLIcs will therefore be an extremely useful resource for the Atlantic salmon sequencing project. In addition to be helpful in order to distinguish between duplicate genome regions and in determining correct gene structures, FLIcs are an important resource for functional genomic studies and for investigation of regulatory elements controlling gene expression. In contrast to the large number of ESTs available, including the ESTs from 23 developmental and tissue specific cDNA libraries contributed by the Salmon Genome Project (SGP), the number of sequences where the full-length of the cDNA insert has been determined has been small. Results High quality full-length insert sequences from 560 pre-smolt white muscle tissue specific cDNAs were generated, accession numbers [GenBank: BT043497 - BT044056]. Five hundred and ten (91%) of the transcripts were annotated using Gene Ontology (GO) terms and 440 of the FLIcs are likely to contain a complete coding sequence (cCDS). The sequence information was used to identify putative paralogs, characterize salmon Kozak motifs, polyadenylation signal variation and to identify motifs likely to be involved in the regulation of particular genes. Finally, conserved 7-mers in the 3'UTRs were identified, of which some were identical to miRNA target sequences. Conclusion This paper describes the first Atlantic salmon FLIcs from a tissue and developmental stage specific cDNA library. We have demonstrated that many FLIcs contained a complete coding sequence (cCDS). This suggests that the remaining c

  7. Characterization of a Full-Length Endogenous Beta-Retrovirus, EqERV-Beta1, in the Genome of the Horse (Equus caballus

    Directory of Open Access Journals (Sweden)

    Antoinette C. van der Kuyl

    2011-06-01

    Full Text Available Information on endogenous retroviruses fixed in the horse (Equus caballus genome is scarce. The recent availability of a draft sequence of the horse genome enables the detection of such integrated viruses by similarity search. Using translated nucleotide fragments from gamma-, beta-, and delta-retroviral genera for initial searches, a full-length beta-retrovirus genome was retrieved from a horse chromosome 5 contig. The provirus, tentatively named EqERV-beta1 (for the first equine endogenous beta-retrovirus, was 10434 nucleotide (nt in length with the usual retroviral genome structure of 5’LTR-gag-pro-pol-env-3’LTR. The LTRs were 1361 nt long, and differed approximately 1% from each other, suggestive of a relatively recent integration. Coding sequences for gag, pro and pol were present in three different reading-frames, as common for beta-retroviruses, and the reading frames were completely open, except that the env gene was interrupted by a single stopcodon. No reading frame was apparent downstream of the env gene, suggesting that EqERV-beta1 does not encode a superantigen like mouse mammary tumor virus (MMTV. A second proviral genome of EqERV-beta1, with no stopcodon in env, is additionally integrated on chromosome 5 downstream of the first virus. Single EqERV-beta1 LTRs were abundantly present on all chromosomes except chromosome 24. Phylogenetically, EqERV-beta1 most closely resembles an unclassified retroviral sequence from cattle (Bos taurus, and the murine beta-retrovirus MMTV.

  8. Full-length genome sequences of five hepatitis C virus isolates representing subtypes 3g, 3h, 3i and 3k, and a unique genotype 3 variant.

    Science.gov (United States)

    Lu, Ling; Li, Chunhua; Yuan, Jie; Lu, Teng; Okamoto, Hiroaki; Murphy, Donald G

    2013-03-01

    We characterized the full-length genomes of five distinct hepatitis C virus (HCV)-3 isolates. These represent the first complete genomes for subtypes 3g and 3h, the second such genomes for 3k and 3i, and of one novel variant presently not assigned to a subtype. Each genome was determined from 18-25 overlapping fragments. They had lengths of 9579-9660 nt and each contained a single ORF encoding 3020-3025 aa. They were isolated from five patients residing in Canada; four were of Asian origin and one was of Somali origin. Phylogenetic analysis using 64 partial NS5B sequences differentiated 10 assigned subtypes, 3a-3i and 3k, and two additional lineages within genotype 3. From the data of this study, HCV-3 full-length sequences are now available for six of the assigned subtypes and one unassigned. Our findings should add insights to HCV evolutionary studies and clinical applications.

  9. Characterization of near full-length genomes of HIV type 1 strains in Denmark: Basis for a universal therapeutic vaccine

    DEFF Research Database (Denmark)

    Andresen, Betina S.; Vinner, Lasse; Tang, Sheila Tuyet

    2007-01-01

    We report here the near full-length sequence characterization of 17 Danish clinical HIV-1 strains isolated from HLA-A02 patients not in need of ART, with relatively low viral loads and normal CD4 cell counts. Sequencing was performed directly on DNA extracted from short-term cocultures of PBMCs...... of a universal immunotherapeutic vaccine construct based on these epitopes....

  10. Evidence for a Complex Mosaic Genome Pattern in a Full-length Hepatitis C Virus Sequence

    Directory of Open Access Journals (Sweden)

    R.S. Ross

    2008-01-01

    Full Text Available The genome of the hepatitis C virus (HCV exhibits a high genetic variability. This remarkable heterogeneity is mainly attributed to the gradual accumulation of mutational changes, whereas the contribution of recombination events to the evolution of HCV remains controversial so far. While performing phylogenetic analyses including a large number of sequences deposited in the GenBank, we encountered a full-length HCV sequence (AY651061 that showed evidence for inter-subtype recombination and was, therefore, subjected to a detailed analysis of its molecular structure. The obtained results indicated that AY651061 does not represent a “simple” HCV 1c isolate, but a complex 1a/1c mosaic genome, showing five putative breakpoints in the core to NS3 regions. To our knowledge, this is the first report on a mosaic HCV full- length sequence with multiple breakpoints. The molecular structure of AY651061 is reminiscent of complex homologous recombinant variants occurring among other members of the flaviviridae family, e.g. GB virus C, dengue virus, and Japanese encephalitis virus. Our finding of a mosaic HCV sequence may have important implications for many fields of current HCV research which merit careful consideration.

  11. Generation and Analysis of Full-length cDNA Sequences from Elephant Shark (Callorhinchus milii)

    KAUST Repository

    Kodzius, Rimantas

    2009-03-17

    Cartilaginous fishes are the oldest living group of jawed vertebrates and therefore is an important group for understanding the evolution of vertebrate genomes including the human genome. Our laboratory has proposed elephant shark (C. milii) as a model cartilaginous fish genome because of its relatively small genome size (910 Mb). The whole genome of C. milii is being sequenced (first cartilaginous fish genome to be sequenced completely). To characterize the transcriptome of C. milii and to assist in annotating exon-intron boundaries, transcriptional start sites and alternatively spliced transcripts, we are generating full-length cDNA sequences from C. milii.

  12. Full Genomic Characterization of a Saffold Virus Isolated in Peru

    Directory of Open Access Journals (Sweden)

    Mariana Leguia

    2015-11-01

    Full Text Available While studying respiratory infections of unknown etiology we detected Saffold virus in an oropharyngeal swab collected from a two-year-old female suffering from diarrhea and respiratory illness. The full viral genome recovered by deep sequencing showed 98% identity to a previously described Saffold strain isolated in Japan. Phylogenetic analysis confirmed the Peruvian Saffold strain belongs to genotype 3 and is most closely related to strains that have circulated in Asia. This is the first documented case report of Saffold virus in Peru and the only complete genomic characterization of a Saffold-3 isolate from the Americas.

  13. Near Full-Length Identification of a Novel HIV-1 CRF01_AE/B/C Recombinant in Northern Myanmar.

    Science.gov (United States)

    Zhou, Yan-Heng; Chen, Xin; Liang, Yue-Bo; Pang, Wei; Qin, Wei-Hong; Zhang, Chiyu; Zheng, Yong-Tang

    2015-08-01

    The Myanmar-China border appears to be the "hot spot" region for the occurrence of HIV-1 recombination. The majority of the previous analyses of HIV-1 recombination were based on partial genomic sequences, which obviously cannot reflect the reality of the genetic diversity of HIV-1 in this area well. Here, we present a near full-length characterization of a novel HIV-1 CRF01_AE/B/C recombinant isolated from a long-distance truck driver in Northern Myanmar. It is the first description of a near full-length genomic sequence in Myanmar since 2003, and might be one of the most complicated HIV-1 chimeras ever detected in Myanmar, containing four CRF01_AE, six B segments, and five C segments separated by 14 breakpoints throughout its genome. The discovery and characterization of this new CRF01_AE/B/C recombinant indicate that intersubtype recombination is ongoing in Myanmar, continuously generating new forms of HIV-1. More work based on near full-length sequence analyses is urgently needed to better understand the genetic diversity of HIV-1 in these regions.

  14. Molecular cloning and expression of full-length DNA copies of the genomic RNAs of cowpea mosaic virus

    NARCIS (Netherlands)

    Vos, P.

    1987-01-01

    The experiments described in this thesis were designed to unravel various aspects of the mechanism of gene expression of cowpea mosaic virus (CPMV). For this purpose full-length DNA copies of both genomic RNAs of CPMV were constructed. Using powerful invitro

  15. Diversity, distribution and dynamics of full-length Copia and Gypsy LTR retroelements in Solanum lycopersicum.

    Science.gov (United States)

    Paz, Rosalía Cristina; Kozaczek, Melisa Eliana; Rosli, Hernán Guillermo; Andino, Natalia Pilar; Sanchez-Puerta, Maria Virginia

    2017-10-01

    Transposable elements are the most abundant components of plant genomes and can dramatically induce genetic changes and impact genome evolution. In the recently sequenced genome of tomato (Solanum lycopersicum), the estimated fraction of elements corresponding to retrotransposons is nearly 62%. Given that tomato is one of the most important vegetable crop cultivated and consumed worldwide, understanding retrotransposon dynamics can provide insight into its evolution and domestication processes. In this study, we performed a genome-wide in silico search of full-length LTR retroelements in the tomato nuclear genome and annotated 736 full-length Gypsy and Copia retroelements. The dispersion level across the 12 chromosomes, the diversity and tissue-specific expression of those elements were estimated. Phylogenetic analysis based on the retrotranscriptase region revealed the presence of 12 major lineages of LTR retroelements in the tomato genome. We identified 97 families, of which 77 and 20 belong to the superfamilies Copia and Gypsy, respectively. Each retroelement family was characterized according to their element size, relative frequencies and insertion time. These analyses represent a valuable resource for comparative genomics within the Solanaceae, transposon-tagging and for the design of cultivar-specific molecular markers in tomato.

  16. Particle infectivity of HIV-1 full-length genome infectious molecular clones in a subtype C heterosexual transmission pair following high fidelity amplification and unbiased cloning

    Energy Technology Data Exchange (ETDEWEB)

    Deymier, Martin J., E-mail: mdeymie@emory.edu [Emory Vaccine Center, Yerkes National Primate Research Center, 954 Gatewood Road NE, Atlanta, GA 30329 (United States); Claiborne, Daniel T., E-mail: dclaibo@emory.edu [Emory Vaccine Center, Yerkes National Primate Research Center, 954 Gatewood Road NE, Atlanta, GA 30329 (United States); Ende, Zachary, E-mail: zende@emory.edu [Emory Vaccine Center, Yerkes National Primate Research Center, 954 Gatewood Road NE, Atlanta, GA 30329 (United States); Ratner, Hannah K., E-mail: hannah.ratner@emory.edu [Emory Vaccine Center, Yerkes National Primate Research Center, 954 Gatewood Road NE, Atlanta, GA 30329 (United States); Kilembe, William, E-mail: wkilembe@rzhrg-mail.org [Zambia-Emory HIV Research Project (ZEHRP), B22/737 Mwembelelo, Emmasdale Post Net 412, P/BagE891, Lusaka (Zambia); Allen, Susan, E-mail: sallen5@emory.edu [Zambia-Emory HIV Research Project (ZEHRP), B22/737 Mwembelelo, Emmasdale Post Net 412, P/BagE891, Lusaka (Zambia); Department of Pathology and Laboratory Medicine, Emory University, Atlanta, GA (United States); Hunter, Eric, E-mail: eric.hunter2@emory.edu [Emory Vaccine Center, Yerkes National Primate Research Center, 954 Gatewood Road NE, Atlanta, GA 30329 (United States); Department of Pathology and Laboratory Medicine, Emory University, Atlanta, GA (United States)

    2014-11-15

    The high genetic diversity of HIV-1 impedes high throughput, large-scale sequencing and full-length genome cloning by common restriction enzyme based methods. Applying novel methods that employ a high-fidelity polymerase for amplification and an unbiased fusion-based cloning strategy, we have generated several HIV-1 full-length genome infectious molecular clones from an epidemiologically linked transmission pair. These clones represent the transmitted/founder virus and phylogenetically diverse non-transmitted variants from the chronically infected individual's diverse quasispecies near the time of transmission. We demonstrate that, using this approach, PCR-induced mutations in full-length clones derived from their cognate single genome amplicons are rare. Furthermore, all eight non-transmitted genomes tested produced functional virus with a range of infectivities, belying the previous assumption that a majority of circulating viruses in chronic HIV-1 infection are defective. Thus, these methods provide important tools to update protocols in molecular biology that can be universally applied to the study of human viral pathogens. - Highlights: • Our novel methodology demonstrates accurate amplification and cloning of full-length HIV-1 genomes. • A majority of plasma derived HIV variants from a chronically infected individual are infectious. • The transmitted/founder was more infectious than the majority of the variants from the chronically infected donor.

  17. Particle infectivity of HIV-1 full-length genome infectious molecular clones in a subtype C heterosexual transmission pair following high fidelity amplification and unbiased cloning

    International Nuclear Information System (INIS)

    Deymier, Martin J.; Claiborne, Daniel T.; Ende, Zachary; Ratner, Hannah K.; Kilembe, William; Allen, Susan; Hunter, Eric

    2014-01-01

    The high genetic diversity of HIV-1 impedes high throughput, large-scale sequencing and full-length genome cloning by common restriction enzyme based methods. Applying novel methods that employ a high-fidelity polymerase for amplification and an unbiased fusion-based cloning strategy, we have generated several HIV-1 full-length genome infectious molecular clones from an epidemiologically linked transmission pair. These clones represent the transmitted/founder virus and phylogenetically diverse non-transmitted variants from the chronically infected individual's diverse quasispecies near the time of transmission. We demonstrate that, using this approach, PCR-induced mutations in full-length clones derived from their cognate single genome amplicons are rare. Furthermore, all eight non-transmitted genomes tested produced functional virus with a range of infectivities, belying the previous assumption that a majority of circulating viruses in chronic HIV-1 infection are defective. Thus, these methods provide important tools to update protocols in molecular biology that can be universally applied to the study of human viral pathogens. - Highlights: • Our novel methodology demonstrates accurate amplification and cloning of full-length HIV-1 genomes. • A majority of plasma derived HIV variants from a chronically infected individual are infectious. • The transmitted/founder was more infectious than the majority of the variants from the chronically infected donor

  18. A new set of ESTs and cDNA clones from full-length and normalized libraries for gene discovery and functional characterization in citrus

    Directory of Open Access Journals (Sweden)

    Alamar Santiago

    2009-09-01

    Full Text Available Abstract Background Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information and biological knowledge. Full-length cDNA clones facilitate functional analysis of the corresponding genes enabling manipulation of their expression in heterologous systems and the generation of a variety of tagged versions of the native protein. In addition, the development of full-length cDNA sequences has the power to improve the quality of genome annotation. Results We developed an integrated method to generate a new normalized EST collection enriched in full-length and rare transcripts of different citrus species from multiple tissues and developmental stages. We constructed a total of 15 cDNA libraries, from which we isolated 10,898 high-quality ESTs representing 6142 different genes. Percentages of redundancy and proportion of full-length clones range from 8 to 33, and 67 to 85, respectively, indicating good efficiency of the approach employed. The new EST collection adds 2113 new citrus ESTs, representing 1831 unigenes, to the collection of citrus genes available in the public databases. To facilitate functional analysis, cDNAs were introduced in a Gateway-based cloning vector for high-throughput functional analysis of genes in planta. Herein, we describe the technical methods used in the library construction, sequence analysis of clones and the overexpression of CitrSEP, a citrus homolog to the Arabidopsis SEP3 gene, in Arabidopsis as an example of a practical application of the engineered Gateway vector for functional analysis. Conclusion The new

  19. Molecular Cloning and Characterization of Full-Length cDNA of Calmodulin Gene from Pacific Oyster Crassostrea gigas

    Directory of Open Access Journals (Sweden)

    Xing-Xia Li

    2016-01-01

    Full Text Available The shell of the pearl oyster (Pinctada fucata mainly comprises aragonite whereas that of the Pacific oyster (Crassostrea gigas is mainly calcite, thereby suggesting the different mechanisms of shell formation between above two mollusks. Calmodulin (CaM is an important gene for regulating the uptake, transport, and secretion of calcium during the process of shell formation in pearl oyster. It is interesting to characterize the CaM in oysters, which could facilitate the understanding of the different shell formation mechanisms among mollusks. We cloned the full-length cDNA of Pacific oyster CaM (cgCaM and found that the cgCaM ORF encoded a peptide of 113 amino acids containing three EF-hand calcium-binding domains, its expression level was highest in the mantle, hinting that the cgCaM gene is probably involved in shell formation of Pacific oyster, and the common ancestor of Gastropoda and Bivalvia may possess at least three CaM genes. We also found that the numbers of some EF hand family members in highly calcified species were higher than those in lowly calcified species and the numbers of these motifs in oyster genome were the highest among the mollusk species with whole genome sequence, further hinting the correlation between CaM and biomineralization.

  20. Calculation of evolutionary correlation between individual genes and full-length genome: a method useful for choosing phylogenetic markers for molecular epidemiology.

    Directory of Open Access Journals (Sweden)

    Shuai Wang

    Full Text Available Individual genes or regions are still commonly used to estimate the phylogenetic relationships among viral isolates. The genomic regions that can faithfully provide assessments consistent with those predicted with full-length genome sequences would be preferable to serve as good candidates of the phylogenetic markers for molecular epidemiological studies of many viruses. Here we employed a statistical method to evaluate the evolutionary relationships between individual viral genes and full-length genomes without tree construction as a way to determine which gene can match the genome well in phylogenetic analyses. This method was performed by calculation of linear correlations between the genetic distance matrices of aligned individual gene sequences and aligned genome sequences. We applied this method to the phylogenetic analyses of porcine circovirus 2 (PCV2, measles virus (MV, hepatitis E virus (HEV and Japanese encephalitis virus (JEV. Phylogenetic trees were constructed for comparisons and the possible factors affecting the method accuracy were also discussed in the calculations. The results revealed that this method could produce results consistent with those of previous studies about the proper consensus sequences that could be successfully used as phylogenetic markers. And our results also suggested that these evolutionary correlations could provide useful information for identifying genes that could be used effectively to infer the genetic relationships.

  1. Characterization of partial and near full-length genomes of HIV-1 strains sampled from recently infected individuals in São Paulo, Brazil.

    Directory of Open Access Journals (Sweden)

    Sabri Saeed Sanabani

    Full Text Available BACKGROUND: Genetic variability is a major feature of human immunodeficiency virus type 1 (HIV-1 and is considered the key factor frustrating efforts to halt the HIV epidemic. A proper understanding of HIV-1 genomic diversity is a fundamental prerequisite for proper epidemiology, genetic diagnosis, and successful drugs and vaccines design. Here, we report on the partial and near full-length genomic (NFLG variability of HIV-1 isolates from a well-characterized cohort of recently infected patients in São Paul, Brazil. METHODOLOGY: HIV-1 proviral DNA was extracted from the peripheral blood mononuclear cells of 113 participants. The NFLG and partial fragments were determined by overlapping nested PCR and direct sequencing. The data were phylogenetically analyzed. RESULTS: Of the 113 samples (90.3% male; median age 31 years; 79.6% homosexual men studied, 77 (68.1% NFLGs and 32 (29.3% partial fragments were successfully subtyped. Of the successfully subtyped sequences, 88 (80.7% were subtype B sequences, 12 (11% BF1 recombinants, 3 (2.8% subtype C sequences, 2 (1.8% BC recombinants and subclade F1 each, 1 (0.9% CRF02 AG, and 1 (0.9% CRF31 BC. Primary drug resistance mutations were observed in 14/101 (13.9% of samples, with 5.9% being resistant to protease inhibitors and nucleoside reverse transcriptase inhibitors (NRTI and 4.9% resistant to non-NRTIs. Predictions of viral tropism were determined for 86 individuals. X4 or X4 dual or mixed-tropic viruses (X4/DM were seen in 26 (30.2% of subjects. The proportion of X4 viruses in homosexuals was detected in 19/69 (27.5%. CONCLUSIONS: Our results confirm the existence of various HIV-1 subtypes circulating in São Paulo, and indicate that subtype B account for the majority of infections. Antiretroviral (ARV drug resistance is relatively common among recently infected patients. The proportion of X4 viruses in homosexuals was significantly higher than the proportion seen in other study populations.

  2. A new set of ESTs and cDNA clones from full-length and normalized libraries for gene discovery and functional characterization in citrus

    Science.gov (United States)

    Marques, M Carmen; Alonso-Cantabrana, Hugo; Forment, Javier; Arribas, Raquel; Alamar, Santiago; Conejero, Vicente; Perez-Amador, Miguel A

    2009-01-01

    Background Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information and biological knowledge. Full-length cDNA clones facilitate functional analysis of the corresponding genes enabling manipulation of their expression in heterologous systems and the generation of a variety of tagged versions of the native protein. In addition, the development of full-length cDNA sequences has the power to improve the quality of genome annotation. Results We developed an integrated method to generate a new normalized EST collection enriched in full-length and rare transcripts of different citrus species from multiple tissues and developmental stages. We constructed a total of 15 cDNA libraries, from which we isolated 10,898 high-quality ESTs representing 6142 different genes. Percentages of redundancy and proportion of full-length clones range from 8 to 33, and 67 to 85, respectively, indicating good efficiency of the approach employed. The new EST collection adds 2113 new citrus ESTs, representing 1831 unigenes, to the collection of citrus genes available in the public databases. To facilitate functional analysis, cDNAs were introduced in a Gateway-based cloning vector for high-throughput functional analysis of genes in planta. Herein, we describe the technical methods used in the library construction, sequence analysis of clones and the overexpression of CitrSEP, a citrus homolog to the Arabidopsis SEP3 gene, in Arabidopsis as an example of a practical application of the engineered Gateway vector for functional analysis. Conclusion The new EST collection denotes an

  3. Use of Dried Blood Spots to Elucidate Full-Length Transmitted/Founder HIV-1 Genomes

    Directory of Open Access Journals (Sweden)

    Jesus F. Salazar-Gonzalez

    2016-07-01

    Full Text Available Background: Identification of HIV-1 genomes responsible for establishing clinical infection in newly infected individuals is fundamental to prevention and pathogenesis research. Processing, storage, and transportation of the clinical samples required to perform these virologic assays in resource-limited settings requires challenging venipuncture and cold chain logistics. Here, we validate the use of dried-blood spots (DBS as a simple and convenient alternative to collecting and storing frozen plasma. Methods: We performed parallel nucleic acid extraction, single genome amplification (SGA, next generation sequencing (NGS, and phylogenetic analyses on plasma and DBS. Results: We demonstrated the capacity to extract viral RNA from DBS and perform SGA to infer the complete nucleotide sequence of the transmitted/founder (TF HIV-1 envelope gene and full-length genome in two acutely infected individuals. Using both SGA and NGS methodologies, we showed that sequences generated from DBS and plasma display comparable phylogenetic patterns in both acute and chronic infection. SGA was successful on samples with a range of plasma viremia, including samples as low as 1,700 copies/ml and an estimated ~50 viral copies per blood spot. Further, we demonstrated reproducible efficiency in gp160 env sequencing in DBS stored at ambient temperature for up to three weeks or at -20ºC for up to five months. Conclusions: These findings support the use of DBS as a practical and cost-effective alternative to frozen plasma for clinical trials and translational research conducted in resource-limited settings.

  4. Improvement of genome assembly completeness and identification of novel full-length protein-coding genes by RNA-seq in the giant panda genome.

    Science.gov (United States)

    Chen, Meili; Hu, Yibo; Liu, Jingxing; Wu, Qi; Zhang, Chenglin; Yu, Jun; Xiao, Jingfa; Wei, Fuwen; Wu, Jiayan

    2015-12-11

    High-quality and complete gene models are the basis of whole genome analyses. The giant panda (Ailuropoda melanoleuca) genome was the first genome sequenced on the basis of solely short reads, but the genome annotation had lacked the support of transcriptomic evidence. In this study, we applied RNA-seq to globally improve the genome assembly completeness and to detect novel expressed transcripts in 12 tissues from giant pandas, by using a transcriptome reconstruction strategy that combined reference-based and de novo methods. Several aspects of genome assembly completeness in the transcribed regions were effectively improved by the de novo assembled transcripts, including genome scaffolding, the detection of small-size assembly errors, the extension of scaffold/contig boundaries, and gap closure. Through expression and homology validation, we detected three groups of novel full-length protein-coding genes. A total of 12.62% of the novel protein-coding genes were validated by proteomic data. GO annotation analysis showed that some of the novel protein-coding genes were involved in pigmentation, anatomical structure formation and reproduction, which might be related to the development and evolution of the black-white pelage, pseudo-thumb and delayed embryonic implantation of giant pandas. The updated genome annotation will help further giant panda studies from both structural and functional perspectives.

  5. Large-scale analysis of full-length cDNAs from the tomato (Solanum lycopersicum cultivar Micro-Tom, a reference system for the Solanaceae genomics

    Directory of Open Access Journals (Sweden)

    Kikuchi Mari

    2010-03-01

    Full Text Available Abstract Background The Solanaceae family includes several economically important vegetable crops. The tomato (Solanum lycopersicum is regarded as a model plant of the Solanaceae family. Recently, a number of tomato resources have been developed in parallel with the ongoing tomato genome sequencing project. In particular, a miniature cultivar, Micro-Tom, is regarded as a model system in tomato genomics, and a number of genomics resources in the Micro-Tom-background, such as ESTs and mutagenized lines, have been established by an international alliance. Results To accelerate the progress in tomato genomics, we developed a collection of fully-sequenced 13,227 Micro-Tom full-length cDNAs. By checking redundant sequences, coding sequences, and chimeric sequences, a set of 11,502 non-redundant full-length cDNAs (nrFLcDNAs was generated. Analysis of untranslated regions demonstrated that tomato has longer 5'- and 3'-untranslated regions than most other plants but rice. Classification of functions of proteins predicted from the coding sequences demonstrated that nrFLcDNAs covered a broad range of functions. A comparison of nrFLcDNAs with genes of sixteen plants facilitated the identification of tomato genes that are not found in other plants, most of which did not have known protein domains. Mapping of the nrFLcDNAs onto currently available tomato genome sequences facilitated prediction of exon-intron structure. Introns of tomato genes were longer than those of Arabidopsis and rice. According to a comparison of exon sequences between the nrFLcDNAs and the tomato genome sequences, the frequency of nucleotide mismatch in exons between Micro-Tom and the genome-sequencing cultivar (Heinz 1706 was estimated to be 0.061%. Conclusion The collection of Micro-Tom nrFLcDNAs generated in this study will serve as a valuable genomic tool for plant biologists to bridge the gap between basic and applied studies. The nrFLcDNA sequences will help annotation of the

  6. First full-length genome sequence of the polerovirus luffa aphid-borne yellows virus (LABYV) reveals the presence of at least two consensus sequences in an isolate from Thailand.

    Science.gov (United States)

    Knierim, Dennis; Maiss, Edgar; Kenyon, Lawrence; Winter, Stephan; Menzel, Wulf

    2015-10-01

    Luffa aphid-borne yellows virus (LABYV) was proposed as the name for a previously undescribed polerovirus based on partial genome sequences obtained from samples of cucurbit plants collected in Thailand between 2008 and 2013. In this study, we determined the first full-length genome sequence of LABYV. Based on phylogenetic analysis and genome properties, it is clear that this virus represents a distinct species in the genus Polerovirus. Analysis of sequences from sample TH24, which was collected in 2010 from a luffa plant in Thailand, reveals the presence of two different full-length genome consensus sequences.

  7. Molecular Cloning and Characterization of Full-Length cDNA of Calmodulin Gene from Pacific Oyster Crassostrea gigas.

    Science.gov (United States)

    Li, Xing-Xia; Yu, Wen-Chao; Cai, Zhong-Qiang; He, Cheng; Wei, Na; Wang, Xiao-Tong; Yue, Xi-Qing

    2016-01-01

    The shell of the pearl oyster ( Pinctada fucata ) mainly comprises aragonite whereas that of the Pacific oyster ( Crassostrea gigas ) is mainly calcite, thereby suggesting the different mechanisms of shell formation between above two mollusks. Calmodulin (CaM) is an important gene for regulating the uptake, transport, and secretion of calcium during the process of shell formation in pearl oyster. It is interesting to characterize the CaM in oysters, which could facilitate the understanding of the different shell formation mechanisms among mollusks. We cloned the full-length cDNA of Pacific oyster CaM (cgCaM) and found that the cgCaM ORF encoded a peptide of 113 amino acids containing three EF-hand calcium-binding domains, its expression level was highest in the mantle, hinting that the cgCaM gene is probably involved in shell formation of Pacific oyster, and the common ancestor of Gastropoda and Bivalvia may possess at least three CaM genes. We also found that the numbers of some EF hand family members in highly calcified species were higher than those in lowly calcified species and the numbers of these motifs in oyster genome were the highest among the mollusk species with whole genome sequence, further hinting the correlation between CaM and biomineralization.

  8. Large-scale analysis of full-length cDNAs from the tomato (Solanum lycopersicum) cultivar Micro-Tom, a reference system for the Solanaceae genomics.

    Science.gov (United States)

    Aoki, Koh; Yano, Kentaro; Suzuki, Ayako; Kawamura, Shingo; Sakurai, Nozomu; Suda, Kunihiro; Kurabayashi, Atsushi; Suzuki, Tatsuya; Tsugane, Taneaki; Watanabe, Manabu; Ooga, Kazuhide; Torii, Maiko; Narita, Takanori; Shin-I, Tadasu; Kohara, Yuji; Yamamoto, Naoki; Takahashi, Hideki; Watanabe, Yuichiro; Egusa, Mayumi; Kodama, Motoichiro; Ichinose, Yuki; Kikuchi, Mari; Fukushima, Sumire; Okabe, Akiko; Arie, Tsutomu; Sato, Yuko; Yazawa, Katsumi; Satoh, Shinobu; Omura, Toshikazu; Ezura, Hiroshi; Shibata, Daisuke

    2010-03-30

    The Solanaceae family includes several economically important vegetable crops. The tomato (Solanum lycopersicum) is regarded as a model plant of the Solanaceae family. Recently, a number of tomato resources have been developed in parallel with the ongoing tomato genome sequencing project. In particular, a miniature cultivar, Micro-Tom, is regarded as a model system in tomato genomics, and a number of genomics resources in the Micro-Tom-background, such as ESTs and mutagenized lines, have been established by an international alliance. To accelerate the progress in tomato genomics, we developed a collection of fully-sequenced 13,227 Micro-Tom full-length cDNAs. By checking redundant sequences, coding sequences, and chimeric sequences, a set of 11,502 non-redundant full-length cDNAs (nrFLcDNAs) was generated. Analysis of untranslated regions demonstrated that tomato has longer 5'- and 3'-untranslated regions than most other plants but rice. Classification of functions of proteins predicted from the coding sequences demonstrated that nrFLcDNAs covered a broad range of functions. A comparison of nrFLcDNAs with genes of sixteen plants facilitated the identification of tomato genes that are not found in other plants, most of which did not have known protein domains. Mapping of the nrFLcDNAs onto currently available tomato genome sequences facilitated prediction of exon-intron structure. Introns of tomato genes were longer than those of Arabidopsis and rice. According to a comparison of exon sequences between the nrFLcDNAs and the tomato genome sequences, the frequency of nucleotide mismatch in exons between Micro-Tom and the genome-sequencing cultivar (Heinz 1706) was estimated to be 0.061%. The collection of Micro-Tom nrFLcDNAs generated in this study will serve as a valuable genomic tool for plant biologists to bridge the gap between basic and applied studies. The nrFLcDNA sequences will help annotation of the tomato whole-genome sequence and aid in tomato functional

  9. Full-length cDNA sequences from Rhesus monkey placenta tissue: analysis and utility for comparative mapping

    Directory of Open Access Journals (Sweden)

    Lee Sang-Rae

    2010-07-01

    Full Text Available Abstract Background Rhesus monkeys (Macaca mulatta are widely-used as experimental animals in biomedical research and are closely related to other laboratory macaques, such as cynomolgus monkeys (Macaca fascicularis, and to humans, sharing a last common ancestor from about 25 million years ago. Although rhesus monkeys have been studied extensively under field and laboratory conditions, research has been limited by the lack of genetic resources. The present study generated placenta full-length cDNA libraries, characterized the resulting expressed sequence tags, and described their utility for comparative mapping with human RefSeq mRNA transcripts. Results From rhesus monkey placenta full-length cDNA libraries, 2000 full-length cDNA sequences were determined and 1835 rhesus placenta cDNA sequences longer than 100 bp were collected. These sequences were annotated based on homology to human genes. Homology search against human RefSeq mRNAs revealed that our collection included the sequences of 1462 putative rhesus monkey genes. Moreover, we identified 207 genes containing exon alterations in the coding region and the untranslated region of rhesus monkey transcripts, despite the highly conserved structure of the coding regions. Approximately 10% (187 of all full-length cDNA sequences did not represent any public human RefSeq mRNAs. Intriguingly, two rhesus monkey specific exons derived from the transposable elements of AluYRa2 (SINE family and MER11B (LTR family were also identified. Conclusion The 1835 rhesus monkey placenta full-length cDNA sequences described here could expand genomic resources and information of rhesus monkeys. This increased genomic information will greatly contribute to the development of evolutionary biology and biomedical research.

  10. VP1u phospholipase activity is critical for infectivity of full-length parvovirus B19 genomic clones

    OpenAIRE

    Filippone, Claudia; Zhi, Ning; Wong, Susan; Lu, Jun; Kajigaya, Sachiko; Gallinella, Giorgio; Kakkola, Laura; Söderlund-Venermo, Maria; Young, Neal S.; Brown, Kevin E.

    2008-01-01

    Three full-length genomic clones (pB19-M20, pB19-FL and pB19-HG1) of parvovirus B19 were produced in different laboratories. pB19-M20 was shown to produce infectious virus. To determine the differences in infectivity, all three plasmids were tested by transfection and infection assays. All three clones were similar in viral DNA replication, RNA transcription, and viral capsid protein production. However, only pB19-M20 and pB19-HG1 produced infectious virus. Comparison of viral sequences showe...

  11. Inconsistencies of genome annotations in apicomplexan parasites revealed by 5'-end-one-pass and full-length sequences of oligo-capped cDNAs

    Directory of Open Access Journals (Sweden)

    Sugano Sumio

    2009-07-01

    Full Text Available Abstract Background Apicomplexan parasites are causative agents of various diseases including malaria and have been targets of extensive genomic sequencing. We generated 5'-EST collections for six apicomplexa parasites using our full-length oligo-capping cDNA library method. To improve upon the current genome annotations, as well as to validate the importance for physical cDNA clone resources, we generated a large-scale collection of full-length cDNAs for several apicomplexa parasites. Results In this study, we used a total of 61,056 5'-end-single-pass cDNA sequences from Plasmodium falciparum, P. vivax, P. yoelii, P. berghei, Cryptosporidium parvum, and Toxoplasma gondii. We compared these partially sequenced cDNA sequences with the currently annotated gene models and observed significant inconsistencies between the two datasets. In particular, we found that on average 14% of the exons in the current gene models were not supported by any cDNA evidence, and that 16% of the current gene models may contain at least one mis-annotation and should be re-evaluated. We also identified a large number of transcripts that had been previously unidentified. For 732 cDNAs in T. gondii, the entire sequences were determined in order to evaluate the annotated gene models at the complete full-length transcript level. We found that 41% of the T. gondii gene models contained at least one inconsistency. We also identified and confirmed by RT-PCR 140 previously unidentified transcripts found in the intergenic regions of the current gene annotations. We show that the majority of these discrepancies are due to questionable predictions of one or two extra exons in the upstream or downstream regions of the genes. Conclusion Our data indicates that the current gene models are likely to still be incomplete and have much room for improvement. Our unique full-length cDNA information is especially useful for further refinement of the annotations for the genomes of

  12. Rapid CRISPR/Cas9-Mediated Cloning of Full-Length Epstein-Barr Virus Genomes from Latently Infected Cells

    Directory of Open Access Journals (Sweden)

    Misako Yajima

    2018-04-01

    Full Text Available Herpesviruses have relatively large DNA genomes of more than 150 kb that are difficult to clone and sequence. Bacterial artificial chromosome (BAC cloning of herpesvirus genomes is a powerful technique that greatly facilitates whole viral genome sequencing as well as functional characterization of reconstituted viruses. We describe recently invented technologies for rapid BAC cloning of herpesvirus genomes using CRISPR/Cas9-mediated homology-directed repair. We focus on recent BAC cloning techniques of Epstein-Barr virus (EBV genomes and discuss the possible advantages of a CRISPR/Cas9-mediated strategy comparatively with precedent EBV-BAC cloning strategies. We also describe the design decisions of this technology as well as possible pitfalls and points to be improved in the future. The obtained EBV-BAC clones are subjected to long-read sequencing analysis to determine complete EBV genome sequence including repetitive regions. Rapid cloning and sequence determination of various EBV strains will greatly contribute to the understanding of their global geographical distribution. This technology can also be used to clone disease-associated EBV strains and test the hypothesis that they have special features that distinguish them from strains that infect asymptomatically.

  13. Molecular characterization of human T-cell lymphotropic virus type 1 full and partial genomes by Illumina massively parallel sequencing technology.

    Directory of Open Access Journals (Sweden)

    Rodrigo Pessôa

    Full Text Available BACKGROUND: Here, we report on the partial and full-length genomic (FLG variability of HTLV-1 sequences from 90 well-characterized subjects, including 48 HTLV-1 asymptomatic carriers (ACs, 35 HTLV-1-associated myelopathy/tropical spastic paraparesis (HAM/TSP and 7 adult T-cell leukemia/lymphoma (ATLL patients, using an Illumina paired-end protocol. METHODS: Blood samples were collected from 90 individuals, and DNA was extracted from the PBMCs to measure the proviral load and to amplify the HTLV-1 FLG from two overlapping fragments. The amplified PCR products were subjected to deep sequencing. The sequencing data were assembled, aligned, and mapped against the HTLV-1 genome with sufficient genetic resemblance and utilized for further phylogenetic analysis. RESULTS: A high-throughput sequencing-by-synthesis instrument was used to obtain an average of 3210- and 5200-fold coverage of the partial (n = 14 and FLG (n = 76 data from the HTLV-1 strains, respectively. The results based on the phylogenetic trees of consensus sequences from partial and FLGs revealed that 86 (95.5% individuals were infected with the transcontinental sub-subtypes of the cosmopolitan subtype (aA and that 4 individuals (4.5% were infected with the Japanese sub-subtypes (aB. A comparison of the nucleotide and amino acids of the FLG between the three clinical settings yielded no correlation between the sequenced genotype and clinical outcomes. The evolutionary relationships among the HTLV sequences were inferred from nucleotide sequence, and the results are consistent with the hypothesis that there were multiple introductions of the transcontinental subtype in Brazil. CONCLUSIONS: This study has increased the number of subtype aA full-length genomes from 8 to 81 and HTLV-1 aB from 2 to 5 sequences. The overall data confirmed that the cosmopolitan transcontinental sub-subtypes were the most prevalent in the Brazilian population. It is hoped that this valuable genomic data

  14. Sequencing and analysis of full-length cDNAs, 5'-ESTs and 3'-ESTs from a cartilaginous fish, the elephant shark (Callorhinchus milii).

    KAUST Repository

    Brenner, Sydney

    2012-10-08

    Cartilaginous fishes are the most ancient group of living jawed vertebrates (gnathostomes) and are, therefore, an important reference group for understanding the evolution of vertebrates. The elephant shark (Callorhinchus milii), a holocephalan cartilaginous fish, has been identified as a model cartilaginous fish genome because of its compact genome (∼910 Mb) and a genome project has been initiated to obtain its whole genome sequence. In this study, we have generated and sequenced full-length enriched cDNA libraries of the elephant shark using the \\'oligo-capping\\' method and Sanger sequencing. A total of 6,778 full-length protein-coding cDNA and 10,701 full-length noncoding cDNA were sequenced from six tissues (gills, intestine, kidney, liver, spleen, and testis) of the elephant shark. Analysis of their polyadenylation signals showed that polyadenylation usage in elephant shark is similar to that in mammals. Furthermore, both coding and noncoding transcripts of the elephant shark use the same proportion of canonical polyadenylation sites. Besides BLASTX searches, protein-coding transcripts were annotated by Gene Ontology, InterPro domain, and KEGG pathway analyses. By comparing elephant shark genes to bony vertebrate genes, we identified several ancient genes present in elephant shark but differentially lost in tetrapods or teleosts. Only ∼6% of elephant shark noncoding cDNA showed similarity to known noncoding RNAs (ncRNAs). The rest are either highly divergent ncRNAs or novel ncRNAs. In addition to full-length transcripts, 30,375 5\\'-ESTs and 41,317 3\\'-ESTs were sequenced and annotated. The clones and transcripts generated in this study are valuable resources for annotating transcription start sites, exon-intron boundaries, and UTRs of genes in the elephant shark genome, and for the functional characterization of protein sequences. These resources will also be useful for annotating genes in other cartilaginous fishes whose genomes have been targeted for

  15. Sequencing and analysis of full-length cDNAs, 5'-ESTs and 3'-ESTs from a cartilaginous fish, the elephant shark (Callorhinchus milii).

    KAUST Repository

    Brenner, Sydney; Kodzius, Rimantas; Tan, Yue Ying; Tay, Alice; Tay, Boon-Hui; Venkatesh, Byrappa

    2012-01-01

    Cartilaginous fishes are the most ancient group of living jawed vertebrates (gnathostomes) and are, therefore, an important reference group for understanding the evolution of vertebrates. The elephant shark (Callorhinchus milii), a holocephalan cartilaginous fish, has been identified as a model cartilaginous fish genome because of its compact genome (∼910 Mb) and a genome project has been initiated to obtain its whole genome sequence. In this study, we have generated and sequenced full-length enriched cDNA libraries of the elephant shark using the 'oligo-capping' method and Sanger sequencing. A total of 6,778 full-length protein-coding cDNA and 10,701 full-length noncoding cDNA were sequenced from six tissues (gills, intestine, kidney, liver, spleen, and testis) of the elephant shark. Analysis of their polyadenylation signals showed that polyadenylation usage in elephant shark is similar to that in mammals. Furthermore, both coding and noncoding transcripts of the elephant shark use the same proportion of canonical polyadenylation sites. Besides BLASTX searches, protein-coding transcripts were annotated by Gene Ontology, InterPro domain, and KEGG pathway analyses. By comparing elephant shark genes to bony vertebrate genes, we identified several ancient genes present in elephant shark but differentially lost in tetrapods or teleosts. Only ∼6% of elephant shark noncoding cDNA showed similarity to known noncoding RNAs (ncRNAs). The rest are either highly divergent ncRNAs or novel ncRNAs. In addition to full-length transcripts, 30,375 5'-ESTs and 41,317 3'-ESTs were sequenced and annotated. The clones and transcripts generated in this study are valuable resources for annotating transcription start sites, exon-intron boundaries, and UTRs of genes in the elephant shark genome, and for the functional characterization of protein sequences. These resources will also be useful for annotating genes in other cartilaginous fishes whose genomes have been targeted for whole

  16. Meiotic gene-conversion rate and tract length variation in the human genome.

    Science.gov (United States)

    Padhukasahasram, Badri; Rannala, Bruce

    2013-02-27

    Meiotic recombination occurs in the form of two different mechanisms called crossing-over and gene-conversion and both processes have an important role in shaping genetic variation in populations. Although variation in crossing-over rates has been studied extensively using sperm-typing experiments, pedigree studies and population genetic approaches, our knowledge of variation in gene-conversion parameters (ie, rates and mean tract lengths) remains far from complete. To explore variability in population gene-conversion rates and its relationship to crossing-over rate variation patterns, we have developed and validated using coalescent simulations a comprehensive Bayesian full-likelihood method that can jointly infer crossing-over and gene-conversion rates as well as tract lengths from population genomic data under general variable rate models with recombination hotspots. Here, we apply this new method to SNP data from multiple human populations and attempt to characterize for the first time the fine-scale variation in gene-conversion parameters along the human genome. We find that the estimated ratio of gene-conversion to crossing-over rates varies considerably across genomic regions as well as between populations. However, there is a great degree of uncertainty associated with such estimates. We also find substantial evidence for variation in the mean conversion tract length. The estimated tract lengths did not show any negative relationship with the local heterozygosity levels in our analysis.European Journal of Human Genetics advance online publication, 27 February 2013; doi:10.1038/ejhg.2013.30.

  17. Analysis of expressed sequence tags generated from full-length enriched cDNA libraries of melon

    Directory of Open Access Journals (Sweden)

    Bendahmane Abdelhafid

    2011-05-01

    Full Text Available Abstract Background Melon (Cucumis melo, an economically important vegetable crop, belongs to the Cucurbitaceae family which includes several other important crops such as watermelon, cucumber, and pumpkin. It has served as a model system for sex determination and vascular biology studies. However, genomic resources currently available for melon are limited. Result We constructed eleven full-length enriched and four standard cDNA libraries from fruits, flowers, leaves, roots, cotyledons, and calluses of four different melon genotypes, and generated 71,577 and 22,179 ESTs from full-length enriched and standard cDNA libraries, respectively. These ESTs, together with ~35,000 ESTs available in public domains, were assembled into 24,444 unigenes, which were extensively annotated by comparing their sequences to different protein and functional domain databases, assigning them Gene Ontology (GO terms, and mapping them onto metabolic pathways. Comparative analysis of melon unigenes and other plant genomes revealed that 75% to 85% of melon unigenes had homologs in other dicot plants, while approximately 70% had homologs in monocot plants. The analysis also identified 6,972 gene families that were conserved across dicot and monocot plants, and 181, 1,192, and 220 gene families specific to fleshy fruit-bearing plants, the Cucurbitaceae family, and melon, respectively. Digital expression analysis identified a total of 175 tissue-specific genes, which provides a valuable gene sequence resource for future genomics and functional studies. Furthermore, we identified 4,068 simple sequence repeats (SSRs and 3,073 single nucleotide polymorphisms (SNPs in the melon EST collection. Finally, we obtained a total of 1,382 melon full-length transcripts through the analysis of full-length enriched cDNA clones that were sequenced from both ends. Analysis of these full-length transcripts indicated that sizes of melon 5' and 3' UTRs were similar to those of tomato, but

  18. Full Genome Sequence and sfRNA Interferon Antagonist Activity of Zika Virus from Recife, Brazil.

    Directory of Open Access Journals (Sweden)

    Claire L Donald

    2016-10-01

    Full Text Available The outbreak of Zika virus (ZIKV in the Americas has transformed a previously obscure mosquito-transmitted arbovirus of the Flaviviridae family into a major public health concern. Little is currently known about the evolution and biology of ZIKV and the factors that contribute to the associated pathogenesis. Determining genomic sequences of clinical viral isolates and characterization of elements within these are an important prerequisite to advance our understanding of viral replicative processes and virus-host interactions.We obtained a ZIKV isolate from a patient who presented with classical ZIKV-associated symptoms, and used high throughput sequencing and other molecular biology approaches to determine its full genome sequence, including non-coding regions. Genome regions were characterized and compared to the sequences of other isolates where available. Furthermore, we identified a subgenomic flavivirus RNA (sfRNA in ZIKV-infected cells that has antagonist activity against RIG-I induced type I interferon induction, with a lesser effect on MDA-5 mediated action.The full-length genome sequence including non-coding regions of a South American ZIKV isolate from a patient with classical symptoms will support efforts to develop genetic tools for this virus. Detection of sfRNA that counteracts interferon responses is likely to be important for further understanding of pathogenesis and virus-host interactions.

  19. Molecular characterization of human T-cell lymphotropic virus type 1 full and partial genomes by Illumina massively parallel sequencing technology.

    Science.gov (United States)

    Pessôa, Rodrigo; Watanabe, Jaqueline Tomoko; Nukui, Youko; Pereira, Juliana; Casseb, Jorge; Kasseb, Jorge; de Oliveira, Augusto César Penalva; Segurado, Aluisio Cotrim; Sanabani, Sabri Saeed

    2014-01-01

    Here, we report on the partial and full-length genomic (FLG) variability of HTLV-1 sequences from 90 well-characterized subjects, including 48 HTLV-1 asymptomatic carriers (ACs), 35 HTLV-1-associated myelopathy/tropical spastic paraparesis (HAM/TSP) and 7 adult T-cell leukemia/lymphoma (ATLL) patients, using an Illumina paired-end protocol. Blood samples were collected from 90 individuals, and DNA was extracted from the PBMCs to measure the proviral load and to amplify the HTLV-1 FLG from two overlapping fragments. The amplified PCR products were subjected to deep sequencing. The sequencing data were assembled, aligned, and mapped against the HTLV-1 genome with sufficient genetic resemblance and utilized for further phylogenetic analysis. A high-throughput sequencing-by-synthesis instrument was used to obtain an average of 3210- and 5200-fold coverage of the partial (n = 14) and FLG (n = 76) data from the HTLV-1 strains, respectively. The results based on the phylogenetic trees of consensus sequences from partial and FLGs revealed that 86 (95.5%) individuals were infected with the transcontinental sub-subtypes of the cosmopolitan subtype (aA) and that 4 individuals (4.5%) were infected with the Japanese sub-subtypes (aB). A comparison of the nucleotide and amino acids of the FLG between the three clinical settings yielded no correlation between the sequenced genotype and clinical outcomes. The evolutionary relationships among the HTLV sequences were inferred from nucleotide sequence, and the results are consistent with the hypothesis that there were multiple introductions of the transcontinental subtype in Brazil. This study has increased the number of subtype aA full-length genomes from 8 to 81 and HTLV-1 aB from 2 to 5 sequences. The overall data confirmed that the cosmopolitan transcontinental sub-subtypes were the most prevalent in the Brazilian population. It is hoped that this valuable genomic data will add to our current understanding of the

  20. Generation of recombinant pestiviruses using a full genome amplification strategy

    DEFF Research Database (Denmark)

    Rasmussen, Thomas Bruun; Reimann, Ilona; Uttenthal, Åse

    Aim Complete genome amplification of viral RNA provides a new tool for generation of modified pestiviruses. We have recently reported a full genome amplification strategy for direct recovery of infectious pestivirus (Rasmussen et al., 2008). This comprised rescue of BDV strain “Gifhorn” from a full......-length RT-PCR amplicon demonstrating that long RT-PCR can be used for direct generation of an infectious pestivirus. The strategy is not limited to amplification of BDV “Gifhorn”, but can be further utilized for amplification of a diverse selection of pestivirus strains and for the generation of modified...... was reverse transcribed to cDNA at 50C for 90 minutes using SuperScript III reverse transcriptase (Invitrogen). Full-length PCR amplification was performed using primers specific for the extreme 5’- and 3’-ends of the viral genomes. A T7 promoter was incorporated in the 5’-primers for direct in vitro...

  1. Full-length Dysferlin Transfer by the Hyperactive Sleeping Beauty Transposase Restores Dysferlin-deficient Muscle

    Directory of Open Access Journals (Sweden)

    Helena Escobar

    2016-01-01

    Full Text Available Dysferlin-deficient muscular dystrophy is a progressive disease characterized by muscle weakness and wasting for which there is no treatment. It is caused by mutations in DYSF, a large, multiexonic gene that forms a coding sequence of 6.2 kb. Sleeping Beauty (SB transposon is a nonviral gene transfer vector, already used in clinical trials. The hyperactive SB system consists of a transposon DNA sequence and a transposase protein, SB100X, that can integrate DNA over 10 kb into the target genome. We constructed an SB transposon-based vector to deliver full-length human DYSF cDNA into dysferlin-deficient H2K A/J myoblasts. We demonstrate proper dysferlin expression as well as highly efficient engraftment (>1,100 donor-derived fibers of the engineered myoblasts in the skeletal muscle of dysferlin- and immunodeficient B6. Cg-Dysfprmd Prkdcscid/J (Scid/BLA/J mice. Nonviral gene delivery of full-length human dysferlin into muscle cells, along with a successful and efficient transplantation into skeletal muscle are important advances towards successful gene therapy of dysferlin-deficient muscular dystrophy.

  2. Universal global imprints of genome growth and evolution--equivalent length and cumulative mutation density.

    Directory of Open Access Journals (Sweden)

    Hong-Da Chen

    Full Text Available BACKGROUND: Segmental duplication is widely held to be an important mode of genome growth and evolution. Yet how this would affect the global structure of genomes has been little discussed. METHODS/PRINCIPAL FINDINGS: Here, we show that equivalent length, or L(e, a quantity determined by the variance of fluctuating part of the distribution of the k-mer frequencies in a genome, characterizes the latter's global structure. We computed the L(es of 865 complete chromosomes and found that they have nearly universal but (k-dependent values. The differences among the L(e of a chromosome and those of its coding and non-coding parts were found to be slight. CONCLUSIONS: We verified that these non-trivial results are natural consequences of a genome growth model characterized by random segmental duplication and random point mutation, but not of any model whose dominant growth mechanism is not segmental duplication. Our study also indicates that genomes have a nearly universal cumulative "point" mutation density of about 0.73 mutations per site that is compatible with the relatively low mutation rates of (1-5 x 10(-3/site/Mya previously determined by sequence comparison for the human and E. coli genomes.

  3. Scalable production in human cells and biochemical characterization of full-length normal and mutant huntingtin.

    Directory of Open Access Journals (Sweden)

    Bin Huang

    Full Text Available Huntingtin (Htt is a 350 kD intracellular protein, ubiquitously expressed and mainly localized in the cytoplasm. Huntington's disease (HD is caused by a CAG triplet amplification in exon 1 of the corresponding gene resulting in a polyglutamine (polyQ expansion at the N-terminus of Htt. Production of full-length Htt has been difficult in the past and so far a scalable system or process has not been established for recombinant production of Htt in human cells. The ability to produce Htt in milligram quantities would be a prerequisite for many biochemical and biophysical studies aiming in a better understanding of Htt function under physiological conditions and in case of mutation and disease. For scalable production of full-length normal (17Q and mutant (46Q and 128Q Htt we have established two different systems, the first based on doxycycline-inducible Htt expression in stable cell lines, the second on "gutless" adenovirus mediated gene transfer. Purified material has then been used for biochemical characterization of full-length Htt. Posttranslational modifications (PTMs were determined and several new phosphorylation sites were identified. Nearly all PTMs in full-length Htt localized to areas outside of predicted alpha-solenoid protein regions. In all detected N-terminal peptides methionine as the first amino acid was missing and the second, alanine, was found to be acetylated. Differences in secondary structure between normal and mutant Htt, a helix-rich protein, were not observed in our study. Purified Htt tends to form dimers and higher order oligomers, thus resembling the situation observed with N-terminal fragments, although the mechanism of oligomer formation may be different.

  4. Construction and characterization of a full-length cDNA library for the wheat stripe rust pathogen (Puccinia striiformis f. sp. tritici

    Directory of Open Access Journals (Sweden)

    Chen Xianming

    2007-06-01

    Full Text Available Abstract Background Puccinia striiformis is a plant pathogenic fungus causing stripe rust, one of the most important diseases on cereal crops and grasses worldwide. However, little is know about its genome and genes involved in the biology and pathogenicity of the pathogen. We initiated the functional genomic research of the fungus by constructing a full-length cDNA and determined functions of the first group of genes by sequence comparison of cDNA clones to genes reported in other fungi. Results A full-length cDNA library, consisting of 42,240 clones with an average cDNA insert of 1.9 kb, was constructed using urediniospores of race PST-78 of P. striiformis f. sp. tritici. From 196 sequenced cDNA clones, we determined functions of 73 clones (37.2%. In addition, 36 clones (18.4% had significant homology to hypothetical proteins, 37 clones (18.9% had some homology to genes in other fungi, and the remaining 50 clones (25.5% did not produce any hits. From the 73 clones with functions, we identified 51 different genes encoding protein products that are involved in amino acid metabolism, cell defense, cell cycle, cell signaling, cell structure and growth, energy cycle, lipid and nucleotide metabolism, protein modification, ribosomal protein complex, sugar metabolism, transcription factor, transport metabolism, and virulence/infection. Conclusion The full-length cDNA library is useful in identifying functional genes of P. striiformis.

  5. dsRNA binding characterization of full length recombinant wild type and mutants Zaire ebolavirus VP35.

    Science.gov (United States)

    Zinzula, Luca; Esposito, Francesca; Pala, Daniela; Tramontano, Enzo

    2012-03-01

    The Ebola viruses (EBOVs) VP35 protein is a multifunctional major virulence factor involved in EBOVs replication and evasion of the host immune system. EBOV VP35 is an essential component of the viral RNA polymerase, it is a key participant of the nucleocapsid assembly and it inhibits the innate immune response by antagonizing RIG-I like receptors through its dsRNA binding function and, hence, by suppressing the host type I interferon (IFN) production. Insights into the VP35 dsRNA recognition have been recently revealed by structural and functional analysis performed on its C-terminus protein. We report the biochemical characterization of the Zaire ebolavirus (ZEBOV) full-length recombinant VP35 (rVP35)-dsRNA binding function. We established a novel in vitro magnetic dsRNA binding pull down assay, determined the rVP35 optimal dsRNA binding parameters, measured the rVP35 equilibrium dissociation constant for heterologous in vitro transcribed dsRNA of different length and short synthetic dsRNA of 8bp, and validated the assay for compound screening by assessing the inhibitory ability of auryntricarboxylic acid (IC(50) value of 50μg/mL). Furthermore, we compared the dsRNA binding properties of full length wt rVP35 with those of R305A, K309A and R312A rVP35 mutants, which were previously reported to be defective in dsRNA binding-mediated IFN inhibition, showing that the latter have measurably increased K(d) values for dsRNA binding and modified migration patterns in mobility shift assays with respect to wt rVP35. Overall, these results provide the first characterization of the full-length wt and mutants VP35-dsRNA binding functions. Copyright © 2012 Elsevier B.V. All rights reserved.

  6. Full-length characterization of A1/D intersubtype recombinant genomes from a therapy-induced HIV type 1 controller during acute infection and his noncontrolling partner

    DEFF Research Database (Denmark)

    Fomsgaard, A.; Vinner, L.; Therrien, D.

    2008-01-01

    To increase the understanding of mechanisms of HIV control we have genetically and immunologically characterized a full-length HIV-1 isolated from an acute infection in a rare case of undetectable viremia. The subject, a 43-year-old Danish white male (DK1), was diagnosed with acute HIV-1 infection...... and phylogenic trees were constructed and diversity and evolutionary distances were calculated. Intracellular IFN-gamma in CD8(+)CD3(+) T-lymphocyte reactions was investigated by intracellular flow cytometry (IC-FACS). Virus isolates from both patients were A1D intersubtype recombinants showing 98% sequence...

  7. Universal and idiosyncratic characteristic lengths in bacterial genomes

    Science.gov (United States)

    Junier, Ivan; Frémont, Paul; Rivoire, Olivier

    2018-05-01

    In condensed matter physics, simplified descriptions are obtained by coarse-graining the features of a system at a certain characteristic length, defined as the typical length beyond which some properties are no longer correlated. From a physics standpoint, in vitro DNA has thus a characteristic length of 300 base pairs (bp), the Kuhn length of the molecule beyond which correlations in its orientations are typically lost. From a biology standpoint, in vivo DNA has a characteristic length of 1000 bp, the typical length of genes. Since bacteria live in very different physico-chemical conditions and since their genomes lack translational invariance, whether larger, universal characteristic lengths exist is a non-trivial question. Here, we examine this problem by leveraging the large number of fully sequenced genomes available in public databases. By analyzing GC content correlations and the evolutionary conservation of gene contexts (synteny) in hundreds of bacterial chromosomes, we conclude that a fundamental characteristic length around 10–20 kb can be defined. This characteristic length reflects elementary structures involved in the coordination of gene expression, which are present all along the genome of nearly all bacteria. Technically, reaching this conclusion required us to implement methods that are insensitive to the presence of large idiosyncratic genomic features, which may co-exist along these fundamental universal structures.

  8. Full-Genome Characterization and Genetic Evolution of West African Isolates of Bagaza Virus

    Directory of Open Access Journals (Sweden)

    Martin Faye

    2018-04-01

    Full Text Available Bagaza virus is a mosquito-borne flavivirus, first isolated in 1966 in Central African Republic. It has currently been identified in mosquito pools collected in the field in West and Central Africa. Emergence in wild birds in Europe and serological evidence in encephalitis patients in India raise questions on its genetic evolution and the diversity of isolates circulating in Africa. To better understand genetic diversity and evolution of Bagaza virus, we describe the full-genome characterization of 11 West African isolates, sampled from 1988 to 2014. Parameters such as genetic distances, N-glycosylation patterns, recombination events, selective pressures, and its codon adaptation to human genes are assessed. Our study is noteworthy for the observation of N-glycosylation and recombination in Bagaza virus and provides insight into its Indian origin from the 13th century. Interestingly, evidence of Bagaza virus codon adaptation to human house-keeping genes is also observed to be higher than those of other flaviviruses well known in human infections. Genetic variations on genome of West African Bagaza virus could play an important role in generating diversity and may promote Bagaza virus adaptation to other vertebrates and become an important threat in human health.

  9. Generation of recombinant pestiviruses using a full-genome amplification strategy

    DEFF Research Database (Denmark)

    Rasmussen, Thomas Bruun; Reimann, I.; Uttenthal, Åse

    2010-01-01

    -Gifhorn genome was generated by long RTPCR and then RNA transcripts derived from this amplicon were used to rescue infectious virus. Here, we have now used this full-genome amplification strategy for efficient and robust amplification of three additional pestivirus strains: the vaccine strain C and the virulent...... Paderborn strain of Classical swine fever virus plus the CP7 strain of Bovine viral diarrhoea virus. The amplicons were cloned directly into a stable single-copy bacterial artificial chromosome generating full-length pestivirus DNAs from which infectious RNA transcripts could be also derived....

  10. An analysis of expressed sequence tags of developing castor endosperm using a full-length cDNA library

    Directory of Open Access Journals (Sweden)

    Wallis James G

    2007-07-01

    Full Text Available Abstract Background Castor seeds are a major source for ricinoleate, an important industrial raw material. Genomics studies of castor plant will provide critical information for understanding seed metabolism, for effectively engineering ricinoleate production in transgenic oilseeds, or for genetically improving castor plants by eliminating toxic and allergic proteins in seeds. Results Full-length cDNAs are useful resources in annotating genes and in providing functional analysis of genes and their products. We constructed a full-length cDNA library from developing castor endosperm, and obtained 4,720 ESTs from 5'-ends of the cDNA clones representing 1,908 unique sequences. The most abundant transcripts are genes encoding storage proteins, ricin, agglutinin and oleosins. Several other sequences are also very numerous, including two acidic triacylglycerol lipases, and the oleate hydroxylase (FAH12 gene that is responsible for ricinoleate biosynthesis. The role(s of the lipases in developing castor seeds are not clear, and co-expressing of a lipase and the FAH12 did not result in significant changes in hydroxy fatty acid accumulation in transgenic Arabidopsis seeds. Only one oleate desaturase (FAD2 gene was identified in our cDNA sequences. Sequence and functional analyses of the castor FAD2 were carried out since it had not been characterized previously. Overexpression of castor FAD2 in a FAH12-expressing Arabidopsis line resulted in decreased accumulation of hydroxy fatty acids in transgenic seeds. Conclusion Our results suggest that transcriptional regulation of FAD2 and FAH12 genes maybe one of the mechanisms that contribute to a high level of ricinoleate accumulation in castor endosperm. The full-length cDNA library will be used to search for additional genes that affect ricinoleate accumulation in seed oils. Our EST sequences will also be useful to annotate the castor genome, which whole sequence is being generated by shotgun sequencing at

  11. Integrative annotation of 21,037 human genes validated by full-length cDNA clones.

    Directory of Open Access Journals (Sweden)

    Tadashi Imanishi

    2004-06-01

    Full Text Available The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/. It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs, identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA

  12. Full-Length Characterization of Hepatitis C Virus Subtype 3a Reveals Novel Hypervariable Regions under Positive Selection during Acute Infection▿

    OpenAIRE

    Humphreys, Isla; Fleming, Vicki; Fabris, Paolo; Parker, Joe; Schulenberg, Bodo; Brown, Anthony; Demetriou, Charis; Gaudieri, Silvana; Pfafferott, Katja; Lucas, Michaela; Collier, Jane; Huang, Kuan-Hsiang Gary; Pybus, Oliver G.; Klenerman, Paul; Barnes, Eleanor

    2009-01-01

    Hepatitis C virus subtype 3a is a highly prevalent and globally distributed strain that is often associated with infection via injection drug use. This subtype exhibits particular phenotypic characteristics. In spite of this, detailed genetic analysis of this subtype has rarely been performed. We performed full-length viral sequence analysis in 18 patients with chronic HCV subtype 3a infection and assessed genomic viral variability in comparison to other HCV subtypes. Two novel regions of int...

  13. Full-length genomic analysis of korean porcine sapelovirus strains

    DEFF Research Database (Denmark)

    Son, Kyu-Yeol; Kim, Deok-Song; Kwon, Joseph

    2014-01-01

    the typical picornavirus genome organization; 5'untranslated region (UTR)-L-VP4-VP2-VP3-VP1-2A-2B-2C-3A-3B-3C-3D-3'UTR. Three distinct cis-active RNA elements, the internal ribosome entry site (IRES) in the 5'UTR, a cis-replication element (CRE) in the 2C coding region and 3'UTR were identified...... and their structures were predicted. Interestingly, the structural features of the CRE and 3'UTR were different between PSV strains. The availability of these first complete genome sequences for PSV strains will facilitate future investigations of the molecular pathogenesis and evolutionary characteristics of PSV....

  14. PCR-based isolation and identification of full-length low-molecular-weight glutenin subunit genes in bread wheat (Triticum aestivum L.).

    Science.gov (United States)

    Zhang, Xiaofei; Liu, Dongcheng; Jiang, Wei; Guo, Xiaoli; Yang, Wenlong; Sun, Jiazhu; Ling, Hongqing; Zhang, Aimin

    2011-12-01

    Low-molecular-weight glutenin subunits (LMW-GSs) are encoded by a multi-gene family and are essential for determining the quality of wheat flour products, such as bread and noodles. However, the exact role or contribution of individual LMW-GS genes to wheat quality remains unclear. This is, at least in part, due to the difficulty in characterizing complete sequences of all LMW-GS gene family members in bread wheat. To identify full-length LMW-GS genes, a polymerase chain reaction (PCR)-based method was established, consisting of newly designed conserved primers and the previously developed LMW-GS gene molecular marker system. Using the PCR-based method, 17 LMW-GS genes were identified and characterized in Xiaoyan 54, of which 12 contained full-length sequences. Sequence alignments showed that 13 LMW-GS genes were identical to those found in Xiaoyan 54 using the genomic DNA library screening, and the other four full-length LMW-GS genes were first isolated from Xiaoyan 54. In Chinese Spring, 16 unique LMW-GS genes were isolated, and 13 of them contained full-length coding sequences. Additionally, 16 and 17 LMW-GS genes in Dongnong 101 and Lvhan 328 (chosen from the micro-core collections of Chinese germplasm), respectively, were also identified. Sequence alignments revealed that at least 15 LMW-GS genes were common in the four wheat varieties, and allelic variants of each gene shared high sequence identities (>95%) but exhibited length polymorphism in repetitive regions. This study provides a PCR-based method for efficiently identifying LMW-GS genes in bread wheat, which will improve the characterization of complex members of the LMW-GS gene family and facilitate the understanding of their contributions to wheat quality.

  15. Employment of Near Full-Length Ribosome Gene TA-Cloning and Primer-Blast to Detect Multiple Species in a Natural Complex Microbial Community Using Species-Specific Primers Designed with Their Genome Sequences.

    Science.gov (United States)

    Zhang, Huimin; He, Hongkui; Yu, Xiujuan; Xu, Zhaohui; Zhang, Zhizhou

    2016-11-01

    It remains an unsolved problem to quantify a natural microbial community by rapidly and conveniently measuring multiple species with functional significance. Most widely used high throughput next-generation sequencing methods can only generate information mainly for genus-level taxonomic identification and quantification, and detection of multiple species in a complex microbial community is still heavily dependent on approaches based on near full-length ribosome RNA gene or genome sequence information. In this study, we used near full-length rRNA gene library sequencing plus Primer-Blast to design species-specific primers based on whole microbial genome sequences. The primers were intended to be specific at the species level within relevant microbial communities, i.e., a defined genomics background. The primers were tested with samples collected from the Daqu (also called fermentation starters) and pit mud of a traditional Chinese liquor production plant. Sixteen pairs of primers were found to be suitable for identification of individual species. Among them, seven pairs were chosen to measure the abundance of microbial species through quantitative PCR. The combination of near full-length ribosome RNA gene library sequencing and Primer-Blast may represent a broadly useful protocol to quantify multiple species in complex microbial population samples with species-specific primers.

  16. Assessment of adaptive evolution between wheat and rice as deduced from full-length common wheat cDNA sequence data and expression patterns

    Directory of Open Access Journals (Sweden)

    Hayashizaki Yoshihide

    2009-06-01

    Full Text Available Abstract Background Wheat is an allopolyploid plant that harbors a huge, complex genome. Therefore, accumulation of expressed sequence tags (ESTs for wheat is becoming particularly important for functional genomics and molecular breeding. We prepared a comprehensive collection of ESTs from the various tissues that develop during the wheat life cycle and from tissues subjected to stress. We also examined their expression profiles in silico. As full-length cDNAs are indispensable to certify the collected ESTs and annotate the genes in the wheat genome, we performed a systematic survey and sequencing of the full-length cDNA clones. This sequence information is a valuable genetic resource for functional genomics and will enable carrying out comparative genomics in cereals. Results As part of the functional genomics and development of genomic wheat resources, we have generated a collection of full-length cDNAs from common wheat. By grouping the ESTs of recombinant clones randomly selected from the full-length cDNA library, we were able to sequence 6,162 independent clones with high accuracy. About 10% of the clones were wheat-unique genes, without any counterparts within the DNA database. Wheat clones that showed high homology to those of rice were selected in order to investigate their expression patterns in various tissues throughout the wheat life cycle and in response to abiotic-stress treatments. To assess the variability of genes that have evolved differently in wheat and rice, we calculated the substitution rate (Ka/Ks of the counterparts in wheat and rice. Genes that were preferentially expressed in certain tissues or treatments had higher Ka/Ks values than those in other tissues and treatments, which suggests that the genes with the higher variability expressed in these tissues is under adaptive selection. Conclusion We have generated a high-quality full-length cDNA resource for common wheat, which is essential for continuation of the

  17. Characterization of a new full length TMPRSS3 isoform and identification of mutant alleles responsible for nonsyndromic recessive deafness in Newfoundland and Pakistan

    Directory of Open Access Journals (Sweden)

    Shotland Lawrence I

    2004-09-01

    Full Text Available Abstract Background Mutant alleles of TMPRSS3 are associated with nonsyndromic recessive deafness (DFNB8/B10. TMPRSS3 encodes a predicted secreted serine protease, although the deduced amino acid sequence has no signal peptide. In this study, we searched for mutant alleles of TMPRSS3 in families from Pakistan and Newfoundland with recessive deafness co-segregating with DFNB8/B10 linked haplotypes and also more thoroughly characterized the genomic structure of TMPRSS3. Methods We enrolled families segregating recessive hearing loss from Pakistan and Newfoundland. Microsatellite markers flanking the TMPRSS3 locus were used for linkage analysis. DNA samples from participating individuals were sequenced for TMPRSS3. The structure of TMPRSS3 was characterized bioinformatically and experimentally by sequencing novel cDNA clones of TMPRSS3. Results We identified mutations in TMPRSS3 in four Pakistani families with recessive, nonsyndromic congenital deafness. We also identified two recessive mutations, one of which is novel, of TMPRSS3 segregating in a six-generation extended family from Newfoundland. The spectrum of TMPRSS3 mutations is reviewed in the context of a genotype-phenotype correlation. Our study also revealed a longer isoform of TMPRSS3 with a hitherto unidentified exon encoding a signal peptide, which is expressed in several tissues. Conclusion Mutations of TMPRSS3 contribute to hearing loss in many communities worldwide and account for 1.8% (8 of 449 of Pakistani families segregating congenital deafness as an autosomal recessive trait. The newly identified TMPRSS3 isoform e will be helpful in the functional characterization of the full length protein.

  18. Rapid CRISPR/Cas9-Mediated Cloning of Full-Length Epstein-Barr Virus Genomes from Latently Infected Cells.

    Science.gov (United States)

    Yajima, Misako; Ikuta, Kazufumi; Kanda, Teru

    2018-04-03

    Herpesviruses have relatively large DNA genomes of more than 150 kb that are difficult to clone and sequence. Bacterial artificial chromosome (BAC) cloning of herpesvirus genomes is a powerful technique that greatly facilitates whole viral genome sequencing as well as functional characterization of reconstituted viruses. We describe recently invented technologies for rapid BAC cloning of herpesvirus genomes using CRISPR/Cas9-mediated homology-directed repair. We focus on recent BAC cloning techniques of Epstein-Barr virus (EBV) genomes and discuss the possible advantages of a CRISPR/Cas9-mediated strategy comparatively with precedent EBV-BAC cloning strategies. We also describe the design decisions of this technology as well as possible pitfalls and points to be improved in the future. The obtained EBV-BAC clones are subjected to long-read sequencing analysis to determine complete EBV genome sequence including repetitive regions. Rapid cloning and sequence determination of various EBV strains will greatly contribute to the understanding of their global geographical distribution. This technology can also be used to clone disease-associated EBV strains and test the hypothesis that they have special features that distinguish them from strains that infect asymptomatically.

  19. Read length and repeat resolution: exploring prokaryote genomes using next-generation sequencing technologies.

    Directory of Open Access Journals (Sweden)

    Matt J Cahill

    Full Text Available BACKGROUND: There are a growing number of next-generation sequencing technologies. At present, the most cost-effective options also produce the shortest reads. However, even for prokaryotes, there is uncertainty concerning the utility of these technologies for the de novo assembly of complete genomes. This reflects an expectation that short reads will be unable to resolve small, but presumably abundant, repeats. METHODOLOGY/PRINCIPAL FINDINGS: Using a simple model of repeat assembly, we develop and test a technique that, for any read length, can estimate the occurrence of unresolvable repeats in a genome, and thus predict the number of gaps that would need to be closed to produce a complete sequence. We apply this technique to 818 prokaryote genome sequences. This provides a quantitative assessment of the relative performance of various lengths. Notably, unpaired reads of only 150nt can reconstruct approximately 50% of the analysed genomes with fewer than 96 repeat-induced gaps. Nonetheless, there is considerable variation amongst prokaryotes. Some genomes can be assembled to near contiguity using very short reads while others require much longer reads. CONCLUSIONS: Given the diversity of prokaryote genomes, a sequencing strategy should be tailored to the organism under study. Our results will provide researchers with a practical resource to guide the selection of the appropriate read length.

  20. Molecular comparisons of full length metapneumovirus (MPV genomes, including newly determined French AMPV-C and -D isolates, further supports possible subclassification within the MPV Genus.

    Directory of Open Access Journals (Sweden)

    Paul A Brown

    Full Text Available Four avian metapneumovirus (AMPV subgroups (A-D have been reported previously based on genetic and antigenic differences. However, until now full length sequences of the only known isolates of European subgroup C and subgroup D viruses (duck and turkey origin, respectively have been unavailable. These full length sequences were determined and compared with other full length AMPV and human metapneumoviruses (HMPV sequences reported previously, using phylogenetics, comparisons of nucleic and amino acid sequences and study of codon usage bias. Results confirmed that subgroup C viruses were more closely related to HMPV than they were to the other AMPV subgroups in the study. This was consistent with previous findings using partial genome sequences. Closer relationships between AMPV-A, B and D were also evident throughout the majority of results. Three metapneumovirus "clusters" HMPV, AMPV-C and AMPV-A, B and D were further supported by codon bias and phylogenetics. The data presented here together with those of previous studies describing antigenic relationships also between AMPV-A, B and D and between AMPV-C and HMPV may call for a subclassification of metapneumoviruses similar to that used for avian paramyxoviruses, grouping AMPV-A, B and D as type I metapneumoviruses and AMPV-C and HMPV as type II.

  1. Hibiscus latent Fort Pierce virus in Brazil and synthesis of its biologically active full-length cDNA clone.

    Science.gov (United States)

    Gao, Ruimin; Niu, Shengniao; Dai, Weifang; Kitajima, Elliot; Wong, Sek-Man

    2016-10-01

    A Brazilian isolate of Hibiscus latent Fort Pierce virus (HLFPV-BR) was firstly found in a hibiscus plant in Limeira, SP, Brazil. RACE PCR was carried out to obtain the full-length sequences of HLFPV-BR which is 6453 nucleotides and has more than 99.15 % of complete genomic RNA nucleotide sequence identity with that of HLFPV Japanese isolate. The genomic structure of HLFPV-BR is similar to other tobamoviruses. It includes a 5' untranslated region (UTR), followed by open reading frames encoding for a 128-kDa protein and a 188-kDa readthrough protein, a 38-kDa movement protein, 18-kDa coat protein, and a 3' UTR. Interestingly, the unique feature of poly(A) tract is also found within its 3'-UTR. Furthermore, from the total RNA extracted from the local lesions of HLFPV-BR-infected Chenopodium quinoa leaves, a biologically active, full-length cDNA clone encompassing the genome of HLFPV-BR was amplified and placed adjacent to a T7 RNA polymerase promoter. The capped in vitro transcripts from the cloned cDNA were infectious when mechanically inoculated into C. quinoa and Nicotiana benthamiana plants. This is the first report of the presence of an isolate of HLFPV in Brazil and the successful synthesis of a biologically active HLFPV-BR full-length cDNA clone.

  2. Full genome sequences and molecular characterization of tick-borne encephalitis virus strains isolated from human patients.

    Science.gov (United States)

    Formanová, Petra; Černý, Jiří; Bolfíková, Barbora Černá; Valdés, James J; Kozlova, Irina; Dzhioev, Yuri; Růžek, Daniel

    2015-02-01

    Tick-borne encephalitis virus (TBEV) causes tick-borne encephalitis (TBE), one of the most important human neuroinfections across Eurasia. Up to date, only three full genome sequences of human European TBEV isolates are available, mostly due to difficulties with isolation of the virus from human patients. Here we present full genome characterization of an additional five low-passage TBEV strains isolated from human patients with severe forms of TBE. These strains were isolated in 1953 within Central Bohemia in the former Czechoslovakia, and belong to the historically oldest human TBEV isolates in Europe. We demonstrate here that all analyzed isolates are distantly phylogenetically related, indicating that the emergence of TBE in Central Europe was not caused by one predominant strain, but rather a pool of distantly related TBEV strains. Nucleotide identity between individual sequenced TBEV strains ranged from 97.5% to 99.6% and all strains shared large deletions in the 3' non-coding region, which has been recently suggested to be an important determinant of virulence. The number of unique amino acid substitutions varied from 3 to 9 in individual isolates, but no characteristic amino acid substitution typical exclusively for all human TBEV isolates was identified when compared to the isolates from ticks. We did, however, correlate that the exploration of the TBEV envelope glycoprotein by specific antibodies were in close proximity to these unique amino acid substitutions. Taken together, we report here the largest number of patient-derived European TBEV full genome sequences to date and provide a platform for further studies on evolution of TBEV since the first emergence of human TBE in Europe. Copyright © 2014 Elsevier GmbH. All rights reserved.

  3. Molecular Comparisons of Full Length Metapneumovirus (MPV) Genomes, Including Newly Determined French AMPV-C and –D Isolates, Further Supports Possible Subclassification within the MPV Genus

    Science.gov (United States)

    Brown, Paul A.; Lemaitre, Evelyne; Briand, François-Xavier; Courtillon, Céline; Guionie, Olivier; Allée, Chantal; Toquin, Didier; Bayon-Auboyer, Marie-Hélène; Jestin, Véronique; Eterradossi, Nicolas

    2014-01-01

    Four avian metapneumovirus (AMPV) subgroups (A–D) have been reported previously based on genetic and antigenic differences. However, until now full length sequences of the only known isolates of European subgroup C and subgroup D viruses (duck and turkey origin, respectively) have been unavailable. These full length sequences were determined and compared with other full length AMPV and human metapneumoviruses (HMPV) sequences reported previously, using phylogenetics, comparisons of nucleic and amino acid sequences and study of codon usage bias. Results confirmed that subgroup C viruses were more closely related to HMPV than they were to the other AMPV subgroups in the study. This was consistent with previous findings using partial genome sequences. Closer relationships between AMPV-A, B and D were also evident throughout the majority of results. Three metapneumovirus “clusters” HMPV, AMPV-C and AMPV-A, B and D were further supported by codon bias and phylogenetics. The data presented here together with those of previous studies describing antigenic relationships also between AMPV-A, B and D and between AMPV-C and HMPV may call for a subclassification of metapneumoviruses similar to that used for avian paramyxoviruses, grouping AMPV-A, B and D as type I metapneumoviruses and AMPV-C and HMPV as type II. PMID:25036224

  4. Sequencing, mapping, and analysis of 27,455 maize full-length cDNAs.

    Directory of Open Access Journals (Sweden)

    Carol Soderlund

    2009-11-01

    Full Text Available Full-length cDNA (FLcDNA sequencing establishes the precise primary structure of individual gene transcripts. From two libraries representing 27 B73 tissues and abiotic stress treatments, 27,455 high-quality FLcDNAs were sequenced. The average transcript length was 1.44 kb including 218 bases and 321 bases of 5' and 3' UTR, respectively, with 8.6% of the FLcDNAs encoding predicted proteins of fewer than 100 amino acids. Approximately 94% of the FLcDNAs were stringently mapped to the maize genome. Although nearly two-thirds of this genome is composed of transposable elements (TEs, only 5.6% of the FLcDNAs contained TE sequences in coding or UTR regions. Approximately 7.2% of the FLcDNAs are putative transcription factors, suggesting that rare transcripts are well-enriched in our FLcDNA set. Protein similarity searching identified 1,737 maize transcripts not present in rice, sorghum, Arabidopsis, or poplar annotated genes. A strict FLcDNA assembly generated 24,467 non-redundant sequences, of which 88% have non-maize protein matches. The FLcDNAs were also assembled with 41,759 FLcDNAs in GenBank from other projects, where semi-strict parameters were used to identify 13,368 potentially unique non-redundant sequences from this project. The libraries, ESTs, and FLcDNA sequences produced from this project are publicly available. The annotated EST and FLcDNA assemblies are available through the maize FLcDNA web resource (www.maizecdna.org.

  5. Full-length genomic and molecular characterization of Canine parvovirus in dogs from North of Brazil.

    Science.gov (United States)

    Silva, S P; Silva, L N P P; Rodrigues, E D L; Cardoso, J F; Tavares, F N; Souza, W M; Santos, C M P; Martins, F M S; Jesus, I S; Brito, T C; Moura, T P C; Nunes, M R T; Casseb, L M N; Silva Filho, E; Casseb, A R

    2017-09-21

    With the objective of characterizing Canine parvovirus (CPV) from some suspected fecal samples of dogs collected from the Veterinarian Hospital in Belém city, five positive samples were found by PCR assay and an update molecular characterization was provided of the CPV-2 circulation in Belém. Through sequencing of the complete DNA sequences (NS1, NS2, VP1, and VP2 genes), the CPV-2 strain was identified as CPV-2b (Asn426Asp) circulating in Belém. The CPV-2b strain with a different change at the position Tyr324Leu was detected in all samples assessed and thus reported for the first time for the scientific community. Phylogenetic analysis indicated that Belém CPV-2b and CPV-2a strains would be related to a cluster with samples after the 1990s, suggesting that CPV-2b in Belém originated from CPV-2a circulating in Brazil after the 1990s. Potential recombination events were analyzed using RDP4 and SplitsTree4; therefore, results suggest that CPV-2 sequences here described were not potentially recombination events. Continuous monitoring and molecular characterization of CPV-2 samples are needed not only to identify possible genetic and antigenic changes that may interfere with the effectiveness of vaccines but also to bring a better understanding of the mechanisms that drive the evolution of CPV-2 in Brazil.

  6. VP1u phospholipase activity is critical for infectivity of full-length parvovirus B19 genomic clones.

    Science.gov (United States)

    Filippone, Claudia; Zhi, Ning; Wong, Susan; Lu, Jun; Kajigaya, Sachiko; Gallinella, Giorgio; Kakkola, Laura; Söderlund-Venermo, Maria; Young, Neal S; Brown, Kevin E

    2008-05-10

    Three full-length genomic clones (pB19-M20, pB19-FL and pB19-HG1) of parvovirus B19 were produced in different laboratories. pB19-M20 was shown to produce infectious virus. To determine the differences in infectivity, all three plasmids were tested by transfection and infection assays. All three clones were similar in viral DNA replication, RNA transcription, and viral capsid protein production. However, only pB19-M20 and pB19-HG1 produced infectious virus. Comparison of viral sequences showed no significant differences in ITR or NS regions. In the capsid region, there was a nucleotide sequence difference conferring an amino acid substitution (E176K) in the phospholipase A2-like motif of the VP1-unique (VP1u) region. The recombinant VP1u with the E176K mutation had no catalytic activity as compared with the wild-type. When this mutation was introduced into pB19-M20, infectivity was significantly attenuated, confirming the critical role of this motif. Investigation of the original serum from which pB19-FL was cloned confirmed that the phospholipase mutation was present in the native B19 virus.

  7. VP1u phospholipase activity is critical for infectivity of full-length parvovirus B19 genomic clones✰

    Science.gov (United States)

    Filippone, Claudia; Zhi, Ning; Wong, Susan; Lu, Jun; Kajigaya, Sachiko; Gallinella, Giorgio; Kakkola, Laura; Venermo, Maria S Söderlund; Young, Neal S.; Brown, Kevin E.

    2008-01-01

    Three full-length genomic clones (pB19-M20, pB19-FL and pB19-HG1) of parvovirus B19 were produced in different laboratories. pB19-M20 was shown to produce infectious virus. To determine the differences in infectivity, all three plasmids were tested by transfection and infection assays. All three clones were similar in viral DNA replication, RNA transcription, and viral capsid protein production. However, only pB19-M20 and pB19-HG1 produced infectious virus. Comparison of viral sequences showed no significant differences in ITR or NS regions. In the capsid region, there was a nucleotide sequence difference conferring an amino acid substitution (E176K) in the phospholipase A2-like motif of the VP1-unique (VP1u) region. The recombinant VP1u with the E176K mutation had no catalytic activity as compared with the wild-type. When this mutation was introduced into pB19-M20, infectivity was significantly attenuated, confirming the critical role of this motif. Investigation of the original serum from which pB19-FL was cloned confirmed that the phospholipase mutation was present in the native B19 virus. PMID:18252260

  8. Functional characterization of a full length pregnane X receptor, expression in vivo, and identification of PXR alleles, in Zebrafish (Danio rerio)

    Energy Technology Data Exchange (ETDEWEB)

    Bainy, Afonso C.D. [Biology Department, Woods Hole Oceanographic Institution, Woods Hole, MA 02543 (United States); Departamento de Bioquímica, CCB, Universidade Federal de Santa Catarina, Florianópolis, SC 88040-900 (Brazil); Kubota, Akira; Goldstone, Jared V. [Biology Department, Woods Hole Oceanographic Institution, Woods Hole, MA 02543 (United States); Lille-Langøy, Roger [Department of Biology, University of Bergen, N-5020 Bergen (Norway); Karchner, Sibel I. [Biology Department, Woods Hole Oceanographic Institution, Woods Hole, MA 02543 (United States); Celander, Malin C. [Department of Biological and Environmental Sciences, University of Gothenburg, SE 405 30 Göteborg (Sweden); Hahn, Mark E. [Biology Department, Woods Hole Oceanographic Institution, Woods Hole, MA 02543 (United States); Goksøyr, Anders [Department of Biology, University of Bergen, N-5020 Bergen (Norway); Stegeman, John J., E-mail: jstegeman@whoi.edu [Biology Department, Woods Hole Oceanographic Institution, Woods Hole, MA 02543 (United States)

    2013-10-15

    Highlights: •Full-length pxr has been cloned from zebrafish. •Alleles of pxr were identified in zebrafish. •Full length Pxr was activated less strongly than ligand binding domain in cell-based reporter assays. •High levels of pxr expression were found in eye and brain as well as in liver. •TCPOBOP and PB did not significantly alter expression of pxr in liver. -- Abstract: The pregnane X receptor (PXR) (nuclear receptor NR1I2) is a ligand activated transcription factor, mediating responses to diverse xenobiotic and endogenous chemicals. The properties of PXR in fish are not fully understood. Here we report on cloning and characterization of full-length PXR of zebrafish, Danio rerio, and pxr expression in vivo. Initial efforts gave a cDNA encoding a 430 amino acid protein identified as zebrafish pxr by phylogenetic and synteny analysis. The sequence of the cloned Pxr DNA binding domain (DBD) was highly conserved, with 74% identity to human PXR-DBD, while the ligand-binding domain (LBD) of the cloned sequence was only 44% identical to human PXR-LBD. Sequence variation among clones in the initial effort prompted sequencing of multiple clones from a single fish. There were two prominent variants, one sequence with S183, Y218 and H383 and the other with I183, C218 and N383, which we designate as alleles pxr*1 (nr1i2*1) and pxr*2 (nr1i2*2), respectively. In COS-7 cells co-transfected with a PXR-responsive reporter gene, the full-length Pxr*1 (the more common variant) was activated by known PXR agonists clotrimazole and pregnenolone 16α-carbonitrile but to a lesser extent than the full-length human PXR. Activation of full-length Pxr*1 was only 10% of that with the Pxr*1 LBD. Quantitative real time PCR analysis showed prominent expression of pxr in liver and eye, as well as brain and intestine of adult zebrafish. The pxr was expressed in heart and kidney at levels similar to that in intestine. The expression of pxr in liver was weakly induced by ligands for

  9. The full-length E1-circumflexE4 protein of human papillomavirus type 18 modulates differentiation-dependent viral DNA amplification and late gene expression

    International Nuclear Information System (INIS)

    Wilson, Regina; Ryan, Gordon B.; Knight, Gillian L.; Laimins, Laimonis A.; Roberts, Sally

    2007-01-01

    Activation of the productive phase of the human papillomavirus (HPV) life cycle in differentiated keratinocytes is coincident with high-level expression of E1-circumflexE4 protein. To determine the role of E1-circumflexE4 in the HPV replication cycle, we constructed HPV18 mutant genomes in which expression of the full-length E1-circumflexE4 protein was abrogated. Undifferentiated keratinocytes containing mutant genomes showed enhanced proliferation when compared to cells containing wildtype genomes, but there were no differences in maintenance of viral episomes. Following differentiation, cells with mutant genomes exhibited reduced levels of viral DNA amplification and late gene expression, compared to wildtype genome-containing cells. This indicates that HPV18 E1-circumflexE4 plays an important role in regulating HPV late functions, and it may also function in the early phase of the replication cycle. Our finding that full-length HPV18 E1-circumflexE4 protein plays a significant role in promoting viral genome amplification concurs with a similar report with HPV31, but is in contrast to an HPV11 study where viral DNA amplification was not dependent on full-length E1-circumflexE4 expression, and to HPV16 where only C-terminal truncations in E1-circumflexE4 abrogated vegetative genome replication. This suggests that type-specific differences exist between various E1-circumflexE4 proteins

  10. Telomere Length Dynamics and the Evolution of Cancer Genome Architecture

    Directory of Open Access Journals (Sweden)

    Kez Cleal

    2018-02-01

    Full Text Available Telomeres are progressively eroded during repeated rounds of cell division due to the end replication problem but also undergo additional more substantial stochastic shortening events. In most cases, shortened telomeres induce a cell-cycle arrest or trigger apoptosis, although for those cells that bypass such signals during tumour progression, a critical length threshold is reached at which telomere dysfunction may ensue. Dysfunction of the telomere nucleoprotein complex can expose free chromosome ends to the DNA double-strand break (DSB repair machinery, leading to telomere fusion with both telomeric and non-telomeric loci. The consequences of telomere fusions in promoting genome instability have long been appreciated through the breakage–fusion–bridge (BFB cycle mechanism, although recent studies using high-throughput sequencing technologies have uncovered evidence of involvement in a wider spectrum of genomic rearrangements including chromothripsis. A critical step in cancer progression is the transition of a clone to immortality, through the stabilisation of the telomere repeat array. This can be achieved via the reactivation of telomerase, or the induction of the alternative lengthening of telomeres (ALT pathway. Whilst telomere dysfunction may promote genome instability and tumour progression, by limiting the replicative potential of a cell and enforcing senescence, telomere shortening can act as a tumour suppressor mechanism. However, the burden of senescent cells has also been implicated as a driver of ageing and age-related pathology, and in the promotion of cancer through inflammatory signalling. Considering the critical role of telomere length in governing cancer biology, we review questions related to the prognostic value of studying the dynamics of telomere shortening and fusion, and discuss mechanisms and consequences of telomere-induced genome rearrangements.

  11. Full-length genomic sequence of hepatitis B virus genotype C2 isolated from a native Brazilian patient

    Directory of Open Access Journals (Sweden)

    Mónica Viviana Alvarado-Mora

    2011-06-01

    Full Text Available The hepatitis B virus (HBV is among the leading causes of chronic hepatitis, cirrhosis and hepatocellular carcinoma. In Brazil, genotype A is the most frequent, followed by genotypes D and F. Genotypes B and C are found in Brazil exclusively among Asian patients and their descendants. The aim of this study was to sequence the entire HBV genome of a Caucasian patient infected with HBV/C2 and to infer the origin of the virus based on sequencing analysis. The sequence of this Brazilian isolate was grouped with four other sequences described in China. The sequence of this patient is the first complete genome of HBV/C2 reported in Brazil.

  12. Recombination events and variability among full-length genomes of co-circulating molluscum contagiosum virus subtypes 1 and 2.

    Science.gov (United States)

    López-Bueno, Alberto; Parras-Moltó, Marcos; López-Barrantes, Olivia; Belda, Sylvia; Alejo, Alí

    2017-05-01

    Molluscum contagiosum virus (MCV) is the sole member of the Molluscipoxvirus genus and causes a highly prevalent human disease of the skin characterized by the formation of a variable number of lesions that can persist for prolonged periods of time. Two major genotypes, subtype 1 and subtype 2, are recognized, although currently only a single complete genomic sequence corresponding to MCV subtype 1 is available. Using next-generation sequencing techniques, we report the complete genomic sequence of four new MCV isolates, including the first one derived from a subtype 2. Comparisons suggest a relatively distant evolutionary split between both MCV subtypes. Further, our data illustrate concurrent circulation of distinct viruses within a population and reveal the existence of recombination events among them. These results help identify a set of MCV genes with potentially relevant roles in molluscum contagiosum epidemiology and pathogenesis.

  13. Construction and characterization of a full-length infectious cDNA clone of foot-and-mouth disease virus strain O/JPN/2010 isolated in Japan in 2010.

    Science.gov (United States)

    Nishi, Tatsuya; Onozato, Hiroyuki; Ohashi, Seiichi; Fukai, Katsuhiko; Yamada, Manabu; Morioka, Kazuki; Kanno, Toru

    2016-06-01

    A full-length infectious cDNA clone of the genome of a foot-and-mouth disease virus isolated from the 2010 epidemic in Japan was constructed and designated pSVL-f02. Transfection of Cos-7 or IBRS-2 cells with this clone allowed the recovery of infectious virus. The recovered virus had the same in vitro characterization as the parental virus with regard to antigenicity in neutralization and indirect immunofluorescence tests, plaque size and one-step growth. Pigs were experimentally infected with the parental virus or the recombinant virus recovered from pSVL-f02 transfected cells. There were no significant differences in clinical signs or antibody responses between the two groups, and virus isolation and viral RNA detection from clinical samples were similar. Virus recovered from transfected cells therefore retained the in vitro characteristics and the in vivo pathogenicity of their parental strain. This cDNA clone should be a valuable tool to analyze determinants of pathogenicity and mechanisms of virus replication, and to develop genetically engineered vaccines against foot-and-mouth disease virus. Copyright © 2016 Elsevier Ltd. All rights reserved.

  14. Full-length enriched multistage cDNA library construction covering ...

    African Journals Online (AJOL)

    DR TONUKARI NYEROVWO

    2012-04-10

    Apr 10, 2012 ... Full Length Research Paper. Full-length enriched ... complementary DNA; pfu, plaque-forming unit. ... Chinese-native tree species in Populus section Leuce ... the infected bacteria, 2 ml melted top agar was added, and the.

  15. Exact Solution of Mutator Model with Linear Fitness and Finite Genome Length

    Science.gov (United States)

    Saakian, David B.

    2017-08-01

    We considered the infinite population version of the mutator phenomenon in evolutionary dynamics, looking at the uni-directional mutations in the mutator-specific genes and linear selection. We solved exactly the model for the finite genome length case, looking at the quasispecies version of the phenomenon. We calculated the mutator probability both in the statics and dynamics. The exact solution is important for us because the mutator probability depends on the genome length in a highly non-trivial way.

  16. On the total number of genes and their length distribution in complete microbial genomes

    DEFF Research Database (Denmark)

    Skovgaard, M; Jensen, L J; Brunak, S

    2001-01-01

    In sequenced microbial genomes, some of the annotated genes are actually not protein-coding genes, but rather open reading frames that occur by chance. Therefore, the number of annotated genes is higher than the actual number of genes for most of these microbes. Comparison of the length distribut......In sequenced microbial genomes, some of the annotated genes are actually not protein-coding genes, but rather open reading frames that occur by chance. Therefore, the number of annotated genes is higher than the actual number of genes for most of these microbes. Comparison of the length...... distribution of the annotated genes with the length distribution of those matching a known protein reveals that too many short genes are annotated in many genomes. Here we estimate the true number of protein-coding genes for sequenced genomes. Although it is often claimed that Escherichia coli has about 4300...... genes, we show that it probably has only approximately 3800 genes, and that a similar discrepancy exists for almost all published genomes....

  17. Genome survey sequencing and genetic background characterization of Gracilariopsis lemaneiformis (Rhodophyta) based on next-generation sequencing.

    Science.gov (United States)

    Zhou, Wei; Hu, Yiyi; Sui, Zhenghong; Fu, Feng; Wang, Jinguo; Chang, Lianpeng; Guo, Weihua; Li, Binbin

    2013-01-01

    Gracilariopsis lemaneiformis has a high economic value and is one of the most important aquaculture species in China. Despite it is economic importance, it has remained largely unstudied at the genomic level. In this study, we conducted a genome survey of Gp. lemaneiformis using next-generation sequencing (NGS) technologies. In total, 18.70 Gb of high-quality sequence data with an estimated genome size of 97 Mb were obtained by HiSeq 2000 sequencing for Gp. lemaneiformis. These reads were assembled into 160,390 contigs with a N50 length of 3.64 kb, which were further assembled into 125,685 scaffolds with a total length of 81.17 Mb. Genome analysis predicted 3490 genes and a GC% content of 48%. The identified genes have an average transcript length of 1,429 bp, an average coding sequence size of 1,369 bp, 1.36 exons per gene, exon length of 1,008 bp, and intron length of 191 bp. From the initial assembled scaffold, transposable elements constituted 54.64% (44.35 Mb) of the genome, and 7737 simple sequence repeats (SSRs) were identified. Among these SSRs, the trinucleotide repeat type was the most abundant (up to 73.20% of total SSRs), followed by the di- (17.41%), tetra- (5.49%), hexa- (2.90%), and penta- (1.00%) nucleotide repeat type. These characteristics suggest that Gp. lemaneiformis is a model organism for genetic study. This is the first report of genome-wide characterization within this taxon.

  18. Genome Survey Sequencing and Genetic Background Characterization of Gracilariopsis lemaneiformis (Rhodophyta) Based on Next-Generation Sequencing

    Science.gov (United States)

    Sui, Zhenghong; Fu, Feng; Wang, Jinguo; Chang, Lianpeng; Guo, Weihua; Li, Binbin

    2013-01-01

    Gracilariopsis lemaneiformis has a high economic value and is one of the most important aquaculture species in China. Despite it is economic importance, it has remained largely unstudied at the genomic level. In this study, we conducted a genome survey of Gp. lemaneiformis using next-generation sequencing (NGS) technologies. In total, 18.70 Gb of high-quality sequence data with an estimated genome size of 97 Mb were obtained by HiSeq 2000 sequencing for Gp. lemaneiformis. These reads were assembled into 160,390 contigs with a N50 length of 3.64 kb, which were further assembled into 125,685 scaffolds with a total length of 81.17 Mb. Genome analysis predicted 3490 genes and a GC% content of 48%. The identified genes have an average transcript length of 1,429 bp, an average coding sequence size of 1,369 bp, 1.36 exons per gene, exon length of 1,008 bp, and intron length of 191 bp. From the initial assembled scaffold, transposable elements constituted 54.64% (44.35 Mb) of the genome, and 7737 simple sequence repeats (SSRs) were identified. Among these SSRs, the trinucleotide repeat type was the most abundant (up to 73.20% of total SSRs), followed by the di- (17.41%), tetra- (5.49%), hexa- (2.90%), and penta- (1.00%) nucleotide repeat type. These characteristics suggest that Gp. lemaneiformis is a model organism for genetic study. This is the first report of genome-wide characterization within this taxon. PMID:23875008

  19. Length and GC content variability of introns among teleostean genomes in the light of the metabolic rate hypothesis.

    Directory of Open Access Journals (Sweden)

    Ankita Chaurasia

    Full Text Available A comparative analysis of five teleostean genomes, namely zebrafish, medaka, three-spine stickleback, fugu and pufferfish was performed with the aim to highlight the nature of the forces driving both length and base composition of introns (i.e., bpi and GCi. An inter-genome approach using orthologous intronic sequences was carried out, analyzing independently both variables in pairwise comparisons. An average length shortening of introns was observed at increasing average GCi values. The result was not affected by masking transposable and repetitive elements harbored in the intronic sequences. The routine metabolic rate (mass specific temperature-corrected using the Boltzmann's factor was measured for each species. A significant correlation held between average differences of metabolic rate, length and GC content, while environmental temperature of fish habitat was not correlated with bpi and GCi. Analyzing the concomitant effect of both variables, i.e., bpi and GCi, at increasing genomic GC content, a decrease of bpi and an increase of GCi was observed for the significant majority of the intronic sequences (from ∼ 40% to ∼ 90%, in each pairwise comparison. The opposite event, concomitant increase of bpi and decrease of GCi, was counter selected (from <1% to ∼ 10%, in each pairwise comparison. The results further support the hypothesis that the metabolic rate plays a key role in shaping genome architecture and evolution of vertebrate genomes.

  20. Generation and analysis of a large-scale expressed sequence Tag database from a full-length enriched cDNA library of developing leaves of Gossypium hirsutum L.

    Directory of Open Access Journals (Sweden)

    Min Lin

    Full Text Available BACKGROUND: Cotton (Gossypium hirsutum L. is one of the world's most economically-important crops. However, its entire genome has not been sequenced, and limited resources are available in GenBank for understanding the molecular mechanisms underlying leaf development and senescence. METHODOLOGY/PRINCIPAL FINDINGS: In this study, 9,874 high-quality ESTs were generated from a normalized, full-length cDNA library derived from pooled RNA isolated from throughout leaf development during the plant blooming stage. After clustering and assembly of these ESTs, 5,191 unique sequences, representative 1,652 contigs and 3,539 singletons, were obtained. The average unique sequence length was 682 bp. Annotation of these unique sequences revealed that 84.4% showed significant homology to sequences in the NCBI non-redundant protein database, and 57.3% had significant hits to known proteins in the Swiss-Prot database. Comparative analysis indicated that our library added 2,400 ESTs and 991 unique sequences to those known for cotton. The unigenes were functionally characterized by gene ontology annotation. We identified 1,339 and 200 unigenes as potential leaf senescence-related genes and transcription factors, respectively. Moreover, nine genes related to leaf senescence and eleven MYB transcription factors were randomly selected for quantitative real-time PCR (qRT-PCR, which revealed that these genes were regulated differentially during senescence. The qRT-PCR for three GhYLSs revealed that these genes express express preferentially in senescent leaves. CONCLUSIONS/SIGNIFICANCE: These EST resources will provide valuable sequence information for gene expression profiling analyses and functional genomics studies to elucidate their roles, as well as for studying the mechanisms of leaf development and senescence in cotton and discovering candidate genes related to important agronomic traits of cotton. These data will also facilitate future whole-genome sequence

  1. Read length and repeat resolution: Exploring prokaryote genomes using next-generation sequencing technologies

    KAUST Repository

    Cahill, Matt J.

    2010-07-12

    Background: There are a growing number of next-generation sequencing technologies. At present, the most cost-effective options also produce the shortest reads. However, even for prokaryotes, there is uncertainty concerning the utility of these technologies for the de novo assembly of complete genomes. This reflects an expectation that short reads will be unable to resolve small, but presumably abundant, repeats. Methodology/Principal Findings: Using a simple model of repeat assembly, we develop and test a technique that, for any read length, can estimate the occurrence of unresolvable repeats in a genome, and thus predict the number of gaps that would need to be closed to produce a complete sequence. We apply this technique to 818 prokaryote genome sequences. This provides a quantitative assessment of the relative performance of various lengths. Notably, unpaired reads of only 150nt can reconstruct approximately 50% of the analysed genomes with fewer than 96 repeat-induced gaps. Nonetheless, there is considerable variation amongst prokaryotes. Some genomes can be assembled to near contiguity using very short reads while others require much longer reads. Conclusions: Given the diversity of prokaryote genomes, a sequencing strategy should be tailored to the organism under study. Our results will provide researchers with a practical resource to guide the selection of the appropriate read length. 2010 Cahill et al.

  2. Read length and repeat resolution: Exploring prokaryote genomes using next-generation sequencing technologies

    KAUST Repository

    Cahill, Matt J.; Kö ser, Claudio U.; Ross, Nicholas E.; Archer, John A.C.

    2010-01-01

    Background: There are a growing number of next-generation sequencing technologies. At present, the most cost-effective options also produce the shortest reads. However, even for prokaryotes, there is uncertainty concerning the utility of these technologies for the de novo assembly of complete genomes. This reflects an expectation that short reads will be unable to resolve small, but presumably abundant, repeats. Methodology/Principal Findings: Using a simple model of repeat assembly, we develop and test a technique that, for any read length, can estimate the occurrence of unresolvable repeats in a genome, and thus predict the number of gaps that would need to be closed to produce a complete sequence. We apply this technique to 818 prokaryote genome sequences. This provides a quantitative assessment of the relative performance of various lengths. Notably, unpaired reads of only 150nt can reconstruct approximately 50% of the analysed genomes with fewer than 96 repeat-induced gaps. Nonetheless, there is considerable variation amongst prokaryotes. Some genomes can be assembled to near contiguity using very short reads while others require much longer reads. Conclusions: Given the diversity of prokaryote genomes, a sequencing strategy should be tailored to the organism under study. Our results will provide researchers with a practical resource to guide the selection of the appropriate read length. 2010 Cahill et al.

  3. Genome-wide characterization of centromeric satellites from multiple mammalian genomes.

    Science.gov (United States)

    Alkan, Can; Cardone, Maria Francesca; Catacchio, Claudia Rita; Antonacci, Francesca; O'Brien, Stephen J; Ryder, Oliver A; Purgato, Stefania; Zoli, Monica; Della Valle, Giuliano; Eichler, Evan E; Ventura, Mario

    2011-01-01

    Despite its importance in cell biology and evolution, the centromere has remained the final frontier in genome assembly and annotation due to its complex repeat structure. However, isolation and characterization of the centromeric repeats from newly sequenced species are necessary for a complete understanding of genome evolution and function. In recent years, various genomes have been sequenced, but the characterization of the corresponding centromeric DNA has lagged behind. Here, we present a computational method (RepeatNet) to systematically identify higher-order repeat structures from unassembled whole-genome shotgun sequence and test whether these sequence elements correspond to functional centromeric sequences. We analyzed genome datasets from six species of mammals representing the diversity of the mammalian lineage, namely, horse, dog, elephant, armadillo, opossum, and platypus. We define candidate monomer satellite repeats and demonstrate centromeric localization for five of the six genomes. Our analysis revealed the greatest diversity of centromeric sequences in horse and dog in contrast to elephant and armadillo, which showed high-centromeric sequence homogeneity. We could not isolate centromeric sequences within the platypus genome, suggesting that centromeres in platypus are not enriched in satellite DNA. Our method can be applied to the characterization of thousands of other vertebrate genomes anticipated for sequencing in the near future, providing an important tool for annotation of centromeres.

  4. An RNA-Seq strategy to detect the complete coding and non-coding transcriptome including full-length imprinted macro ncRNAs.

    Directory of Open Access Journals (Sweden)

    Ru Huang

    Full Text Available Imprinted macro non-protein-coding (nc RNAs are cis-repressor transcripts that silence multiple genes in at least three imprinted gene clusters in the mouse genome. Similar macro or long ncRNAs are abundant in the mammalian genome. Here we present the full coding and non-coding transcriptome of two mouse tissues: differentiated ES cells and fetal head using an optimized RNA-Seq strategy. The data produced is highly reproducible in different sequencing locations and is able to detect the full length of imprinted macro ncRNAs such as Airn and Kcnq1ot1, whose length ranges between 80-118 kb. Transcripts show a more uniform read coverage when RNA is fragmented with RNA hydrolysis compared with cDNA fragmentation by shearing. Irrespective of the fragmentation method, all coding and non-coding transcripts longer than 8 kb show a gradual loss of sequencing tags towards the 3' end. Comparisons to published RNA-Seq datasets show that the strategy presented here is more efficient in detecting known functional imprinted macro ncRNAs and also indicate that standardization of RNA preparation protocols would increase the comparability of the transcriptome between different RNA-Seq datasets.

  5. Genomic Analysis of Terpene Synthase Family and Functional Characterization of Seven Sesquiterpene Synthases from Citrus sinensis

    Directory of Open Access Journals (Sweden)

    Berta Alquézar

    2017-08-01

    Full Text Available Citrus aroma and flavor, chief traits of fruit quality, are derived from their high content in essential oils of most plant tissues, including leaves, stems, flowers, and fruits. Accumulated in secretory cavities, most components of these oils are volatile terpenes. They contribute to defense against herbivores and pathogens, and perhaps also protect tissues against abiotic stress. In spite of their importance, our understanding of the physiological, biochemical, and genetic regulation of citrus terpene volatiles is still limited. The availability of the sweet orange (Citrus sinensis L. Osbeck genome sequence allowed us to characterize for the first time the terpene synthase (TPS family in a citrus type. CsTPS is one of the largest angiosperm TPS families characterized so far, formed by 95 loci from which just 55 encode for putative functional TPSs. All TPS angiosperm families, TPS-a, TPS-b, TPS-c, TPS-e/f, and TPS-g were represented in the sweet orange genome, with 28, 18, 2, 2, and 5 putative full length genes each. Additionally, sweet orange β-farnesene synthase, (Z-β-cubebene/α-copaene synthase, two β-caryophyllene synthases, and three multiproduct enzymes yielding β-cadinene/α-copaene, β-elemene, and β-cadinene/ledene/allo-aromandendrene as major products were identified, and functionally characterized via in vivo recombinant Escherichia coli assays.

  6. Full-length RNA structure prediction of the HIV-1 genome reveals a conserved core domain

    DEFF Research Database (Denmark)

    Sükösd, Zsuzsanna; Andersen, Ebbe Sloth; Seemann, Ernst Stefan

    2015-01-01

    of the HIV-1 genome is highly variable in most regions, with a limited number of stable and conserved RNA secondary structures. Most interesting, a set of long distance interactions form a core organizing structure (COS) that organize the genome into three major structural domains. Despite overlapping...

  7. Genome-wide identification of the regulatory targets of a transcription factor using biochemical characterization and computational genomic analysis

    Directory of Open Access Journals (Sweden)

    Jolly Emmitt R

    2005-11-01

    Full Text Available Abstract Background A major challenge in computational genomics is the development of methodologies that allow accurate genome-wide prediction of the regulatory targets of a transcription factor. We present a method for target identification that combines experimental characterization of binding requirements with computational genomic analysis. Results Our method identified potential target genes of the transcription factor Ndt80, a key transcriptional regulator involved in yeast sporulation, using the combined information of binding affinity, positional distribution, and conservation of the binding sites across multiple species. We have also developed a mathematical approach to compute the false positive rate and the total number of targets in the genome based on the multiple selection criteria. Conclusion We have shown that combining biochemical characterization and computational genomic analysis leads to accurate identification of the genome-wide targets of a transcription factor. The method can be extended to other transcription factors and can complement other genomic approaches to transcriptional regulation.

  8. On the total number of genes and their length distribution in complete microbial genomes

    DEFF Research Database (Denmark)

    Skovgaard, Marie; Jensen, L.J.; Brunak, Søren

    2001-01-01

    In sequenced microbial genomes, some of the annotated genes are actually not protein-coding genes, but rather open reading frames that occur by chance. Therefore, the number of annotated genes is higher than the actual number of genes for most of these microbes. Comparison of the length...... distribution of the annotated genes with the length distribution of those matching a known protein reveals that too many short genes are annotated in many genomes. Here we estimate the true number of protein-coding genes for sequenced genomes. Although it is often claimed that Escherichia coli has about 4300...... genes, we show that it probably has only similar to 3800 genes, and that a similar discrepancy exists for almost all published genomes....

  9. Resolving prokaryotic taxonomy without rRNA: longer oligonucleotide word lengths improve genome and metagenome taxonomic classification.

    Directory of Open Access Journals (Sweden)

    Eric B Alsop

    Full Text Available Oligonucleotide signatures, especially tetranucleotide signatures, have been used as method for homology binning by exploiting an organism's inherent biases towards the use of specific oligonucleotide words. Tetranucleotide signatures have been especially useful in environmental metagenomics samples as many of these samples contain organisms from poorly classified phyla which cannot be easily identified using traditional homology methods, including NCBI BLAST. This study examines oligonucleotide signatures across 1,424 completed genomes from across the tree of life, substantially expanding upon previous work. A comprehensive analysis of mononucleotide through nonanucleotide word lengths suggests that longer word lengths substantially improve the classification of DNA fragments across a range of sizes of relevance to high throughput sequencing. We find that, at present, heptanucleotide signatures represent an optimal balance between prediction accuracy and computational time for resolving taxonomy using both genomic and metagenomic fragments. We directly compare the ability of tetranucleotide and heptanucleotide world lengths (tetranucleotide signatures are the current standard for oligonucleotide word usage analyses for taxonomic binning of metagenome reads. We present evidence that heptanucleotide word lengths consistently provide more taxonomic resolving power, particularly in distinguishing between closely related organisms that are often present in metagenomic samples. This implies that longer oligonucleotide word lengths should replace tetranucleotide signatures for most analyses. Finally, we show that the application of longer word lengths to metagenomic datasets leads to more accurate taxonomic binning of DNA scaffolds and have the potential to substantially improve taxonomic assignment and assembly of metagenomic data.

  10. Characterizing Phage Genomes for Therapeutic Applications

    Directory of Open Access Journals (Sweden)

    Casandra W. Philipson

    2018-04-01

    Full Text Available Multi-drug resistance is increasing at alarming rates. The efficacy of phage therapy, treating bacterial infections with bacteriophages alone or in combination with traditional antibiotics, has been demonstrated in emergency cases in the United States and in other countries, however remains to be approved for wide-spread use in the US. One limiting factor is a lack of guidelines for assessing the genomic safety of phage candidates. We present the phage characterization workflow used by our team to generate data for submitting phages to the Federal Drug Administration (FDA for authorized use. Essential analysis checkpoints and warnings are detailed for obtaining high-quality genomes, excluding undesirable candidates, rigorously assessing a phage genome for safety and evaluating sequencing contamination. This workflow has been developed in accordance with community standards for high-throughput sequencing of viral genomes as well as principles for ideal phages used for therapy. The feasibility and utility of the pipeline is demonstrated on two new phage genomes that meet all safety criteria. We propose these guidelines as a minimum standard for phages being submitted to the FDA for review as investigational new drug candidates.

  11. Full length channel Pressure Tube sagging under completely voided full length pressure tube of an Indian PHWR

    Energy Technology Data Exchange (ETDEWEB)

    Negi, Sujay, E-mail: negi.sujay@gmail.com [Indian Institute of Technology, Roorkee 247667 (India); Kumar, Ravi, E-mail: ravikfme@gmail.com [Indian Institute of Technology, Roorkee 247667 (India); Majumdar, P., E-mail: pmajum@barc.gov.in [Bhabha Atomic Research Centre, Mumbai 400085 (India); Mukopadhyay, D., E-mail: dmukho@barc.gov.in [Bhabha Atomic Research Centre, Mumbai 400085 (India)

    2017-03-15

    Highlights: • At 16 kW/m input, thermal stability was attained at 595 °C, without PT-CT contact. • At 20 kW/m step input, PT-CT contact occurred at 637 °C near bottom-center of the tube. • PT integrity was maintained throughout the experiment. - Abstract: An experimental investigation was conducted to simulate the sagging behavior of a full length Pressure Tube of a channel of 220 MWe Indian PHWR. The investigation aimed to recreate a condition resembling Loss of Coolant Accident (LOCA) with Emergency Core Cooling System (ECCS) failure in a nuclear power plant. A full length channel assembly immersed in moderator was subjected to electrical resistance heating of Pressure Tube (PT) to simulate the residual heat after shutting down of reactor. The temperature of PT started rising and the contact between PT and CT was established at the center of the tube where average bottom temperature was 637 °C. The integrity of PT was maintained throughout the experiment and the PT heat up was arrested on contact with the CT due to transfer of heat to the moderator.

  12. Cost-effective sequencing of full-length cDNA clones powered by a de novo-reference hybrid assembly.

    Science.gov (United States)

    Kuroshu, Reginaldo M; Watanabe, Junichi; Sugano, Sumio; Morishita, Shinichi; Suzuki, Yutaka; Kasahara, Masahiro

    2010-05-07

    Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence approximately 800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only approximately US$3 per clone, demonstrating a significant advantage over previous approaches.

  13. Construction of a full-length infectious bacterial artificial chromosome clone of duck enteritis virus vaccine strain

    Science.gov (United States)

    2013-01-01

    Background Duck enteritis virus (DEV) is the causative agent of duck viral enteritis, which causes an acute, contagious and lethal disease of many species of waterfowl within the order Anseriformes. In recent years, two laboratories have reported on the successful construction of DEV infectious clones in viral vectors to express exogenous genes. The clones obtained were either created with deletion of viral genes and based on highly virulent strains or were constructed using a traditional overlapping fosmid DNA system. Here, we report the construction of a full-length infectious clone of DEV vaccine strain that was cloned into a bacterial artificial chromosome (BAC). Methods A mini-F vector as a BAC that allows the maintenance of large circular DNA in E. coli was introduced into the intergenic region between UL15B and UL18 of a DEV vaccine strain by homologous recombination in chicken embryoblasts (CEFs). Then, the full-length DEV clone pDEV-vac was obtained by electroporating circular viral replication intermediates containing the mini-F sequence into E. coli DH10B and identified by enzyme digestion and sequencing. The infectivity of the pDEV-vac was validated by DEV reconstitution from CEFs transfected with pDEV-vac. The reconstructed virus without mini-F vector sequence was also rescued by co-transfecting the Cre recombinase expression plasmid pCAGGS-NLS/Cre and pDEV-vac into CEF cultures. Finally, the in vitro growth properties and immunoprotection capacity in ducks of the reconstructed viruses were also determined and compared with the parental virus. Results The full genome of the DEV vaccine strain was successfully cloned into the BAC, and this BAC clone was infectious. The in vitro growth properties of these reconstructions were very similar to parental DEV, and ducks immunized with these viruses acquired protection against virulent DEV challenge. Conclusions DEV vaccine virus was cloned as an infectious bacterial artificial chromosome maintaining full-length

  14. Fiscal 2000 report on result of the full-length cDNA structure analysis; 2000 nendo kanzen cho cDNA kozo kaiseki seika hokokusho

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2001-03-01

    This paper explains the results of research on full-length cDNA structure analysis for the period from April, 2000 to March, 2001. The outline of human genome sequence was published in June, 2000. In Japan, human gene analysis was such that, as the basic technology of the bio industry, a millennium project was decided in the budget of fiscal 2000. The full-length cDNA structure analysis is the core of the project. The libraries of cDNA were prepared using full-length and more than 4-5kbp-long cDNAs by oligo-capping method. It began from determining partial sequence data at end cDNA, and then, with new clones selected therefrom, full-length human cDNA sequence data were determined. The partial sequence data determined by fiscal 2000 were 1,035,000 clones while the full-length sequence data were 12,144 clones. The sequence data obtained were analyzed by homology search and translated into amino acid coding sequences, with predictions conducted on protein functions. A clustering method was examined that selects new clones from partial sequences. Database was constructed on gene expression profiles and disease-related gene sequence data. (NEDO)

  15. Identification, Characterization and Full-Length Sequence Analysis of a Novel Polerovirus Associated with Wheat Leaf Yellowing Disease

    Directory of Open Access Journals (Sweden)

    Peipei Zhang

    2017-09-01

    Full Text Available To identify the pathogens responsible for leaf yellowing symptoms on wheat samples collected from Jinan, China, we tested for the presence of three known barley/wheat yellow dwarf viruses (BYDV-GAV, -PAV, WYDV-GPV (most likely pathogens using RT-PCR. A sample that tested negative for the three viruses was selected for small RNA sequencing. Twenty-five million sequences were generated, among which 5% were of viral origin. A novel polerovirus was discovered and temporarily named wheat leaf yellowing-associated virus (WLYaV. The full genome of WLYaV corresponds to 5,772 nucleotides (nt, with six AUG-initiated open reading frames, one non-AUG-initiated open reading frame, and three untranslated regions, showing typical features of the family Luteoviridae. Sequence comparison and phylogenetic analyses suggested that WLYaV had the closest relationship with sugarcane yellow leaf virus (ScYLV, but the identities of full genomic nucleotides and deduced amino acid sequence of coat protein (CP were 64.9 and 86.2%, respectively, below the species demarcation thresholds (90% in the family Luteoviridae. Furthermore, agroinoculation of Nicotiana benthamiana leaves with a cDNA clone of WLYaV caused yellowing symptoms on the plant. Our study adds a new polerovirus that is associated with wheat leaf yellowing disease, which would help to identify and control pathogens of wheat.

  16. Full-length fuel rod behavior under severe accident conditions

    International Nuclear Information System (INIS)

    Lombardo, N.J.; Lanning, D.D.; Panisko, F.E.

    1992-12-01

    This document presents an assessment of the severe accident phenomena observed from four Full-Length High-Temperature (FLHT) tests that were performed by the Pacific Northwest Laboratory (PNL) in the National Research Universal (NRU) reactor at Chalk River, Ontario, Canada. These tests were conducted for the US Nuclear Regulatory Commission (NRC) as part of the Severe Accident Research Program. The objectives of the test were to simulate conditions and provide information on the behavior of full-length fuel rods during hypothetical, small-break, loss-of-coolant severe accidents, in commercial light water reactors

  17. Full genome analysis of enterovirus D-68 strains circulating in Alberta, Canada.

    Science.gov (United States)

    Pabbaraju, Kanti; Wong, Sallene; Drews, Steven J; Tipples, Graham; Tellier, Raymond

    2016-07-01

    A widespread outbreak of enterovirus (EV)-D68 that started in the summer of 2014 has been reported in the USA and Canada. During the course of this outbreak, EV-D68 was identified as a possible cause of acute, unexplained severe respiratory illness and a temporal association was observed between acute flaccid paralysis with anterior myelitis and EV-D68 detection in the upper respiratory tract. In this study, four nasopharyngeal samples collected from patients in Alberta, Canada with a laboratory diagnosis of EV-D68 were used to determine the near full-length genome sequence directly from the specimens. Phylogenetic analysis was performed to study the genotypes and pathogenesis of the circulating strains. Our results support the contention that mutations in the VP1 gene and other regions of the genome causing altered antigenicity, as well as lack of immunity in the younger population, may be responsible for the increased severe respiratory disease outbreaks of EV-D68 worldwide. © 2015 Wiley Periodicals, Inc.

  18. Purification and characterization of recombinant full-length and protease domain of murine MMP-9 expressed in Drosophila S2 cells

    DEFF Research Database (Denmark)

    Rasch, Morten G; Lund, Ida K; Illemann, Martin

    2010-01-01

    MMP-9. Constructs encoding zymogens of full-length murine MMP-9 and a version lacking the O-glycosylated linker region and hemopexin domains were therefore generated and expressed in stably transfected Drosophila S2 insect cells. After 7 days of induction the expression levels of the full......-length and truncated versions were 5 mg/l and 2 mg/l, respectively. The products were >95% pure after gelatin Sepharose chromatography and possessed proteolytic activity when analyzed by gelatin zymography. Using the purified full-length murine MMP-9 we raised polyclonal antibodies by immunizations of rabbits......Matrix metalloproteinase-9 (MMP-9) is a 92-kDa soluble pro-enzyme implicated in pathological events including cancer invasion. It is therefore an attractive target for therapeutic intervention studies in mouse models. Development of inhibitors requires sufficient amounts of correctly folded murine...

  19. Comprehensive genomic characterization of campylobacter genus reveals some underlying mechanisms for its genomic diversification.

    Directory of Open Access Journals (Sweden)

    Yizhuang Zhou

    Full Text Available Campylobacter species.are phenotypically diverse in many aspects including host habitats and pathogenicities, which demands comprehensive characterization of the entire Campylobacter genus to study their underlying genetic diversification. Up to now, 34 Campylobacter strains have been sequenced and published in public databases, providing good opportunity to systemically analyze their genomic diversities. In this study, we first conducted genomic characterization, which includes genome-wide alignments, pan-genome analysis, and phylogenetic identification, to depict the genetic diversity of Campylobacter genus. Afterward, we improved the tetranucleotide usage pattern-based naïve Bayesian classifier to identify the abnormal composition fragments (ACFs, fragments with significantly different tetranucleotide frequency profiles from its genomic tetranucleotide frequency profiles including horizontal gene transfers (HGTs to explore the mechanisms for the genetic diversity of this organism. Finally, we analyzed the HGTs transferred via bacteriophage transductions. To our knowledge, this study is the first to use single nucleotide polymorphism information to construct liable microevolution phylogeny of 21 Campylobacter jejuni strains. Combined with the phylogeny of all the collected Campylobacter species based on genome-wide core gene information, comprehensive phylogenetic inference of all 34 Campylobacter organisms was determined. It was found that C. jejuni harbors a high fraction of ACFs possibly through intraspecies recombination, whereas other Campylobacter members possess numerous ACFs possibly via intragenus recombination. Furthermore, some Campylobacter strains have undergone significant ancient viral integration during their evolution process. The improved method is a powerful tool for bacterial genomic analysis. Moreover, the findings would provide useful information for future research on Campylobacter genus.

  20. Mapping and characterizing N6-methyladenine in eukaryotic genomes using single molecule real-time sequencing.

    Science.gov (United States)

    Zhu, Shijia; Beaulaurier, John; Deikus, Gintaras; Wu, Tao; Strahl, Maya; Hao, Ziyang; Luo, Guanzheng; Gregory, James A; Chess, Andrew; He, Chuan; Xiao, Andrew; Sebra, Robert; Schadt, Eric E; Fang, Gang

    2018-05-15

    N6-methyladenine (m6dA) has been discovered as a novel form of DNA methylation prevalent in eukaryotes, however, methods for high resolution mapping of m6dA events are still lacking. Single-molecule real-time (SMRT) sequencing has enabled the detection of m6dA events at single-nucleotide resolution in prokaryotic genomes, but its application to detecting m6dA in eukaryotic genomes has not been rigorously examined. Herein, we identified unique characteristics of eukaryotic m6dA methylomes that fundamentally differ from those of prokaryotes. Based on these differences, we describe the first approach for mapping m6dA events using SMRT sequencing specifically designed for the study of eukaryotic genomes, and provide appropriate strategies for designing experiments and carrying out sequencing in future studies. We apply the novel approach to study two eukaryotic genomes. For green algae, we construct the first complete genome-wide map of m6dA at single nucleotide and single molecule resolution. For human lymphoblastoid cells (hLCLs), joint analyses of SMRT sequencing and independent sequencing data suggest that putative m6dA events are enriched in the promoters of young, full length LINE-1 elements (L1s). These analyses demonstrate a general method for rigorous mapping and characterization of m6dA events in eukaryotic genomes. Published by Cold Spring Harbor Laboratory Press.

  1. Direct recovery of infectious Pestivirus from a full-length RT-PCR amplicon

    DEFF Research Database (Denmark)

    Rasmussen, Thomas Bruun; Reimann, Ilona; Hoffmann, Bernd

    2008-01-01

    This study describes the use of a novel and rapid long reverse transcription (RT)-PCR for the generation of infectious full-length cDNA of pestiviruses. To produce rescued viruses, full-length RT-PCR amplicons of 12.3 kb, including a T7-promotor, were transcribed directly in vitro, and the result......This study describes the use of a novel and rapid long reverse transcription (RT)-PCR for the generation of infectious full-length cDNA of pestiviruses. To produce rescued viruses, full-length RT-PCR amplicons of 12.3 kb, including a T7-promotor, were transcribed directly in vitro......, and the resulting RNA transcripts were electroporated into ovine cells. Infectious virus was obtained after one cell culture passage. The rescued viruses had a phenotype similar to the parental Border Disease virus strain. Therefore, direct generation of infectious pestiviruses from full-length RT-PCR cDNA products...

  2. Phylogenetic and genomic characterization of a novel atypical porcine pestivirus in China.

    Science.gov (United States)

    Zhang, H; Wen, W; Hao, G; Hu, Y; Chen, H; Qian, P; Li, X

    2018-02-01

    Atypical porcine pestivirus (APPV) has been considered a novel pestivirus and causative agent of congenital tremor type A-II. An APPV CH-GX2016 strain was characterized from newly born piglets with clinical symptoms of congenital tremor in Guangxi, China. The genome of APPV CH-GX 2016 strain was 11,475 bp in length and encoded a polyprotein composed of the 3,635 amino acids. This genome sequence exhibited 88.0% to 90.8% nucleotide sequence homology with other APPV reference sequences in GenBank. Phylogenetic analysis further showed that APPV CH-GX is a novel pestivirus compared with previously described classical pestivirus strains. Therefore, APPV is present in pigs in China. © 2017 Blackwell Verlag GmbH.

  3. Purification and characterization of recombinant full-length and protease domain of murine MMP-9 expressed in Drosophila S2 cells

    DEFF Research Database (Denmark)

    Rasch, Morten G; Lund, Ida K.; Illemann, Martin

    2010-01-01

    -length and truncated versions were 5 mg/l and 2 mg/l, respectively. The products were >95% pure after gelatin Sepharose chromatography and possessed proteolytic activity when analyzed by gelatin zymography. Using the purified full-length murine MMP-9 we raised polyclonal antibodies by immunizations of rabbits...

  4. Construction of a Full-Length Enriched cDNA Library and Preliminary Analysis of Expressed Sequence Tags from Bengal Tiger Panthera tigris tigris

    Directory of Open Access Journals (Sweden)

    Changqing Liu

    2013-05-01

    Full Text Available In this study, a full-length enriched cDNA library was successfully constructed from Bengal tiger, Panthera tigris tigris, the most well-known wild Animal. Total RNA was extracted from cultured Bengal tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.28 × 106 pfu/mL and 1.56 × 109 pfu/mL respectively. The percentage of recombinants from unamplified library was 90.2% and average length of exogenous inserts was 0.98 kb. A total of 212 individual ESTs with sizes ranging from 356 to 1108 bps were then analyzed. The BLASTX score revealed that 48.1% of the sequences were classified as a strong match, 45.3% as nominal and 6.6% as a weak match. Among the ESTs with known putative function, 26.4% ESTs were found to be related to all kinds of metabolisms, 19.3% ESTs to information storage and processing, 11.3% ESTs to posttranslational modification, protein turnover, chaperones, 11.3% ESTs to transport, 9.9% ESTs to signal transducer/cell communication, 9.0% ESTs to structure protein, 3.8% ESTs to cell cycle, and only 6.6% ESTs classified as novel genes. By EST sequencing, a full-length gene coding ferritin was identified and characterized. The recombinant plasmid pET32a-TAT-Ferritin was constructed, coded for the TAT-Ferritin fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-Ferritin recombinant protein was 2.32 ± 0.12 mg/mL. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genome and transcriptome research of Bengal tigers.

  5. Genomic Relatedness of Chlamydia Isolates Determined by Amplified Fragment Length Polymorphism Analysis

    OpenAIRE

    Meijer, Adam; Morré, Servaas A.; Van Den Brule, Adriaan J. C.; Savelkoul, Paul H. M.; Ossewaarde, Jacobus M.

    1999-01-01

    The genomic relatedness of 19 Chlamydia pneumoniae isolates (17 from respiratory origin and 2 from atherosclerotic origin), 21 Chlamydia trachomatis isolates (all serovars from the human biovar, an isolate from the mouse biovar, and a porcine isolate), 6 Chlamydia psittaci isolates (5 avian isolates and 1 feline isolate), and 1 Chlamydia pecorum isolate was studied by analyzing genomic amplified fragment length polymorphism (AFLP) fingerprints. The AFLP procedure was adapted from a previously...

  6. Molecular characterization, genomic distribution and evolutionary dynamics of Short INterspersed Elements in the termite genome.

    Science.gov (United States)

    Luchetti, Andrea; Mantovani, Barbara

    2011-02-01

    Short INterspersed Elements (SINEs) in invertebrates, and especially in animal inbred genomes such that of termites, are poorly known; in this paper we characterize three new SINE families (Talub, Taluc and Talud) through the analyses of 341 sequences, either isolated from the Reticulitermes lucifugus genome or drawn from EST Genbank collection. We further add new data to the only isopteran element known so far, Talua. These SINEs are tRNA-derived elements, with an average length ranging from 258 to 372 bp. The tails are made up by poly(A) or microsatellite motifs. Their copy number varies from 7.9 × 10(3) to 10(5) copies, well within the range observed for other metazoan genomes. Species distribution, age and target site duplication analysis indicate Talud as the oldest, possibly inactive SINE originated before the onset of Isoptera (~150 Myr ago). Taluc underwent to substantial sequence changes throughout the evolution of termites and data suggest it was silenced and then re-activated in the R. lucifugus lineage. Moreover, Taluc shares a conserved sequence block with other unrelated SINEs, as observed for some vertebrate and cephalopod elements. The study of genomic environment showed that insertions are mainly surrounded by microsatellites and other SINEs, indicating a biased accumulation within non-coding regions. The evolutionary dynamics of Talu~ elements is explained through selective mechanisms acting in an inbred genome; in this respect, the study of termites' SINEs activity may provide an interesting framework to address the (co)evolution of mobile elements and the host genome.

  7. The Complete Sequence of a Human Parainfluenzavirus 4 Genome

    Directory of Open Access Journals (Sweden)

    Carmen Yea

    2009-06-01

    Full Text Available Although the human parainfluenza virus 4 (HPIV4 has been known for a long time, its genome, alone among the human paramyxoviruses, has not been completely sequenced to date. In this study we obtained the first complete genomic sequence of HPIV4 from a clinical isolate named SKPIV4 obtained at the Hospital for Sick Children in Toronto (Ontario, Canada. The coding regions for the N, P/V, M, F and HN proteins show very high identities (95% to 97% with previously available partial sequences for HPIV4B. The sequence for the L protein and the non-coding regions represent new information. A surprising feature of the genome is its length, more than 17 kb, making it the longest genome within the genus Rubulavirus, although the length is well within the known range of 15 kb to 19 kb for the subfamily Paramyxovirinae. The availability of a complete genomic sequence will facilitate investigations on a respiratory virus that is still not completely characterized.

  8. Molecular characterisation of the full-length genome of olive latent virus 1 isolated from tomato.

    Science.gov (United States)

    Hasiów-Jaroszewska, Beata; Borodynko, Natasza; Pospieszny, Henryk

    2011-05-01

    Olive latent virus 1 (OLV-1) is a species of the Necrovirus genus. So far, it has been reported to infect olive, citrus tree and tulip. Here, we determined and analysed the complete genomic sequence of an isolate designated as CM1, which was collected from tomato plant in the Wielkopolska region of Poland and represents the prevalent isolate of OLV-1. The CM1 genome consists of monopartite single-stranded positive-sense RNA genome sized 3,699 nt with five open reading frames (ORFs) and small inter-cistronic regions. ORF1 encodes a polypeptide with a molecular weight of 23 kDa and the read-through (RT) of its amber stop codon results in ORF1 RT that encodes the virus RNA-dependent RNA polymerase. ORF2 and ORF3 encode two peptides, with 8 kDa and 6 kDa, respectively, which appear to be involved in cell-to-cell movement. ORF4 is located in the 3' terminal and encodes a protein with 30 kDa identified as the viral coat protein (CP). The differences in CP region of four OLV-1 isolates whose sequences have been deposited in GenBank were observed. Nucleotide sequence identities of the CP of tomato CM1 isolate with those of olive, citrus and tulip isolates were 91.8%, 89.5% and 92.5%, respectively. In contrast to other OLV-1 isolates, CM1 induced necrotic spots on tomato plants and elicited necrotic local lesions on Nicotiana benthamiana, followed by systemic infection. This is the third complete genomic sequence of OLV-1 reported and the first one from tomato.

  9. Gene organization in rice revealed by full-length cDNA mapping and gene expression analysis through microarray.

    Directory of Open Access Journals (Sweden)

    Kouji Satoh

    Full Text Available Rice (Oryza sativa L. is a model organism for the functional genomics of monocotyledonous plants since the genome size is considerably smaller than those of other monocotyledonous plants. Although highly accurate genome sequences of indica and japonica rice are available, additional resources such as full-length complementary DNA (FL-cDNA sequences are also indispensable for comprehensive analyses of gene structure and function. We cross-referenced 28.5K individual loci in the rice genome defined by mapping of 578K FL-cDNA clones with the 56K loci predicted in the TIGR genome assembly. Based on the annotation status and the presence of corresponding cDNA clones, genes were classified into 23K annotated expressed (AE genes, 33K annotated non-expressed (ANE genes, and 5.5K non-annotated expressed (NAE genes. We developed a 60mer oligo-array for analysis of gene expression from each locus. Analysis of gene structures and expression levels revealed that the general features of gene structure and expression of NAE and ANE genes were considerably different from those of AE genes. The results also suggested that the cloning efficiency of rice FL-cDNA is associated with the transcription activity of the corresponding genetic locus, although other factors may also have an effect. Comparison of the coverage of FL-cDNA among gene families suggested that FL-cDNA from genes encoding rice- or eukaryote-specific domains, and those involved in regulatory functions were difficult to produce in bacterial cells. Collectively, these results indicate that rice genes can be divided into distinct groups based on transcription activity and gene structure, and that the coverage bias of FL-cDNA clones exists due to the incompatibility of certain eukaryotic genes in bacteria.

  10. A comprehensive characterization of simple sequence repeats in pepper genomes provides valuable resources for marker development in Capsicum.

    Science.gov (United States)

    Cheng, Jiaowen; Zhao, Zicheng; Li, Bo; Qin, Cheng; Wu, Zhiming; Trejo-Saavedra, Diana L; Luo, Xirong; Cui, Junjie; Rivera-Bustamante, Rafael F; Li, Shuaicheng; Hu, Kailin

    2016-01-07

    The sequences of the full set of pepper genomes including nuclear, mitochondrial and chloroplast are now available for use. However, the overall of simple sequence repeats (SSR) distribution in these genomes and their practical implications for molecular marker development in Capsicum have not yet been described. Here, an average of 868,047.50, 45.50 and 30.00 SSR loci were identified in the nuclear, mitochondrial and chloroplast genomes of pepper, respectively. Subsequently, systematic comparisons of various species, genome types, motif lengths, repeat numbers and classified types were executed and discussed. In addition, a local database composed of 113,500 in silico unique SSR primer pairs was built using a homemade bioinformatics workflow. As a pilot study, 65 polymorphic markers were validated among a wide collection of 21 Capsicum genotypes with allele number and polymorphic information content value per marker raging from 2 to 6 and 0.05 to 0.64, respectively. Finally, a comparison of the clustering results with those of a previous study indicated the usability of the newly developed SSR markers. In summary, this first report on the comprehensive characterization of SSR motifs in pepper genomes and the very large set of SSR primer pairs will benefit various genetic studies in Capsicum.

  11. Ranking of Prokaryotic Genomes Based on Maximization of Sortedness of Gene Lengths.

    Science.gov (United States)

    Bolshoy, A; Salih, B; Cohen, I; Tatarinova, T

    How variations of gene lengths (some genes become longer than their predecessors, while other genes become shorter and the sizes of these factions are randomly different from organism to organism) depend on organismal evolution and adaptation is still an open question. We propose to rank the genomes according to lengths of their genes, and then find association between the genome rank and variousproperties, such as growth temperature, nucleotide composition, and pathogenicity. This approach reveals evolutionary driving factors. The main purpose of this study is to test effectiveness and robustness of several ranking methods. The selected method of evaluation is measuring of overall sortedness of the data. We have demonstrated that all considered methods give consistent results and Bubble Sort and Simulated Annealing achieve the highest sortedness. Also, Bubble Sort is considerably faster than the Simulated Annealing method.

  12. Identification, Characterization and Full-Length Sequence Analysis of a Novel Polerovirus Associated with Wheat Leaf Yellowing Disease.

    Science.gov (United States)

    Zhang, Peipei; Liu, Yan; Liu, Wenwen; Cao, Mengji; Massart, Sebastien; Wang, Xifeng

    2017-01-01

    To identify the pathogens responsible for leaf yellowing symptoms on wheat samples collected from Jinan, China, we tested for the presence of three known barley/wheat yellow dwarf viruses (BYDV-GAV, -PAV, WYDV-GPV) (most likely pathogens) using RT-PCR. A sample that tested negative for the three viruses was selected for small RNA sequencing. Twenty-five million sequences were generated, among which 5% were of viral origin. A novel polerovirus was discovered and temporarily named wheat leaf yellowing-associated virus (WLYaV). The full genome of WLYaV corresponds to 5,772 nucleotides (nt), with six AUG-initiated open reading frames, one non-AUG-initiated open reading frame, and three untranslated regions, showing typical features of the family Luteoviridae . Sequence comparison and phylogenetic analyses suggested that WLYaV had the closest relationship with sugarcane yellow leaf virus (ScYLV), but the identities of full genomic nucleotides and deduced amino acid sequence of coat protein (CP) were 64.9 and 86.2%, respectively, below the species demarcation thresholds (90%) in the family Luteoviridae . Furthermore, agroinoculation of Nicotiana benthamiana leaves with a cDNA clone of WLYaV caused yellowing symptoms on the plant. Our study adds a new polerovirus that is associated with wheat leaf yellowing disease, which would help to identify and control pathogens of wheat.

  13. Methylation-Sensitive Amplification Length Polymorphism (MS-AFLP) Microarrays for Epigenetic Analysis of Human Genomes.

    Science.gov (United States)

    Alonso, Sergio; Suzuki, Koichi; Yamamoto, Fumiichiro; Perucho, Manuel

    2018-01-01

    Somatic, and in a minor scale also germ line, epigenetic aberrations are fundamental to carcinogenesis, cancer progression, and tumor phenotype. DNA methylation is the most extensively studied and arguably the best understood epigenetic mechanisms that become altered in cancer. Both somatic loss of methylation (hypomethylation) and gain of methylation (hypermethylation) are found in the genome of malignant cells. In general, the cancer cell epigenome is globally hypomethylated, while some regions-typically gene-associated CpG islands-become hypermethylated. Given the profound impact that DNA methylation exerts on the transcriptional profile and genomic stability of cancer cells, its characterization is essential to fully understand the complexity of cancer biology, improve tumor classification, and ultimately advance cancer patient management and treatment. A plethora of methods have been devised to analyze and quantify DNA methylation alterations. Several of the early-developed methods relied on the use of methylation-sensitive restriction enzymes, whose activity depends on the methylation status of their recognition sequences. Among these techniques, methylation-sensitive amplification length polymorphism (MS-AFLP) was developed in the early 2000s, and successfully adapted from its original gel electrophoresis fingerprinting format to a microarray format that notably increased its throughput and allowed the quantification of the methylation changes. This array-based platform interrogates over 9500 independent loci putatively amplified by the MS-AFLP technique, corresponding to the NotI sites mapped throughout the human genome.

  14. Technology development for gene discovery and full-length sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Marcelo Bento Soares

    2004-07-19

    In previous years, with support from the U.S. Department of Energy, we developed methods for construction of normalized and subtracted cDNA libraries, and constructed hundreds of high-quality libraries for production of Expressed Sequence Tags (ESTs). Our clones were made widely available to the scientific community through the IMAGE Consortium, and millions of ESTs were produced from our libraries either by collaborators or by our own sequencing laboratory at the University of Iowa. During this grant period, we focused on (1) the development of a method for preferential cloning of tissue-specific and/or rare transcripts, (2) its utilization to expedite EST-based gene discovery for the NIH Mouse Brain Molecular Anatomy Project, (3) further development and optimization of a method for construction of full-length-enriched cDNA libraries, and (4) modification of a plasmid vector to maximize efficiency of full-length cDNA sequencing by the transposon-mediated approach. It is noteworthy that the technology developed for preferential cloning of rare mRNAs enabled identification of over 2,000 mouse transcripts differentially expressed in the hippocampus. In addition, the method that we optimized for construction of full-length-enriched cDNA libraries was successfully utilized for the production of approximately fifty libraries from the developing mouse nervous system, from which over 2,500 full-ORF-containing cDNAs have been identified and accurately sequenced in their entirety either by our group or by the NIH-Mammalian Gene Collection Program Sequencing Team.

  15. Genome mapping and characterization of the Anopheles gambiae heterochromatin

    Directory of Open Access Journals (Sweden)

    Sharakhova Maria V

    2010-08-01

    Full Text Available Abstract Background Heterochromatin plays an important role in chromosome function and gene regulation. Despite the availability of polytene chromosomes and genome sequence, the heterochromatin of the major malaria vector Anopheles gambiae has not been mapped and characterized. Results To determine the extent of heterochromatin within the An. gambiae genome, genes were physically mapped to the euchromatin-heterochromatin transition zone of polytene chromosomes. The study found that a minimum of 232 genes reside in 16.6 Mb of mapped heterochromatin. Gene ontology analysis revealed that heterochromatin is enriched in genes with DNA-binding and regulatory activities. Immunostaining of the An. gambiae chromosomes with antibodies against Drosophila melanogaster heterochromatin protein 1 (HP1 and the nuclear envelope protein lamin Dm0 identified the major invariable sites of the proteins' localization in all regions of pericentric heterochromatin, diffuse intercalary heterochromatin, and euchromatic region 9C of the 2R arm, but not in the compact intercalary heterochromatin. To better understand the molecular differences among chromatin types, novel Bayesian statistical models were developed to analyze genome features. The study found that heterochromatin and euchromatin differ in gene density and the coverage of retroelements and segmental duplications. The pericentric heterochromatin had the highest coverage of retroelements and tandem repeats, while intercalary heterochromatin was enriched with segmental duplications. We also provide evidence that the diffuse intercalary heterochromatin has a higher coverage of DNA transposable elements, minisatellites, and satellites than does the compact intercalary heterochromatin. The investigation of 42-Mb assembly of unmapped genomic scaffolds showed that it has molecular characteristics similar to cytologically mapped heterochromatin. Conclusions Our results demonstrate that Anopheles polytene chromosomes

  16. Full mitochondrial genome sequences of two endemic Philippine hornbill species (Aves: Bucerotidae provide evidence for pervasive mitochondrial DNA recombination

    Directory of Open Access Journals (Sweden)

    Bleidorn Christoph

    2011-01-01

    Full Text Available Abstract Background Although nowaday it is broadly accepted that mitochondrial DNA (mtDNA may undergo recombination, the frequency of such recombination remains controversial. Its estimation is not straightforward, as recombination under homoplasmy (i.e., among identical mt genomes is likely to be overlooked. In species with tandem duplications of large mtDNA fragments the detection of recombination can be facilitated, as it can lead to gene conversion among duplicates. Although the mechanisms for concerted evolution in mtDNA are not fully understood yet, recombination rates have been estimated from "one per speciation event" down to 850 years or even "during every replication cycle". Results Here we present the first complete mt genome of the avian family Bucerotidae, i.e., that of two Philippine hornbills, Aceros waldeni and Penelopides panini. The mt genomes are characterized by a tandemly duplicated region encompassing part of cytochrome b, 3 tRNAs, NADH6, and the control region. The duplicated fragments are identical to each other except for a short section in domain I and for the length of repeat motifs in domain III of the control region. Due to the heteroplasmy with regard to the number of these repeat motifs, there is some size variation in both genomes; with around 21,657 bp (A. waldeni and 22,737 bp (P. panini, they significantly exceed the hitherto longest known avian mt genomes, that of the albatrosses. We discovered concerted evolution between the duplicated fragments within individuals. The existence of differences between individuals in coding genes as well as in the control region, which are maintained between duplicates, indicates that recombination apparently occurs frequently, i.e., in every generation. Conclusions The homogenised duplicates are interspersed by a short fragment which shows no sign of recombination. We hypothesize that this region corresponds to the so-called Replication Fork Barrier (RFB, which has been

  17. Construction of a Full-Length Enriched cDNA Library and Preliminary Analysis of Expressed Sequence Tags from Bengal Tiger Panthera tigris tigris

    Science.gov (United States)

    Liu, Changqing; Liu, Dan; Guo, Yu; Lu, Taofeng; Li, Xiangchen; Zhang, Minghai; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

    2013-01-01

    In this study, a full-length enriched cDNA library was successfully constructed from Bengal tiger, Panthera tigris tigris, the most well-known wild Animal. Total RNA was extracted from cultured Bengal tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.28 × 106 pfu/mL and 1.56 × 109 pfu/mL respectively. The percentage of recombinants from unamplified library was 90.2% and average length of exogenous inserts was 0.98 kb. A total of 212 individual ESTs with sizes ranging from 356 to 1108 bps were then analyzed. The BLASTX score revealed that 48.1% of the sequences were classified as a strong match, 45.3% as nominal and 6.6% as a weak match. Among the ESTs with known putative function, 26.4% ESTs were found to be related to all kinds of metabolisms, 19.3% ESTs to information storage and processing, 11.3% ESTs to posttranslational modification, protein turnover, chaperones, 11.3% ESTs to transport, 9.9% ESTs to signal transducer/cell communication, 9.0% ESTs to structure protein, 3.8% ESTs to cell cycle, and only 6.6% ESTs classified as novel genes. By EST sequencing, a full-length gene coding ferritin was identified and characterized. The recombinant plasmid pET32a-TAT-Ferritin was constructed, coded for the TAT-Ferritin fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-Ferritin recombinant protein was 2.32 ± 0.12 mg/mL. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genome and transcriptome research of Bengal tigers. PMID:23708105

  18. Advantages of genome sequencing by long-read sequencer using SMRT technology in medical area.

    Science.gov (United States)

    Nakano, Kazuma; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Ashimine, Noriko; Ohki, Shun; Shinzato, Misuzu; Minami, Maiko; Nakanishi, Tetsuhiro; Teruya, Kuniko; Satou, Kazuhito; Hirano, Takashi

    2017-07-01

    PacBio RS II is the first commercialized third-generation DNA sequencer able to sequence a single molecule DNA in real-time without amplification. PacBio RS II's sequencing technology is novel and unique, enabling the direct observation of DNA synthesis by DNA polymerase. PacBio RS II confers four major advantages compared to other sequencing technologies: long read lengths, high consensus accuracy, a low degree of bias, and simultaneous capability of epigenetic characterization. These advantages surmount the obstacle of sequencing genomic regions such as high/low G+C, tandem repeat, and interspersed repeat regions. Moreover, PacBio RS II is ideal for whole genome sequencing, targeted sequencing, complex population analysis, RNA sequencing, and epigenetics characterization. With PacBio RS II, we have sequenced and analyzed the genomes of many species, from viruses to humans. Herein, we summarize and review some of our key genome sequencing projects, including full-length viral sequencing, complete bacterial genome and almost-complete plant genome assemblies, and long amplicon sequencing of a disease-associated gene region. We believe that PacBio RS II is not only an effective tool for use in the basic biological sciences but also in the medical/clinical setting.

  19. Characterization of mango (Mangifera indica L.) transcriptome and chloroplast genome.

    Science.gov (United States)

    Azim, M Kamran; Khan, Ishtaiq A; Zhang, Yong

    2014-05-01

    We characterized mango leaf transcriptome and chloroplast genome using next generation DNA sequencing. The RNA-seq output of mango transcriptome generated >12 million reads (total nucleotides sequenced >1 Gb). De novo transcriptome assembly generated 30,509 unigenes with lengths in the range of 300 to ≥3,000 nt and 67× depth of coverage. Blast searching against nonredundant nucleotide databases and several Viridiplantae genomic datasets annotated 24,593 mango unigenes (80% of total) and identified Citrus sinensis as closest neighbor of mango with 9,141 (37%) matched sequences. The annotation with gene ontology and Clusters of Orthologous Group terms categorized unigene sequences into 57 and 25 classes, respectively. More than 13,500 unigenes were assigned to 293 KEGG pathways. Besides major plant biology related pathways, KEGG based gene annotation pointed out active presence of an array of biochemical pathways involved in (a) biosynthesis of bioactive flavonoids, flavones and flavonols, (b) biosynthesis of terpenoids and lignins and (c) plant hormone signal transduction. The mango transcriptome sequences revealed 235 proteases belonging to five catalytic classes of proteolytic enzymes. The draft genome of mango chloroplast (cp) was obtained by a combination of Sanger and next generation sequencing. The draft mango cp genome size is 151,173 bp with a pair of inverted repeats of 27,093 bp separated by small and large single copy regions, respectively. Out of 139 genes in mango cp genome, 91 found to be protein coding. Sequence analysis revealed cp genome of C. sinensis as closest neighbor of mango. We found 51 short repeats in mango cp genome supposed to be associated with extensive rearrangements. This is the first report of transcriptome and chloroplast genome analysis of any Anacardiaceae family member.

  20. Full-length Ebola glycoprotein accumulates in the endoplasmic reticulum

    Directory of Open Access Journals (Sweden)

    Bhattacharyya Suchita

    2011-01-01

    Full Text Available Abstract The Filoviridae family comprises of Ebola and Marburg viruses, which are known to cause lethal hemorrhagic fever. However, there is no effective anti-viral therapy or licensed vaccines currently available for these human pathogens. The envelope glycoprotein (GP of Ebola virus, which mediates entry into target cells, is cytotoxic and this effect maps to a highly glycosylated mucin-like region in the surface subunit of GP (GP1. However, the mechanism underlying this cytotoxic property of GP is unknown. To gain insight into the basis of this GP-induced cytotoxicity, HEK293T cells were transiently transfected with full-length and mucin-deleted (Δmucin Ebola GP plasmids and GP localization was examined relative to the nucleus, endoplasmic reticulum (ER, Golgi, early and late endosomes using deconvolution fluorescent microscopy. Full-length Ebola GP was observed to accumulate in the ER. In contrast, GPΔmucin was uniformly expressed throughout the cell and did not localize in the ER. The Ebola major matrix protein VP40 was also co-expressed with GP to investigate its influence on GP localization. GP and VP40 co-expression did not alter GP localization to the ER. Also, when VP40 was co-expressed with the nucleoprotein (NP, it localized to the plasma membrane while NP accumulated in distinct cytoplasmic structures lined with vimentin. These latter structures are consistent with aggresomes and may serve as assembly sites for filoviral nucleocapsids. Collectively, these data suggest that full-length GP, but not GPΔmucin, accumulates in the ER in close proximity to the nuclear membrane, which may underscore its cytotoxic property.

  1. Isolation and characterization of full-length cDNA clones coding for cholinesterase from fetal human tissues

    International Nuclear Information System (INIS)

    Prody, C.A.; Zevin-Sonkin, D.; Gnatt, A.; Goldberg, O.; Soreq, H.

    1987-01-01

    To study the primary structure and regulation of human cholinesterases, oligodeoxynucleotide probes were prepared according to a consensus peptide sequence present in the active site of both human serum pseudocholinesterase and Torpedo electric organ true acetylcholinesterase. Using these probes, the authors isolated several cDNA clones from λgt10 libraries of fetal brain and liver origins. These include 2.4-kilobase cDNA clones that code for a polypeptide containing a putative signal peptide and the N-terminal, active site, and C-terminal peptides of human BtChoEase, suggesting that they code either for BtChoEase itself or for a very similar but distinct fetal form of cholinesterase. In RNA blots of poly(A) + RNA from the cholinesterase-producing fetal brain and liver, these cDNAs hybridized with a single 2.5-kilobase band. Blot hybridization to human genomic DNA revealed that these fetal BtChoEase cDNA clones hybridize with DNA fragments of the total length of 17.5 kilobases, and signal intensities indicated that these sequences are not present in many copies. Both the cDNA-encoded protein and its nucleotide sequence display striking homology to parallel sequences published for Torpedo AcChoEase. These finding demonstrate extensive homologies between the fetal BtChoEase encoded by these clones and other cholinesterases of various forms and species

  2. Whole genome characterization of non-tissue culture adapted HRSV strains in severely infected children

    Directory of Open Access Journals (Sweden)

    Kumaria Rajni

    2011-07-01

    Full Text Available Abstract Background Human respiratory syncytial virus (HRSV is the most important virus causing lower respiratory infection in young children. The complete genetic characterization of RSV clinical strains is a prerequisite for understanding HRSV infection in the clinical context. Current information about the genetic structure of the HRSV genome has largely been obtained using tissue culture adapted viruses. During tissue culture adaptation genetic changes can be introduced into the virus genome, which may obscure subtle variations in the genetic structure of different RSV strains. Methods In this study we describe a novel Sanger sequencing strategy which allowed the complete genetic characterisation of 14 clinical HRSV strains. The viruses were sequenced directly in the nasal washes of severely hospitalized children, and without prior passage of the viruses in tissue culture. Results The analysis of nucleotide sequences suggested that vRNA length is a variable factor among primary strains, while the phylogenetic analysis suggests selective pressure for change. The G gene showed the greatest sequence variation (2-6.4%, while small hydrophobic protein and matrix genes were completely conserved across all clinical strains studied. A number of sequence changes in the F, L, M2-1 and M2-2 genes were observed that have not been described in laboratory isolates. The gene junction regions showed more sequence variability, and in particular the intergenic regions showed a highest level of sequence variation. Although the clinical strains grew slower than the HRSVA2 virus isolate in tissue culture, the HRSVA2 isolate and clinical strains formed similar virus structures such as virus filaments and inclusion bodies in infected cells; supporting the clinical relevance of these virus structures. Conclusion This is the first report to describe the complete genetic characterization of HRSV clinical strains that have been sequenced directly from clinical

  3. Pre-Steady-State Kinetic Analysis of Truncated and Full-Length Saccharomyces cerevisiae DNA Polymerase Eta

    Directory of Open Access Journals (Sweden)

    Jessica A. Brown

    2010-01-01

    Full Text Available Understanding polymerase fidelity is an important objective towards ascertaining the overall stability of an organism's genome. Saccharomyces cerevisiae DNA polymerase η (yPolη, a Y-family DNA polymerase, is known to efficiently bypass DNA lesions (e.g., pyrimidine dimers in vivo. Using pre-steady-state kinetic methods, we examined both full-length and a truncated version of yPolη which contains only the polymerase domain. In the absence of yPolη's C-terminal residues 514–632, the DNA binding affinity was weakened by 2-fold and the base substitution fidelity dropped by 3-fold. Thus, the C-terminus of yPolη may interact with DNA and slightly alter the conformation of the polymerase domain during catalysis. In general, yPolη discriminated between a correct and incorrect nucleotide more during the incorporation step (50-fold on average than the ground-state binding step (18-fold on average. Blunt-end additions of dATP or pyrene nucleotide 5′-triphosphate revealed the importance of base stacking during the binding of incorrect incoming nucleotides.

  4. Genome-wide identification and characterization of long intergenic non-coding RNAs in Ganoderma lucidum.

    Directory of Open Access Journals (Sweden)

    Jianqin Li

    Full Text Available Ganoderma lucidum is a white-rot fungus best-known for its medicinal activities. We have previously sequenced its genome and annotated the protein coding genes. However, long non-coding RNAs in G. lucidum genome have not been analyzed. In this study, we have identified and characterized long intergenic non-coding RNAs (lincRNA in G. lucidum systematically. We developed a computational pipeline, which was used to analyze RNA-Seq data derived from G. lucidum samples collected from three developmental stages. A total of 402 lincRNA candidates were identified, with an average length of 609 bp. Analysis of their adjacent protein-coding genes (apcGenes revealed that 46 apcGenes belong to the pathways of triterpenoid biosynthesis and lignin degradation, or families of cytochrome P450, mating type B genes, and carbohydrate-active enzymes. To determine if lincRNAs and these apcGenes have any interactions, the corresponding pairs of lincRNAs and apcGenes were analyzed in detail. We developed a modified 3' RACE method to analyze the transcriptional direction of a transcript. Among the 46 lincRNAs, 37 were found unidirectionally transcribed, and 9 were found bidirectionally transcribed. The expression profiles of 16 of these 37 lincRNAs were found to be highly correlated with those of the apcGenes across the three developmental stages. Among them, 11 are positively correlated (r>0.8 and 5 are negatively correlated (r<-0.8. The co-localization and co-expression of lincRNAs and those apcGenes playing important functions is consistent with the notion that lincRNAs might be important regulators for cellular processes. In summary, this represents the very first study to identify and characterize lincRNAs in the genomes of basidiomycetes. The results obtained here have laid the foundation for study of potential lincRNA-mediated expression regulation of genes in G. lucidum.

  5. Genomic characterization of large heterochromatic gaps in the human genome assembly.

    Directory of Open Access Journals (Sweden)

    Nicolas Altemose

    2014-05-01

    Full Text Available The largest gaps in the human genome assembly correspond to multi-megabase heterochromatic regions composed primarily of two related families of tandem repeats, Human Satellites 2 and 3 (HSat2,3. The abundance of repetitive DNA in these regions challenges standard mapping and assembly algorithms, and as a result, the sequence composition and potential biological functions of these regions remain largely unexplored. Furthermore, existing genomic tools designed to predict consensus-based descriptions of repeat families cannot be readily applied to complex satellite repeats such as HSat2,3, which lack a consistent repeat unit reference sequence. Here we present an alignment-free method to characterize complex satellites using whole-genome shotgun read datasets. Utilizing this approach, we classify HSat2,3 sequences into fourteen subfamilies and predict their chromosomal distributions, resulting in a comprehensive satellite reference database to further enable genomic studies of heterochromatic regions. We also identify 1.3 Mb of non-repetitive sequence interspersed with HSat2,3 across 17 unmapped assembly scaffolds, including eight annotated gene predictions. Finally, we apply our satellite reference database to high-throughput sequence data from 396 males to estimate array size variation of the predominant HSat3 array on the Y chromosome, confirming that satellite array sizes can vary between individuals over an order of magnitude (7 to 98 Mb and further demonstrating that array sizes are distributed differently within distinct Y haplogroups. In summary, we present a novel framework for generating initial reference databases for unassembled genomic regions enriched with complex satellite DNA, and we further demonstrate the utility of these reference databases for studying patterns of sequence variation within human populations.

  6. Comparative analysis of full genomic sequences among different genotypes of dengue virus type 3

    Directory of Open Access Journals (Sweden)

    Lin Ting-Hsiang

    2008-05-01

    Full Text Available Abstract Background Although the previous study demonstrated the envelope protein of dengue viruses is under purifying selection pressure, little is known about the genetic differences of full-length viral genomes of DENV-3. In our study, complete genomic sequencing of DENV-3 strains collected from different geographical locations and isolation years were determined and the sequence diversity as well as selection pressure sites in the DENV genome other than within the E gene were also analyzed. Results Using maximum likelihood and Bayesian approaches, our phylogenetic analysis revealed that the Taiwan's indigenous DENV-3 isolated from 1994 and 1998 dengue/DHF epidemics and one 1999 sporadic case were of the three different genotypes – I, II, and III, each associated with DENV-3 circulating in Indonesia, Thailand and Sri Lanka, respectively. Sequence diversity and selection pressure of different genomic regions among DENV-3 different genotypes was further examined to understand the global DENV-3 evolution. The highest nucleotide sequence diversity among the fully sequenced DENV-3 strains was found in the nonstructural protein 2A (mean ± SD: 5.84 ± 0.54 and envelope protein gene regions (mean ± SD: 5.04 ± 0.32. Further analysis found that positive selection pressure of DENV-3 may occur in the non-structural protein 1 gene region and the positive selection site was detected at position 178 of the NS1 gene. Conclusion Our study confirmed that the envelope protein is under purifying selection pressure although it presented higher sequence diversity. The detection of positive selection pressure in the non-structural protein along genotype II indicated that DENV-3 originated from Southeast Asia needs to monitor the emergence of DENV strains with epidemic potential for better epidemic prevention and vaccine development.

  7. A comparative phylogenetic analysis of full-length mariner elements

    Indian Academy of Sciences (India)

    Mariner like elements (MLEs) are widely distributed type II transposons with an open reading frame (ORF) for transposase. We studied comparative phylogenetic evolution and inverted terminal repeat (ITR) conservation of MLEs from Indian saturniid silkmoth, Antheraea mylitta with other full length MLEs submitted in the ...

  8. Length and GC content variability of introns among teleostean genomes in the light of the metabolic rate hypothesis.

    Science.gov (United States)

    Chaurasia, Ankita; Tarallo, Andrea; Bernà, Luisa; Yagi, Mitsuharu; Agnisola, Claudio; D'Onofrio, Giuseppe

    2014-01-01

    A comparative analysis of five teleostean genomes, namely zebrafish, medaka, three-spine stickleback, fugu and pufferfish was performed with the aim to highlight the nature of the forces driving both length and base composition of introns (i.e., bpi and GCi). An inter-genome approach using orthologous intronic sequences was carried out, analyzing independently both variables in pairwise comparisons. An average length shortening of introns was observed at increasing average GCi values. The result was not affected by masking transposable and repetitive elements harbored in the intronic sequences. The routine metabolic rate (mass specific temperature-corrected using the Boltzmann's factor) was measured for each species. A significant correlation held between average differences of metabolic rate, length and GC content, while environmental temperature of fish habitat was not correlated with bpi and GCi. Analyzing the concomitant effect of both variables, i.e., bpi and GCi, at increasing genomic GC content, a decrease of bpi and an increase of GCi was observed for the significant majority of the intronic sequences (from ∼ 40% to ∼ 90%, in each pairwise comparison). The opposite event, concomitant increase of bpi and decrease of GCi, was counter selected (from hypothesis that the metabolic rate plays a key role in shaping genome architecture and evolution of vertebrate genomes.

  9. An integrated PCR colony hybridization approach to screen cDNA libraries for full-length coding sequences.

    Science.gov (United States)

    Pollier, Jacob; González-Guzmán, Miguel; Ardiles-Diaz, Wilson; Geelen, Danny; Goossens, Alain

    2011-01-01

    cDNA-Amplified Fragment Length Polymorphism (cDNA-AFLP) is a commonly used technique for genome-wide expression analysis that does not require prior sequence knowledge. Typically, quantitative expression data and sequence information are obtained for a large number of differentially expressed gene tags. However, most of the gene tags do not correspond to full-length (FL) coding sequences, which is a prerequisite for subsequent functional analysis. A medium-throughput screening strategy, based on integration of polymerase chain reaction (PCR) and colony hybridization, was developed that allows in parallel screening of a cDNA library for FL clones corresponding to incomplete cDNAs. The method was applied to screen for the FL open reading frames of a selection of 163 cDNA-AFLP tags from three different medicinal plants, leading to the identification of 109 (67%) FL clones. Furthermore, the protocol allows for the use of multiple probes in a single hybridization event, thus significantly increasing the throughput when screening for rare transcripts. The presented strategy offers an efficient method for the conversion of incomplete expressed sequence tags (ESTs), such as cDNA-AFLP tags, to FL-coding sequences.

  10. Detection and full genome characterization of two beta CoV viruses related to Middle East respiratory syndrome from bats in Italy.

    Science.gov (United States)

    Moreno, Ana; Lelli, Davide; de Sabato, Luca; Zaccaria, Guendalina; Boni, Arianna; Sozzi, Enrica; Prosperi, Alice; Lavazza, Antonio; Cella, Eleonora; Castrucci, Maria Rita; Ciccozzi, Massimo; Vaccari, Gabriele

    2017-12-19

    Middle East respiratory syndrome coronavirus (MERS-CoV), which belongs to beta group of coronavirus, can infect multiple host species and causes severe diseases in humans. Multiple surveillance and phylogenetic studies suggest a bat origin. In this study, we describe the detection and full genome characterization of two CoVs closely related to MERS-CoV from two Italian bats, Pipistrellus kuhlii and Hypsugo savii. Pool of viscera were tested by a pan-coronavirus RT-PCR. Virus isolation was attempted by inoculation in different cell lines. Full genome sequencing was performed using the Ion Torrent platform and phylogenetic trees were performed using IQtree software. Similarity plots of CoV clade c genomes were generated by using SSE v1.2. The three dimensional macromolecular structure (3DMMS) of the receptor binding domain (RBD) in the S protein was predicted by sequence-homology method using the protein data bank (PDB). Both samples resulted positive to the pan-coronavirus RT-PCR (IT-batCoVs) and their genome organization showed identical pattern of MERS CoV. Phylogenetic analysis showed a monophyletic group placed in the Beta2c clade formed by MERS-CoV sequences originating from humans and camels and bat-related sequences from Africa, Italy and China. The comparison of the secondary and 3DMMS of the RBD of IT-batCoVs with MERS, HKU4 and HKU5 bat sequences showed two aa deletions located in a region corresponding to the external subdomain of MERS-RBD in IT-batCoV and HKU5 RBDs. This study reported two beta CoVs closely related to MERS that were obtained from two bats belonging to two commonly recorded species in Italy (P. kuhlii and H. savii). The analysis of the RBD showed similar structure in IT-batCoVs and HKU5 respect to HKU4 sequences. Since the RBD domain of HKU4 but not HKU5 can bind to the human DPP4 receptor for MERS-CoV, it is possible to suggest also for IT-batCoVs the absence of DPP4-binding potential. More surveillance studies are needed to better

  11. Irradiation performance of full-length metallic IFR fuels

    International Nuclear Information System (INIS)

    Tsai, H.; Neimark, L.A.

    1992-07-01

    An assembly irradiation of 169 full-length U-Pu-Zr metallic fuel pins was successfully completed in FFTF to a goal burnup of 10 at.%. All test fuel pins maintained their cladding integrity during the irradiation. Postirradiation examination showed minimal fuel/cladding mechanical interaction and excellent stability of the fuel column. Fission-gas release was normal and consistent with the existing data base from irradiation testing of shorter metallic fuel pins in EBR-II

  12. Novel full-length major histocompatibility complex class I allele discovery and haplotype definition in pig-tailed macaques.

    Science.gov (United States)

    Semler, Matthew R; Wiseman, Roger W; Karl, Julie A; Graham, Michael E; Gieger, Samantha M; O'Connor, David H

    2017-11-13

    Pig-tailed macaques (Macaca nemestrina, Mane) are important models for human immunodeficiency virus (HIV) studies. Their infectability with minimally modified HIV makes them a uniquely valuable animal model to mimic human infection with HIV and progression to acquired immunodeficiency syndrome (AIDS). However, variation in the pig-tailed macaque major histocompatibility complex (MHC) and the impact of individual transcripts on the pathogenesis of HIV and other infectious diseases is understudied compared to that of rhesus and cynomolgus macaques. In this study, we used Pacific Biosciences single-molecule real-time circular consensus sequencing to describe full-length MHC class I (MHC-I) transcripts for 194 pig-tailed macaques from three breeding centers. We then used the full-length sequences to infer Mane-A and Mane-B haplotypes containing groups of MHC-I transcripts that co-segregate due to physical linkage. In total, we characterized full-length open reading frames (ORFs) for 313 Mane-A, Mane-B, and Mane-I sequences that defined 86 Mane-A and 106 Mane-B MHC-I haplotypes. Pacific Biosciences technology allows us to resolve these Mane-A and Mane-B haplotypes to the level of synonymous allelic variants. The newly defined haplotypes and transcript sequences containing full-length ORFs provide an important resource for infectious disease researchers as certain MHC haplotypes have been shown to provide exceptional control of simian immunodeficiency virus (SIV) replication and prevention of AIDS-like disease in nonhuman primates. The increased allelic resolution provided by Pacific Biosciences sequencing also benefits transplant research by allowing researchers to more specifically match haplotypes between donors and recipients to the level of nonsynonymous allelic variation, thus reducing the risk of graft-versus-host disease.

  13. The function analysis of full-length cDNA sequence from IRM-2 mouse cDNA library

    International Nuclear Information System (INIS)

    Wang Qin; Liu Xiaoqiu; Xu Chang; Du Liqing; Sun Zhijuan; Wang Yan; Liu Qiang; Song Li; Li Jin; Fan Feiyue

    2013-01-01

    Objective: To identify the function of full-length cDNA sequence from IRM-2 mouse cDNA library. Methods: Full-length cDNA products were amplified by PCR from IRM-2 mouse cDNA library according to twenty-one pieces of expressed sequence tag. The expression of full-length cDNAs were detected after mouse embryonic fibroblasts were exposed to 6.5 Gy γ-ray radiation. And the effect on the growth of radiosensitivity cells AT5B1VA transfected with full-length cDNAs was investigated. Results: The expression of No.4, 5 and 2 full-length cDNAs from IRM-2 mouse were higher than that of parental ICR and 615 mouse after mouse embryonic fibroblasts irradiated with γ-ray radiation. And the survival rate of AT5B1VA cells transfected with No.4, 5 and 2 full-length cDNAs was high. Conclusion: No.4, 5 and 2 full-length cDNAs of IRM-2 mouse are of high radioresistance. (authors)

  14. Seismic inference of 57 stars using full-length Kepler data sets

    Directory of Open Access Journals (Sweden)

    Creevey Orlagh

    2017-01-01

    Full Text Available We present stellar properties of 57 stars from a seismic inference using full-length data sets from Kepler (mass, age, radius, distances. These stars comprise active stars, planet-hosts, solar-analogs, and binary systems. We validate the distances derived from the astrometric Gaia-Tycho solution. Ensemble analysis of the stellar properties reveals a trend of mixing-length parameter with the surface gravity and effective temperature. We derive a linear relationship with the seismic quantity ‹r02› to estimate the stellar age. Finally, we define the stellar regimes where the Kjeldsen et al (2008 empirical surface correction for 1D model frequencies is valid.

  15. The full-length form of the Drosophila amyloid precursor protein is involved in memory formation.

    Science.gov (United States)

    Bourdet, Isabelle; Preat, Thomas; Goguel, Valérie

    2015-01-21

    The APP plays a central role in AD, a pathology that first manifests as a memory decline. Understanding the role of APP in normal cognition is fundamental in understanding the progression of AD, and mammalian studies have pointed to a role of secreted APPα in memory. In Drosophila, we recently showed that APPL, the fly APP ortholog, is required for associative memory. In the present study, we aimed to characterize which form of APPL is involved in this process. We show that expression of a secreted-APPL form in the mushroom bodies, the center for olfactory memory, is able to rescue the memory deficit caused by APPL partial loss of function. We next assessed the impact on memory of the Drosophila α-secretase kuzbanian (KUZ), the enzyme initiating the nonamyloidogenic pathway that produces secreted APPLα. Strikingly, KUZ overexpression not only failed to rescue the memory deficit caused by APPL loss of function, it exacerbated this deficit. We further show that in addition to an increase in secreted-APPL forms, KUZ overexpression caused a decrease of membrane-bound full-length species that could explain the memory deficit. Indeed, we observed that transient expression of a constitutive membrane-bound mutant APPL form is sufficient to rescue the memory deficit caused by APPL reduction, revealing for the first time a role of full-length APPL in memory formation. Our data demonstrate that, in addition to secreted APPL, the noncleaved form is involved in memory, raising the possibility that secreted and full-length APPL act together in memory processes. Copyright © 2015 the authors 0270-6474/15/351043-09$15.00/0.

  16. Genomic characterization of Burkholderia pseudomallei isolates selected for medical countermeasures testing: comparative genomics associated with differential virulence.

    Directory of Open Access Journals (Sweden)

    Jason W Sahl

    Full Text Available Burkholderia pseudomallei is the causative agent of melioidosis and a potential bioterrorism agent. In the development of medical countermeasures against B. pseudomallei infection, the US Food and Drug Administration (FDA animal Rule recommends using well-characterized strains in animal challenge studies. In this study, whole genome sequence data were generated for 6 B. pseudomallei isolates previously identified as candidates for animal challenge studies; an additional 5 isolates were sequenced that were associated with human inhalational melioidosis. A core genome single nucleotide polymorphism (SNP phylogeny inferred from a concatenated SNP alignment from the 11 isolates sequenced in this study and a diverse global collection of isolates demonstrated the diversity of the proposed Animal Rule isolates. To understand the genomic composition of each isolate, a large-scale blast score ratio (LS-BSR analysis was performed on the entire pan-genome; this demonstrated the variable composition of genes across the panel and also helped to identify genes unique to individual isolates. In addition, a set of ~550 genes associated with pathogenesis in B. pseudomallei were screened against the 11 sequenced genomes with LS-BSR. Differential gene distribution for 54 virulence-associated genes was observed between genomes and three of these genes were correlated with differential virulence observed in animal challenge studies using BALB/c mice. Differentially conserved genes and SNPs associated with disease severity were identified and could be the basis for future studies investigating the pathogenesis of B. pseudomallei. Overall, the genetic characterization of the 11 proposed Animal Rule isolates provides context for future studies involving B. pseudomallei pathogenesis, differential virulence, and efficacy to therapeutics.

  17. Full length prototype SSC dipole test results

    International Nuclear Information System (INIS)

    Strait, J.; Brown, B.C.; Carson, J.

    1987-01-01

    Results are presented from tests of the first full length prototype SSC dipole magnet. The cryogenic behavior of the magnet during a slow cooldown to 4.5K and a slow warmup to room temperature has been measured. Magnetic field quality was measured at currents up to 2000 A. Averaged over the body field all harmonics with the exception of b 2 and b 8 are at or within the tolerances specified by the SSC Central Design Group. (The values of b 2 and b 8 result from known design and construction defects which will be be corrected in later magnets.) Using an NMR probe the average body field strength is measured to be 10.283 G/A with point to point variations on the order of one part in 1000. Data are presented on quench behavior of the magnet up to 3500 A (approximately 55% of full field) including longitudinal and transverse velocities for the first 250 msec of the quench

  18. De novo assembly of the Aedes aegypti genome using Hi-C yields chromosome-length scaffolds.

    Science.gov (United States)

    Dudchenko, Olga; Batra, Sanjit S; Omer, Arina D; Nyquist, Sarah K; Hoeger, Marie; Durand, Neva C; Shamim, Muhammad S; Machol, Ido; Lander, Eric S; Aiden, Aviva Presser; Aiden, Erez Lieberman

    2017-04-07

    The Zika outbreak, spread by the Aedes aegypti mosquito, highlights the need to create high-quality assemblies of large genomes in a rapid and cost-effective way. Here we combine Hi-C data with existing draft assemblies to generate chromosome-length scaffolds. We validate this method by assembling a human genome, de novo, from short reads alone (67× coverage). We then combine our method with draft sequences to create genome assemblies of the mosquito disease vectors Ae aegypti and Culex quinquefasciatus , each consisting of three scaffolds corresponding to the three chromosomes in each species. These assemblies indicate that almost all genomic rearrangements among these species occur within, rather than between, chromosome arms. The genome assembly procedure we describe is fast, inexpensive, and accurate, and can be applied to many species. Copyright © 2017, American Association for the Advancement of Science.

  19. EuMicroSatdb: A database for microsatellites in the sequenced genomes of eukaryotes

    Directory of Open Access Journals (Sweden)

    Grover Atul

    2007-07-01

    Full Text Available Abstract Background Microsatellites have immense utility as molecular markers in different fields like genome characterization and mapping, phylogeny and evolutionary biology. Existing microsatellite databases are of limited utility for experimental and computational biologists with regard to their content and information output. EuMicroSatdb (Eukaryotic MicroSatellite database http://ipu.ac.in/usbt/EuMicroSatdb.htm is a web based relational database for easy and efficient positional mining of microsatellites from sequenced eukaryotic genomes. Description A user friendly web interface has been developed for microsatellite data retrieval using Active Server Pages (ASP. The backend database codes for data extraction and assembly have been written using Perl based scripts and C++. Precise need based microsatellites data retrieval is possible using different input parameters like microsatellite type (simple perfect or compound perfect, repeat unit length (mono- to hexa-nucleotide, repeat number, microsatellite length and chromosomal location in the genome. Furthermore, information about clustering of different microsatellites in the genome can also be retrieved. Finally, to facilitate primer designing for PCR amplification of any desired microsatellite locus, 200 bp upstream and downstream sequences are provided. Conclusion The database allows easy systematic retrieval of comprehensive information about simple and compound microsatellites, microsatellite clusters and their locus coordinates in 31 sequenced eukaryotic genomes. The information content of the database is useful in different areas of research like gene tagging, genome mapping, population genetics, germplasm characterization and in understanding microsatellite dynamics in eukaryotic genomes.

  20. Development of three full-length infectious cDNA clones of distinct brassica yellows virus genotypes for agrobacterium-mediated inoculation.

    Science.gov (United States)

    Zhang, Xiao-Yan; Dong, Shu-Wei; Xiang, Hai-Ying; Chen, Xiang-Ru; Li, Da-Wei; Yu, Jia-Lin; Han, Cheng-Gui

    2015-02-02

    Brassica yellows virus is a newly identified species in the genus of Polerovirus within the family Luteoviridae. Brassica yellows virus (BrYV) is prevalently distributed throughout Mainland China and South Korea, is an important virus infecting cruciferous crops. Based on six BrYV genomic sequences of isolates from oilseed rape, rutabaga, radish, and cabbage, three genotypes, BrYV-A, BrYV-B, and BrYV-C, exist, which mainly differ in the 5' terminal half of the genome. BrYV is an aphid-transmitted and phloem-limited virus. The use of infectious cDNA clones is an alternative means of infecting plants that allows reverse genetic studies to be performed. In this study, full-length cDNA clones of BrYV-A, recombinant BrYV5B3A, and BrYV-C were constructed under control of the cauliflower mosaic virus 35S promoter. An agrobacterium-mediated inoculation system of Nicotiana benthamiana was developed using these cDNA clones. Three days after infiltration with full-length BrYV cDNA clones, necrotic symptoms were observed in the inoculated leaves of N. benthamiana; however, no obvious symptoms appeared in the upper leaves. Reverse transcription-PCR (RT-PCR) and western blot detection of samples from the upper leaves showed that the maximum infection efficiency of BrYVs could reach 100%. The infectivity of the BrYV-A, BrYV-5B3A, and BrYV-C cDNA clones was further confirmed by northern hybridization. The system developed here will be useful for further studies of BrYV, such as host range, pathogenicity, viral gene functions, and plant-virus-vector interactions, and especially for discerning the differences among the three genotypes. Copyright © 2014 Elsevier B.V. All rights reserved.

  1. Genome-wide macrosynteny among Fusarium species in the Gibberella fujikuroi complex revealed by amplified fragment length polymorphisms.

    Directory of Open Access Journals (Sweden)

    Lieschen De Vos

    Full Text Available The Gibberella fujikuroi complex includes many Fusarium species that cause significant losses in yield and quality of agricultural and forestry crops. Due to their economic importance, whole-genome sequence information has rapidly become available for species including Fusarium circinatum, Fusarium fujikuroi and Fusarium verticillioides, each of which represent one of the three main clades known in this complex. However, no previous studies have explored the genomic commonalities and differences among these fungi. In this study, a previously completed genetic linkage map for an interspecific cross between Fusarium temperatum and F. circinatum, together with genomic sequence data, was utilized to consider the level of synteny between the three Fusarium genomes. Regions that are homologous amongst the Fusarium genomes examined were identified using in silico and pyrosequenced amplified fragment length polymorphism (AFLP fragment analyses. Homology was determined using BLAST analysis of the sequences, with 777 homologous regions aligned to F. fujikuroi and F. verticillioides. This also made it possible to assign the linkage groups from the interspecific cross to their corresponding chromosomes in F. verticillioides and F. fujikuroi, as well as to assign two previously unmapped supercontigs of F. verticillioides to probable chromosomal locations. We further found evidence of a reciprocal translocation between the distal ends of chromosome 8 and 11, which apparently originated before the divergence of F. circinatum and F. temperatum. Overall, a remarkable level of macrosynteny was observed among the three Fusarium genomes, when comparing AFLP fragments. This study not only demonstrates how in silico AFLPs can aid in the integration of a genetic linkage map to the physical genome, but it also highlights the benefits of using this tool to study genomic synteny and architecture.

  2. Genomic, proteomic and morphological characterization of two novel broad host lytic bacteriophages ΦPD10.3 and ΦPD23.1 infecting pectinolytic Pectobacterium spp. and Dickeya spp.

    Directory of Open Access Journals (Sweden)

    Robert Czajkowski

    Full Text Available Pectinolytic Pectobacterium spp. and Dickeya spp. are necrotrophic bacterial pathogens of many important crops, including potato, worldwide. This study reports on the isolation and characterization of broad host lytic bacteriophages able to infect the dominant Pectobacterium spp. and Dickeya spp. affecting potato in Europe viz. Pectobacterium carotovorum subsp. carotovorum (Pcc, P. wasabiae (Pwa and Dickeya solani (Dso with the objective to assess their potential as biological disease control agents. Two lytic bacteriophages infecting stains of Pcc, Pwa and Dso were isolated from potato samples collected from two potato fields in central Poland. The ΦPD10.3 and ΦPD23.1 phages have morphology similar to other members of the Myoviridae family and the Caudovirales order, with a head diameter of 85 and 86 nm and length of tails of 117 and 121 nm, respectively. They were characterized for optimal multiplicity of infection, the rate of adsorption to the Pcc, Pwa and Dso cells, the latent period and the burst size. The phages were genotypically characterized with RAPD-PCR and RFLP techniques. The structural proteomes of both phages were obtained by fractionation of phage proteins by SDS-PAGE. Phage protein identification was performed by liquid chromatography-mass spectrometry (LC-MS analysis. Pulsed-field gel electrophoresis (PFGE, genome sequencing and comparative genome analysis were used to gain knowledge of the length, organization and function of the ΦPD10.3 and ΦPD23.1 genomes. The potential use of ΦPD10.3 and ΦPD23.1 phages for the biocontrol of Pectobacterium spp. and Dickeya spp. infections in potato is discussed.

  3. Retrieval of a million high-quality, full-length microbial 16S and 18S rRNA gene sequences without primer bias

    DEFF Research Database (Denmark)

    Karst, Søren Michael; Dueholm, Morten Simonsen; McIlroy, Simon Jon

    2018-01-01

    Small subunit ribosomal RNA (SSU rRNA) genes, 16S in bacteria and 18S in eukaryotes, have been the standard phylogenetic markers used to characterize microbial diversity and evolution for decades. However, the reference databases of full-length SSU rRNA gene sequences are skewed to well-studied e...

  4. Genomic regions, cellular components and gene regulatory basis underlying pod length variations in cowpea (V. unguiculata L. Walp).

    Science.gov (United States)

    Xu, Pei; Wu, Xinyi; Muñoz-Amatriaín, María; Wang, Baogen; Wu, Xiaohua; Hu, Yaowen; Huynh, Bao-Lam; Close, Timothy J; Roberts, Philip A; Zhou, Wen; Lu, Zhongfu; Li, Guojing

    2017-05-01

    Cowpea (V. unguiculata L. Walp) is a climate resilient legume crop important for food security. Cultivated cowpea (V. unguiculata L) generally comprises the bushy, short-podded grain cowpea dominant in Africa and the climbing, long-podded vegetable cowpea popular in Asia. How selection has contributed to the diversification of the two types of cowpea remains largely unknown. In the current study, a novel genotyping assay for over 50 000 SNPs was employed to delineate genomic regions governing pod length. Major, minor and epistatic QTLs were identified through QTL mapping. Seventy-two SNPs associated with pod length were detected by genome-wide association studies (GWAS). Population stratification analysis revealed subdivision among a cowpea germplasm collection consisting of 299 accessions, which is consistent with pod length groups. Genomic scan for selective signals suggested that domestication of vegetable cowpea was accompanied by selection of multiple traits including pod length, while the further improvement process was featured by selection of pod length primarily. Pod growth kinetics assay demonstrated that more durable cell proliferation rather than cell elongation or enlargement was the main reason for longer pods. Transcriptomic analysis suggested the involvement of sugar, gibberellin and nutritional signalling in regulation of pod length. This study establishes the basis for map-based cloning of pod length genes in cowpea and for marker-assisted selection of this trait in breeding programmes. © 2016 The Authors. Plant Biotechnology Journal published by Society for Experimental Biology and The Association of Applied Biologists and John Wiley & Sons Ltd.

  5. Identification and genomic characterization of a novel porcine parvovirus (PPV6) in China.

    Science.gov (United States)

    Ni, Jianqiang; Qiao, Caixia; Han, Xue; Han, Tao; Kang, Wenhua; Zi, Zhanchao; Cao, Zhen; Zhai, Xinyan; Cai, Xuepeng

    2014-12-02

    Parvoviruses are classified into two subfamilies based on their host range: the Parvovirinae, which infect vertebrates, and the Densovirinae, which mainly infect insects and other arthropods. In recent years, a number of novel parvoviruses belonging to the subfamily Parvovirinae have been identified from various animal species and humans, including human parvovirus 4 (PARV4), porcine hokovirus, ovine partetravirus, porcine parvovirus 4 (PPV4), and porcine parvovirus 5 (PPV5). Using sequence-independent single primer amplification (SISPA), a novel parvovirus within the subfamily Parvovirinae that was distinct from any known parvoviruses was identified and five full-length genome sequences were determined and analyzed. A novel porcine parvovirus, provisionally named PPV6, was initially identified from aborted pig fetuses in China. Retrospective studies revealed the prevalence of PPV6 in aborted pig fetuses and piglets(50% and 75%, respectively) was apparently higher than that in finishing pigs and sows (15.6% and 3.8% respectively). Furthermore, the prevalence of PPV6 in finishing pig was similar in affected and unaffected farms (i.e. 16.7% vs. 13.6%-21.7%). This finding indicates that animal age, perhaps due to increased innate immune resistance, strongly influences the level of PPV6 viremia. Complete genome sequencing and multiple alignments have shown that the nearly full-length genome sequences were approximately 6,100 nucleotides in length and shared 20.5%-42.6% DNA sequence identity with other members of the Parvovirinae subfamily. Phylogenetic analysis showed that PPV6 was significantly distinct from other known parvoviruses and was most closely related to PPV4. Our findings and review of published parvovirus sequences suggested that a novel porcine parvovirus is currently circulating in China and might be classified into the novel genus Copiparvovirus within the subfamily Parvovirinae. However, the clinical manifestations of PPV6 are still unknown in that the

  6. Resolving prokaryotic taxonomy without rRNA: longer oligonucleotide word lengths improve genome and metagenome taxonomic classification.

    Science.gov (United States)

    Alsop, Eric B; Raymond, Jason

    2013-01-01

    Oligonucleotide signatures, especially tetranucleotide signatures, have been used as method for homology binning by exploiting an organism's inherent biases towards the use of specific oligonucleotide words. Tetranucleotide signatures have been especially useful in environmental metagenomics samples as many of these samples contain organisms from poorly classified phyla which cannot be easily identified using traditional homology methods, including NCBI BLAST. This study examines oligonucleotide signatures across 1,424 completed genomes from across the tree of life, substantially expanding upon previous work. A comprehensive analysis of mononucleotide through nonanucleotide word lengths suggests that longer word lengths substantially improve the classification of DNA fragments across a range of sizes of relevance to high throughput sequencing. We find that, at present, heptanucleotide signatures represent an optimal balance between prediction accuracy and computational time for resolving taxonomy using both genomic and metagenomic fragments. We directly compare the ability of tetranucleotide and heptanucleotide world lengths (tetranucleotide signatures are the current standard for oligonucleotide word usage analyses) for taxonomic binning of metagenome reads. We present evidence that heptanucleotide word lengths consistently provide more taxonomic resolving power, particularly in distinguishing between closely related organisms that are often present in metagenomic samples. This implies that longer oligonucleotide word lengths should replace tetranucleotide signatures for most analyses. Finally, we show that the application of longer word lengths to metagenomic datasets leads to more accurate taxonomic binning of DNA scaffolds and have the potential to substantially improve taxonomic assignment and assembly of metagenomic data.

  7. Construction of a normalized full-length cDNA library of cephalopod Amphioctopus fangsiao and development of microsatellite markers

    Science.gov (United States)

    Feng, Yanwei; Liu, Wenfen; Xu, Xin; Yang, Jianmin; Wang, Weijun; Wei, Xiumei; Liu, Xiangquan; Sun, Guohua

    2017-10-01

    Amphioctopus fangsiao is one of the most economically important species and has been considered to be a candidate for aquaculture. In order to facilitate its fine-scale genetic analyses, we constructed a normalized full-length library successfully and developed a set of microsatellite markers in this study. The normalized full-length library had a storage capacity of 6.9×105 independent clones. The recombination efficiency was 95% and the average size of inserted fragments was longer than 1000 bp. A total of 3440 high quality ESTs were obtained, which were assembled into 1803 unigenes. Of these unigenes, 450 (25%) were assigned into 33 Gene Ontology terms, 576 (31.9%) into 153 Kyoto Encyclopedia of Genes and Genomes pathways, and 275 (15.3%) into 22 Clusters of Orthologous Groups. Seventy-six polymorphic microsatellite markers were identified. The number of alleles per locus ranged from 4 to 17, and the observed and expected heterozygosities varied between 0.167 and 0.967 and between 0.326 and 0.944, respectively. Twelve loci were significantly deviated from Hardy-Weinberg equilibrium after Bonferroni correction and no linkage disequilibrium was found between different loci. This study provided not only a useful resource for the isolation of the functional genes, but also a set of informative microsatellites for the assessment of population structure and conservation genetics of A. fangsiao.

  8. Genetic Characterization and Comparative Genome Analysis of Brucella melitensis Isolates from India

    Directory of Open Access Journals (Sweden)

    Sarwar Azam

    2016-01-01

    Full Text Available Brucellosis is the most frequent zoonotic disease worldwide, with over 500,000 new human infections every year. Brucella melitensis, the most virulent species in humans, primarily affects goats and the zoonotic transmission occurs by ingestion of unpasteurized milk products or through direct contact with fetal tissues. Brucellosis is endemic in India but no information is available on population structure and genetic diversity of Brucella spp. in India. We performed multilocus sequence typing of four B. melitensis strains isolated from naturally infected goats from India. For more detailed genetic characterization, we carried out whole genome sequencing and comparative genome analysis of one of the B. melitensis isolates, Bm IND1. Genome analysis identified 141 unique SNPs, 78 VNTRs, 51 Indels, and 2 putative prophage integrations in the Bm IND1 genome. Our data may help to develop improved epidemiological typing tools and efficient preventive strategies to control brucellosis.

  9. Genomic characterization of the Taylorella genus.

    Directory of Open Access Journals (Sweden)

    Laurent Hébert

    Full Text Available The Taylorella genus comprises two species: Taylorella equigenitalis, which causes contagious equine metritis, and Taylorella asinigenitalis, a closely-related species mainly found in donkeys. We herein report on the first genome sequence of T. asinigenitalis, analyzing and comparing it with the recently-sequenced T. equigenitalis genome. The T. asinigenitalis genome contains a single circular chromosome of 1,638,559 bp with a 38.3% GC content and 1,534 coding sequences (CDS. While 212 CDSs were T. asinigenitalis-specific, 1,322 had orthologs in T. equigenitalis. Two hundred and thirty-four T. equigenitalis CDSs had no orthologs in T. asinigenitalis. Analysis of the basic nutrition metabolism of both Taylorella species showed that malate, glutamate and alpha-ketoglutarate may be their main carbon and energy sources. For both species, we identified four different secretion systems and several proteins potentially involved in binding and colonization of host cells, suggesting a strong potential for interaction with their host. T. equigenitalis seems better-equipped than T. asinigenitalis in terms of virulence since we identified numerous proteins potentially involved in pathogenicity, including hemagluttinin-related proteins, a type IV secretion system, TonB-dependent lactoferrin and transferrin receptors, and YadA and Hep_Hag domains containing proteins. This is the first molecular characterization of Taylorella genus members, and the first molecular identification of factors potentially involved in T. asinigenitalis and T. equigenitalis pathogenicity and host colonization. This study facilitates a genetic understanding of growth phenotypes, animal host preference and pathogenic capacity, paving the way for future functional investigations into this largely unknown genus.

  10. Subtype-independent near full-length HIV-1 genome sequencing and assembly to be used in large molecular epidemiological studies and clinical management.

    Science.gov (United States)

    Grossmann, Sebastian; Nowak, Piotr; Neogi, Ujjwal

    2015-01-01

    HIV-1 near full-length genome (HIV-NFLG) sequencing from plasma is an attractive multidimensional tool to apply in large-scale population-based molecular epidemiological studies. It also enables genotypic resistance testing (GRT) for all drug target sites allowing effective intervention strategies for control and prevention in high-risk population groups. Thus, the main objective of this study was to develop a simplified subtype-independent, cost- and labour-efficient HIV-NFLG protocol that can be used in clinical management as well as in molecular epidemiological studies. Plasma samples (n=30) were obtained from HIV-1B (n=10), HIV-1C (n=10), CRF01_AE (n=5) and CRF01_AG (n=5) infected individuals with minimum viral load >1120 copies/ml. The amplification was performed with two large amplicons of 5.5 kb and 3.7 kb, sequenced with 17 primers to obtain HIV-NFLG. GRT was validated against ViroSeq™ HIV-1 Genotyping System. After excluding four plasma samples with low-quality RNA, a total of 26 samples were attempted. Among them, NFLG was obtained from 24 (92%) samples with the lowest viral load being 3000 copies/ml. High (>99%) concordance was observed between HIV-NFLG and ViroSeq™ when determining the drug resistance mutations (DRMs). The N384I connection mutation was additionally detected by NFLG in two samples. Our high efficiency subtype-independent HIV-NFLG is a simple and promising approach to be used in large-scale molecular epidemiological studies. It will facilitate the understanding of the HIV-1 pandemic population dynamics and outline effective intervention strategies. Furthermore, it can potentially be applicable in clinical management of drug resistance by evaluating DRMs against all available antiretrovirals in a single assay.

  11. Purification and Fibrillation of Full-Length Recombinant PrP

    OpenAIRE

    Makarava, Natallia; Baskakov, Ilia V.

    2012-01-01

    Misfolding and aggregation of prion protein (PrP) is related to several neurodegenerative diseases in humans such as Creutzfeldt–Jacob disease, fatal familial insomnia, and Gerstmann–Straussler–Sheinker disease. Certain applications in prion area require recombinant PrP of high purity and quality. Here, we report an experimental procedure for expression and purification of full-length mammalian PrP. This protocol has been proved to yield PrP of extremely high purity that lac...

  12. Increased circulating full-length betatrophin levels in drug-naïve metabolic syndrome.

    Science.gov (United States)

    Liu, Dan; Li, Sheyu; He, He; Yu, Chuan; Li, Xiaodan; Liang, Libo; Chen, Yi; Li, Jianwei; Li, Jianshu; Sun, Xin; Tian, Haoming; An, Zhenmei

    2017-03-14

    Betatrophin is a newly identified circulating adipokine playing a role in the regulation of glucose homeostasis and lipid metabolism. But its role in metabolic syndrome (MetS) remains unknown. Therefore, we aimed to compare the circulating betatrophin concentrations between patients with MetS and healthy controls. We recruited 47 patients with MetS and 47 age and sex matched healthy controls. Anthropometric and biochemical measurements were performed, and serum betatrophin levels were detected by ELISA. Full-length betatrophin levels in patients with MetS were significantly higher than those in controls (694.84 ± 365.51 pg/ml versus 356.64 ± 287.92 pg/ml; P <0.001). While no significant difference of total betatrophin levels was found between the two groups (1.20 ± 0.79 ng/ml versus 1.31 ± 1.08 ng/ml; P = 0.524). Full-length betatrophin level was positively correlated with fasting plasma glucose (FPG) (r = 0.357, P = 0.014) and 2-hour plasma glucose (2hPG) (r = 0.38, P <0.01). Binary logistic regression models indicated that subjects in the tertile of the highest full-length betatrophin level experienced higher odds of having MetS (OR, 8.6; 95% CI 2.8-26.8; P <0.001). Our study showed that full-length betatrophin concentrations were increased in drug-naïve MetS patients.

  13. Characterization of Binary Organogels Based on Some Azobenzene Compounds and Alkyloxybenzoic Acids with Different Chain Lengths

    Directory of Open Access Journals (Sweden)

    Yongmei Hu

    2014-01-01

    Full Text Available In this work the gelation behaviors of binary organogels composed of azobenzene amino derivatives and alkyloxybenzoic acids with different lengths of alkyl chains in various organic solvents were investigated and characterized. The corresponding gelation behaviors in 20 solvents were characterized and shown as new binary organic systems. It showed that the lengths of substituent alkyl chains in compounds have played an important role in the gelation formation of gelator mixtures in present tested organic solvents. Longer methylene chains in molecular skeletons in these gelators seem more suitable for the gelation of present solvents. Morphological characterization showed that these gelator molecules have the tendency to self-assemble into various aggregates from lamella, wrinkle, and belt to dot with change of solvents and gelator mixtures. Spectral characterization demonstrated different H-bond formation and hydrophobic force existing in gels, depending on different substituent chains in molecular skeletons. Meanwhile, these organogels can self-assemble to form monomolecular or multilayer nanostructures owing to the different lengths of due to alkyl substituent chains. Possible assembly modes for present xerogels were proposed. The present investigation is perspective to provide new clues for the design of new nanomaterials and functional textile materials with special microstructures.

  14. Characterization of the first complete genome sequence of an Impatiens necrotic spot orthotospovirus isolate from the United States and worldwide phylogenetic analyses of INSV isolates.

    Science.gov (United States)

    Zhao, Kaixi; Margaria, Paolo; Rosa, Cristina

    2018-05-10

    Impatiens necrotic spot orthotospovirus (INSV) can impact economically important ornamental plants and vegetables worldwide. Characterization studies on INSV are limited. For most INSV isolates, there are no complete genome sequences available. This lack of genomic information has a negative impact on the understanding of the INSV genetic diversity and evolution. Here we report the first complete nucleotide sequence of a US INSV isolate. INSV-UP01 was isolated from an impatiens in Pennsylvania, US. RT-PCR was used to clone its full-length genome and Vector NTI to assemble overlapping sequences. Phylogenetic trees were constructed by using MEGA7 software to show the phylogenetic relationships with other available INSV sequences worldwide. This US isolate has genome and biological features classical of INSV species and clusters in the Western Hemisphere clade, but its origin appears to be recent. Furthermore, INSV-UP01 might have been involved in a recombination event with an Italian isolate belonging to the Asian clade. Our analyses support that INSV isolates infect a broad plant-host range they group by geographic origin and not by host, and are subjected to frequent recombination events. These results justify the need to generate and analyze complete genome sequences of orthotospoviruses in general and INSV in particular.

  15. Pre-Steady-State Kinetic Analysis of Truncated and Full-Length Saccharomyces cerevisiae DNA Polymerase Eta

    Science.gov (United States)

    Brown, Jessica A.; Zhang, Likui; Sherrer, Shanen M.; Taylor, John-Stephen; Burgers, Peter M. J.; Suo, Zucai

    2010-01-01

    Understanding polymerase fidelity is an important objective towards ascertaining the overall stability of an organism's genome. Saccharomyces cerevisiae DNA polymerase η (yPolη), a Y-family DNA polymerase, is known to efficiently bypass DNA lesions (e.g., pyrimidine dimers) in vivo. Using pre-steady-state kinetic methods, we examined both full-length and a truncated version of yPolη which contains only the polymerase domain. In the absence of yPolη's C-terminal residues 514–632, the DNA binding affinity was weakened by 2-fold and the base substitution fidelity dropped by 3-fold. Thus, the C-terminus of yPolη may interact with DNA and slightly alter the conformation of the polymerase domain during catalysis. In general, yPolη discriminated between a correct and incorrect nucleotide more during the incorporation step (50-fold on average) than the ground-state binding step (18-fold on average). Blunt-end additions of dATP or pyrene nucleotide 5′-triphosphate revealed the importance of base stacking during the binding of incorrect incoming nucleotides. PMID:20798853

  16. Complete Genomes of Classical Swine Fever Virus Cloned into Bacterial Artificial Chromosomes

    DEFF Research Database (Denmark)

    Rasmussen, Thomas Bruun; Reimann, I.; Uttenthal, Åse

    Complete genome amplification of viral RNA provides a new tool for the generation of modified pestiviruses. We have used our full-genome amplification strategy for generation of amplicons representing complete genomes of classical swine fever virus. The amplicons were cloned directly into a stabl...... single-copy bacterial artificial chromosome (BAC) generating full-length pestivirus DNAs from which infectious RNA transcripts could be also derived. Our strategy allows construction of stable infectious BAC DNAs from a single full-length PCR product....

  17. Characterizing Phage Genomes for Therapeutic Applications.

    Science.gov (United States)

    Philipson, Casandra W; Voegtly, Logan J; Lueder, Matthew R; Long, Kyle A; Rice, Gregory K; Frey, Kenneth G; Biswas, Biswajit; Cer, Regina Z; Hamilton, Theron; Bishop-Lilly, Kimberly A

    2018-04-10

    Multi-drug resistance is increasing at alarming rates. The efficacy of phage therapy, treating bacterial infections with bacteriophages alone or in combination with traditional antibiotics, has been demonstrated in emergency cases in the United States and in other countries, however remains to be approved for wide-spread use in the US. One limiting factor is a lack of guidelines for assessing the genomic safety of phage candidates. We present the phage characterization workflow used by our team to generate data for submitting phages to the Federal Drug Administration (FDA) for authorized use. Essential analysis checkpoints and warnings are detailed for obtaining high-quality genomes, excluding undesirable candidates, rigorously assessing a phage genome for safety and evaluating sequencing contamination. This workflow has been developed in accordance with community standards for high-throughput sequencing of viral genomes as well as principles for ideal phages used for therapy. The feasibility and utility of the pipeline is demonstrated on two new phage genomes that meet all safety criteria. We propose these guidelines as a minimum standard for phages being submitted to the FDA for review as investigational new drug candidates.

  18. Coordinated Changes in Mutation and Growth Rates Induced by Genome Reduction

    Directory of Open Access Journals (Sweden)

    Issei Nishimura

    2017-07-01

    Full Text Available Genome size is determined during evolution, but it can also be altered by genetic engineering in laboratories. The systematic characterization of reduced genomes provides valuable insights into the cellular properties that are quantitatively described by the global parameters related to the dynamics of growth and mutation. In the present study, we analyzed a small collection of W3110 Escherichia coli derivatives containing either the wild-type genome or reduced genomes of various lengths to examine whether the mutation rate, a global parameter representing genomic plasticity, was affected by genome reduction. We found that the mutation rates of these cells increased with genome reduction. The correlation between genome length and mutation rate, which has been reported for the evolution of bacteria, was also identified, intriguingly, for genome reduction. Gene function enrichment analysis indicated that the deletion of many of the genes encoding membrane and transport proteins play a role in the mutation rate changes mediated by genome reduction. Furthermore, the increase in the mutation rate with genome reduction was highly associated with a decrease in the growth rate in a nutrition-dependent manner; thus, poorer media showed a larger change that was of higher significance. This negative correlation was strongly supported by experimental evidence that the serial transfer of the reduced genome improved the growth rate and reduced the mutation rate to a large extent. Taken together, the global parameters corresponding to the genome, growth, and mutation showed a coordinated relationship, which might be an essential working principle for balancing the cellular dynamics appropriate to the environment.

  19. A simple method for the parallel deep sequencing of full influenza A genomes

    DEFF Research Database (Denmark)

    Kampmann, Marie-Louise; Fordyce, Sarah Louise; Avila Arcos, Maria del Carmen

    2011-01-01

    Given the major threat of influenza A to human and animal health, and its ability to evolve rapidly through mutation and reassortment, tools that enable its timely characterization are necessary to help monitor its evolution and spread. For this purpose, deep sequencing can be a very valuable tool....... This study reports a comprehensive method that enables deep sequencing of the complete genomes of influenza A subtypes using the Illumina Genome Analyzer IIx (GAIIx). By using this method, the complete genomes of nine viruses were sequenced in parallel, representing the 2009 pandemic H1N1 virus, H5N1 virus...

  20. Lengths of Orthologous Prokaryotic Proteins Are Affected by Evolutionary Factors

    Directory of Open Access Journals (Sweden)

    Tatiana Tatarinova

    2015-01-01

    Full Text Available Proteins of the same functional family (for example, kinases may have significantly different lengths. It is an open question whether such variation in length is random or it appears as a response to some unknown evolutionary driving factors. The main purpose of this paper is to demonstrate existence of factors affecting prokaryotic gene lengths. We believe that the ranking of genomes according to lengths of their genes, followed by the calculation of coefficients of association between genome rank and genome property, is a reasonable approach in revealing such evolutionary driving factors. As we demonstrated earlier, our chosen approach, Bubble-sort, combines stability, accuracy, and computational efficiency as compared to other ranking methods. Application of Bubble Sort to the set of 1390 prokaryotic genomes confirmed that genes of Archaeal species are generally shorter than Bacterial ones. We observed that gene lengths are affected by various factors: within each domain, different phyla have preferences for short or long genes; thermophiles tend to have shorter genes than the soil-dwellers; halophiles tend to have longer genes. We also found that species with overrepresentation of cytosines and guanines in the third position of the codon (GC3 content tend to have longer genes than species with low GC3 content.

  1. Human uroporphyrinogen III synthase: Molecular cloning, nucleotide sequence, and expression of a full-length cDNA

    International Nuclear Information System (INIS)

    Tsai, Shihfeng; Bishop, D.F.; Desnick, R.J.

    1988-01-01

    Uroporphyrinogen III synthase, the fourth enzyme in the heme biosynthetic pathway, is responsible for conversion of the linear tetrapyrrole, hydroxymethylbilane, to the cyclic tetrapyrrole, uroporphyrinogen III. The deficient activity of URO-synthase is the enzymatic defect in the autosomal recessive disorder congenital erythropoietic porphyria. To facilitate the isolation of a full-length cDNA for human URO-synthase, the human erythrocyte enzyme was purified to homogeneity and 81 nonoverlapping amino acids were determined by microsequencing the N terminus and four tryptic peptides. Two synthetic oligonucleotide mixtures were used to screen 1.2 x 10 6 recombinants from a human adult liver cDNA library. Eight clones were positive with both oligonucleotide mixtures. Of these, dideoxy sequencing of the 1.3 kilobase insert from clone pUROS-2 revealed 5' and 3' untranslated sequences of 196 and 284 base pairs, respectively, and an open reading frame of 798 base pairs encoding a protein of 265 amino acids with a predicted molecular mass of 28,607 Da. The isolation and expression of this full-length cDNA for human URO-synthase should facilitate studies of the structure, organization, and chromosomal localization of this heme biosynthetic gene as well as the characterization of the molecular lesions causing congenital erythropoietic porphyria

  2. Sequencing and characterization of the complete mitochondrial genome of Japanese Swellshark (Cephalloscyllium umbratile)

    OpenAIRE

    Zhu, Ke-Cheng; Liang, Yin-Yin; Wu, Na; Guo, Hua-Yang; Zhang, Nan; Jiang, Shi-Gui; Zhang, Dian-Chang

    2017-01-01

    To further comprehend the genome features of Cephalloscyllium umbratile (Carcharhiniformes), an endangered species, the complete mitochondrial DNA (mtDNA) was firstly sequenced and annotated. The full-length mtDNA of C. umbratile was 16,697 bp and contained ribosomal RNA (rRNA) genes, 13 protein-coding genes (PCGs), 23 transfer RNA (tRNA) genes, and a major non-coding control region. Each PCG was initiated by an authoritative ATN codon, except for COX1 initiated by a GTG codon. Seven of 13 PC...

  3. Genome-wide analysis of LTR-retrotransposons in oil palm.

    Science.gov (United States)

    Beulé, Thierry; Agbessi, Mawussé Dt; Dussert, Stephane; Jaligot, Estelle; Guyot, Romain

    2015-10-15

    The oil palm (Elaeis guineensis Jacq.) is a major cultivated crop and the world's largest source of edible vegetable oil. The genus Elaeis comprises two species E. guineensis, the commercial African oil palm and E. oleifera, which is used in oil palm genetic breeding. The recent publication of both the African oil palm genome assembly and the first draft sequence of its Latin American relative now allows us to tackle the challenge of understanding the genome composition, structure and evolution of these palm genomes through the annotation of their repeated sequences. In this study, we identified, annotated and compared Transposable Elements (TE) from the African and Latin American oil palms. In a first step, Transposable Element databases were built through de novo detection in both genome sequences then the TE content of both genomes was estimated. Then putative full-length retrotransposons with Long Terminal Repeats (LTRs) were further identified in the E. guineensis genome for characterization of their structural diversity, copy number and chromosomal distribution. Finally, their relative expression in several tissues was determined through in silico analysis of publicly available transcriptome data. Our results reveal a congruence in the transpositional history of LTR retrotransposons between E. oleifera and E. guineensis, especially the Sto-4 family. Also, we have identified and described 583 full-length LTR-retrotransposons in the Elaeis guineensis genome. Our work shows that these elements are most likely no longer mobile and that no recent insertion event has occurred. Moreover, the analysis of chromosomal distribution suggests a preferential insertion of Copia elements in gene-rich regions, whereas Gypsy elements appear to be evenly distributed throughout the genome. Considering the high proportion of LTR retrotransposon in the oil palm genome, our work will contribute to a greater understanding of their impact on genome organization and evolution

  4. Internally deleted WNV genomes isolated from exotic birds in New Mexico: function in cells, mosquitoes, and mice.

    Science.gov (United States)

    Pesko, Kendra N; Fitzpatrick, Kelly A; Ryan, Elizabeth M; Shi, Pei-Yong; Zhang, Bo; Lennon, Niall J; Newman, Ruchi M; Henn, Matthew R; Ebel, Gregory D

    2012-05-25

    Most RNA viruses exist in their hosts as a heterogeneous population of related variants. Due to error prone replication, mutants are constantly generated which may differ in individual fitness from the population as a whole. Here we characterize three WNV isolates that contain, along with full-length genomes, mutants with large internal deletions to structural and nonstructural protein-coding regions. The isolates were all obtained from lorikeets that died from WNV at the Rio Grande Zoo in Albuquerque, NM between 2005 and 2007. The deletions are approximately 2kb, in frame, and result in the elimination of the complete envelope, and portions of the prM and NS-1 proteins. In Vero cell culture, these internally deleted WNV genomes function as defective interfering particles, reducing the production of full-length virus when introduced at high multiplicities of infection. In mosquitoes, the shortened WNV genomes reduced infection and dissemination rates, and virus titers overall, and were not detected in legs or salivary secretions at 14 or 21 days post-infection. In mice, inoculation with internally deleted genomes did not attenuate pathogenesis relative to full-length or infectious clone derived virus, and shortened genomes were not detected in mice at the time of death. These observations provide evidence that large deletions may occur within flavivirus populations more frequently than has generally been appreciated and suggest that they impact population phenotype minimally. Additionally, our findings suggest that highly similar mutants may frequently occur in particular vertebrate hosts. Copyright © 2012 Elsevier Inc. All rights reserved.

  5. The evolutionary rates of HCV estimated with subtype 1a and 1b sequences over the ORF length and in different genomic regions.

    Directory of Open Access Journals (Sweden)

    Manqiong Yuan

    Full Text Available Considerable progress has been made in the HCV evolutionary analysis, since the software BEAST was released. However, prior information, especially the prior evolutionary rate, which plays a critical role in BEAST analysis, is always difficult to ascertain due to various uncertainties. Providing a proper prior HCV evolutionary rate is thus of great importance.176 full-length sequences of HCV subtype 1a and 144 of 1b were assembled by taking into consideration the balance of the sampling dates and the even dispersion in phylogenetic trees. According to the HCV genomic organization and biological functions, each dataset was partitioned into nine genomic regions and two routinely amplified regions. A uniform prior rate was applied to the BEAST analysis for each region and also the entire ORF. All the obtained posterior rates for 1a are of a magnitude of 10(-3 substitutions/site/year and in a bell-shaped distribution. Significantly lower rates were estimated for 1b and some of the rate distribution curves resulted in a one-sided truncation, particularly under the exponential model. This indicates that some of the rates for subtype 1b are less accurate, so they were adjusted by including more sequences to improve the temporal structure.Among the various HCV subtypes and genomic regions, the evolutionary patterns are dissimilar. Therefore, an applied estimation of the HCV epidemic history requires the proper selection of the rate priors, which should match the actual dataset so that they can fit for the subtype, the genomic region and even the length. By referencing the findings here, future evolutionary analysis of the HCV subtype 1a and 1b datasets may become more accurate and hence prove useful for tracing their patterns.

  6. A high HIV-1 strain variability in London, UK, revealed by full-genome analysis: Results from the ICONIC project

    Science.gov (United States)

    Frampton, Dan; Gallo Cassarino, Tiziano; Raffle, Jade; Hubb, Jonathan; Ferns, R. Bridget; Waters, Laura; Tong, C. Y. William; Kozlakidis, Zisis; Hayward, Andrew; Kellam, Paul; Pillay, Deenan; Clark, Duncan; Nastouli, Eleni; Leigh Brown, Andrew J.

    2018-01-01

    Background & methods The ICONIC project has developed an automated high-throughput pipeline to generate HIV nearly full-length genomes (NFLG, i.e. from gag to nef) from next-generation sequencing (NGS) data. The pipeline was applied to 420 HIV samples collected at University College London Hospitals NHS Trust and Barts Health NHS Trust (London) and sequenced using an Illumina MiSeq at the Wellcome Trust Sanger Institute (Cambridge). Consensus genomes were generated and subtyped using COMET, and unique recombinants were studied with jpHMM and SimPlot. Maximum-likelihood phylogenetic trees were constructed using RAxML to identify transmission networks using the Cluster Picker. Results The pipeline generated sequences of at least 1Kb of length (median = 7.46Kb, IQR = 4.01Kb) for 375 out of the 420 samples (89%), with 174 (46.4%) being NFLG. A total of 365 sequences (169 of them NFLG) corresponded to unique subjects and were included in the down-stream analyses. The most frequent HIV subtypes were B (n = 149, 40.8%) and C (n = 77, 21.1%) and the circulating recombinant form CRF02_AG (n = 32, 8.8%). We found 14 different CRFs (n = 66, 18.1%) and multiple URFs (n = 32, 8.8%) that involved recombination between 12 different subtypes/CRFs. The most frequent URFs were B/CRF01_AE (4 cases) and A1/D, B/C, and B/CRF02_AG (3 cases each). Most URFs (19/26, 73%) lacked breakpoints in the PR+RT pol region, rendering them undetectable if only that was sequenced. Twelve (37.5%) of the URFs could have emerged within the UK, whereas the rest were probably imported from sub-Saharan Africa, South East Asia and South America. For 2 URFs we found highly similar pol sequences circulating in the UK. We detected 31 phylogenetic clusters using the full dataset: 25 pairs (mostly subtypes B and C), 4 triplets and 2 quadruplets. Some of these were not consistent across different genes due to inter- and intra-subtype recombination. Clusters involved 70 sequences, 19.2% of the dataset. Conclusions

  7. TypeLoader: A fast and efficient automated workflow for the annotation and submission of novel full-length HLA alleles.

    Science.gov (United States)

    Surendranath, V; Albrecht, V; Hayhurst, J D; Schöne, B; Robinson, J; Marsh, S G E; Schmidt, A H; Lange, V

    2017-07-01

    Recent years have seen a rapid increase in the discovery of novel allelic variants of the human leukocyte antigen (HLA) genes. Commonly, only the exons encoding the peptide binding domains of novel HLA alleles are submitted. As a result, the IPD-IMGT/HLA Database lacks sequence information outside those regions for the majority of known alleles. This has implications for the application of the new sequencing technologies, which deliver sequence data often covering the complete gene. As these technologies simplify the characterization of the complete gene regions, it is desirable for novel alleles to be submitted as full-length sequences to the database. However, the manual annotation of full-length alleles and the generation of specific formats required by the sequence repositories is prone to error and time consuming. We have developed TypeLoader to address both these facets. With only the full-length sequence as a starting point, Typeloader performs automatic sequence annotation and subsequently handles all steps involved in preparing the specific formats for submission with very little manual intervention. TypeLoader is routinely used at the DKMS Life Science Lab and has aided in the successful submission of more than 900 novel HLA alleles as full-length sequences to the European Nucleotide Archive repository and the IPD-IMGT/HLA Database with a 95% reduction in the time spent on annotation and submission when compared with handling these processes manually. TypeLoader is implemented as a web application and can be easily installed and used on a standalone Linux desktop system or within a Linux client/server architecture. TypeLoader is downloadable from http://www.github.com/DKMS-LSL/typeloader. © 2017 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.

  8. The GAAS metagenomic tool and its estimations of viral and microbial average genome size in four major biomes.

    Directory of Open Access Journals (Sweden)

    Florent E Angly

    2009-12-01

    Full Text Available Metagenomic studies characterize both the composition and diversity of uncultured viral and microbial communities. BLAST-based comparisons have typically been used for such analyses; however, sampling biases, high percentages of unknown sequences, and the use of arbitrary thresholds to find significant similarities can decrease the accuracy and validity of estimates. Here, we present Genome relative Abundance and Average Size (GAAS, a complete software package that provides improved estimates of community composition and average genome length for metagenomes in both textual and graphical formats. GAAS implements a novel methodology to control for sampling bias via length normalization, to adjust for multiple BLAST similarities by similarity weighting, and to select significant similarities using relative alignment lengths. In benchmark tests, the GAAS method was robust to both high percentages of unknown sequences and to variations in metagenomic sequence read lengths. Re-analysis of the Sargasso Sea virome using GAAS indicated that standard methodologies for metagenomic analysis may dramatically underestimate the abundance and importance of organisms with small genomes in environmental systems. Using GAAS, we conducted a meta-analysis of microbial and viral average genome lengths in over 150 metagenomes from four biomes to determine whether genome lengths vary consistently between and within biomes, and between microbial and viral communities from the same environment. Significant differences between biomes and within aquatic sub-biomes (oceans, hypersaline systems, freshwater, and microbialites suggested that average genome length is a fundamental property of environments driven by factors at the sub-biome level. The behavior of paired viral and microbial metagenomes from the same environment indicated that microbial and viral average genome sizes are independent of each other, but indicative of community responses to stressors and

  9. Isolation and sequence characterization of DNA-A genome of a new begomovirus strain associated with severe leaf curling symptoms of Jatropha curcas L.

    KAUST Repository

    Chauhan, Sushma

    2018-04-22

    Begomoviruses belong to the family Geminiviridae are associated with several disease symptoms, such as mosaic and leaf curling in Jatropha curcas. The molecular characterization of these viral strains will help in developing management strategies to control the disease. In this study, J. curcas that was infected with begomovirus and showed acute leaf curling symptoms were identified. DNA-A segment from pathogenic viral strain was isolated and sequenced. The sequenced genome was assembled and characterized in detail. The full-length DNA-A sequence was covered by primer walking. The genome sequence showed the general organization of DNA-A from begomovirus by the distribution of ORFs in both viral and anti-viral strands. The genome size ranged from 2844 bp–2852 bp. Three strains with minor nucleotide variations were identified, and a phylogenetic analysis was performed by comparing the DNA-A segments from other reported begomovirus isolates. The maximum sequence similarity was observed with Euphorbia yellow mosaic virus (FN435995). In the phylogenetic tree, no clustering was observed with previously reported begomovirus strains isolated from J. curcas host. The strains isolated in this study belong to new begomoviral strain that elicits symptoms of leaf curling in J. curcas. The results indicate that the probable origin of the strains is from Jatropha mosaic virus infecting J. gassypifolia. The strains isolated in this study are referred as Jatropha curcas leaf curl India virus (JCLCIV) based on the major symptoms exhibited by host J. curcas.

  10. Simulations of The Dalles Dam Proposed Full Length Spillwall

    Energy Technology Data Exchange (ETDEWEB)

    Rakowski, Cynthia L.; Perkins, William A.; Richmond, Marshall C.; Serkowski, John A.

    2008-02-25

    This report presents results of a computational fluid dynamics (CFD) modeling study to evaluatethe impacts of a full-length spillwall at The Dalles Dam. The full-length spillwall is being designed and evaluated as a structural means to improve tailrace egress and thus survival of juvenile fish passing through the spillway. During the course of this study, a full-length spillwall at Bays 6/7 and 8/9 were considered. The U.S. Army Corps of Engineers (USACE) has proposed extending the spillwall constructed in the stilling basin between spillway Bays 6 and 7 about 590 ft farther downstream. It is believed that the extension of the spillwall will improve egress conditions for downstream juvenile salmonids by moving them more rapidly into the thalweg of the river hence reducing their exposure to predators. A numerical model was created, validated, and applied the The Dalles Dam tailrace. The models were designed to assess impacts to flow, tailrace egress, navigation, and adult salmon passage of a proposed spill wall extension. The more extensive model validation undertaken in this study greatly improved our confidence in the numerical model to represent the flow conditions in The Dalles tailrace. This study used these validated CFD models to simulate the potential impacts of a spillwall extension for The Dalles Dam tailrace for two locations. We determined the following: (1)The construction of an extended wall (between Bays 6/7) will not adversely impact entering or exiting the navigation lock. Impact should be less if a wall were constructed between Bays 8/9. (2)The construction of a wall between Bays 6/7 will increase the water surface elevation between the wall and the Washington shore. Although the increased water surface elevation would be beneficial to adult upstream migrants in that it decreases velocities on the approach to the adult ladder, the increased flow depth would enhance dissolved gas production, impacting potential operations of the project because of

  11. Importance of Viral Sequence Length and Number of Variable and Informative Sites in Analysis of HIV Clustering.

    Science.gov (United States)

    Novitsky, Vlad; Moyo, Sikhulile; Lei, Quanhong; DeGruttola, Victor; Essex, M

    2015-05-01

    To improve the methodology of HIV cluster analysis, we addressed how analysis of HIV clustering is associated with parameters that can affect the outcome of viral clustering. The extent of HIV clustering and tree certainty was compared between 401 HIV-1C near full-length genome sequences and subgenomic regions retrieved from the LANL HIV Database. Sliding window analysis was based on 99 windows of 1,000 bp and 45 windows of 2,000 bp. Potential associations between the extent of HIV clustering and sequence length and the number of variable and informative sites were evaluated. The near full-length genome HIV sequences showed the highest extent of HIV clustering and the highest tree certainty. At the bootstrap threshold of 0.80 in maximum likelihood (ML) analysis, 58.9% of near full-length HIV-1C sequences but only 15.5% of partial pol sequences (ViroSeq) were found in clusters. Among HIV-1 structural genes, pol showed the highest extent of clustering (38.9% at a bootstrap threshold of 0.80), although it was significantly lower than in the near full-length genome sequences. The extent of HIV clustering was significantly higher for sliding windows of 2,000 bp than 1,000 bp. We found a strong association between the sequence length and proportion of HIV sequences in clusters, and a moderate association between the number of variable and informative sites and the proportion of HIV sequences in clusters. In HIV cluster analysis, the extent of detectable HIV clustering is directly associated with the length of viral sequences used, as well as the number of variable and informative sites. Near full-length genome sequences could provide the most informative HIV cluster analysis. Selected subgenomic regions with a high extent of HIV clustering and high tree certainty could also be considered as a second choice.

  12. Characterization of Transposable Elements in Laccaria bicolor

    Energy Technology Data Exchange (ETDEWEB)

    Labbe, Jessy L [ORNL; Murat, Claude [INRA, Nancy, France; Morin, Emmanuelle [INRA, Nancy, France; Tuskan, Gerald A [ORNL; Le Tacon, F [UMR, France; Martin, Francis [INRA, Nancy, France

    2012-01-01

    Background: The publicly available Laccaria bicolor genome sequence has provided a considerable genomic resource allowing systematic identification of transposable elements (TEs) in this symbiotic ectomycorrhizal fungus. Using a TE-specific annotation pipeline we have characterized and analyzed TEs in the L. bicolor S238N-H82 genome. Methodology/Principal Findings: TEs occupy 24% of the 60 Mb L. bicolor genome and represent 25,787 full-length and partial copies elements distributed within 172 families. The most abundant elements were the Copia-like. TEs are not randomly distributed across the genome, but are tightly nested or clustered. The majority of TEs are ancient except some terminal inverted repeats (TIRS), long terminal repeats (LTRs) and a large retrotransposon derivative (LARD) element. There were three main periods of TEs expansion in L. bicolor; the first from 57 to 10 Mya, the second from 5 to 1 Mya and the most recent from 500,000 years ago until now. LTR retrotransposons are closely related to retrotransposons found in another basidiomycete, Coprinopsis cinerea. Conclusions: This analysis represents an initial characterization of TEs in the L. bicolor genome, contributes to genome assembly and to a greater understanding of the role TEs played in genome organization and evolution, and provides a valuable resource for the ongoing Laccaria Pan-Genome project supported by the U.S.-DOE Joint Genome Institute.

  13. Characterization of Equine Infectious Anemia Virus Integration in the Horse Genome

    Directory of Open Access Journals (Sweden)

    Qiang Liu

    2015-06-01

    Full Text Available Human immunodeficiency virus (HIV-1 has a unique integration profile in the human genome relative to murine and avian retroviruses. Equine infectious anemia virus (EIAV is another well-studied lentivirus that can also be used as a promising retro-transfection vector, but its integration into its native host has not been characterized. In this study, we mapped 477 integration sites of the EIAV strain EIAVFDDV13 in fetal equine dermal (FED cells during in vitro infection. Published integration sites of EIAV and HIV-1 in the human genome were also analyzed as references. Our results demonstrated that EIAVFDDV13 tended to integrate into genes and AT-rich regions, and it avoided integrating into transcription start sites (TSS, which is consistent with EIAV and HIV-1 integration in the human genome. Notably, the integration of EIAVFDDV13 favored long interspersed elements (LINEs and DNA transposons in the horse genome, whereas the integration of HIV-1 favored short interspersed elements (SINEs in the human genome. The chromosomal environment near LINEs or DNA transposons potentially influences viral transcription and may be related to the unique EIAV latency states in equids. The data on EIAV integration in its natural host will facilitate studies on lentiviral infection and lentivirus-based therapeutic vectors.

  14. Full-length model of the human galectin-4 and insights into dynamics of inter-domain communication

    Science.gov (United States)

    Rustiguel, Joane K.; Soares, Ricardo O. S.; Meisburger, Steve P.; Davis, Katherine M.; Malzbender, Kristina L.; Ando, Nozomi; Dias-Baruffi, Marcelo; Nonato, Maria Cristina

    2016-09-01

    Galectins are proteins involved in diverse cellular contexts due to their capacity to decipher and respond to the information encoded by β-galactoside sugars. In particular, human galectin-4, normally expressed in the healthy gastrointestinal tract, displays differential expression in cancerous tissues and is considered a potential drug target for liver and lung cancer. Galectin-4 is a tandem-repeat galectin characterized by two carbohydrate recognition domains connected by a linker-peptide. Despite their relevance to cell function and pathogenesis, structural characterization of full-length tandem-repeat galectins has remained elusive. Here, we investigate galectin-4 using X-ray crystallography, small- and wide-angle X-ray scattering, molecular modelling, molecular dynamics simulations, and differential scanning fluorimetry assays and describe for the first time a structural model for human galectin-4. Our results provide insight into the structural role of the linker-peptide and shed light on the dynamic characteristics of the mechanism of carbohydrate recognition among tandem-repeat galectins.

  15. Isolation and characterization of an isoamylase gene from rye

    Directory of Open Access Journals (Sweden)

    Ke Zheng

    2013-12-01

    Full Text Available Genomic DNA and cDNA sequences of an isoamylase gene were isolated and characterized from the rye genome. The full-lengths of the rye isoamylase gene are 7351 bp for genomic DNA and 2364 bp for cDNA. There are 18 exons and 17 introns in the genomic sequence, which shares a similar organization with homologous genes from Aegilops tauschii, maize, rice and Arabidopsis. Exon regions of rye and other plant isoamylase genes are more conserved than the introns. High sequence similarity (> 95% was observed in mature proteins of isoamylase genes originating from rye, Ae. tauschii, wheat and barley. The transcript profile revealed that rye isoamylase is mainly expressed in the seed endosperm with a maximum level at the middle developmental stage (15 DPA. A phylogenetic tree based on the deduced aa sequences of mature proteins from rye and other plant isoamylases indicated that rye isoamylase is more closely related to Ae. tauschii wDBE1 and wheat iso1. This is the first report on identification and characterization of the isoamylase gene from rye, making it possible to explore the roles of this enzyme for amylopectin development in rye and triticale.

  16. Genomic diversity among Danish field strains of Mycoplasma hyosynoviae assessed by amplified fragment length polymorphism analysis

    DEFF Research Database (Denmark)

    Kokotovic, Branko; Friis, Niels F.; Nielsen, Elisabeth O.

    2002-01-01

    Genomic diversity among strains of Mycoplasma hyosynoviae isolated in Denmark was assessed by using amplified fragment length polymorphism (AFLP) analysis. Ninety-six strains, obtained from different specimens and geographical locations during 30 years and the type strain of M. hyosynoviae S16(T......) were concurrently examined for variance in BglII-MfeI and EcoRI-Csp6I-A AFLP markers. A total of 56 different genomic fingerprints having an overall similarity between 77 and 96% were detected. No correlation between AFLP variability and period of isolation or anatomical site of isolation could...

  17. Whole-genome characterization of Uruguayan strains of avian infectious bronchitis virus reveals extensive recombination between the two major South American lineages.

    Science.gov (United States)

    Marandino, Ana; Tomás, Gonzalo; Panzera, Yanina; Greif, Gonzalo; Parodi-Talice, Adriana; Hernández, Martín; Techera, Claudia; Hernández, Diego; Pérez, Ruben

    2017-10-01

    Infectious bronchitis virus (Gammacoronavirus, Coronaviridae) is a genetically variable RNA virus that causes one of the most persistent respiratory diseases in poultry. The virus is classified in genotypes and lineages with different epidemiological relevance. Two lineages of the GI genotype (11 and 16) have been widely circulating for decades in South America. GI-11 is an exclusive South American lineage while the GI-16 lineage is distributed in Asia, Europe and South America. Here, we obtained the whole genome of two Uruguayan strains of the GI-11 and GI-16 lineages using Illumina high-throughput sequencing. The strains here sequenced are the first obtained in South America for the infectious bronchitis virus and provide new insights into the origin, spreading and evolution of viral variants. The complete genome of the GI-11 and GI-16 strains have 27,621 and 27,638 nucleotides, respectively, and possess the same genomic organization. Phylogenetic incongruence analysis reveals that both strains have a mosaic genome that arose by recombination between Euro Asiatic strains of the GI-16 lineage and ancestral South American GI-11 viruses. The recombination occurred in South America and produced two viral variants that have retained the full-length S1 sequences of the parental lineages but are extremely similar in the rest of their genomes. These recombinant virus have been extraordinary successful, persisting in the continent for several years with a notorious wide geographic distribution. Our findings reveal a singular viral dynamics and emphasize the importance of complete genomic characterization to understand the emergence and evolutionary history of viral variants. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. Construction and evaluation of normalized cDNA libraries enriched with full-length sequences for rapid discovery of new genes from Sisal (Agave sisalana Perr.) different developmental stages.

    Science.gov (United States)

    Zhou, Wen-Zhao; Zhang, Yan-Mei; Lu, Jun-Ying; Li, Jun-Feng

    2012-10-12

    To provide a resource of sisal-specific expressed sequence data and facilitate this powerful approach in new gene research, the preparation of normalized cDNA libraries enriched with full-length sequences is necessary. Four libraries were produced with RNA pooled from Agave sisalana multiple tissues to increase efficiency of normalization and maximize the number of independent genes by SMART™ method and the duplex-specific nuclease (DSN). This procedure kept the proportion of full-length cDNAs in the subtracted/normalized libraries and dramatically enhanced the discovery of new genes. Sequencing of 3875 cDNA clones of libraries revealed 3320 unigenes with an average insert length about 1.2 kb, indicating that the non-redundancy of libraries was about 85.7%. These unigene functions were predicted by comparing their sequences to functional domain databases and extensively annotated with Gene Ontology (GO) terms. Comparative analysis of sisal unigenes and other plant genomes revealed that four putative MADS-box genes and knotted-like homeobox (knox) gene were obtained from a total of 1162 full-length transcripts. Furthermore, real-time PCR showed that the characteristics of their transcripts mainly depended on the tight expression regulation of a number of genes during the leaf and flower development. Analysis of individual library sequence data indicated that the pooled-tissue approach was highly effective in discovering new genes and preparing libraries for efficient deep sequencing.

  19. Expression of full-length and splice forms of FoxP3 in rheumatoid arthritis

    DEFF Research Database (Denmark)

    Ryder, L R; Woetmann, A; Madsen, H O

    2010-01-01

    OBJECTIVE: The aim of our study was to compare the presence of full-length and alternative splice forms of FoxP3 mRNA in CD4 cells from rheumatoid arthritis (RA) patients and healthy controls. METHODS: A quantitative real-time polymerase chain reaction (QRT-PCR) method was used to measure...... the amount of FoxP3 mRNA full-length and splice forms. CD4-positive T cells were isolated from peripheral blood from 50 RA patients by immunomagnetic separation, and the FoxP3 mRNA expression was compared with the results from 10 healthy controls. RESULTS: We observed an increased expression of full......-length FoxP3 mRNA in RA patients when compared to healthy controls, as well as an increase in CD25 mRNA expression, but no corresponding increase in CTLA-4 mRNA expression. The presence of an alternative splice form of FoxP3 lacking exon 2 was confirmed in both RA patients and healthy controls...

  20. EXPRESSION AND CHARACTERIZATION OF FULL-LENGTH HUMAN HEME OXYGENASE-1: PRESENCE OF INTACT MEMBRANE-BINDING REGION LEADS TO INCREASED BINDING AFFINITY FOR NADPH-CYTOCHROME P450 REDUCTASE

    Science.gov (United States)

    Huber, Warren J.; Backes, Wayne L.

    2009-01-01

    Heme oxygenase (HO) is the chief regulatory enzyme in the oxidative degradation of heme to biliverdin. In the process of heme degradation, this NADPH and cytochrome P450 reductase (CPR)-dependent oxidation of heme also releases free iron and carbon monoxide. Much of the recent research involving heme oxygenase is done using a 30-kDa soluble form of the enzyme, which lacks the membrane binding region (C-terminal 23 amino acids). The goal of this study was to express and purify a full-length human HO-1 (hHO-1) protein; however, due to the lability of the full-length form, a rapid purification procedure was required. This was accomplished by use of a GST-tagged hHO-1 construct. Although the procedure permitted the generation of a full-length HO-1, this form was contaminated with a 30-kDa degradation product that could not be eliminated. Therefore, we attempted to remove a putative secondary thrombin cleavage site by a conservative mutation of amino acid 254, which replaces lysine with arginine. This mutation allowed the expression and purification of a full length hHO-1 protein. Unlike wild-type HO-1, the K254R mutant could be purified to a single 32-kDa protein capable of degrading heme at the same rate as the wild-type enzyme. The K254R full-length form had a specific activity of ~200–225 nmol bilirubin hr−1nmol−1 HO-1 as compared to ~140–150 nmol bilirubin hr−1nmol−1 for the WT form, which contains the 30-kDa contaminant. This is a 2–3-fold increase from the previously reported soluble 30-kDa HO-1, suggesting that the C-terminal 23 amino acids are essential for maximal catalytic activity. Because the membrane spanning domain is present, the full-length hHO-1 has the potential to incorporate into phospholipid membranes, which can be reconstituted at known concentrations, in combination with other ER-resident enzymes. PMID:17915953

  1. Full mitochondrial genome sequences of two endemic Philippine hornbill species (Aves: Bucerotidae) provide evidence for pervasive mitochondrial DNA recombination.

    Science.gov (United States)

    Sammler, Svenja; Bleidorn, Christoph; Tiedemann, Ralph

    2011-01-14

    Although nowaday it is broadly accepted that mitochondrial DNA (mtDNA) may undergo recombination, the frequency of such recombination remains controversial. Its estimation is not straightforward, as recombination under homoplasmy (i.e., among identical mt genomes) is likely to be overlooked. In species with tandem duplications of large mtDNA fragments the detection of recombination can be facilitated, as it can lead to gene conversion among duplicates. Although the mechanisms for concerted evolution in mtDNA are not fully understood yet, recombination rates have been estimated from "one per speciation event" down to 850 years or even "during every replication cycle". Here we present the first complete mt genome of the avian family Bucerotidae, i.e., that of two Philippine hornbills, Aceros waldeni and Penelopides panini. The mt genomes are characterized by a tandemly duplicated region encompassing part of cytochrome b, 3 tRNAs, NADH6, and the control region. The duplicated fragments are identical to each other except for a short section in domain I and for the length of repeat motifs in domain III of the control region. Due to the heteroplasmy with regard to the number of these repeat motifs, there is some size variation in both genomes; with around 21,657 bp (A. waldeni) and 22,737 bp (P. panini), they significantly exceed the hitherto longest known avian mt genomes, that of the albatrosses. We discovered concerted evolution between the duplicated fragments within individuals. The existence of differences between individuals in coding genes as well as in the control region, which are maintained between duplicates, indicates that recombination apparently occurs frequently, i.e., in every generation. The homogenised duplicates are interspersed by a short fragment which shows no sign of recombination. We hypothesize that this region corresponds to the so-called Replication Fork Barrier (RFB), which has been described from the chicken mitochondrial genome. As this RFB

  2. Characterization of probiotic Escherichia coli isolates with a novel pan-genome microarray

    DEFF Research Database (Denmark)

    Willenbrock, Hanni; Hallin, Peter Fischer; Wassenaar, Trudy

    2007-01-01

    of the same species are rapidly becoming available, allowing for the definition and characterization of a whole species as a population of genomes - the 'pan-genome'. Results: Using 32 Escherichia coli and Shigella genome sequences we estimate the pan- and core genome of the species. We designed a high...

  3. Generation of a reliable full-length cDNA of infectiousTembusu virus using a PCR-based protocol.

    Science.gov (United States)

    Liang, Te; Liu, Xiaoxiao; Cui, Shulin; Qu, Shenghua; Wang, Dan; Liu, Ning; Wang, Fumin; Ning, Kang; Zhang, Bing; Zhang, Dabing

    2016-02-02

    Full-length cDNA of Tembusu virus (TMUV) cloned in a plasmid has been found instable in bacterial hosts. Using a PCR-based protocol, we generated a stable full-length cDNA of TMUV. Different cDNA fragments of TMUV were amplified by reverse transcription (RT)-PCR, and cloned into plasmids. Fragmented cDNAs were amplified and assembled by fusion PCR to produce a full-length cDNA using the recombinant plasmids as templates. Subsequently, a full-length RNA was transcribed from the full-length cDNA in vitro and transfected into BHK-21 cells; infectious viral particles were rescued successfully. Following several passages in BKH-21 cells, the rescued virus was compared with the parental virus by genetic marker checks, growth curve determinations and animal experiments. These assays clearly demonstrated the genetic and biological stabilities of the rescued virus. The present work will be useful for future investigations on the molecular mechanisms involved in replication and pathogenesis of TMUV. Copyright © 2015 Elsevier B.V. All rights reserved.

  4. Characterizing genomic alterations in cancer by complementary functional associations.

    Science.gov (United States)

    Kim, Jong Wook; Botvinnik, Olga B; Abudayyeh, Omar; Birger, Chet; Rosenbluh, Joseph; Shrestha, Yashaswi; Abazeed, Mohamed E; Hammerman, Peter S; DiCara, Daniel; Konieczkowski, David J; Johannessen, Cory M; Liberzon, Arthur; Alizad-Rahvar, Amir Reza; Alexe, Gabriela; Aguirre, Andrew; Ghandi, Mahmoud; Greulich, Heidi; Vazquez, Francisca; Weir, Barbara A; Van Allen, Eliezer M; Tsherniak, Aviad; Shao, Diane D; Zack, Travis I; Noble, Michael; Getz, Gad; Beroukhim, Rameen; Garraway, Levi A; Ardakani, Masoud; Romualdi, Chiara; Sales, Gabriele; Barbie, David A; Boehm, Jesse S; Hahn, William C; Mesirov, Jill P; Tamayo, Pablo

    2016-05-01

    Systematic efforts to sequence the cancer genome have identified large numbers of mutations and copy number alterations in human cancers. However, elucidating the functional consequences of these variants, and their interactions to drive or maintain oncogenic states, remains a challenge in cancer research. We developed REVEALER, a computational method that identifies combinations of mutually exclusive genomic alterations correlated with functional phenotypes, such as the activation or gene dependency of oncogenic pathways or sensitivity to a drug treatment. We used REVEALER to uncover complementary genomic alterations associated with the transcriptional activation of β-catenin and NRF2, MEK-inhibitor sensitivity, and KRAS dependency. REVEALER successfully identified both known and new associations, demonstrating the power of combining functional profiles with extensive characterization of genomic alterations in cancer genomes.

  5. The Statistical Segment Length of DNA: Opportunities for Biomechanical Modeling in Polymer Physics and Next-Generation Genomics.

    Science.gov (United States)

    Dorfman, Kevin D

    2018-02-01

    The development of bright bisintercalating dyes for deoxyribonucleic acid (DNA) in the 1990s, most notably YOYO-1, revolutionized the field of polymer physics in the ensuing years. These dyes, in conjunction with modern molecular biology techniques, permit the facile observation of polymer dynamics via fluorescence microscopy and thus direct tests of different theories of polymer dynamics. At the same time, they have played a key role in advancing an emerging next-generation method known as genome mapping in nanochannels. The effect of intercalation on the bending energy of DNA as embodied by a change in its statistical segment length (or, alternatively, its persistence length) has been the subject of significant controversy. The precise value of the statistical segment length is critical for the proper interpretation of polymer physics experiments and controls the phenomena underlying the aforementioned genomics technology. In this perspective, we briefly review the model of DNA as a wormlike chain and a trio of methods (light scattering, optical or magnetic tweezers, and atomic force microscopy (AFM)) that have been used to determine the statistical segment length of DNA. We then outline the disagreement in the literature over the role of bisintercalation on the bending energy of DNA, and how a multiscale biomechanical approach could provide an important model for this scientifically and technologically relevant problem.

  6. Construction experience with Fermilab-built full length 50mm SSC dipoles

    International Nuclear Information System (INIS)

    Blessing, M.J.; Hoffman, D.E.; Packer, M.D.; Gordon, M.; Higinbotham, W.; Sims, R.

    1992-03-01

    Fourteen full length SSC dipole magnets are being built and tested at Fermilab. Their purpose is to verify the magnet design as well as transfer the construction technology to industry. Magnet design is summarized. Construction problems and their solutions are discussed. Topics include coil winding, curing and measuring, collaring, instrumentation, end clamp installation, yoking and electrical and mechanical interconnection

  7. 3G vector-primer plasmid for constructing full-length-enriched cDNA libraries.

    Science.gov (United States)

    Zheng, Dong; Zhou, Yanna; Zhang, Zidong; Li, Zaiyu; Liu, Xuedong

    2008-09-01

    We designed a 3G vector-primer plasmid for the generation of full-length-enriched complementary DNA (cDNA) libraries. By employing the terminal transferase activity of reverse transcriptase and the modified strand replacement method, this plasmid (assembled with a polydT end and a deoxyguanosine [dG] end) combines priming full-length cDNA strand synthesis and directional cDNA cloning. As a result, the number of steps involved in cDNA library preparation is decreased while simplifying downstream gene manipulation, sequencing, and subcloning. The 3G vector-primer plasmid method yields fully represented plasmid primed libraries that are equivalent to those made by the SMART (switching mechanism at 5' end of RNA transcript) approach.

  8. Full-length high-temperature severe fuel damage test No. 2

    International Nuclear Information System (INIS)

    Hesson, G.M.; Lombardo, N.J.; Pilger, J.P.; Rausch, W.N.; King, L.L.; Hurley, D.E.; Parchen, L.J.; Panisko, F.E.

    1993-09-01

    Hazardous conditions associated with performing the Full-Length High- Temperature (FLHT). Severe Fuel Damage Test No. 2 experiment have been analyzed. Major hazards that could cause harm or damage are (1) radioactive fission products, (2) radiation fields, (3) reactivity changes, (4) hydrogen generation, (5) materials at high temperature, (6) steam explosion, and (7) steam pressure pulse. As a result of this analysis, it is concluded that with proper precautions the FLHT- 2 test can be safely conducted

  9. A new strategy for full-length Ebola virus glycoprotein expression in E.coli.

    Science.gov (United States)

    Zai, Junjie; Yi, Yinhua; Xia, Han; Zhang, Bo; Yuan, Zhiming

    2016-12-01

    Ebola virus (EBOV) causes severe hemorrhagic fever in humans and non-human primates with high rates of fatality. Glycoprotein (GP) is the only envelope protein of EBOV, which may play a critical role in virus attachment and entry as well as stimulating host protective immune responses. However, the lack of expression of full-length GP in Escherichia coli hinders the further study of its function in viral pathogenesis. In this study, the vp40 gene was fused to the full-length gp gene and cloned into a prokaryotic expression vector. We showed that the VP40-GP and GP-VP40 fusion proteins could be expressed in E.coli at 16 °C. In addition, it was shown that the position of vp40 in the fusion proteins affected the yields of the fusion proteins, with a higher level of production of the fusion protein when vp40 was upstream of gp compared to when it was downstream. The results provide a strategy for the expression of a large quantity of EBOV full-length GP, which is of importance for further analyzing the relationship between the structure and function of GP and developing an antibody for the treatment of EBOV infection.

  10. Sequence Analysis and Characterization of Active Human Alu Subfamilies Based on the 1000 Genomes Pilot Project.

    Science.gov (United States)

    Konkel, Miriam K; Walker, Jerilyn A; Hotard, Ashley B; Ranck, Megan C; Fontenot, Catherine C; Storer, Jessica; Stewart, Chip; Marth, Gabor T; Batzer, Mark A

    2015-08-29

    The goal of the 1000 Genomes Consortium is to characterize human genome structural variation (SV), including forms of copy number variations such as deletions, duplications, and insertions. Mobile element insertions, particularly Alu elements, are major contributors to genomic SV among humans. During the pilot phase of the project we experimentally validated 645 (611 intergenic and 34 exon targeted) polymorphic "young" Alu insertion events, absent from the human reference genome. Here, we report high resolution sequencing of 343 (322 unique) recent Alu insertion events, along with their respective target site duplications, precise genomic breakpoint coordinates, subfamily assignment, percent divergence, and estimated A-rich tail lengths. All the sequenced Alu loci were derived from the AluY lineage with no evidence of retrotransposition activity involving older Alu families (e.g., AluJ and AluS). AluYa5 is currently the most active Alu subfamily in the human lineage, followed by AluYb8, and many others including three newly identified subfamilies we have termed AluYb7a3, AluYb8b1, and AluYa4a1. This report provides the structural details of 322 unique Alu variants from individual human genomes collectively adding about 100 kb of genomic variation. Many Alu subfamilies are currently active in human populations, including a surprising level of AluY retrotransposition. Human Alu subfamilies exhibit continuous evolution with potential drivers sprouting new Alu lineages. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  11. Comparative genomics reveals insights into avian genome evolution and adaptation

    DEFF Research Database (Denmark)

    Zhang, Guojie; Li, Cai; Li, Qiye

    2014-01-01

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, ...

  12. The characterization of twenty sequenced human genomes.

    Directory of Open Access Journals (Sweden)

    Kimberly Pelak

    2010-09-01

    Full Text Available We present the analysis of twenty human genomes to evaluate the prospects for identifying rare functional variants that contribute to a phenotype of interest. We sequenced at high coverage ten "case" genomes from individuals with severe hemophilia A and ten "control" genomes. We summarize the number of genetic variants emerging from a study of this magnitude, and provide a proof of concept for the identification of rare and highly-penetrant functional variants by confirming that the cause of hemophilia A is easily recognizable in this data set. We also show that the number of novel single nucleotide variants (SNVs discovered per genome seems to stabilize at about 144,000 new variants per genome, after the first 15 individuals have been sequenced. Finally, we find that, on average, each genome carries 165 homozygous protein-truncating or stop loss variants in genes representing a diverse set of pathways.

  13. Isolation and sequence characterization of DNA-A genome of a new begomovirus strain associated with severe leaf curling symptoms of Jatropha curcas L.

    Science.gov (United States)

    Chauhan, Sushma; Rahman, Hifzur; Mastan, Shaik G; Pamidimarri, D V N Sudheer; Reddy, Muppala P

    2018-07-20

    Begomoviruses belong to the family Geminiviridae are associated with several disease symptoms, such as mosaic and leaf curling in Jatropha curcas. The molecular characterization of these viral strains will help in developing management strategies to control the disease. In this study, J. curcas that was infected with begomovirus and showed acute leaf curling symptoms were identified. DNA-A segment from pathogenic viral strain was isolated and sequenced. The sequenced genome was assembled and characterized in detail. The full-length DNA-A sequence was covered by primer walking. The genome sequence showed the general organization of DNA-A from begomovirus by the distribution of ORFs in both viral and anti-viral strands. The genome size ranged from 2844 bp-2852 bp. Three strains with minor nucleotide variations were identified, and a phylogenetic analysis was performed by comparing the DNA-A segments from other reported begomovirus isolates. The maximum sequence similarity was observed with Euphorbia yellow mosaic virus (FN435995). In the phylogenetic tree, no clustering was observed with previously reported begomovirus strains isolated from J. curcas host. The strains isolated in this study belong to new begomoviral strain that elicits symptoms of leaf curling in J. curcas. The results indicate that the probable origin of the strains is from Jatropha mosaic virus infecting J. gassypifolia. The strains isolated in this study are referred as Jatropha curcas leaf curl India virus (JCLCIV) based on the major symptoms exhibited by host J. curcas. Copyright © 2018 Elsevier B.V. All rights reserved.

  14. Molecular characterization and complete genome sequence of avian paramyxovirus type 4 prototype strain duck/Hong Kong/D3/75

    Directory of Open Access Journals (Sweden)

    Collins Peter L

    2008-10-01

    Full Text Available Abstract Background Avian paramyxoviruses (APMVs are frequently isolated from domestic and wild birds throughout the world. All APMVs, except avian metapneumovirus, are classified in the genus Avulavirus of the family Paramyxoviridae. At present, the APMVs of genus Avulavirus are divided into nine serological types (APMV 1–9. Newcastle disease virus represents APMV-1 and is the most characterized among all APMV types. Very little is known about the molecular characteristics and pathogenicity of APMV 2–9. Results As a first step towards understanding the molecular genetics and pathogenicity of APMV-4, we have sequenced the complete genome of APMV-4 strain duck/Hong Kong/D3/75 and determined its pathogenicity in embryonated chicken eggs. The genome of APMV-4 is 15,054 nucleotides (nt in length, which is consistent with the "rule of six". The genome contains six non-overlapping genes in the order 3'-N-P/V-M-F-HN-L-5'. The genes are flanked on either side by highly conserved transcription start and stop signals and have intergenic sequences varying in length from 9 to 42 nt. The genome contains a 55 nt leader region at 3' end. The 5' trailer region is 17 nt, which is the shortest in the family Paramyxoviridae. Analysis of mRNAs transcribed from the P gene showed that 35% of the transcripts were edited by insertion of one non-templated G residue at an editing site leading to production of V mRNAs. No message was detected that contained insertion of two non-templated G residues, indicating that the W mRNAs are inefficiently produced in APMV-4 infected cells. The cleavage site of the F protein (DIPQR↓F does not conform to the preferred cleavage site of the ubiquitous intracellular protease furin. However, exogenous proteases were not required for the growth of APMV-4 in cell culture, indicating that the cleavage does not depend on a furin site. Conclusion Phylogenic analysis of the nucleotide sequences of viruses of all five genera of the family

  15. Genomic characterization of DArT markers based on high-density linkage analysis and physical mapping to the Eucalyptus genome.

    Directory of Open Access Journals (Sweden)

    César D Petroli

    Full Text Available Diversity Arrays Technology (DArT provides a robust, high throughput, cost-effective method to query thousands of sequence polymorphisms in a single assay. Despite the extensive use of this genotyping platform for numerous plant species, little is known regarding the sequence attributes and genome-wide distribution of DArT markers. We investigated the genomic properties of the 7,680 DArT marker probes of a Eucalyptus array, by sequencing them, constructing a high density linkage map and carrying out detailed physical mapping analyses to the Eucalyptus grandis reference genome. A consensus linkage map with 2,274 DArT markers anchored to 210 microsatellites and a framework map, with improved support for ordering, displayed extensive collinearity with the genome sequence. Only 1.4 Mbp of the 75 Mbp of still unplaced scaffold sequence was captured by 45 linkage mapped but physically unaligned markers to the 11 main Eucalyptus pseudochromosomes, providing compelling evidence for the quality and completeness of the current Eucalyptus genome assembly. A highly significant correspondence was found between the locations of DArT markers and predicted gene models, while most of the 89 DArT probes unaligned to the genome correspond to sequences likely absent in E. grandis, consistent with the pan-genomic feature of this multi-Eucalyptus species DArT array. These comprehensive linkage-to-physical mapping analyses provide novel data regarding the genomic attributes of DArT markers in plant genomes in general and for Eucalyptus in particular. DArT markers preferentially target the gene space and display a largely homogeneous distribution across the genome, thereby providing superb coverage for mapping and genome-wide applications in breeding and diversity studies. Data reported on these ubiquitous properties of DArT markers will be particularly valuable to researchers working on less-studied crop species who already count on DArT genotyping arrays but for

  16. High-resolution genomic fingerprinting of Campylobacter jejuni and Campylobacter coli by analysis of amplified fragment length polymorphisms

    DEFF Research Database (Denmark)

    Kokotovic, Branko; On, Stephen L.W.

    1999-01-01

    A method for high-resolution genomic fingerprinting of the enteric pathogens Campylobacter jejuni and Campylobacter coli, based on the determination of amplified fragment length polymorphism, is described. The potential of this method for molecular epidemiological studies of these species...... is evaluated with 50 type, reference, and well-characterised field strains. Amplified fragment length polymorphism fingerprints comprised over 60 bands detected in the size range 35-500 bp. Groups of outbreak strains, replicate subcultures, and 'genetically identical' strains from humans, poultry and cattle......, proved indistinguishable by amplified fragment length polymorphism fingerprinting, but were differentiated fi-om unrelated isolates. Previously unknown relationships between three hippurate-negative C. jejuni strains, and two C. coil var, hyoilei strains, were identified. These relationships corresponded...

  17. Characterizing the cancer genome in lung adenocarcinoma

    Science.gov (United States)

    Weir, Barbara A.; Woo, Michele S.; Getz, Gad; Perner, Sven; Ding, Li; Beroukhim, Rameen; Lin, William M.; Province, Michael A.; Kraja, Aldi; Johnson, Laura A.; Shah, Kinjal; Sato, Mitsuo; Thomas, Roman K.; Barletta, Justine A.; Borecki, Ingrid B.; Broderick, Stephen; Chang, Andrew C.; Chiang, Derek Y.; Chirieac, Lucian R.; Cho, Jeonghee; Fujii, Yoshitaka; Gazdar, Adi F.; Giordano, Thomas; Greulich, Heidi; Hanna, Megan; Johnson, Bruce E.; Kris, Mark G.; Lash, Alex; Lin, Ling; Lindeman, Neal; Mardis, Elaine R.; McPherson, John D.; Minna, John D.; Morgan, Margaret B.; Nadel, Mark; Orringer, Mark B.; Osborne, John R.; Ozenberger, Brad; Ramos, Alex H.; Robinson, James; Roth, Jack A.; Rusch, Valerie; Sasaki, Hidefumi; Shepherd, Frances; Sougnez, Carrie; Spitz, Margaret R.; Tsao, Ming-Sound; Twomey, David; Verhaak, Roel G. W.; Weinstock, George M.; Wheeler, David A.; Winckler, Wendy; Yoshizawa, Akihiko; Yu, Soyoung; Zakowski, Maureen F.; Zhang, Qunyuan; Beer, David G.; Wistuba, Ignacio I.; Watson, Mark A.; Garraway, Levi A.; Ladanyi, Marc; Travis, William D.; Pao, William; Rubin, Mark A.; Gabriel, Stacey B.; Gibbs, Richard A.; Varmus, Harold E.; Wilson, Richard K.; Lander, Eric S.; Meyerson, Matthew

    2008-01-01

    Somatic alterations in cellular DNA underlie almost all human cancers1. The prospect of targeted therapies2 and the development of high-resolution, genome-wide approaches3–8 are now spurring systematic efforts to characterize cancer genomes. Here we report a large-scale project to characterize copy-number alterations in primary lung adenocarcinomas. By analysis of a large collection of tumors (n = 371) using dense single nucleotide polymorphism arrays, we identify a total of 57 significantly recurrent events. We find that 26 of 39 autosomal chromosome arms show consistent large-scale copy-number gain or loss, of which only a handful have been linked to a specific gene. We also identify 31 recurrent focal events, including 24 amplifications and 7 homozygous deletions. Only six of these focal events are currently associated with known mutations in lung carcinomas. The most common event, amplification of chromosome 14q13.3, is found in ~12% of samples. On the basis of genomic and functional analyses, we identify NKX2-1 (NK2 homeobox 1, also called TITF1), which lies in the minimal 14q13.3 amplification interval and encodes a lineage-specific transcription factor, as a novel candidate proto-oncogene involved in a significant fraction of lung adenocarcinomas. More generally, our results indicate that many of the genes that are involved in lung adenocarcinoma remain to be discovered. PMID:17982442

  18. Characterizing neutral genomic diversity and selection signatures in indigenous populations of Moroccan goats (Capra hircus using WGS data

    Directory of Open Access Journals (Sweden)

    Badr eBenjelloun

    2015-04-01

    Full Text Available Since the time of their domestication, goats (Capra hircus have evolved in a large variety of locally adapted populations in response to different human and environmental pressures. In the present era, many indigenous populations are threatened with extinction due to their substitution by cosmopolitan breeds, while they might represent highly valuable genomic resources. It is thus crucial to characterize the neutral and adaptive genetic diversity of indigenous populations. A fine characterization of whole genome variation in farm animals is now possible by using new sequencing technologies. We sequenced the complete genome at 12X coverage of 44 goats geographically representative of the three phenotypically distinct indigenous populations in Morocco. The study of mitochondrial genomes showed a high diversity exclusively restricted to the haplogroup A. The 44 nuclear genomes showed a very high diversity (24 million variants associated with low linkage disequilibrium. The overall genetic diversity was weakly structured according to geography and phenotypes. When looking for signals of positive selection in each population we identified many candidate genes, several of which gave insights into the metabolic pathways or biological processes involved in the adaptation to local conditions (e.g. panting in warm/desert conditions. This study highlights the interest of WGS data to characterize livestock genomic diversity. It illustrates the valuable genetic richness present in indigenous populations that have to be sustainably managed and may represent valuable genetic resources for the long-term preservation of the species.

  19. Arthropod phylogenetics in light of three novel millipede (myriapoda: diplopoda mitochondrial genomes with comments on the appropriateness of mitochondrial genome sequence data for inferring deep level relationships.

    Directory of Open Access Journals (Sweden)

    Michael S Brewer

    Full Text Available BACKGROUND: Arthropods are the most diverse group of eukaryotic organisms, but their phylogenetic relationships are poorly understood. Herein, we describe three mitochondrial genomes representing orders of millipedes for which complete genomes had not been characterized. Newly sequenced genomes are combined with existing data to characterize the protein coding regions of myriapods and to attempt to reconstruct the evolutionary relationships within the Myriapoda and Arthropoda. RESULTS: The newly sequenced genomes are similar to previously characterized millipede sequences in terms of synteny and length. Unique translocations occurred within the newly sequenced taxa, including one half of the Appalachioria falcifera genome, which is inverted with respect to other millipede genomes. Across myriapods, amino acid conservation levels are highly dependent on the gene region. Additionally, individual loci varied in the level of amino acid conservation. Overall, most gene regions showed low levels of conservation at many sites. Attempts to reconstruct the evolutionary relationships suffered from questionable relationships and low support values. Analyses of phylogenetic informativeness show the lack of signal deep in the trees (i.e., genes evolve too quickly. As a result, the myriapod tree resembles previously published results but lacks convincing support, and, within the arthropod tree, well established groups were recovered as polyphyletic. CONCLUSIONS: The novel genome sequences described herein provide useful genomic information concerning millipede groups that had not been investigated. Taken together with existing sequences, the variety of compositions and evolution of myriapod mitochondrial genomes are shown to be more complex than previously thought. Unfortunately, the use of mitochondrial protein-coding regions in deep arthropod phylogenetics appears problematic, a result consistent with previously published studies. Lack of phylogenetic

  20. Full-length high-temperature severe fuel damage test No. 5

    International Nuclear Information System (INIS)

    Lanning, D.D.; Lombardo, N.J.; Hensley, W.K.; Fitzsimmons, D.E.; Panisko, F.E.; Hartwell, J.K.

    1993-09-01

    This report describes and presents data from a severe fuel damage test that was conducted in the National Research Universal (NRU) reactor at Chalk River Nuclear Laboratories (CRNL), Ontario, Canada. The test, designated FLHT-5, was the fourth in a series of full-length high-temperature (FLHT) tests on light-water reactor fuel. The tests were designed and performed by staff from the US Department of Energy's Pacific Northwest Laboratory (PNL), operated by Battelle Memorial Institute. The test operation and test results are described in this report. The fuel bundle in the FLHT-5 experiment included 10 unirradiated full-length pressurized-water reactor (PWR) rods, 1 irradiated PWR rod and 1 dummy gamma thermometer. The fuel rods were subjected to a very low coolant flow while operating at low fission power. This caused coolant boilaway, rod dryout and overheating to temperatures above 2600 K, severe fuel rod damage, hydrogen generation, and fission product release. The test assembly and its effluent path were extensively instrumented to record temperatures, pressures, flow rates, hydrogen evolution, and fission product release during the boilaway/heatup transient. Post-test gamma scanning of the upper plenum indicated significant iodine and cesium release and deposition. Both stack gas activity and on-line gamma spectrometer data indicated significant (∼50%) release of noble fission gases. Post-test visual examination of one side of the fuel bundle revealed no massive relocation and flow blockage; however, rundown of molten cladding was evident

  1. Comparative genome analysis and characterization of the Salmonella Typhimurium strain CCRJ_26 isolated from swine carcasses using whole-genome sequencing approach.

    Science.gov (United States)

    Panzenhagen, P H N; Cabral, C C; Suffys, P N; Franco, R M; Rodrigues, D P; Conte-Junior, C A

    2018-04-01

    Salmonella pathogenicity relies on virulence factors many of which are clustered within the Salmonella pathogenicity islands. Salmonella also harbours mobile genetic elements such as virulence plasmids, prophage-like elements and antimicrobial resistance genes which can contribute to increase its pathogenicity. Here, we have genetically characterized a selected S. Typhimurium strain (CCRJ_26) from our previous study with Multiple Drugs Resistant profile and high-frequency PFGE clonal profile which apparently persists in the pork production centre of Rio de Janeiro State, Brazil. By whole-genome sequencing, we described the strain's genome virulent content and characterized the repertoire of bacterial plasmids, antibiotic resistance genes and prophage-like elements. Here, we have shown evidence that strain CCRJ_26 genome possible represent a virulence-associated phenotype which may be potentially virulent in human infection. Whole-genome sequencing technologies are still costly and remain underexplored for applied microbiology in Brazil. Hence, this genomic description of S. Typhimurium strain CCRJ_26 will provide help in future molecular epidemiological studies. The analysis described here reveals a quick and useful pipeline for bacterial virulence characterization using whole-genome sequencing approach. © 2018 The Society for Applied Microbiology.

  2. Genomic Characterization of the Genus Nairovirus (Family Bunyaviridae

    Directory of Open Access Journals (Sweden)

    Jens H. Kuhn

    2016-06-01

    Full Text Available Nairovirus, one of five bunyaviral genera, includes seven species. Genomic sequence information is limited for members of the Dera Ghazi Khan, Hughes, Qalyub, Sakhalin, and Thiafora nairovirus species. We used next-generation sequencing and historical virus-culture samples to determine 14 complete and nine coding-complete nairoviral genome sequences to further characterize these species. Previously unsequenced viruses include Abu Mina, Clo Mor, Great Saltee, Hughes, Raza, Sakhalin, Soldado, and Tillamook viruses. In addition, we present genomic sequence information on additional isolates of previously sequenced Avalon, Dugbe, Sapphire II, and Zirqa viruses. Finally, we identify Tunis virus, previously thought to be a phlebovirus, as an isolate of Abu Hammad virus. Phylogenetic analyses indicate the need for reassignment of Sapphire II virus to Dera Ghazi Khan nairovirus and reassignment of Hazara, Tofla, and Nairobi sheep disease viruses to novel species. We also propose new species for the Kasokero group (Kasokero, Leopards Hill, Yogue viruses, the Ketarah group (Gossas, Issyk-kul, Keterah/soft tick viruses and the Burana group (Wēnzhōu tick virus, Huángpí tick virus 1, Tǎchéng tick virus 1. Our analyses emphasize the sister relationship of nairoviruses and arenaviruses, and indicate that several nairo-like viruses (Shāyáng spider virus 1, Xīnzhōu spider virus, Sānxiá water strider virus 1, South Bay virus, Wǔhàn millipede virus 2 require establishment of novel genera in a larger nairovirus-arenavirus supergroup.

  3. Organizational heterogeneity of vertebrate genomes.

    Directory of Open Access Journals (Sweden)

    Svetlana Frenkel

    Full Text Available Genomes of higher eukaryotes are mosaics of segments with various structural, functional, and evolutionary properties. The availability of whole-genome sequences allows the investigation of their structure as "texts" using different statistical and computational methods. One such method, referred to as Compositional Spectra (CS analysis, is based on scoring the occurrences of fixed-length oligonucleotides (k-mers in the target DNA sequence. CS analysis allows generating species- or region-specific characteristics of the genome, regardless of their length and the presence of coding DNA. In this study, we consider the heterogeneity of vertebrate genomes as a joint effect of regional variation in sequence organization superimposed on the differences in nucleotide composition. We estimated compositional and organizational heterogeneity of genome and chromosome sequences separately and found that both heterogeneity types vary widely among genomes as well as among chromosomes in all investigated taxonomic groups. The high correspondence of heterogeneity scores obtained on three genome fractions, coding, repetitive, and the remaining part of the noncoding DNA (the genome dark matter--GDM allows the assumption that CS-heterogeneity may have functional relevance to genome regulation. Of special interest for such interpretation is the fact that natural GDM sequences display the highest deviation from the corresponding reshuffled sequences.

  4. The Complete Sequence of a Human Parainfluenzavirus 4 Genome

    Science.gov (United States)

    Yea, Carmen; Cheung, Rose; Collins, Carol; Adachi, Dena; Nishikawa, John; Tellier, Raymond

    2009-01-01

    Although the human parainfluenza virus 4 (HPIV4) has been known for a long time, its genome, alone among the human paramyxoviruses, has not been completely sequenced to date. In this study we obtained the first complete genomic sequence of HPIV4 from a clinical isolate named SKPIV4 obtained at the Hospital for Sick Children in Toronto (Ontario, Canada). The coding regions for the N, P/V, M, F and HN proteins show very high identities (95% to 97%) with previously available partial sequences for HPIV4B. The sequence for the L protein and the non-coding regions represent new information. A surprising feature of the genome is its length, more than 17 kb, making it the longest genome within the genus Rubulavirus, although the length is well within the known range of 15 kb to 19 kb for the subfamily Paramyxovirinae. The availability of a complete genomic sequence will facilitate investigations on a respiratory virus that is still not completely characterized. PMID:21994536

  5. Expression and characterization of full-length human heme oxygenase-1: the presence of intact membrane-binding region leads to increased binding affinity for NADPH cytochrome P450 reductase.

    Science.gov (United States)

    Huber, Warren J; Backes, Wayne L

    2007-10-30

    Heme oxygenase-1 (HO-1) is the chief regulatory enzyme in the oxidative degradation of heme to biliverdin. In the process of heme degradation, HO-1 receives the electrons necessary for catalysis from the flavoprotein NADPH cytochrome P450 reductase (CPR), releasing free iron and carbon monoxide. Much of the recent research involving heme oxygenase has been done using a 30 kDa soluble form of the enzyme, which lacks the membrane binding region (C-terminal 23 amino acids). The goal of this study was to express and purify a full-length human HO-1 (hHO-1) protein; however, due to the lability of the full-length form, a rapid purification procedure was required. This was accomplished by use of a glutathione-s-transferase (GST)-tagged hHO-1 construct. Although the procedure permitted the generation of a full-length HO-1, this form was contaminated with a 30 kDa degradation product that could not be eliminated. Therefore, attempts were made to remove a putative secondary thrombin cleavage site by a conservative mutation of amino acid 254, which replaces arginine with lysine. This mutation allowed the expression and purification of a full-length hHO-1 protein. Unlike wild type (WT) HO-1, the R254K mutant could be purified to a single 32 kDa protein capable of degrading heme at the same rate as the WT enzyme. The R254K full-length form had a specific activity of approximately 200-225 nmol of bilirubin h-1 nmol-1 HO-1 as compared to approximately 140-150 nmol of bilirubin h-1 nmol-1 for the WT form, which contains the 30 kDa contaminant. This is a 2-3-fold increase from the previously reported soluble 30 kDa HO-1, suggesting that the C-terminal 23 amino acids are essential for maximal catalytic activity. Because the membrane-spanning domain is present, the full-length hHO-1 has the potential to incorporate into phospholipid membranes, which can be reconstituted at known concentrations, in combination with other endoplasmic reticulum resident enzymes.

  6. Preliminary Characterization of Mitochondrial Genome of Melipona scutellaris, a Brazilian Stingless Bee

    Directory of Open Access Journals (Sweden)

    Manuella Souza Silverio

    2014-01-01

    Full Text Available Bees are manufacturers of relevant economical products and have a pollinator role fundamental to ecosystems. Traditionally, studies focused on the genus Melipona have been mostly based on behavioral, and social organization and ecological aspects. Only recently the evolutionary history of this genus has been assessed using molecular markers, including mitochondrial genes. Even though these studies have shed light on the evolutionary history of the Melipona genus, a more accurate picture may emerge when full nuclear and mitochondrial genomes of Melipona species become available. Here we present the assembly, annotation, and characterization of a draft mitochondrial genome of the Brazilian stingless bee Melipona scutellaris using Melipona bicolor as a reference organism. Using Illumina MiSeq data, we achieved the annotation of all protein coding genes, as well as the genes for the two ribosomal subunits (16S and 12S and transfer RNA genes as well. Using the COI sequence as a DNA barcode, we found that M. cramptoni is the closest species to M. scutellaris.

  7. Genome-based microbial ecology of anammox granules in a full-scale wastewater treatment system

    NARCIS (Netherlands)

    Speth, D.R.; Zandt, M.H. in 't; Guerrero Cruz, S.; Dutilh, B.E.; Jetten, M.S.M.

    2016-01-01

    Partial-nitritation anammox (PNA) is a novel wastewater treatment procedure for energy-efficient ammonium removal. Here we use genome-resolved metagenomics to build a genome-based ecological model of the microbial community in a full-scale PNA reactor. Sludge from the bioreactor examined here is

  8. Characterizing immunoglobulin repertoire from whole blood by a personal genome sequencer.

    Directory of Open Access Journals (Sweden)

    Fan Gao

    Full Text Available In human immune system, V(DJ recombination produces an enormously large repertoire of immunoglobulins (Ig so that they can tackle different antigens from bacteria, viruses and tumor cells. Several studies have demonstrated the utility of next-generation sequencers such as Roche 454 and Illumina Genome Analyzer to characterize the repertoire of immunoglobulins. However, these techniques typically require separation of B cell population from whole blood and require a few weeks for running the sequencers, so it may not be practical to implement them in clinical settings. Recently, the Ion Torrent personal genome sequencer has emerged as a tabletop personal genome sequencer that can be operated in a time-efficient and cost-effective manner. In this study, we explored the technical feasibility to use multiplex PCR for amplifying V(DJ recombination for IgH, directly from whole blood, then sequence the amplicons by the Ion Torrent sequencer. The whole process including data generation and analysis can be completed in one day. We tested the method in a pilot study on patients with benign, atypical and malignant meningiomas. Despite the noisy data, we were able to compare the samples by their usage frequencies of the V segment, as well as their somatic hypermutation rates. In summary, our study suggested that it is technically feasible to perform clinical monitoring of V(DJ recombination within a day by personal genome sequencers.

  9. Characterization of noncoding regulatory DNA in the human genome.

    Science.gov (United States)

    Elkon, Ran; Agami, Reuven

    2017-08-08

    Genetic variants associated with common diseases are usually located in noncoding parts of the human genome. Delineation of the full repertoire of functional noncoding elements, together with efficient methods for probing their biological roles, is therefore of crucial importance. Over the past decade, DNA accessibility and various epigenetic modifications have been associated with regulatory functions. Mapping these features across the genome has enabled researchers to begin to document the full complement of putative regulatory elements. High-throughput reporter assays to probe the functions of regulatory regions have also been developed but these methods separate putative regulatory elements from the chromosome so that any effects of chromatin context and long-range regulatory interactions are lost. Definitive assignment of function(s) to putative cis-regulatory elements requires perturbation of these elements. Genome-editing technologies are now transforming our ability to perturb regulatory elements across entire genomes. Interpretation of high-throughput genetic screens that incorporate genome editors might enable the construction of an unbiased map of functional noncoding elements in the human genome.

  10. Pulp regeneration in a full-length human tooth root using a hierarchical nanofibrous microsphere system.

    Science.gov (United States)

    Li, Xiangwei; Ma, Chi; Xie, Xiaohua; Sun, Hongchen; Liu, Xiaohua

    2016-04-15

    While pulp regeneration using tissue engineering strategy has been explored for over a decade, successful regeneration of pulp tissues in a full-length human root with a one-end seal that truly simulates clinical endodontic treatment has not been achieved. To address this challenge, we designed and synthesized a unique hierarchical growth factor-loaded nanofibrous microsphere scaffolding system. In this system, vascular endothelial growth factor (VEGF) binds with heparin and is encapsulated in heparin-conjugated gelatin nanospheres, which are further immobilized in the nanofibers of an injectable poly(l-lactic acid) (PLLA) microsphere. This hierarchical microsphere system not only protects the VEGF from denaturation and degradation, but also provides excellent control of its sustained release. In addition, the nanofibrous PLLA microsphere integrates the extracellular matrix-mimicking architecture with a highly porous injectable form, efficiently accommodating dental pulp stem cells (DPSCs) and supporting their proliferation and pulp tissue formation. Our in vivo study showed the successful regeneration of pulp-like tissues that fulfilled the entire apical and middle thirds and reached the coronal third of the full-length root canal. In addition, a large number of blood vessels were regenerated throughout the canal. For the first time, our work demonstrates the success of pulp tissue regeneration in a full-length root canal, making it a significant step toward regenerative endodontics. The regeneration of pulp tissues in a full-length tooth root canal has been one of the greatest challenges in the field of regenerative endodontics, and one of the biggest barriers for its clinical application. In this study, we developed a unique approach to tackle this challenge, and for the first time, we successfully regenerated living pulp tissues in a full-length root canal, making it a significant step toward regenerative endodontics. This study will make positive scientific

  11. The Pinus taeda genome is characterized by diverse and highly diverged repetitive sequences

    Directory of Open Access Journals (Sweden)

    Yandell Mark

    2010-07-01

    Full Text Available Abstract Background In today's age of genomic discovery, no attempt has been made to comprehensively sequence a gymnosperm genome. The largest genus in the coniferous family Pinaceae is Pinus, whose 110-120 species have extremely large genomes (c. 20-40 Gb, 2N = 24. The size and complexity of these genomes have prompted much speculation as to the feasibility of completing a conifer genome sequence. Conifer genomes are reputed to be highly repetitive, but there is little information available on the nature and identity of repetitive units in gymnosperms. The pines have extensive genetic resources, with approximately 329000 ESTs from eleven species and genetic maps in eight species, including a dense genetic map of the twelve linkage groups in Pinus taeda. Results We present here the Sanger sequence and annotation of ten P. taeda BAC clones and Genome Analyzer II whole genome shotgun (WGS sequences representing 7.5% of the genome. Computational annotation of ten BACs predicts three putative protein-coding genes and at least fifteen likely pseudogenes in nearly one megabase of sequence. We found three conifer-specific LTR retroelements in the BACs, and tentatively identified at least 15 others based on evidence from the distantly related angiosperms. Alignment of WGS sequences to the BACs indicates that 80% of BAC sequences have similar copies (≥ 75% nucleotide identity elsewhere in the genome, but only 23% have identical copies (99% identity. The three most common repetitive elements in the genome were identified and, when combined, represent less than 5% of the genome. Conclusions This study indicates that the majority of repeats in the P. taeda genome are 'novel' and will therefore require additional BAC or genomic sequencing for accurate characterization. The pine genome contains a very large number of diverged and probably defunct repetitive elements. This study also provides new evidence that sequencing a pine genome using a WGS approach is

  12. Molecular cloning and characterization of the full-length cDNA encoding the tree shrew (tupaia belangeri) CD28

    Science.gov (United States)

    Huang, Xiaoyan; Yan, Yan; Wang, Sha; Wang, Qinying; Shi, Jian; Shao, Zhanshe; Dai, Jiejie

    2017-11-01

    CD28 is one of the most important co-stimulatory molecules expressed by naive and primed T cells. The tree shrews (Tupaia belangeri), as an ideal animal model for analyzing mechanism of human diseases receiving extensive attentions, demands essential research tools, in particular in the study of cellular markers and monoclonal antibodies for immunological studies. However, little is known about tree shrew CD28 (tsCD28) until now. In this study, a 663 bp of the full-length CD28 cDNA, encoding a polypeptide of 220 amino acids was cloned from tree shrew spleen lymphocytes. The nucleotide sequence of the tsCD28 showed 85%, 76%, and 75% similarities with human, rat, and mouse, respectively, which showed the affinity relationship between tree shrew and human is much closer than between human and rodents. The open reading frame (ORF) sequence of tsCD28 gene was predicted to be in correspondence with the signal sequence, immunoglobulin variable-like (IgV) domain, transmembrane domain and cytoplasmic tail, respectively.We also analyzed its molecular characteristics with other mammals by using biology software such as Clustal W 2.0 and so forth. Our results showed that tsCD28 contained many features conserved in CD28 genes from other mammals, including conserved signal peptide and glycosylation sites, and several residues responsible for binding to the CD28R, and the tsCD28 amino acid sequence were found a close genetic relationship with human and monkey. The crystal structure and surface charge revealed most regions of tree shrew CD28 molecule surface charges are similar as human. However, compared with human CD28 (hCD28) regions, in some areas, the surface positive charge of tsCD28 was less than hCD28, which may affect antibody binding. The present study is the first report of cloning and characterization of CD28 in tree shrew. This study provides a theoretical basis for the further study the structure and function of tree shrew CD28 and utilize tree shrew as an effective

  13. Quantitative measure of randomness and order for complete genomes

    Science.gov (United States)

    Kong, Sing-Guan; Fan, Wen-Lang; Chen, Hong-Da; Wigger, Jan; Torda, Andrew E.; Lee, H. C.

    2009-06-01

    We propose an order index, ϕ , which gives a quantitative measure of randomness and order of complete genomic sequences. It maps genomes to a number from 0 (random and of infinite length) to 1 (fully ordered) and applies regardless of sequence length. The 786 complete genomic sequences in GenBank were found to have ϕ values in a very narrow range, ϕg=0.031-0.015+0.028 . We show this implies that genomes are halfway toward being completely random, or, at the “edge of chaos.” We further show that artificial “genomes” converted from literary classics have ϕ ’s that almost exactly coincide with ϕg , but sequences of low information content do not. We infer that ϕg represents a high information-capacity “fixed point” in sequence space, and that genomes are driven to it by the dynamics of a robust growth and evolution process. We show that a growth process characterized by random segmental duplication can robustly drive genomes to the fixed point.

  14. Generation and analysis of large-scale expressed sequence tags (ESTs from a full-length enriched cDNA library of porcine backfat tissue

    Directory of Open Access Journals (Sweden)

    Lee Hae-Young

    2006-02-01

    Full Text Available Abstract Background Genome research in farm animals will expand our basic knowledge of the genetic control of complex traits, and the results will be applied in the livestock industry to improve meat quality and productivity, as well as to reduce the incidence of disease. A combination of quantitative trait locus mapping and microarray analysis is a useful approach to reduce the overall effort needed to identify genes associated with quantitative traits of interest. Results We constructed a full-length enriched cDNA library from porcine backfat tissue. The estimated average size of the cDNA inserts was 1.7 kb, and the cDNA fullness ratio was 70%. In total, we deposited 16,110 high-quality sequences in the dbEST division of GenBank (accession numbers: DT319652-DT335761. For all the expressed sequence tags (ESTs, approximately 10.9 Mb of porcine sequence were generated with an average length of 674 bp per EST (range: 200–952 bp. Clustering and assembly of these ESTs resulted in a total of 5,008 unique sequences with 1,776 contigs (35.46% and 3,232 singleton (65.54% ESTs. From a total of 5,008 unique sequences, 3,154 (62.98% were similar to other sequences, and 1,854 (37.02% were identified as having no hit or low identity (Sus scrofa. Gene ontology (GO annotation of unique sequences showed that approximately 31.7, 32.3, and 30.8% were assigned molecular function, biological process, and cellular component GO terms, respectively. A total of 1,854 putative novel transcripts resulted after comparison and filtering with the TIGR SsGI; these included a large percentage of singletons (80.64% and a small proportion of contigs (13.36%. Conclusion The sequence data generated in this study will provide valuable information for studying expression profiles using EST-based microarrays and assist in the condensation of current pig TCs into clusters representing longer stretches of cDNA sequences. The isolation of genes expressed in backfat tissue is the

  15. Evaluation of full-length, cleaved and nitrosylated serum surfactant protein D as biomarkers for COPD

    DEFF Research Database (Denmark)

    Duvoix, Annelyse; Miranda, Elena; Perez, Juan

    2011-01-01

    . Serum levels of SP-D are raised in individuals with COPD but there is no correlation between the serum level of SP-D and the severity of airflow obstruction. Serum SP-D is present in different forms that may have more utility as a biomarker for COPD. We report here the development of new monoclonal...... antibodies to full length and cleaved SP-D. We have assessed these and existing antibodies in 98 individuals with COPD recruited to the Evaluation of COPD Longitudinally to Identify Predictive Surrogate Endpoints (ECLIPSE) cohort. Our data show that neither monoclonal antibodies to full length nor cleaved SP...

  16. Northern Spotted Owl (Strix occidentalis caurina) Genome: Divergence with the Barred Owl (Strix varia) and Characterization of Light-Associated Genes.

    Science.gov (United States)

    Hanna, Zachary R; Henderson, James B; Wall, Jeffrey D; Emerling, Christopher A; Fuchs, Jérôme; Runckel, Charles; Mindell, David P; Bowie, Rauri C K; DeRisi, Joseph L; Dumbacher, John P

    2017-10-01

    We report here the assembly of a northern spotted owl (Strix occidentalis caurina) genome. We generated Illumina paired-end sequence data at 90× coverage using nine libraries with insert lengths ranging from ∼250 to 9,600 nt and read lengths from 100 to 375 nt. The genome assembly is comprised of 8,108 scaffolds totaling 1.26 × 109 nt in length with an N50 length of 3.98 × 106 nt. We calculated the genome-wide fixation index (FST) of S. o. caurina with the closely related barred owl (Strix varia) as 0.819. We examined 19 genes that encode proteins with light-dependent functions in our genome assembly as well as in that of the barn owl (Tyto alba). We present genomic evidence for loss of three of these in S. o. caurina and four in T. alba. We suggest that most light-associated gene functions have been maintained in owls and their loss has not proceeded to the same extent as in other dim-light-adapted vertebrates. © The Author 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.

  17. Nine Loci for Ocular Axial Length Identified through Genome-wide Association Studies, Including Shared Loci with Refractive Error

    NARCIS (Netherlands)

    Cheng, Ching-Yu; Schache, Maria; Ikram, M. Kamran; Young, Terri L.; Guggenheim, Jeremy A.; Vitart, Veronique; Macgregor, Stuart; Verhoeven, Virginie J. M.; Barathi, Veluchamy A.; Liao, Jiemin; Hysi, Pirro G.; Bailey-Wilson, Joan E.; St Pourcain, Beate; Kemp, John P.; McMahon, George; Timpson, Nicholas J.; Evans, David M.; Montgomery, Grant W.; Mishra, Aniket; Wang, Ya Xing; Wang, Jie Jin; Rochtchina, Elena; Polasek, Ozren; Wright, Alan F.; Amin, Najaf; van Leeuwen, Elisabeth M.; Wilson, James F.; Pennell, Craig E.; van Duijn, Cornelia M.; de Jong, Paulus T. V. M.; Vingerling, Johannes R.; Zhou, Xin; Chen, Peng; Li, Ruoying; Tay, Wan-Ting; Zheng, Yingfeng; Chew, Merwyn; Burdon, Kathryn P.; Craig, Jamie E.; Iyengar, Sudha K.; Igo, Robert P.; Lass, Jonathan H.; Chew, Emily Y.; Haller, Toomas; Mihailov, Evelin; Metspalu, Andres; Wedenoja, Juho; Simpson, Claire L.; Wojciechowski, Robert; Chen, Wei

    2013-01-01

    Refractive errors are common eye disorders of public health importance worldwide. Ocular axial length (AL) is the major determinant of refraction and thus of myopia and hyperopia. We conducted a meta-analysis of genome-wide association studies for AL, combining 12,531 Europeans and 8,216 Asians. We

  18. Insertion of Introns: A Strategy to Facilitate Assembly of Infectious Full Length Clones

    DEFF Research Database (Denmark)

    Johansen, Ida Elisabeth; Lund, Ole Søgaard

    2008-01-01

    Some DNA fragments are difficult to clone in Escherichia coli by standard methods. It has been speculated that unintended transcription and translation result in expression of proteins that are toxic to the bacteria. This problem is frequently observed during assembly of infectious full-length vi...

  19. Rare HIV-1 Subtype J Genomes and a New H/U/CRF02_AG Recombinant Genome Suggests an Ancient Origin of HIV-1 in Angola.

    Science.gov (United States)

    Bártolo, Inês; Calado, Rita; Borrego, Pedro; Leitner, Thomas; Taveira, Nuno

    2016-08-01

    Angola has an extremely diverse HIV-1 epidemic fueled in part by the frequent interchange of people with the Democratic Republic of Congo (DRC) and Republic of Congo (RC). Characterization of HIV-1 strains circulating in Angola should help to better understand the origin of HIV-1 subtypes and recombinant forms and their transmission dynamics. In this study we characterize the first near full-length HIV-1 genomic sequences from HIV-1 infected individuals from Angola. Samples were obtained in 1993 from three HIV-1 infected patients living in Cabinda, Angola. Near full-length genomic sequences were obtained from virus isolates. Maximum likelihood phylogenetic tree inference and analyses of potential recombination patterns were performed to evaluate the sequence classifications and origins. Phylogenetic and recombination analyses revealed that one virus was a pure subtype J, another mostly subtype J with a small uncertain region, and the final virus was classified as a H/U/CRF02_AG recombinant. Consistent with their epidemiological data, the subtype J sequences were more closely related to each other than to other J sequences previously published. Based on the env gene, taxa from Angola occur throughout the global subtype J phylogeny. HIV-1 subtypes J and H are present in Angola at low levels since at least 1993. Low transmission efficiency and/or high recombination potential may explain their limited epidemic success in Angola and worldwide. The high diversity of rare subtypes in Angola suggests that Angola was part of the early establishment of the HIV-1 pandemic.

  20. Conformational change in full-length mouse prion: A site-directed spin-labeling study

    International Nuclear Information System (INIS)

    Inanami, Osamu; Hashida, Shukichi; Iizuka, Daisuke; Horiuchi, Motohiro; Hiraoka, Wakako; Shimoyama, Yuhei; Nakamura, Hideo; Inagaki, Fuyuhiko; Kuwabara, Mikinori

    2005-01-01

    The structure of the mouse prion (moPrP) was studied using site-directed spin-labeling electron spin resonance (SDSL-ESR). Since a previous NMR study by Hornemanna et al., [Hornemanna, Korthb, Oeschb, Rieka, Widera, Wuethricha, Glockshubera, Recombinant full-length murine prion protein, mPrP (23-231): purification and spectroscopic characterization, FEBS Lett. 413 (1997) 277-281] has indicated that N96, D143, and T189 in moPrP are localized in a Cu 2+ binding region, Helix1 and Helix2, respectively, three recombinant moPrP mutations (N96C, D143C, and T189C) were expressed in an Escherichia coli system, and then refolded by dialysis under low pH and purified by reverse-phase HPLC. By using the preparation, we succeeded in preserving a target cystein residue without alteration of the α-helix structure of moPrP and were able to apply SDSL-ESR with a methane thiosulfonate spin label to the full-length prion protein. The rotational correlation times (τ) of 1.1, 3.3, and 4.8 ns were evaluated from the X-band ESR spectra at pH 7.4 and 20 deg C for N96R1, D143R1, and T189R1, respectively. τ reflects the fact that the Cu 2+ binding region is more flexible than Helix1 or Helix2. ESR spectra recorded at various temperatures revealed two phases together with a transition point at around 20 deg C in D143R1 and T189R1, but not in N96R1. With the variation of pH from 4.0 to 7.8, ESR spectra of T189R1 at 20 deg C showed a gradual increase of τ from 2.9 to 4.8 ns. On the other hand, the pH-dependent conformational changes in N96R1 and D143R1 were negligible. These results indicated that T189 located in Helix2 possessed a structure sensitive to physiological pH changes; simultaneously, N96 in the Cu 2+ binding region and D143 in Helix1 were conserved

  1. Impaired heat shock response in cells expressing full-length polyglutamine-expanded huntingtin.

    Directory of Open Access Journals (Sweden)

    Sidhartha M Chafekar

    Full Text Available The molecular mechanisms by which polyglutamine (polyQ-expanded huntingtin (Htt causes neurodegeneration in Huntington's disease (HD remain unclear. The malfunction of cellular proteostasis has been suggested as central in HD pathogenesis and also as a target of therapeutic interventions for the treatment of HD. We present results that offer a previously unexplored perspective regarding impaired proteostasis in HD. We find that, under non-stress conditions, the proteostatic capacity of cells expressing full length polyQ-expanded Htt is adequate. Yet, under stress conditions, the presence of polyQ-expanded Htt impairs the heat shock response, a key component of cellular proteostasis. This impaired heat shock response results in a reduced capacity to withstand the damage caused by cellular stress. We demonstrate that in cells expressing polyQ-expanded Htt the levels of heat shock transcription factor 1 (HSF1 are reduced, and, as a consequence, these cells have an impaired a heat shock response. Also, we found reduced HSF1 and HSP70 levels in the striata of HD knock-in mice when compared to wild-type mice. Our results suggests that full length, non-aggregated polyQ-expanded Htt blocks the effective induction of the heat shock response under stress conditions and may thus trigger the accumulation of cellular damage during the course of HD pathogenesis.

  2. [Sequencing and analysis of complete genome of rabies viruses isolated from Chinese Ferret-Badger and dog in Zhejiang province].

    Science.gov (United States)

    Lei, Yong-Liang; Wang, Xiao-Guang; Tao, Xiao-Yan; Li, Hao; Meng, Sheng-Li; Chen, Xiu-Ying; Liu, Fu-Ming; Ye, Bi-Feng; Tang, Qing

    2010-01-01

    Based on sequencing the full-length genomes of four Chinese Ferret-Badger and dog, we analyze the properties of rabies viruses genetic variation in molecular level, get the information about rabies viruses prevalence and variation in Zhejiang, and enrich the genome database of rabies viruses street strains isolated from China. Rabies viruses in suckling mice were isolated, overlapped fragments were amplified by RT-PCR and full-length genomes were assembled to analyze the nucleotide and deduced protein similarities and phylogenetic analyses from Chinese Ferret-Badger, dog, sika deer, vole, used vaccine strain were determined. The four full-length genomes were sequenced completely and had the same genetic structure with the length of 11, 923 nts or 11, 925 nts including 58 nts-Leader, 1353 nts-NP, 894 nts-PP, 609 nts-MP, 1575 nts-GP, 6386 nts-LP, and 2, 5, 5 nts- intergenic regions(IGRs), 423 nts-Pseudogene-like sequence (psi), 70 nts-Trailer. The four full-length genomes were in accordance with the properties of Rhabdoviridae Lyssa virus by BLAST and multi-sequence alignment. The nucleotide and amino acid sequences among Chinese strains had the highest similarity, especially among animals of the same species. Of the four full-length genomes, the similarity in amino acid level was dramatically higher than that in nucleotide level, so the nucleotide mutations happened in these four genomes were most synonymous mutations. Compared with the reference rabies viruses, the lengths of the five protein coding regions had no change, no recombination, only with a few point mutations. It was evident that the five proteins appeared to be stable. The variation sites and types of the four genomes were similar to the reference vaccine or street strains. And the four strains were genotype 1 according to the multi-sequence and phylogenetic analyses, which possessed the distinct district characteristics of China. Therefore, these four rabies viruses are likely to be street viruses

  3. Production of enzymatically active recombinant full-length barley high pI alpha-glucosidase of glycoside family 31 by high cell-density fermentation of Pichia pastoris and affinity purification

    DEFF Research Database (Denmark)

    Næsted, Henrik; Kramhøft, Birte; Lok, F.

    2006-01-01

    Recombinant barley high pI alpha-glucosidase was produced by high cell-density fermentation of Pichia pastoris expressing the cloned full-length gene. The gene was amplified from a genomic clone and exons (coding regions) were assembled by overlap PCR. The resulting cDNA was expressed under contr...... nM x s(-1), and 85 s(-1) using maltose as substrate. This work presents the first production of fully active recombinant alpha-glucosidase of glycoside hydrolase family 31 from higher plants. (c) 2005 Elsevier Inc. All rights reserved....

  4. Live visualization of genomic loci with BiFC-TALE.

    Science.gov (United States)

    Hu, Huan; Zhang, Hongmin; Wang, Sheng; Ding, Miao; An, Hui; Hou, Yingping; Yang, Xiaojing; Wei, Wensheng; Sun, Yujie; Tang, Chao

    2017-01-11

    Tracking the dynamics of genomic loci is important for understanding the mechanisms of fundamental intracellular processes. However, fluorescent labeling and imaging of such loci in live cells have been challenging. One of the major reasons is the low signal-to-background ratio (SBR) of images mainly caused by the background fluorescence from diffuse full-length fluorescent proteins (FPs) in the living nucleus, hampering the application of live cell genomic labeling methods. Here, combining bimolecular fluorescence complementation (BiFC) and transcription activator-like effector (TALE) technologies, we developed a novel method for labeling genomic loci (BiFC-TALE), which largely reduces the background fluorescence level. Using BiFC-TALE, we demonstrated a significantly improved SBR by imaging telomeres and centromeres in living cells in comparison with the methods using full-length FP.

  5. Identification of the full-length β-actin sequence and expression profiles in the tree shrew (Tupaia belangeri).

    Science.gov (United States)

    Zheng, Yu; Yun, Chenxia; Wang, Qihui; Smith, Wanli W; Leng, Jing

    2015-02-01

    The tree shrew (Tupaia belangeri) diverges from the primate order (Primates) and is classified as a separate taxonomic group of mammals - Scandentia. It has been suggested that the tree shrew can be used as an animal model for studying human diseases; however, the genomic sequence of the tree shrew is largely unidentified. In the present study, we reported the full-length cDNA sequence of the housekeeping gene, β-actin, in the tree shrew. The amino acid sequence of β-actin in the tree shrew was compared to that of humans and other species; a simple phylogenetic relationship was discovered. Quantitative polymerase chain reaction (qPCR) and western blot analysis further demonstrated that the expression profiles of β-actin, as a general conservative housekeeping gene, in the tree shrew were similar to those in humans, although the expression levels varied among different types of tissue in the tree shrew. Our data provide evidence that the tree shrew has a close phylogenetic association with humans. These findings further enhance the potential that the tree shrew, as a species, may be used as an animal model for studying human disorders.

  6. Performance of initial full-length RHIC [Relativistic Heavy Ion Collider] dipoles

    International Nuclear Information System (INIS)

    Dahl, P.; Cottingham, J.; Garber, M.

    1987-01-01

    The first four full-length (9.7 m) R and D dipoles for the proposed Relativistic Heavy Ion Collider (RHIC) have been successfully tested. The magnets reached a quench plateau of approximately 4.5 T with very reasonable training - a field level comfortably above the design field of 3.45 T required for operation with beams of 100 GeV/amu gold nuclei. Measured field multipoles are considered to be quite acceptable for this series of R and D magnets

  7. Near-Full Genome Characterisation of Two Natural Intergenotypic 2k/1b Recombinant Hepatitis C Virus Isolates

    Directory of Open Access Journals (Sweden)

    Victoria L. Demetriou

    2011-01-01

    Full Text Available Few natural intergenotypic hepatitis C virus (HCV recombinants have been characterised, and only RF1_2k/1b has demonstrated widespread transmission. The near-full length genome sequences for two cases of 2k/1b recombinants (CYHCV037 and CYHCV093 sampled in Cyprus were obtained using strain-specific RT-PCR amplification and sequencing protocols. Sequence analysis confirmed their similarity with the original RF1_2k/1b strain from St. Petersburg, N687. These two isolates significantly contribute to the sequence data available on this recombinant and confirm its increasing spread among individuals from Eastern Europe, and its association with transmission through intravenous drug use. Phylogenetic analyses reveal clustering of the sequence 3′ to the recombination point, not seen in the topology of the 5′ sequences, implying a more complicated evolutionary history than that held to date. The increasing cases of HCV recombinant strains underline the requirement of their contribution to the standardised rules of HCV classification and nomenclature, molecular epidemiology, diagnosis, and treatment.

  8. Development and characterization of genomic microsatellite markers in Prosopis cineraria

    Directory of Open Access Journals (Sweden)

    Shashi Shekhar Anand

    2017-06-01

    Full Text Available Characterization of genetic diversity is a must for exploring the genetic resources for plant development and improvement. Prosopis cineraria is ecologically imperative species known for its innumerable biological benefits. Since there is a lack of genetic resources for the species, so it is crucial to unravel the population dynamics which will be very effective in plant improvement and conservation strategies. Of the 41 genomic microsatellite markers designed from (AGn enriched library, 24 were subsequently employed for characterization on 30 genotypes of Indian arid region. A total of 93 alleles with an average 3.875 could be amplified by tested primer pairs. The average observed and expected heterozygosity was 0.5139 and 0.5786, respectively with 23 primer pairs showing significant deviations from Hardy-Weinberg equilibrium. Polymorphic information content average to 0.5102 and the overall polymorphism level was found to be 93.27%. STRUCTURE analysis and DARwin exhibited the presence of 4 clusters among 30 genotypes.

  9. Experimental annotation of the human genome using microarray technology.

    Science.gov (United States)

    Shoemaker, D D; Schadt, E E; Armour, C D; He, Y D; Garrett-Engele, P; McDonagh, P D; Loerch, P M; Leonardson, A; Lum, P Y; Cavet, G; Wu, L F; Altschuler, S J; Edwards, S; King, J; Tsang, J S; Schimmack, G; Schelter, J M; Koch, J; Ziman, M; Marton, M J; Li, B; Cundiff, P; Ward, T; Castle, J; Krolewski, M; Meyer, M R; Mao, M; Burchard, J; Kidd, M J; Dai, H; Phillips, J W; Linsley, P S; Stoughton, R; Scherer, S; Boguski, M S

    2001-02-15

    The most important product of the sequencing of a genome is a complete, accurate catalogue of genes and their products, primarily messenger RNA transcripts and their cognate proteins. Such a catalogue cannot be constructed by computational annotation alone; it requires experimental validation on a genome scale. Using 'exon' and 'tiling' arrays fabricated by ink-jet oligonucleotide synthesis, we devised an experimental approach to validate and refine computational gene predictions and define full-length transcripts on the basis of co-regulated expression of their exons. These methods can provide more accurate gene numbers and allow the detection of mRNA splice variants and identification of the tissue- and disease-specific conditions under which genes are expressed. We apply our technique to chromosome 22q under 69 experimental condition pairs, and to the entire human genome under two experimental conditions. We discuss implications for more comprehensive, consistent and reliable genome annotation, more efficient, full-length complementary DNA cloning strategies and application to complex diseases.

  10. Modeling insertional mutagenesis using gene length and expression in murine embryonic stem cells.

    Directory of Open Access Journals (Sweden)

    Alex S Nord

    2007-07-01

    Full Text Available High-throughput mutagenesis of the mammalian genome is a powerful means to facilitate analysis of gene function. Gene trapping in embryonic stem cells (ESCs is the most widely used form of insertional mutagenesis in mammals. However, the rules governing its efficiency are not fully understood, and the effects of vector design on the likelihood of gene-trapping events have not been tested on a genome-wide scale.In this study, we used public gene-trap data to model gene-trap likelihood. Using the association of gene length and gene expression with gene-trap likelihood, we constructed spline-based regression models that characterize which genes are susceptible and which genes are resistant to gene-trapping techniques. We report results for three classes of gene-trap vectors, showing that both length and expression are significant determinants of trap likelihood for all vectors. Using our models, we also quantitatively identified hotspots of gene-trap activity, which represent loci where the high likelihood of vector insertion is controlled by factors other than length and expression. These formalized statistical models describe a high proportion of the variance in the likelihood of a gene being trapped by expression-dependent vectors and a lower, but still significant, proportion of the variance for vectors that are predicted to be independent of endogenous gene expression.The findings of significant expression and length effects reported here further the understanding of the determinants of vector insertion. Results from this analysis can be applied to help identify other important determinants of this important biological phenomenon and could assist planning of large-scale mutagenesis efforts.

  11. [Complete genome sequencing and analyses of rabies viruses isolated from wild animals (Chinese Ferret-Badger) in Zhejiang province].

    Science.gov (United States)

    Lei, Yong-Liang; Wang, Xiao-Guang; Liu, Fu-Ming; Chen, Xiu-Ying; Ye, Bi-Feng; Mei, Jian-Hua; Lan, Jin-Quan; Tang, Qing

    2009-08-01

    Based on sequencing the full-length genomes of two Chinese Ferret-Badger, we analyzed the properties of rabies viruses genetic variation in molecular level to get information on prevalence and variation of rabies viruses in Zhejiang, and to enrich the genome database of rabies viruses street strains isolated from Chinese wildlife. Overlapped fragments were amplified by RT-PCR and full-length genomes were assembled to analyze the nucleotide and deduced protein similarities and phylogenetic analyses of the N genes from Chinese Ferret-Badger, sika deer, vole, dog. Vaccine strains were then determined. The two full-length genomes were completely sequenced to find out that they had the same genetic structure with 11 923 nts including 58 nts-Leader, 1353 nts-NP, 894 nts-PP, 609 nts-MP, 1575 nts-GP, 6386 nts-LP, and 2, 5, 5 nts- intergenic regions (IGRs), 423 nts-Pseudogene-like sequence (Psi), 70 nts-Trailer. The two full-length genomes were in accordance with the properties of Rhabdoviridae Lyssa virus by blast and multi-sequence alignment. The nucleotide and amino acid sequences among Chinese strains had the highest similarity, especially among animals of the same species. Of the two full-length genomes, the similarity in amino acid level was dramatically higher than that in nucleotide level, so that the nucleotide mutations happened in these two genomes were most probably as synonymous mutations. Compared to the referenced rabies viruses, the lengths of the five protein coding regions did not show any changes or recombination, but only with a few-point mutations. It was evident that the five proteins appeared to be stable. The variation sites and types of the two ferret badgers genomes were similar to the referenced vaccine or street strains. The two strains were genotype 1 according to the multi-sequence and phylogenetic analyses, which possessing the distinct geographyphic characteristics of China. All the evidence suggested a cue that these two ferret badgers

  12. Genome-based microbial ecology of anammox granules in a full-scale wastewater treatment system.

    Science.gov (United States)

    Speth, Daan R; In 't Zandt, Michiel H; Guerrero-Cruz, Simon; Dutilh, Bas E; Jetten, Mike S M

    2016-03-31

    Partial-nitritation anammox (PNA) is a novel wastewater treatment procedure for energy-efficient ammonium removal. Here we use genome-resolved metagenomics to build a genome-based ecological model of the microbial community in a full-scale PNA reactor. Sludge from the bioreactor examined here is used to seed reactors in wastewater treatment plants around the world; however, the role of most of its microbial community in ammonium removal remains unknown. Our analysis yielded 23 near-complete draft genomes that together represent the majority of the microbial community. We assign these genomes to distinct anaerobic and aerobic microbial communities. In the aerobic community, nitrifying organisms and heterotrophs predominate. In the anaerobic community, widespread potential for partial denitrification suggests a nitrite loop increases treatment efficiency. Of our genomes, 19 have no previously cultivated or sequenced close relatives and six belong to bacterial phyla without any cultivated members, including the most complete Omnitrophica (formerly OP3) genome to date.

  13. XenDB: Full length cDNA prediction and cross species mapping in Xenopus laevis

    Directory of Open Access Journals (Sweden)

    Giegerich Robert

    2005-09-01

    Full Text Available Abstract Background Research using the model system Xenopus laevis has provided critical insights into the mechanisms of early vertebrate development and cell biology. Large scale sequencing efforts have provided an increasingly important resource for researchers. To provide full advantage of the available sequence, we have analyzed 350,468 Xenopus laevis Expressed Sequence Tags (ESTs both to identify full length protein encoding sequences and to develop a unique database system to support comparative approaches between X. laevis and other model systems. Description Using a suffix array based clustering approach, we have identified 25,971 clusters and 40,877 singleton sequences. Generation of a consensus sequence for each cluster resulted in 31,353 tentative contig and 4,801 singleton sequences. Using both BLASTX and FASTY comparison to five model organisms and the NR protein database, more than 15,000 sequences are predicted to encode full length proteins and these have been matched to publicly available IMAGE clones when available. Each sequence has been compared to the KOG database and ~67% of the sequences have been assigned a putative functional category. Based on sequence homology to mouse and human, putative GO annotations have been determined. Conclusion The results of the analysis have been stored in a publicly available database XenDB http://bibiserv.techfak.uni-bielefeld.de/xendb/. A unique capability of the database is the ability to batch upload cross species queries to identify potential Xenopus homologues and their associated full length clones. Examples are provided including mapping of microarray results and application of 'in silico' analysis. The ability to quickly translate the results of various species into 'Xenopus-centric' information should greatly enhance comparative embryological approaches. Supplementary material can be found at http://bibiserv.techfak.uni-bielefeld.de/xendb/.

  14. Characterization and long term operation of a novel superconducting undulator with 15 mm period length in a synchrotron light source

    Directory of Open Access Journals (Sweden)

    S. Casalbuoni

    2016-11-01

    Full Text Available A new cryogen-free full scale (1.5 m long superconducting undulator with a period length of 15 mm (SCU15 has been successfully tested in the ANKA storage ring. This represents a very important milestone in the development of superconducting undulators for third and fourth generation light sources carried on by the collaboration between the Karlsruhe Institute of Technology and the industrial partner Babcock Noell GmbH. SCU15 is the first full length device worldwide that with beam reaches a higher peak field than what expected with the same geometry (vacuum gap and period length with an ideal cryogenic permanent magnet undulator built with the best material available PrFeB. After a summary on the design and main parameters of the device, we present here the characterization in terms of spectral properties and the long term operation of the SCU15 in the ANKA storage ring.

  15. Sequencing and comparative genome analysis of two pathogenic Streptococcus gallolyticus subspecies: genome plasticity, adaptation and virulence.

    Directory of Open Access Journals (Sweden)

    I-Hsuan Lin

    Full Text Available Streptococcus gallolyticus infections in humans are often associated with bacteremia, infective endocarditis and colon cancers. The disease manifestations are different depending on the subspecies of S. gallolyticus causing the infection. Here, we present the complete genomes of S. gallolyticus ATCC 43143 (biotype I and S. pasteurianus ATCC 43144 (biotype II.2. The genomic differences between the two biotypes were characterized with comparative genomic analyses. The chromosome of ATCC 43143 and ATCC 43144 are 2,36 and 2,10 Mb in length and encode 2246 and 1869 CDS respectively. The organization and genomic contents of both genomes were most similar to the recently published S. gallolyticus UCN34, where 2073 (92% and 1607 (86% of the ATCC 43143 and ATCC 43144 CDS were conserved in UCN34 respectively. There are around 600 CDS conserved in all Streptococcus genomes, indicating the Streptococcus genus has a small core-genome (constitute around 30% of total CDS and substantial evolutionary plasticity. We identified eight and five regions of genome plasticity in ATCC 43143 and ATCC 43144 respectively. Within these regions, several proteins were recognized to contribute to the fitness and virulence of each of the two subspecies. We have also predicted putative cell-surface associated proteins that could play a role in adherence to host tissues, leading to persistent infections causing sub-acute and chronic diseases in humans. This study showed evidence that the S. gallolyticus still possesses genes making it suitable in a rumen environment, whereas the ability for S. pasteurianus to live in rumen is reduced. The genome heterogeneity and genetic diversity among the two biotypes, especially membrane and lipoproteins, most likely contribute to the differences in the pathogenesis of the two S. gallolyticus biotypes and the type of disease an infected patient eventually develops.

  16. Transformation of natural genetic variation into Haemophilus influenzae genomes.

    Directory of Open Access Journals (Sweden)

    Joshua Chang Mell

    2011-07-01

    Full Text Available Many bacteria are able to efficiently bind and take up double-stranded DNA fragments, and the resulting natural transformation shapes bacterial genomes, transmits antibiotic resistance, and allows escape from immune surveillance. The genomes of many competent pathogens show evidence of extensive historical recombination between lineages, but the actual recombination events have not been well characterized. We used DNA from a clinical isolate of Haemophilus influenzae to transform competent cells of a laboratory strain. To identify which of the ~40,000 polymorphic differences had recombined into the genomes of four transformed clones, their genomes and their donor and recipient parents were deep sequenced to high coverage. Each clone was found to contain ~1000 donor polymorphisms in 3-6 contiguous runs (8.1±4.5 kb in length that collectively comprised ~1-3% of each transformed chromosome. Seven donor-specific insertions and deletions were also acquired as parts of larger donor segments, but the presence of other structural variation flanking 12 of 32 recombination breakpoints suggested that these often disrupt the progress of recombination events. This is the first genome-wide analysis of chromosomes directly transformed with DNA from a divergent genotype, connecting experimental studies of transformation with the high levels of natural genetic variation found in isolates of the same species.

  17. First complete genome sequence of canine bocavirus 2 in mainland China

    Directory of Open Access Journals (Sweden)

    S.-L. Zhai

    2017-07-01

    Full Text Available We obtained the first full-length genome sequence of canine bocavirus 2 (CBoV2 from the faeces of a healthy dog in Guangzhou city, Guangdong province, mainland China. The genome of GZHD15 consisted of 5059 nucleotides. Sequence analysis suggested that GZHD15 was close to a previously circulated Hong Kong isolate.

  18. SeqEntropy: genome-wide assessment of repeats for short read sequencing.

    Directory of Open Access Journals (Sweden)

    Hsueh-Ting Chu

    Full Text Available BACKGROUND: Recent studies on genome assembly from short-read sequencing data reported the limitation of this technology to reconstruct the entire genome even at very high depth coverage. We investigated the limitation from the perspective of information theory to evaluate the effect of repeats on short-read genome assembly using idealized (error-free reads at different lengths. METHODOLOGY/PRINCIPAL FINDINGS: We define a metric H(k to be the entropy of sequencing reads at a read length k and use the relative loss of entropy ΔH(k to measure the impact of repeats for the reconstruction of whole-genome from sequences of length k. In our experiments, we found that entropy loss correlates well with de-novo assembly coverage of a genome, and a score of ΔH(k>1% indicates a severe loss in genome reconstruction fidelity. The minimal read lengths to achieve ΔH(k<1% are different for various organisms and are independent of the genome size. For example, in order to meet the threshold of ΔH(k<1%, a read length of 60 bp is needed for the sequencing of human genome (3.2 10(9 bp and 320 bp for the sequencing of fruit fly (1.8×10(8 bp. We also calculated the ΔH(k scores for 2725 prokaryotic chromosomes and plasmids at several read lengths. Our results indicate that the levels of repeats in different genomes are diverse and the entropy of sequencing reads provides a measurement for the repeat structures. CONCLUSIONS/SIGNIFICANCE: The proposed entropy-based measurement, which can be calculated in seconds to minutes in most cases, provides a rapid quantitative evaluation on the limitation of idealized short-read genome sequencing. Moreover, the calculation can be parallelized to scale up to large euakryotic genomes. This approach may be useful to tune the sequencing parameters to achieve better genome assemblies when a closely related genome is already available.

  19. Amplification and pyrosequencing of near-full-length hepatitis C virus for typing and monitoring antiviral resistant strains.

    Science.gov (United States)

    Trémeaux, P; Caporossi, A; Ramière, C; Santoni, E; Tarbouriech, N; Thélu, M-A; Fusillier, K; Geneletti, L; François, O; Leroy, V; Burmeister, W P; André, P; Morand, P; Larrat, S

    2016-05-01

    Directly acting antiviral drugs have contributed considerable progress to hepatitis C virus (HCV) treatment, but they show variable activity depending on virus genotypes and subtypes. Therefore, accurate genotyping including recombinant form detection is still of major importance, as is the detection of resistance-associated mutations in case of therapeutic failure. To meet these goals, an approach to amplify the HCV near-complete genome with a single long-range PCR and sequence it with Roche GS Junior was developed. After optimization, the overall amplification success rate was 73% for usual genotypes (i.e. HCV 1a, 1b, 3a and 4a, 16/22) and 45% for recombinant forms RF_2k/1b (5/11). After pyrosequencing and subsequent de novo assembly, a near-full-length genomic consensus sequence was obtained for 19 of 21 samples. The genotype and subtype were confirmed by phylogenetic analysis for every sample, including the suspected recombinant forms. Resistance-associated mutations were detected in seven of 13 samples at baseline, in the NS3 (n = 3) or NS5A (n = 4) region. Of these samples, the treatment of one patient included daclatasvir, and that patient experienced a relapse. Virus sequences from pre- and posttreatment samples of four patients who experienced relapse after sofosbuvir-based therapy were compared: the selected variants seem too far from the NS5B catalytic site to be held responsible. Although tested on a limited set of samples and with technical improvements still necessary, this assay has proven to be successful for both genotyping and resistance-associated variant detection on several HCV types. Copyright © 2016 European Society of Clinical Microbiology and Infectious Diseases. Published by Elsevier Ltd. All rights reserved.

  20. Genome size of 14 species of fireflies (Insecta, Coleoptera, Lampyridae

    Directory of Open Access Journals (Sweden)

    Gui-Chun Liu

    2017-11-01

    Full Text Available Eukaryotic genome size data are important both as the basis for comparative research into genome evolution and as estimators of the cost and difficulty of genome sequencing programs for non-model organisms. In this study, the genome size of 14 species of fireflies (Lampyridae (two genera in Lampyrinae, three genera in Luciolinae, and one genus in subfamily incertae sedis were estimated by propidium iodide (PI-based flow cytometry. The haploid genome sizes of Lampyridae ranged from 0.42 to 1.31 pg, a 3.1-fold span. Genome sizes of the fireflies varied within the tested subfamilies and genera. Lamprigera and Pyrocoelia species had large and small genome sizes, respectively. No correlation was found between genome size and morphological traits such as body length, body width, eye width, and antennal length. Our data provide additional information on genome size estimation of the firefly family Lampyridae. Furthermore, this study will help clarify the cost and difficulty of genome sequencing programs for non-model organisms and will help promote studies on firefly genome evolution.

  1. Characterization of genomic alterations in radiation-associated breast cancer among childhood cancer survivors, using comparative genomic hybridization (CGH arrays.

    Directory of Open Access Journals (Sweden)

    Xiaohong R Yang

    Full Text Available Ionizing radiation is an established risk factor for breast cancer. Epidemiologic studies of radiation-exposed cohorts have been primarily descriptive; molecular events responsible for the development of radiation-associated breast cancer have not been elucidated. In this study, we used array comparative genomic hybridization (array-CGH to characterize genome-wide copy number changes in breast tumors collected in the Childhood Cancer Survivor Study (CCSS. Array-CGH data were obtained from 32 cases who developed a second primary breast cancer following chest irradiation at early ages for the treatment of their first cancers, mostly Hodgkin lymphoma. The majority of these cases developed breast cancer before age 45 (91%, n = 29, had invasive ductal tumors (81%, n = 26, estrogen receptor (ER-positive staining (68%, n = 19 out of 28, and high proliferation as indicated by high Ki-67 staining (77%, n = 17 out of 22. Genomic regions with low-copy number gains and losses and high-level amplifications were similar to what has been reported in sporadic breast tumors, however, the frequency of amplifications of the 17q12 region containing human epidermal growth factor receptor 2 (HER2 was much higher among CCSS cases (38%, n = 12. Our findings suggest that second primary breast cancers in CCSS were enriched for an "amplifier" genomic subgroup with highly proliferative breast tumors. Future investigation in a larger irradiated cohort will be needed to confirm our findings.

  2. Preliminary characterization of mitochondrial genome of Melipona scutellaris, a Brazilian stingless bee.

    Science.gov (United States)

    Silverio, Manuella Souza; Rodovalho, Vinícius de Rezende; Bonetti, Ana Maria; de Oliveira, Guilherme Corrêa; Cuadros-Orellana, Sara; Ueira-Vieira, Carlos; Rodrigues dos Santos, Anderson

    2014-01-01

    Bees are manufacturers of relevant economical products and have a pollinator role fundamental to ecosystems. Traditionally, studies focused on the genus Melipona have been mostly based on behavioral, and social organization and ecological aspects. Only recently the evolutionary history of this genus has been assessed using molecular markers, including mitochondrial genes. Even though these studies have shed light on the evolutionary history of the Melipona genus, a more accurate picture may emerge when full nuclear and mitochondrial genomes of Melipona species become available. Here we present the assembly, annotation, and characterization of a draft mitochondrial genome of the Brazilian stingless bee Melipona scutellaris using Melipona bicolor as a reference organism. Using Illumina MiSeq data, we achieved the annotation of all protein coding genes, as well as the genes for the two ribosomal subunits (16S and 12S) and transfer RNA genes as well. Using the COI sequence as a DNA barcode, we found that M. cramptoni is the closest species to M. scutellaris.

  3. Amplified-fragment length polymorphism fingerprinting of Mycoplasma species

    DEFF Research Database (Denmark)

    Kokotovic, Branko; Friis, N.F.; Jensen, J.S.

    1999-01-01

    Amplified-fragment length polymorphism (AFLP) is a whole-genome fingerprinting method based on selective amplification of restriction fragments. The potential of the method for the characterization of mycoplasmas was investigated in a total of 50 strains of human and animal origin, including...... Mycoplasma genitalium (n = 11), Mycoplasma pneumoniae (n = 5), Mycoplasma hominis (n = 5), Mycoplasma hyopneunmoniae (n = 9), Myco plasma flocculare (n = 5), Mycoplasma hyosynoviae (n = 10), and Mycoplasma dispar (n = 5), AFLP templates were prepared by the digestion of mycoplasmal DNA with BglII and Mfe...... to discriminate the analyzed strains at species and intraspecies levels as well, Each of the tested Mycoplasma species developed a banding pattern entirely different from those obtained from other species under analysis, Subtle intraspecies genomic differences were detected among strains of all of the Mycoplasma...

  4. Characterization of European Yersinia enterocolitica 1A strains using restriction fragment length polymorphism and multilocus sequence analysis.

    Science.gov (United States)

    Murros, A; Säde, E; Johansson, P; Korkeala, H; Fredriksson-Ahomaa, M; Björkroth, J

    2016-10-01

    Yersinia enterocolitica is currently divided into two subspecies: subsp. enterocolitica including highly pathogenic strains of biotype 1B and subsp. palearctica including nonpathogenic strains of biotype 1A and moderately pathogenic strains of biotypes 2-5. In this work, we characterized 162 Y. enterocolitica strains of biotype 1A and 50 strains of biotypes 2-4 isolated from human, animal and food samples by restriction fragment length polymorphism using the HindIII restriction enzyme. Phylogenetic relatedness of 20 representative Y. enterocolitica strains including 15 biotype 1A strains was further studied by the multilocus sequence analysis of four housekeeping genes (glnA, gyrB, recA and HSP60). In all the analyses, biotype 1A strains formed a separate genomic group, which differed from Y. enterocolitica subsp. enterocolitica and from the strains of biotypes 2-4 of Y. enterocolitica subsp. palearctica. Based on these results, biotype 1A strains considered nonpathogenic should not be included in subspecies palearctica containing pathogenic strains of biotypes 2-5. Yersinia enterocolitica strains are currently divided into six biotypes and two subspecies. Strains of biotype 1A, which are phenotypically and genotypically very heterogeneous, are classified as subspecies palearctica. In this study, European Y. enterocolitica 1A strains isolated from both human and nonhuman sources were characterized using restriction fragment length polymorphism and multilocus sequence analysis. The European biotype 1A strains formed a separate group, which differed from strains belonging to subspecies enterocolitica and palearctica. This may indicate that the current division between the two subspecies is not sufficient considering the strain diversity within Y. enterocolitica. © 2016 The Society for Applied Microbiology.

  5. Comparative genomics and stx phage characterization of LEE-negative Shiga toxin-producing Escherichia coli

    Directory of Open Access Journals (Sweden)

    Susan Renee Steyert

    2012-11-01

    Full Text Available Infection by Escherichia coli and Shigella species are among the leading causes of death due to diarrheal disease in the world. Shiga toxin producing Escherichia coli (STEC that do not encode the locus of enterocyte effacement (LEE-negative STEC often possess Shiga toxin gene variants and have been isolated from humans and a variety of animal sources. In this study, we compare the genomes of nine LEE-negative STEC harboring various stx alleles with four complete reference LEE-positive STEC isolates. Compared to a representative collection of prototype E. coli and Shigella isolates representing each of the pathotypes, the whole genome phylogeny demonstrated that these isolates are diverse. Whole genome comparative analysis of the 13 genomes revealed that in addition to the absence of the LEE pathogenicity island, phage encoded genes including non-LEE encoded effectors, were absent from all nine LEE-negative STEC genomes. Several plasmid-encoded virulence factors reportedly identified in LEE-negative STEC isolates were identified in only a subset of the nine LEE-negative isolates further confirming the diversity of this group. In combination with whole genome analysis, we characterized the lambdoid phages harboring the various stx alleles and determined their genomic insertion sites. Although the integrase gene sequence corresponded with genomic location, it was not correlated with stx variant, further highlighting the mosaic nature of these phages. The transcription of these phages in different genomic backgrounds was examined. Expression of the Shiga toxin genes, stx1 and/or stx2, as well as the Q genes, were examined with quantitative reverse transcriptase polymerase chain reaction (qRT-PCR assays. A wide range of basal and induced toxin induction was observed. Overall, this is a first significant foray into the genome space of this unexplored group of emerging and divergent pathogens.

  6. Score-based prediction of genomic islands in prokaryotic genomes using hidden Markov models

    Directory of Open Access Journals (Sweden)

    Surovcik Katharina

    2006-03-01

    Full Text Available Abstract Background Horizontal gene transfer (HGT is considered a strong evolutionary force shaping the content of microbial genomes in a substantial manner. It is the difference in speed enabling the rapid adaptation to changing environmental demands that distinguishes HGT from gene genesis, duplications or mutations. For a precise characterization, algorithms are needed that identify transfer events with high reliability. Frequently, the transferred pieces of DNA have a considerable length, comprise several genes and are called genomic islands (GIs or more specifically pathogenicity or symbiotic islands. Results We have implemented the program SIGI-HMM that predicts GIs and the putative donor of each individual alien gene. It is based on the analysis of codon usage (CU of each individual gene of a genome under study. CU of each gene is compared against a carefully selected set of CU tables representing microbial donors or highly expressed genes. Multiple tests are used to identify putatively alien genes, to predict putative donors and to mask putatively highly expressed genes. Thus, we determine the states and emission probabilities of an inhomogeneous hidden Markov model working on gene level. For the transition probabilities, we draw upon classical test theory with the intention of integrating a sensitivity controller in a consistent manner. SIGI-HMM was written in JAVA and is publicly available. It accepts as input any file created according to the EMBL-format. It generates output in the common GFF format readable for genome browsers. Benchmark tests showed that the output of SIGI-HMM is in agreement with known findings. Its predictions were both consistent with annotated GIs and with predictions generated by different methods. Conclusion SIGI-HMM is a sensitive tool for the identification of GIs in microbial genomes. It allows to interactively analyze genomes in detail and to generate or to test hypotheses about the origin of acquired

  7. A new HCV genotype 6 subtype designated 6v was confirmed with three complete genome sequences.

    Science.gov (United States)

    Wang, Yizhong; Xia, Xueshan; Li, Chunhua; Maneekarn, Niwat; Xia, Wenjie; Zhao, Wenhua; Feng, Yue; Kung, Hsiang Fu; Fu, Yongshui; Lu, Ling

    2009-03-01

    Although hepatitis C virus (HCV) genotype 6 is classified into 21 subtypes, 6a-6u, new variants continue to be identified. To characterize the full-length genomes of three novel HCV genotype 6 variants: KMN02, KM046 and KM181. From sera of patients with HCV infection, the entire HCV genome was amplified by RT-PCR followed by direct DNA sequencing and phylogenetic analysis. The sera contained HCV genomes of 9461, 9429, and 9461nt in length, and each harboured a single ORF of 9051nt. The genomes showed 95.3-98.1% nucleotide similarity to each other and 72.2-75.4% similarity to 23 genotype 6 reference sequences, which represent subtypes 6a-6u and unassigned variants km41 and gz52557. Phylogenetic analyses demonstrated that they were genotype 6, but were subtypically distinct. Based on the current criteria of HCV classification, they were designed to represent a new subtype, 6v. Analysis of E1 and NS5B region partial sequences revealed two additional related variants, CMBD-14 and CMBD-86 that had been previously reported in northern Thailand and sequences dropped into Genbank. Three novel HCV genotype 6 variants were entirely sequenced and designated subtype 6v.

  8. How genome complexity can explain the difficulty of aligning reads to genomes.

    Science.gov (United States)

    Phan, Vinhthuy; Gao, Shanshan; Tran, Quang; Vo, Nam S

    2015-01-01

    Although it is frequently observed that aligning short reads to genomes becomes harder if they contain complex repeat patterns, there has not been much effort to quantify the relationship between complexity of genomes and difficulty of short-read alignment. Existing measures of sequence complexity seem unsuitable for the understanding and quantification of this relationship. We investigated several measures of complexity and found that length-sensitive measures of complexity had the highest correlation to accuracy of alignment. In particular, the rate of distinct substrings of length k, where k is similar to the read length, correlated very highly to alignment performance in terms of precision and recall. We showed how to compute this measure efficiently in linear time, making it useful in practice to estimate quickly the difficulty of alignment for new genomes without having to align reads to them first. We showed how the length-sensitive measures could provide additional information for choosing aligners that would align consistently accurately on new genomes. We formally established a connection between genome complexity and the accuracy of short-read aligners. The relationship between genome complexity and alignment accuracy provides additional useful information for selecting suitable aligners for new genomes. Further, this work suggests that the complexity of genomes sometimes should be thought of in terms of specific computational problems, such as the alignment of short reads to genomes.

  9. Detecting microsatellites within genomes: significant variation among algorithms

    Directory of Open Access Journals (Sweden)

    Rivals Eric

    2007-04-01

    Full Text Available Abstract Background Microsatellites are short, tandemly-repeated DNA sequences which are widely distributed among genomes. Their structure, role and evolution can be analyzed based on exhaustive extraction from sequenced genomes. Several dedicated algorithms have been developed for this purpose. Here, we compared the detection efficiency of five of them (TRF, Mreps, Sputnik, STAR, and RepeatMasker. Results Our analysis was first conducted on the human X chromosome, and microsatellite distributions were characterized by microsatellite number, length, and divergence from a pure motif. The algorithms work with user-defined parameters, and we demonstrate that the parameter values chosen can strongly influence microsatellite distributions. The five algorithms were then compared by fixing parameters settings, and the analysis was extended to three other genomes (Saccharomyces cerevisiae, Neurospora crassa and Drosophila melanogaster spanning a wide range of size and structure. Significant differences for all characteristics of microsatellites were observed among algorithms, but not among genomes, for both perfect and imperfect microsatellites. Striking differences were detected for short microsatellites (below 20 bp, regardless of motif. Conclusion Since the algorithm used strongly influences empirical distributions, studies analyzing microsatellite evolution based on a comparison between empirical and theoretical size distributions should therefore be considered with caution. We also discuss why a typological definition of microsatellites limits our capacity to capture their genomic distributions.

  10. Species-Specific Expression of Full-Length and Alternatively Spliced Variant Forms of CDK5RAP2.

    Directory of Open Access Journals (Sweden)

    John S Y Park

    Full Text Available CDK5RAP2 is one of the primary microcephaly genes that are associated with reduced brain size and mental retardation. We have previously shown that human CDK5RAP2 exists as a full-length form (hCDK5RAP2 or an alternatively spliced variant form (hCDK5RAP2-V1 that is lacking exon 32. The equivalent of hCDK5RAP2-V1 has been reported in rat and mouse but the presence of full-length equivalent hCDK5RAP2 in rat and mouse has not been examined. Here, we demonstrate that rat expresses both a full length and an alternatively spliced variant form of CDK5RAP2 that are equivalent to our previously reported hCDK5RAP2 and hCDK5RAP2-V1, repectively. However, mouse expresses only one form of CDK5RAP2 that is equivalent to the human and rat alternatively spliced variant forms. Knowledge of this expression of different forms of CDK5RAP2 in human, rat and mouse is essential in selecting the appropriate model for studies of CDK5RAP2 and primary microcephaly but our findings further indicate the evolutionary divergence of mouse from the human and rat species.

  11. Highly efficient full-length hepatitis C virus genotype 1 (strain TN) infectious culture system

    DEFF Research Database (Denmark)

    Li, Yi-Ping; Ramirez, Santseharay; Jensen, Sanne B

    2012-01-01

    Chronic infection with hepatitis C virus (HCV) is an important cause of end stage liver disease worldwide. In the United States, most HCV-related disease is associated with genotype 1 infection, which remains difficult to treat. Drug and vaccine development was hampered by inability to culture...... full-length TN infection dose-dependently. Given the unique importance of genotype 1 for pathogenesis, this infectious 1a culture system represents an important advance in HCV research. The approach used and the mutations identified might permit culture development for other HCV isolates, thus......) culture systems in Huh7.5 cells. Here, we developed a highly efficient genotype 1a (strain TN) full-length culture system. We initially found that the LSG substitutions conferred viability to an intergenotypic recombinant composed of TN 5' untranslated region (5'UTR)-NS5A and JFH1 NS5B-3'UTR; recovered...

  12. Characterization of the complete mitochondrial genome of Khawia sinensis belongs among platyhelminths, cestodes.

    Science.gov (United States)

    Feng, Yan; Feng, Han-Li; Fang, Yi-Hui; Su, Ying-Bing

    2017-06-01

    Khawia sinensis is an important species in freshwater fish causing considerable economic losses to the breeding industry. This is the first mt genome of a caryophyllidean cestode characterised. The entire mt genome of K. sinensis is 13,759 bp in length. This mt genome contains 12 protein-coding genes, 22 transfer RNA genes, two ribosomal RNA genes and two non-coding regions. The arrangement of the K. sinensis mt genome is the same as other tapeworms, however, the incomplete stop codon (A) is more frequent that other species. Phylogenetic analyses based on concatenated amino-acid sequences of the 12 protein-coding genes of 17 tapeworms including K. sinensis were conducted to assess the relationship of K. sinensis with other species, the result indicated K. sinensis was closely related with cestode species. This complete mt genome of K. sinensis will enrich the mitochondrial genome databases of tapeworms and provide important molecular markers for ecology, diagnostics, population variation and evolution of K. sinensis and other species. Copyright © 2017 Elsevier Inc. All rights reserved.

  13. Endocrine-Therapy-Resistant ESR1 Variants Revealed by Genomic Characterization of Breast-Cancer-Derived Xenografts

    Directory of Open Access Journals (Sweden)

    Shunqiang Li

    2013-09-01

    Full Text Available To characterize patient-derived xenografts (PDXs for functional studies, we made whole-genome comparisons with originating breast cancers representative of the major intrinsic subtypes. Structural and copy number aberrations were found to be retained with high fidelity. However, at the single-nucleotide level, variable numbers of PDX-specific somatic events were documented, although they were only rarely functionally significant. Variant allele frequencies were often preserved in the PDXs, demonstrating that clonal representation can be transplantable. Estrogen-receptor-positive PDXs were associated with ESR1 ligand-binding-domain mutations, gene amplification, or an ESR1/YAP1 translocation. These events produced different endocrine-therapy-response phenotypes in human, cell line, and PDX endocrine-response studies. Hence, deeply sequenced PDX models are an important resource for the search for genome-forward treatment options and capture endocrine-drug-resistance etiologies that are not observed in standard cell lines. The originating tumor genome provides a benchmark for assessing genetic drift and clonal representation after transplantation.

  14. Comparative genomic characterization of citrus-associated Xylella fastidiosa strains

    Directory of Open Access Journals (Sweden)

    Nunes Luiz R

    2007-12-01

    Full Text Available Abstract Background The xylem-inhabiting bacterium Xylella fastidiosa (Xf is the causal agent of Pierce's disease (PD in vineyards and citrus variegated chlorosis (CVC in orange trees. Both of these economically-devastating diseases are caused by distinct strains of this complex group of microorganisms, which has motivated researchers to conduct extensive genomic sequencing projects with Xf strains. This sequence information, along with other molecular tools, have been used to estimate the evolutionary history of the group and provide clues to understand the capacity of Xf to infect different hosts, causing a variety of symptoms. Nonetheless, although significant amounts of information have been generated from Xf strains, a large proportion of these efforts has concentrated on the study of North American strains, limiting our understanding about the genomic composition of South American strains – which is particularly important for CVC-associated strains. Results This paper describes the first genome-wide comparison among South American Xf strains, involving 6 distinct citrus-associated bacteria. Comparative analyses performed through a microarray-based approach allowed identification and characterization of large mobile genetic elements that seem to be exclusive to South American strains. Moreover, a large-scale sequencing effort, based on Suppressive Subtraction Hybridization (SSH, identified 290 new ORFs, distributed in 135 Groups of Orthologous Elements, throughout the genomes of these bacteria. Conclusion Results from microarray-based comparisons provide further evidence concerning activity of horizontally transferred elements, reinforcing their importance as major mediators in the evolution of Xf. Moreover, the microarray-based genomic profiles showed similarity between Xf strains 9a5c and Fb7, which is unexpected, given the geographical and chronological differences associated with the isolation of these microorganisms. The newly

  15. Purification and Fibrillation of Full-Length Recombinant PrP.

    Science.gov (United States)

    Makarava, Natallia; Savtchenko, Regina; Baskakov, Ilia V

    2017-01-01

    Misfolding and aggregation of prion protein are related to several neurodegenerative diseases in humans such as Creutzfeldt-Jakob disease, fatal familial insomnia, and Gerstmann-Straussler-Scheinker disease. A growing number of applications in the prion field including assays for detection of PrP Sc and methods for production of PrP Sc de novo require recombinant prion protein (PrP) of high purity and quality. Here, we report an experimental procedure for expression and purification of full-length mammalian prion protein. This protocol has been proved to yield PrP of extremely high purity that lacks PrP adducts, oxidative modifications, or truncation, which is typically generated as a result of spontaneous oxidation or degradation. We also describe methods for preparation of amyloid fibrils from recombinant PrP in vitro. Recombinant PrP fibrils can be used as a noninfectious synthetic surrogate of PrP Sc for development of prion diagnostics including generation of PrP Sc -specific antibody.

  16. Genome-wide characterization of microsatelittes and marker development in the carcinogenic liver fluke Clonorchis sinensis

    Science.gov (United States)

    Nguyen, Thao T.B.; Arimatsu, Yuji; Hong, Sung-Jong; Brindley, Paul J.; Blair, David; Laha, Thewarach; Sripa, Banchob

    2015-01-01

    Clonorchis sinensis is an important carcinogenic human liver fluke endemic in East and Southeast Asia. There are several conventional molecular markers have been used for identification and genetic diversity, however, no information about microsatellites of this liver fluke published so far. We here report microsatellite characterization and marker development for genetic diversity study in C. sinensis using genome-wide bioinformatics approach. Based on our search criteria, a total of 256,990 microsatellites (≥ 12 base pairs) were identified from genome database of C. sinensis with hexa-nucleotide motif being the most abundant (51%) followed by penta-nucleotide (18.3%) and tri-nucleotide (12.7%). The tetra-nucleotide, di-nucleotide and mononucleotide motifs accounted for 9.75 %, 7.63% and 0.14%, respectively. The total length of all microsatellites accounts for 0. 72 % of 547 Mb of the whole genome size and the frequency of microsatellites were found to be one microsatellite in every 2.13 kb of DNA. For the di-, tri, and tetra-nucleotide, the repeat numbers redundant are six (28%), four (45%) and three (76%), respectively. The ATC repeat is the most abundant microsatellites followed by AT, AAT and AC, respectively. Within 40 microsatellite loci developed, 24 microsatellite markers showed potential to differentiate between C. sinensis and O. viverrini. Seven out of 24 loci showed heterozygous with observed heterozygosity ranged from 0.467 to 1. Four-primer sets could amplify both C. sinensis and O. viverrini DNA with different sizes. This study provides basic information of C. sinensis microsatellites and the genome-wide markers developed may be a useful tool for genetic study of C. sinensis. PMID:25782682

  17. Genome-wide characterization of microsatellites and marker development in the carcinogenic liver fluke Clonorchis sinensis.

    Science.gov (United States)

    Nguyen, Thao T B; Arimatsu, Yuji; Hong, Sung-Jong; Brindley, Paul J; Blair, David; Laha, Thewarach; Sripa, Banchob

    2015-06-01

    Clonorchis sinensis is an important carcinogenic human liver fluke endemic in East and Southeast Asia. There are several conventional molecular markers that have been used for identification and genetic diversity; however, no information about microsatellites of this liver fluke is published so far. We here report microsatellite characterization and marker development for a genetic diversity study in C. sinensis, using a genome-wide bioinformatics approach. Based on our search criteria, a total of 256,990 microsatellites (≥12 base pairs) were identified from a genome database of C. sinensis, with hexanucleotide motif being the most abundant (51%) followed by pentanucleotide (18.3%) and trinucleotide (12.7%). The tetranucleotide, dinucleotide, and mononucleotide motifs accounted for 9.75, 7.63, and 0.14%, respectively. The total length of all microsatellites accounts for 0. 72% of 547 Mb of the whole genome size, and the frequency of microsatellites was found to be one microsatellite in every 2.13 kb of DNA. For the di-, tri-, and tetranucleotide, the repeat numbers redundant are six (28%), four (45%), and three (76%), respectively. The ATC repeat is the most abundant microsatellites followed by AT, AAT, and AC, respectively. Within 40 microsatellite loci developed, 24 microsatellite markers showed potential to differentiate between C. sinensis and Opisthorchis viverrini. Seven out of 24 loci showed to be heterozygous with observed heterozygosity that ranged from 0.467 to 1. Four primer sets could amplify both C. sinensis and O. viverrini DNA with different sizes. This study provides basic information of C. sinensis microsatellites, and the genome-wide markers developed may be a useful tool for the genetic study of C. sinensis.

  18. Full-length mRNA sequencing uncovers a widespread coupling between transcription initiation and mRNA processing.

    Science.gov (United States)

    Anvar, Seyed Yahya; Allard, Guy; Tseng, Elizabeth; Sheynkman, Gloria M; de Klerk, Eleonora; Vermaat, Martijn; Yin, Raymund H; Johansson, Hans E; Ariyurek, Yavuz; den Dunnen, Johan T; Turner, Stephen W; 't Hoen, Peter A C

    2018-03-29

    The multifaceted control of gene expression requires tight coordination of regulatory mechanisms at transcriptional and post-transcriptional level. Here, we studied the interdependence of transcription initiation, splicing and polyadenylation events on single mRNA molecules by full-length mRNA sequencing. In MCF-7 breast cancer cells, we find 2700 genes with interdependent alternative transcription initiation, splicing and polyadenylation events, both in proximal and distant parts of mRNA molecules, including examples of coupling between transcription start sites and polyadenylation sites. The analysis of three human primary tissues (brain, heart and liver) reveals similar patterns of interdependency between transcription initiation and mRNA processing events. We predict thousands of novel open reading frames from full-length mRNA sequences and obtained evidence for their translation by shotgun proteomics. The mapping database rescues 358 previously unassigned peptides and improves the assignment of others. By recognizing sample-specific amino-acid changes and novel splicing patterns, full-length mRNA sequencing improves proteogenomics analysis of MCF-7 cells. Our findings demonstrate that our understanding of transcriptome complexity is far from complete and provides a basis to reveal largely unresolved mechanisms that coordinate transcription initiation and mRNA processing.

  19. Genomic diversity and evolution of the lyssaviruses.

    Directory of Open Access Journals (Sweden)

    Olivier Delmas

    2008-04-01

    Full Text Available Lyssaviruses are RNA viruses with single-strand, negative-sense genomes responsible for rabies-like diseases in mammals. To date, genomic and evolutionary studies have most often utilized partial genome sequences, particularly of the nucleoprotein and glycoprotein genes, with little consideration of genome-scale evolution. Herein, we report the first genomic and evolutionary analysis using complete genome sequences of all recognised lyssavirus genotypes, including 14 new complete genomes of field isolates from 6 genotypes and one genotype that is completely sequenced for the first time. In doing so we significantly increase the extent of genome sequence data available for these important viruses. Our analysis of these genome sequence data reveals that all lyssaviruses have the same genomic organization. A phylogenetic analysis reveals strong geographical structuring, with the greatest genetic diversity in Africa, and an independent origin for the two known genotypes that infect European bats. We also suggest that multiple genotypes may exist within the diversity of viruses currently classified as 'Lagos Bat'. In sum, we show that rigorous phylogenetic techniques based on full length genome sequence provide the best discriminatory power for genotype classification within the lyssaviruses.

  20. The first complete genome sequences of clinical isolates of human coronavirus 229E

    NARCIS (Netherlands)

    Farsani, Seyed Mohammad Jazaeri; Dijkman, Ronald; Jebbink, Maarten F.; Goossens, Herman; Ieven, Margareta; Deijs, Martin; Molenkamp, Richard; van der Hoek, Lia

    2012-01-01

    Human coronavirus 229E has been identified in the mid-1960s, yet still only one full-genome sequence is available. This full-length sequence has been determined from the cDNA-clone Inf-1 that is based on the lab-adapted strain VR-740. Lab-adaptation might have resulted in genomic changes, due to

  1. Preliminary Genomic Characterization of Ten Hardwood Tree Species from Multiplexed Low Coverage Whole Genome Sequencing.

    Directory of Open Access Journals (Sweden)

    Margaret Staton

    Full Text Available Forest health issues are on the rise in the United States, resulting from introduction of alien pests and diseases, coupled with abiotic stresses related to climate change. Increasingly, forest scientists are finding genetic/genomic resources valuable in addressing forest health issues. For a set of ten ecologically and economically important native hardwood tree species representing a broad phylogenetic spectrum, we used low coverage whole genome sequencing from multiplex Illumina paired ends to economically profile their genomic content. For six species, the genome content was further analyzed by flow cytometry in order to determine the nuclear genome size. Sequencing yielded a depth of 0.8X to 7.5X, from which in silico analysis yielded preliminary estimates of gene and repetitive sequence content in the genome for each species. Thousands of genomic SSRs were identified, with a clear predisposition toward dinucleotide repeats and AT-rich repeat motifs. Flanking primers were designed for SSR loci for all ten species, ranging from 891 loci in sugar maple to 18,167 in redbay. In summary, we have demonstrated that useful preliminary genome information including repeat content, gene content and useful SSR markers can be obtained at low cost and time input from a single lane of Illumina multiplex sequence.

  2. Telomere Length Reprogramming in Embryos and Stem Cells

    Directory of Open Access Journals (Sweden)

    Keri Kalmbach

    2014-01-01

    Full Text Available Telomeres protect and cap linear chromosome ends, yet these genomic buffers erode over an organism’s lifespan. Short telomeres have been associated with many age-related conditions in humans, and genetic mutations resulting in short telomeres in humans manifest as syndromes of precocious aging. In women, telomere length limits a fertilized egg’s capacity to develop into a healthy embryo. Thus, telomere length must be reset with each subsequent generation. Although telomerase is purportedly responsible for restoring telomere DNA, recent studies have elucidated the role of alternative telomeres lengthening mechanisms in the reprogramming of early embryos and stem cells, which we review here.

  3. Assessing the genetic diversity of Cu resistance in mine tailings through high-throughput recovery of full-length copA genes

    Science.gov (United States)

    Li, Xiaofang; Zhu, Yong-Guan; Shaban, Babak; Bruxner, Timothy J. C.; Bond, Philip L.; Huang, Longbin

    2015-01-01

    Characterizing the genetic diversity of microbial copper (Cu) resistance at the community level remains challenging, mainly due to the polymorphism of the core functional gene copA. In this study, a local BLASTN method using a copA database built in this study was developed to recover full-length putative copA sequences from an assembled tailings metagenome; these sequences were then screened for potentially functioning CopA using conserved metal-binding motifs, inferred by evolutionary trace analysis of CopA sequences from known Cu resistant microorganisms. In total, 99 putative copA sequences were recovered from the tailings metagenome, out of which 70 were found with high potential to be functioning in Cu resistance. Phylogenetic analysis of selected copA sequences detected in the tailings metagenome showed that topology of the copA phylogeny is largely congruent with that of the 16S-based phylogeny of the tailings microbial community obtained in our previous study, indicating that the development of copA diversity in the tailings might be mainly through vertical descent with few lateral gene transfer events. The method established here can be used to explore copA (and potentially other metal resistance genes) diversity in any metagenome and has the potential to exhaust the full-length gene sequences for downstream analyses. PMID:26286020

  4. Leveraging Comparative Genomics to Identify and Functionally Characterize Genes Associated with Sperm Phenotypes in Python bivittatus (Burmese Python

    Directory of Open Access Journals (Sweden)

    Kristopher J. L. Irizarry

    2016-01-01

    Full Text Available Comparative genomics approaches provide a means of leveraging functional genomics information from a highly annotated model organism’s genome (such as the mouse genome in order to make physiological inferences about the role of genes and proteins in a less characterized organism’s genome (such as the Burmese python. We employed a comparative genomics approach to produce the functional annotation of Python bivittatus genes encoding proteins associated with sperm phenotypes. We identify 129 gene-phenotype relationships in the python which are implicated in 10 specific sperm phenotypes. Results obtained through our systematic analysis identified subsets of python genes exhibiting associations with gene ontology annotation terms. Functional annotation data was represented in a semantic scatter plot. Together, these newly annotated Python bivittatus genome resources provide a high resolution framework from which the biology relating to reptile spermatogenesis, fertility, and reproduction can be further investigated. Applications of our research include (1 production of genetic diagnostics for assessing fertility in domestic and wild reptiles; (2 enhanced assisted reproduction technology for endangered and captive reptiles; and (3 novel molecular targets for biotechnology-based approaches aimed at reducing fertility and reproduction of invasive reptiles. Additional enhancements to reptile genomic resources will further enhance their value.

  5. Non-destructive testing of full-length bonded rock bolts based on HHT signal analysis

    Science.gov (United States)

    Shi, Z. M.; Liu, L.; Peng, M.; Liu, C. C.; Tao, F. J.; Liu, C. S.

    2018-04-01

    Full-length bonded rock bolts are commonly used in mining, tunneling and slope engineering because of their simple design and resistance to corrosion. However, the length of a rock bolt and grouting quality do not often meet the required design standards in practice because of the concealment and complexity of bolt construction. Non-destructive testing is preferred when testing a rock bolt's quality because of the convenience, low cost and wide detection range. In this paper, a signal analysis method for the non-destructive sound wave testing of full-length bonded rock bolts is presented, which is based on the Hilbert-Huang transform (HHT). First, we introduce the HHT analysis method to calculate the bolt length and identify defect locations based on sound wave reflection test signals, which includes decomposing the test signal via empirical mode decomposition (EMD), selecting the intrinsic mode functions (IMF) using the Pearson Correlation Index (PCI) and calculating the instantaneous phase and frequency via the Hilbert transform (HT). Second, six model tests are conducted using different grouting defects and bolt protruding lengths to verify the effectiveness of the HHT analysis method. Lastly, the influence of the bolt protruding length on the test signal, identification of multiple reflections from defects, bolt end and protruding end, and mode mixing from EMD are discussed. The HHT analysis method can identify the bolt length and grouting defect locations from signals that contain noise at multiple reflected interfaces. The reflection from the long protruding end creates an irregular test signal with many frequency peaks on the spectrum. The reflections from defects barely change the original signal because they are low energy, which cannot be adequately resolved using existing methods. The HHT analysis method can identify reflections from the long protruding end of the bolt and multiple reflections from grouting defects based on mutations in the instantaneous

  6. Genome-based microbial ecology of anammox granules in a full-scale wastewater treatment system

    OpenAIRE

    Speth, D.R.; Zandt, M.H. in 't; Guerrero Cruz, S.; Dutilh, B.E.; Jetten, M.S.M.

    2016-01-01

    Partial-nitritation anammox (PNA) is a novel wastewater treatment procedure for energy-efficient ammonium removal. Here we use genome-resolved metagenomics to build a genome-based ecological model of the microbial community in a full-scale PNA reactor. Sludge from the bioreactor examined here is used to seed reactors in wastewater treatment plants around the world; however, the role of most of its microbial community in ammonium removal remains unknown. Our analysis yielded 23 near-complete d...

  7. Genome-wide generation and use of informative intron-spanning and intron-length polymorphism markers for high-throughput genetic analysis in rice

    Science.gov (United States)

    Badoni, Saurabh; Das, Sweta; Sayal, Yogesh K.; Gopalakrishnan, S.; Singh, Ashok K.; Rao, Atmakuri R.; Agarwal, Pinky; Parida, Swarup K.; Tyagi, Akhilesh K.

    2016-01-01

    We developed genome-wide 84634 ISM (intron-spanning marker) and 16510 InDel-fragment length polymorphism-based ILP (intron-length polymorphism) markers from genes physically mapped on 12 rice chromosomes. These genic markers revealed much higher amplification-efficiency (80%) and polymorphic-potential (66%) among rice accessions even by a cost-effective agarose gel-based assay. A wider level of functional molecular diversity (17–79%) and well-defined precise admixed genetic structure was assayed by 3052 genome-wide markers in a structured population of indica, japonica, aromatic and wild rice. Six major grain weight QTLs (11.9–21.6% phenotypic variation explained) were mapped on five rice chromosomes of a high-density (inter-marker distance: 0.98 cM) genetic linkage map (IR 64 x Sonasal) anchored with 2785 known/candidate gene-derived ISM and ILP markers. The designing of multiple ISM and ILP markers (2 to 4 markers/gene) in an individual gene will broaden the user-preference to select suitable primer combination for efficient assaying of functional allelic variation/diversity and realistic estimation of differential gene expression profiles among rice accessions. The genomic information generated in our study is made publicly accessible through a user-friendly web-resource, “Oryza ISM-ILP marker” database. The known/candidate gene-derived ISM and ILP markers can be enormously deployed to identify functionally relevant trait-associated molecular tags by optimal-resource expenses, leading towards genomics-assisted crop improvement in rice. PMID:27032371

  8. DNA methylation alteration is a major consequence of genome doubling in autotetraploid Brassica rapa

    Directory of Open Access Journals (Sweden)

    Xu Yanhao

    2017-01-01

    Full Text Available Polyploids are typically classified as autopolyploids or allopolyploids based on the origin of their chromosome sets. Autopolyploidy is much more common than traditionally believed. Allopolyploidization, accompanied by genomic and transcriptomic changes, has been well investigated. In this study, genetic, DNA methylation and gene expression changes in autotetraploid Brassica rapa were investigated. No genetic alteration was detected using an amplified fragment length polymorphism (AFLP approach. Using a cDNA-AFLP approach, approximately 0.58% of fragments showed changes in gene expression in autotetraploid B. rapa. The methylation-sensitive amplification polymorphism (MSAP analysis showed that approximately 1.7% of the fragments underwent DNA methylation changes upon genome doubling, with hypermethylation and demethylation changes equally affected. Fragments displaying changes in gene expression and methylation status were isolated and then sequenced and characterized, respectively. This study showed that variation in cytosine methylation is a major consequence of genome doubling in autotetraploid Brassica rapa.

  9. Targeted isolation, sequence assembly and characterization of two white spruce (Picea glauca BAC clones for terpenoid synthase and cytochrome P450 genes involved in conifer defence reveal insights into a conifer genome

    Directory of Open Access Journals (Sweden)

    Ritland Carol

    2009-08-01

    Full Text Available Abstract Background Conifers are a large group of gymnosperm trees which are separated from the angiosperms by more than 300 million years of independent evolution. Conifer genomes are extremely large and contain considerable amounts of repetitive DNA. Currently, conifer sequence resources exist predominantly as expressed sequence tags (ESTs and full-length (FLcDNAs. There is no genome sequence available for a conifer or any other gymnosperm. Conifer defence-related genes often group into large families with closely related members. The goals of this study are to assess the feasibility of targeted isolation and sequence assembly of conifer BAC clones containing specific genes from two large gene families, and to characterize large segments of genomic DNA sequence for the first time from a conifer. Results We used a PCR-based approach to identify BAC clones for two target genes, a terpene synthase (3-carene synthase; 3CAR and a cytochrome P450 (CYP720B4 from a non-arrayed genomic BAC library of white spruce (Picea glauca. Shotgun genomic fragments isolated from the BAC clones were sequenced to a depth of 15.6- and 16.0-fold coverage, respectively. Assembly and manual curation yielded sequence scaffolds of 172 kbp (3CAR and 94 kbp (CYP720B4 long. Inspection of the genomic sequences revealed the intron-exon structures, the putative promoter regions and putative cis-regulatory elements of these genes. Sequences related to transposable elements (TEs, high complexity repeats and simple repeats were prevalent and comprised approximately 40% of the sequenced genomic DNA. An in silico simulation of the effect of sequencing depth on the quality of the sequence assembly provides direction for future efforts of conifer genome sequencing. Conclusion We report the first targeted cloning, sequencing, assembly, and annotation of large segments of genomic DNA from a conifer. We demonstrate that genomic BAC clones for individual members of multi-member gene

  10. Characterization of amylose nanoparticles prepared via nanoprecipitation: Influence of chain length distribution.

    Science.gov (United States)

    Chang, Yanjiao; Yang, Jingde; Ren, Lili; Zhou, Jiang

    2018-08-15

    The influence of chain length distribution of amylose on size and structure of the amylose nanoparticles (ANPs) prepared through nanoprecipitation was investigated. Amylose with different chain length distributions was obtained by β-amylase treating amylose paste for different times and measured by size exclusion chromatography (SEC) and fluorophore-assisted carbohydrate electrophoresis (FACE). ANPs prepared via precipitation were characterized by using dynamic light scattering (DLS), scanning electron microscopy (SEM) and X-ray diffraction (XRD). Results showed that the β-amylase treatments led to decrease in chain length of amylose, and it was the most important factor affecting size of ANPs. When hydrolysis degree of amylose was 52.8%, mean size of ANPs decreased from 206.4 nm to 102.7 nm. All the ANPs displayed a V-type crystalline structure and the effect of amylose chain length on crystallinity of the precipitated ANPs was negligible in the investigated range. Copyright © 2018 Elsevier Ltd. All rights reserved.

  11. Complete mitochondrial genome of Cynopterus sphinx (Pteropodidae: Cynopterus).

    Science.gov (United States)

    Li, Linmiao; Li, Min; Wu, Zhengjun; Chen, Jinping

    2015-01-01

    We have characterized the complete mitochondrial genome of Cynopterus sphinx (Pteropodidae: Cynopterus) and described its organization in this study. The total length of C. sphinx complete mitochondrial genome was 16,895 bp with the base composition of 32.54% A, 14.05% G, 25.82% T and 27.59% C. The complete mitochondrial genome included 13 protein-coding genes, 22 tRNA genes, 2 rRNA genes (12S rRNA and 16S rRNA) and 1 control region (D-loop). The control region was 1435 bp long with the sequence CATACG repeat 64 times. Three protein-coding genes (ND1, COI and ND4) were ended with incomplete stop codon TA or T.

  12. Characterization of genome sequences and clinical features of coxsackievirus A6 strains collected in Hyogo, Japan in 1999-2013.

    Science.gov (United States)

    Ogi, Miki; Yano, Yoshihiko; Chikahira, Masatsugu; Takai, Denshi; Oshibe, Tomohiro; Arashiro, Takeshi; Hanaoka, Nozomu; Fujimoto, Tsuguto; Hayashi, Yoshitake

    2017-08-01

    Coxsackievirus A6 (CV-A6) is an enterovirus, which is known to cause herpangina. However, since 2009 it has frequently been isolated from children with hand, foot, and mouth disease (HFMD). In Japan, CV-A6 has been linked to HFMD outbreaks in 2011 and 2013. In this study, the full-length genome sequencing of CV-A6 strains were analyzed to identify the association with clinical manifestations. Five thousand six hundred and twelve children with suspected enterovirus infection (0-17 years old) between 1999 and 2013 in Hyogo Prefecture, Japan, were enrolled. Enterovirus infection was confirmed with reverse transcriptase-PCR in 753 children (791 samples), 127 of whom (133 samples) were positive for CV-A6 based on the direct sequencing of the VP4 region. The complete genomes of CV-A6 from 22 positive patients with different clinical manifestations were investigated. A phylogenetic analysis divided these 22 strains into two clusters based on the VP1 region; cluster I contained strains collected in 1999-2009 and mostly related to herpangina, and cluster II contained strains collected in 2011-2013 and related to HFMD outbreak. Based on the full-length polyprotein analysis, the amino acid differences between the strains in cluster I and II were 97.7 ± 0.28%. Amino acid differences were detected in 17 positions within the polyprotein. Strains collected in 1999-2009 and those in 2011-2013 were separately clustered by phylogenetic analysis based on 5'UTR and 3Dpol region, as well as VP1 region. In conclusion, HFMD outbreaks by CV-A6 were recently frequent in Japan and the accumulation of genomic change might be associated with the clinical course. © 2017 Wiley Periodicals, Inc.

  13. Structure and function of the first full-length murein peptide ligase (Mpl cell wall recycling protein.

    Directory of Open Access Journals (Sweden)

    Debanu Das

    2011-03-01

    Full Text Available Bacterial cell walls contain peptidoglycan, an essential polymer made by enzymes in the Mur pathway. These proteins are specific to bacteria, which make them targets for drug discovery. MurC, MurD, MurE and MurF catalyze the synthesis of the peptidoglycan precursor UDP-N-acetylmuramoyl-L-alanyl-γ-D-glutamyl-meso-diaminopimelyl-D-alanyl-D-alanine by the sequential addition of amino acids onto UDP-N-acetylmuramic acid (UDP-MurNAc. MurC-F enzymes have been extensively studied by biochemistry and X-ray crystallography. In gram-negative bacteria, ∼30-60% of the bacterial cell wall is recycled during each generation. Part of this recycling process involves the murein peptide ligase (Mpl, which attaches the breakdown product, the tripeptide L-alanyl-γ-D-glutamyl-meso-diaminopimelate, to UDP-MurNAc. We present the crystal structure at 1.65 Å resolution of a full-length Mpl from the permafrost bacterium Psychrobacter arcticus 273-4 (PaMpl. Although the Mpl structure has similarities to Mur enzymes, it has unique sequence and structure features that are likely related to its role in cell wall recycling, a function that differentiates it from the MurC-F enzymes. We have analyzed the sequence-structure relationships that are unique to Mpl proteins and compared them to MurC-F ligases. We have also characterized the biochemical properties of this enzyme (optimal temperature, pH and magnesium binding profiles and kinetic parameters. Although the structure does not contain any bound substrates, we have identified ∼30 residues that are likely to be important for recognition of the tripeptide and UDP-MurNAc substrates, as well as features that are unique to Psychrobacter Mpl proteins. These results provide the basis for future mutational studies for more extensive function characterization of the Mpl sequence-structure relationships.

  14. Comparative analysis of the full genome sequence of European bat lyssavirus type 1 and type 2 with other lyssaviruses and evidence for a conserved transcription termination and polyadenylation motif in the G-L 3' non-translated region.

    Science.gov (United States)

    Marston, D A; McElhinney, L M; Johnson, N; Müller, T; Conzelmann, K K; Tordo, N; Fooks, A R

    2007-04-01

    We report the first full-length genomic sequences for European bat lyssavirus type-1 (EBLV-1) and type-2 (EBLV-2). The EBLV-1 genomic sequence was derived from a virus isolated from a serotine bat in Hamburg, Germany, in 1968 and the EBLV-2 sequence was derived from a virus isolate from a human case of rabies that occurred in Scotland in 2002. A long-distance PCR strategy was used to amplify the open reading frames (ORFs), followed by standard and modified RACE (rapid amplification of cDNA ends) techniques to amplify the 3' and 5' ends. The lengths of each complete viral genome for EBLV-1 and EBLV-2 were 11 966 and 11 930 base pairs, respectively, and follow the standard rhabdovirus genome organization of five viral proteins. Comparison with other lyssavirus sequences demonstrates variation in degrees of homology, with the genomic termini showing a high degree of complementarity. The nucleoprotein was the most conserved, both intra- and intergenotypically, followed by the polymerase (L), matrix and glyco- proteins, with the phosphoprotein being the most variable. In addition, we have shown that the two EBLVs utilize a conserved transcription termination and polyadenylation (TTP) motif, approximately 50 nt upstream of the L gene start codon. All available lyssavirus sequences to date, with the exception of Pasteur virus (PV) and PV-derived isolates, use the second TTP site. This observation may explain differences in pathogenicity between lyssavirus strains, dependent on the length of the untranslated region, which might affect transcriptional activity and RNA stability.

  15. Revised genomic consensus for the hypermethylated CpG island region of the human L1 transposon and integration sites of full length L1 elements from recombinant clones made using methylation-tolerant host strains

    DEFF Research Database (Denmark)

    Crowther, P J; Doherty, J P; Linsenmeyer, M E

    1991-01-01

    preferentially from L1 members which have accumulated mutations that have removed sites of methylation. We present a revised consensus from the 5' presumptive control region of these elements. This revised consensus contains a consensus RNA polymerase III promoter which would permit the synthesis of transcripts......Efficient recovery of clones from the 5' end of the human L1 dispersed repetitive elements necessitates the use of deletion mcr- host strains since this region contains a CpG island which is hypermethylated in vivo. Clones recovered with conventional mcr+ hosts seem to have been derived...... from the 5' end of full length L1 elements. Such potential transcripts are likely to exhibit a high degree of secondary structure. In addition, we have determined the flanking sequences for 6 full length L1 elements. The majority of full length L1 clones show no convincing evidence for target site...

  16. Early Epstein-Barr Virus Genomic Diversity and Convergence toward the B95.8 Genome in Primary Infection.

    Science.gov (United States)

    Weiss, Eric R; Lamers, Susanna L; Henderson, Jennifer L; Melnikov, Alexandre; Somasundaran, Mohan; Garber, Manuel; Selin, Liisa; Nusbaum, Chad; Luzuriaga, Katherine

    2018-01-15

    Over 90% of the world's population is persistently infected with Epstein-Barr virus. While EBV does not cause disease in most individuals, it is the common cause of acute infectious mononucleosis (AIM) and has been associated with several cancers and autoimmune diseases, highlighting a need for a preventive vaccine. At present, very few primary, circulating EBV genomes have been sequenced directly from infected individuals. While low levels of diversity and low viral evolution rates have been predicted for double-stranded DNA (dsDNA) viruses, recent studies have demonstrated appreciable diversity in common dsDNA pathogens (e.g., cytomegalovirus). Here, we report 40 full-length EBV genome sequences obtained from matched oral wash and B cell fractions from a cohort of 10 AIM patients. Both intra- and interpatient diversity were observed across the length of the entire viral genome. Diversity was most pronounced in viral genes required for establishing latent infection and persistence, with appreciable levels of diversity also detected in structural genes, including envelope glycoproteins. Interestingly, intrapatient diversity declined significantly over time ( P < 0.01), and this was particularly evident on comparison of viral genomes sequenced from B cell fractions in early primary infection and convalescence ( P < 0.001). B cell-associated viral genomes were observed to converge, becoming nearly identical to the B95.8 reference genome over time (Spearman rank-order correlation test; r = -0.5589, P = 0.0264). The reduction in diversity was most marked in the EBV latency genes. In summary, our data suggest independent convergence of diverse viral genome sequences toward a reference-like strain within a relatively short period following primary EBV infection. IMPORTANCE Identification of viral proteins with low variability and high immunogenicity is important for the development of a protective vaccine. Knowledge of genome diversity within circulating viral

  17. HLA diversity in the 1000 genomes dataset.

    Directory of Open Access Journals (Sweden)

    Pierre-Antoine Gourraud

    Full Text Available The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation by sequencing at a level that should allow the genome-wide detection of most variants with frequencies as low as 1%. However, in the major histocompatibility complex (MHC, only the top 10 most frequent haplotypes are in the 1% frequency range whereas thousands of haplotypes are present at lower frequencies. Given the limitation of both the coverage and the read length of the sequences generated by the 1000 Genomes Project, the highly variable positions that define HLA alleles may be difficult to identify. We used classical Sanger sequencing techniques to type the HLA-A, HLA-B, HLA-C, HLA-DRB1 and HLA-DQB1 genes in the available 1000 Genomes samples and combined the results with the 103,310 variants in the MHC region genotyped by the 1000 Genomes Project. Using pairwise identity-by-descent distances between individuals and principal component analysis, we established the relationship between ancestry and genetic diversity in the MHC region. As expected, both the MHC variants and the HLA phenotype can identify the major ancestry lineage, informed mainly by the most frequent HLA haplotypes. To some extent, regions of the genome with similar genetic or similar recombination rate have similar properties. An MHC-centric analysis underlines departures between the ancestral background of the MHC and the genome-wide picture. Our analysis of linkage disequilibrium (LD decay in these samples suggests that overestimation of pairwise LD occurs due to a limited sampling of the MHC diversity. This collection of HLA-specific MHC variants, available on the dbMHC portal, is a valuable resource for future analyses of the role of MHC in population and disease studies.

  18. The ability to form full-length intron RNA circles is a general property of nuclear group I introns

    DEFF Research Database (Denmark)

    Nielsen, Henrik; Fiskaa, Tonje; Birgisdottir, Asa Birna

    2003-01-01

    at the expense of the host. The circularization pathway has distinct structural requirements that differ from those of splicing and appears to be specifically suppressed in vivo. The ability to form full-length circles is found in all types of nuclear group I introns, including those from the Tetrahymena...... ribosomal DNA. The biological function of the full-length circles is not known, but the fact that the circles contain the entire genetic information of the intron suggests a role in intron mobility....

  19. Genome-wide identification, characterization and evolutionary analysis of long intergenic noncoding RNAs in cucumber.

    Directory of Open Access Journals (Sweden)

    Zhiqiang Hao

    Full Text Available Long intergenic noncoding RNAs (lincRNAs are intergenic transcripts with a length of at least 200 nt that lack coding potential. Emerging evidence suggests that lincRNAs from animals participate in many fundamental biological processes. However, the systemic identification of lincRNAs has been undertaken in only a few plants. We chose to use cucumber (Cucumis sativus as a model to analyze lincRNAs due to its importance as a model plant for studying sex differentiation and fruit development and the rich genomic and transcriptome data available. The application of a bioinformatics pipeline to multiple types of gene expression data resulted in the identification and characterization of 3,274 lincRNAs. Next, 10 lincRNAs targeted by 17 miRNAs were also explored. Based on co-expression analysis between lincRNAs and mRNAs, 94 lincRNAs were annotated, which may be involved in response to stimuli, multi-organism processes, reproduction, reproductive processes, and growth. Finally, examination of the evolution of lincRNAs showed that most lincRNAs are under purifying selection, while 16 lincRNAs are under natural selection. Our results provide a rich resource for further validation of cucumber lincRNAs and their function. The identification of lincRNAs targeted by miRNAs offers new clues for investigations into the role of lincRNAs in regulating gene expression. Finally, evaluation of the lincRNAs suggested that some lincRNAs are under positive and balancing selection.

  20. The GAAS metagenomic tool and its estimations of viral and microbial average genome size in four major biomes.

    Science.gov (United States)

    Angly, Florent E; Willner, Dana; Prieto-Davó, Alejandra; Edwards, Robert A; Schmieder, Robert; Vega-Thurber, Rebecca; Antonopoulos, Dionysios A; Barott, Katie; Cottrell, Matthew T; Desnues, Christelle; Dinsdale, Elizabeth A; Furlan, Mike; Haynes, Matthew; Henn, Matthew R; Hu, Yongfei; Kirchman, David L; McDole, Tracey; McPherson, John D; Meyer, Folker; Miller, R Michael; Mundt, Egbert; Naviaux, Robert K; Rodriguez-Mueller, Beltran; Stevens, Rick; Wegley, Linda; Zhang, Lixin; Zhu, Baoli; Rohwer, Forest

    2009-12-01

    Metagenomic studies characterize both the composition and diversity of uncultured viral and microbial communities. BLAST-based comparisons have typically been used for such analyses; however, sampling biases, high percentages of unknown sequences, and the use of arbitrary thresholds to find significant similarities can decrease the accuracy and validity of estimates. Here, we present Genome relative Abundance and Average Size (GAAS), a complete software package that provides improved estimates of community composition and average genome length for metagenomes in both textual and graphical formats. GAAS implements a novel methodology to control for sampling bias via length normalization, to adjust for multiple BLAST similarities by similarity weighting, and to select significant similarities using relative alignment lengths. In benchmark tests, the GAAS method was robust to both high percentages of unknown sequences and to variations in metagenomic sequence read lengths. Re-analysis of the Sargasso Sea virome using GAAS indicated that standard methodologies for metagenomic analysis may dramatically underestimate the abundance and importance of organisms with small genomes in environmental systems. Using GAAS, we conducted a meta-analysis of microbial and viral average genome lengths in over 150 metagenomes from four biomes to determine whether genome lengths vary consistently between and within biomes, and between microbial and viral communities from the same environment. Significant differences between biomes and within aquatic sub-biomes (oceans, hypersaline systems, freshwater, and microbialites) suggested that average genome length is a fundamental property of environments driven by factors at the sub-biome level. The behavior of paired viral and microbial metagenomes from the same environment indicated that microbial and viral average genome sizes are independent of each other, but indicative of community responses to stressors and environmental conditions.

  1. Genomic diversity and introgression in O. sativa reveal the impact of domestication and breeding on the rice genome.

    Directory of Open Access Journals (Sweden)

    Keyan Zhao

    2010-05-01

    Full Text Available The domestication of Asian rice (Oryza sativa was a complex process punctuated by episodes of introgressive hybridization among and between subpopulations. Deep genetic divergence between the two main varietal groups (Indica and Japonica suggests domestication from at least two distinct wild populations. However, genetic uniformity surrounding key domestication genes across divergent subpopulations suggests cultural exchange of genetic material among ancient farmers.In this study, we utilize a novel 1,536 SNP panel genotyped across 395 diverse accessions of O. sativa to study genome-wide patterns of polymorphism, to characterize population structure, and to infer the introgression history of domesticated Asian rice. Our population structure analyses support the existence of five major subpopulations (indica, aus, tropical japonica, temperate japonica and GroupV consistent with previous analyses. Our introgression analysis shows that most accessions exhibit some degree of admixture, with many individuals within a population sharing the same introgressed segment due to artificial selection. Admixture mapping and association analysis of amylose content and grain length illustrate the potential for dissecting the genetic basis of complex traits in domesticated plant populations.Genes in these regions control a myriad of traits including plant stature, blast resistance, and amylose content. These analyses highlight the power of population genomics in agricultural systems to identify functionally important regions of the genome and to decipher the role of human-directed breeding in refashioning the genomes of a domesticated species.

  2. High yield purification of full-length functional hERG K+ channels produced in Saccharomyces cerevisiae

    DEFF Research Database (Denmark)

    Molbaek, Karen; Scharff-Poulsen, Peter; Hélix-Nielsen, Claus

    2015-01-01

    knowledge this is the first reported high-yield production and purification of full length, tetrameric and functional hERG. This significant breakthrough will be paramount in obtaining hERG crystal structures, and in establishment of new high-throughput hERG drug safety screening assays....

  3. Oenococcus oeni in Chilean Red Wines: Technological and Genomic Characterization

    Directory of Open Access Journals (Sweden)

    Jaime Romero

    2018-02-01

    Full Text Available The presence and load of species of LAB at the end of the malolactic fermentation (MLF were investigated in 16 wineries from the different Chilean valleys (Limarí, Casablanca, Maipo, Rapel, and Maule Valleys during 2012 and 2013, using PCR-RFLP and qPCR. Oenococcus oeni was observed in 80% of the samples collected. Dominance of O. oeni was reflected in the bacterial load (O. oeni/total bacteria measured by qPCR, corresponding to >85% in most of the samples. A total of 178 LAB isolates were identified after sequencing molecular markers, 95 of them corresponded to O. oeni. Further genetic analyses were performed using MLST (7 genes including 10 commercial strains; the results indicated that commercial strains were grouped together, while autochthonous strains distributed among different genetic clusters. To pre-select some autochthonous O. oeni, these isolates were also characterized based on technological tests such as ethanol tolerance (12 and 15%, SO2 resistance (0 and 80 mg l−1, and pH (3.1 and 3.6 and malic acid transformation (1.5 and 4 g l−1. For comparison purposes, commercial strain VP41 was also tested. Based on their technological performance, only 3 isolates were selected for further examination (genome analysis and they were able to reduce malic acid concentration, to grow at low pH 3.1, 15% ethanol and 80 mg l−1 SO2. The genome analyses of three selected isolates were examined and compared to PSU-1 and VP41 strains to study their potential contribution to the organoleptic properties of the final product. The presence and homology of genes potentially related to aromatic profile were compared among those strains. The results indicated high conservation of malolactic enzyme (>99% and the absence of some genes related to odor such as phenolic acid decarboxylase, in autochthonous strains. Genomic analysis also revealed that these strains shared 470 genes with VP41 and PSU-1 and that autochthonous strains harbor an interesting

  4. Genome Sequences of Gordonia Phages BaxterFox, Kita, Nymphadora, and Yeezy

    OpenAIRE

    Pope, Welkin H.; Bandla, Sharanya; Colbert, Alexandra K.; Eichinger, Fiona G.; Gamburg, Michelle B.; Horiates, Stavroula G.; Jamison, Jerrica M.; Julian, Dana R.; Moore, Whitney A.; Murthy, Pranav; Powell, Meghan C.; Smith, Sydney V.; Mezghani, Nadia; Milliken, Katherine A.; Thompson, Paige K.

    2016-01-01

    Gordonia phages BaxterFox, Kita, Nymphadora, and Yeezy are newly characterized phages of Gordonia terrae, isolated from soil samples in Pittsburgh, Pennsylvania. These phages have genome lengths between 50,346 and 53,717?bp, and encode on average 84 predicted proteins. All have G+C content of 66.6%.

  5. Genomic and karyotypic variation in Drosophila parasitoids (Hymenoptera, Cynipoidea, Figitidae

    Directory of Open Access Journals (Sweden)

    Vladimir Gokhman

    2011-08-01

    Full Text Available Drosophila melanogaster Meigen, 1830 has served as a model insect for over a century. Sequencing of the 11 additional Drosophila Fallen, 1823 species marks substantial progress in comparative genomics of this genus. By comparison, practically nothing is known about the genome size or genome sequences of parasitic wasps of Drosophila. Here, we present the first comparative analysis of genome size and karyotype structures of Drosophila parasitoids of the Leptopilina Förster, 1869 and Ganaspis Förster, 1869 species. The gametic genome size of Ganaspis xanthopoda (Ashmead, 1896 is larger than those of the three Leptopilina species studied. The genome sizes of all parasitic wasps studied here are also larger than those known for all Drosophila species. Surprisingly, genome sizes of these Drosophila parasitoids exceed the average value known for all previously studied Hymenoptera. The haploid chromosome number of both Leptopilina heterotoma (Thomson, 1862 and L. victoriae Nordlander, 1980 is ten. A chromosomal fusion appears to have produced a distinct karyotype for L. boulardi (Barbotin, Carton et Keiner-Pillault, 1979 (n = 9, whose genome size is smaller than that of wasps of the L. heterotoma clade. Like L. boulardi, the haploid chromosome number for G. xanthopoda is also nine. Our studies reveal a positive, but non linear, correlation between the genome size and total chromosome length in Drosophila parasitoids. These Drosophila parasitoids differ widely in their host range, and utilize different infection strategies to overcome host defense. Their comparative genomics, in relation to their exceptionally well-characterized hosts, will prove to be valuable for understanding the molecular basis of the host-parasite arms race and how such mechanisms shape the genetic structures of insect communities.

  6. A comprehensive evaluation of rodent malaria parasite genomes and gene expression

    KAUST Repository

    Otto, Thomas D

    2014-10-30

    Background: Rodent malaria parasites (RMP) are used extensively as models of human malaria. Draft RMP genomes have been published for Plasmodium yoelii, P. berghei ANKA (PbA) and P. chabaudi AS (PcAS). Although availability of these genomes made a significant impact on recent malaria research, these genomes were highly fragmented and were annotated with little manual curation. The fragmented nature of the genomes has hampered genome wide analysis of Plasmodium gene regulation and function. Results: We have greatly improved the genome assemblies of PbA and PcAS, newly sequenced the virulent parasite P. yoelii YM genome, sequenced additional RMP isolates/lines and have characterized genotypic diversity within RMP species. We have produced RNA-seq data and utilized it to improve gene-model prediction and to provide quantitative, genome-wide, data on gene expression. Comparison of the RMP genomes with the genome of the human malaria parasite P. falciparum and RNA-seq mapping permitted gene annotation at base-pair resolution. Full-length chromosomal annotation permitted a comprehensive classification of all subtelomeric multigene families including the `Plasmodium interspersed repeat genes\\' (pir). Phylogenetic classification of the pir family, combined with pir expression patterns, indicates functional diversification within this family. Conclusions: Complete RMP genomes, RNA-seq and genotypic diversity data are excellent and important resources for gene-function and post-genomic analyses and to better interrogate Plasmodium biology. Genotypic diversity between P. chabaudi isolates makes this species an excellent parasite to study genotype-phenotype relationships. The improved classification of multigene families will enhance studies on the role of (variant) exported proteins in virulence and immune evasion/modulation.

  7. Genomic characterization of a novel poxvirus from a flying fox: evidence for a new genus?

    Science.gov (United States)

    O'Dea, Mark A; Tu, Shin-Lin; Pang, Stanley; De Ridder, Thomas; Jackson, Bethany; Upton, Chris

    2016-09-01

    The carcass of an Australian little red flying fox (Pteropus scapulatus) which died following entrapment on a fence was submitted to the laboratory for Australian bat lyssavirus exclusion testing, which was negative. During post-mortem, multiple nodules were noted on the wing membranes, and therefore degenerate PCR primers targeting the poxvirus DNA polymerase gene were used to screen for poxviruses. The poxvirus PCR screen was positive and sequencing of the PCR product demonstrated very low, but significant, similarity with the DNA polymerase gene from members of the Poxviridae family. Next-generation sequencing of DNA extracted from the lesions returned a contig of 132 353 nucleotides (nt), which was further extended to produce a near full-length viral genome of 133 492 nt. Analysis of the genome revealed it to be AT-rich with inverted terminal repeats of at least 1314 nt and to contain 143 predicted genes. The genome contains a surprisingly large number (29) of genes not found in other poxviruses, one of which appears to be a homologue of the mammalian TNF-related apoptosis-inducing ligand (TRAIL) gene. Phylogenetic analysis indicates that the poxvirus described here is not closely related to any other poxvirus isolated from bats or other species, and that it likely should be placed in a new genus.

  8. Comparative genomics reveals insights into avian genome evolution and adaptation

    Science.gov (United States)

    Zhang, Guojie; Li, Cai; Li, Qiye; Li, Bo; Larkin, Denis M.; Lee, Chul; Storz, Jay F.; Antunes, Agostinho; Greenwold, Matthew J.; Meredith, Robert W.; Ödeen, Anders; Cui, Jie; Zhou, Qi; Xu, Luohao; Pan, Hailin; Wang, Zongji; Jin, Lijun; Zhang, Pei; Hu, Haofu; Yang, Wei; Hu, Jiang; Xiao, Jin; Yang, Zhikai; Liu, Yang; Xie, Qiaolin; Yu, Hao; Lian, Jinmin; Wen, Ping; Zhang, Fang; Li, Hui; Zeng, Yongli; Xiong, Zijun; Liu, Shiping; Zhou, Long; Huang, Zhiyong; An, Na; Wang, Jie; Zheng, Qiumei; Xiong, Yingqi; Wang, Guangbiao; Wang, Bo; Wang, Jingjing; Fan, Yu; da Fonseca, Rute R.; Alfaro-Núñez, Alonzo; Schubert, Mikkel; Orlando, Ludovic; Mourier, Tobias; Howard, Jason T.; Ganapathy, Ganeshkumar; Pfenning, Andreas; Whitney, Osceola; Rivas, Miriam V.; Hara, Erina; Smith, Julia; Farré, Marta; Narayan, Jitendra; Slavov, Gancho; Romanov, Michael N; Borges, Rui; Machado, João Paulo; Khan, Imran; Springer, Mark S.; Gatesy, John; Hoffmann, Federico G.; Opazo, Juan C.; Håstad, Olle; Sawyer, Roger H.; Kim, Heebal; Kim, Kyu-Won; Kim, Hyeon Jeong; Cho, Seoae; Li, Ning; Huang, Yinhua; Bruford, Michael W.; Zhan, Xiangjiang; Dixon, Andrew; Bertelsen, Mads F.; Derryberry, Elizabeth; Warren, Wesley; Wilson, Richard K; Li, Shengbin; Ray, David A.; Green, Richard E.; O’Brien, Stephen J.; Griffin, Darren; Johnson, Warren E.; Haussler, David; Ryder, Oliver A.; Willerslev, Eske; Graves, Gary R.; Alström, Per; Fjeldså, Jon; Mindell, David P.; Edwards, Scott V.; Braun, Edward L.; Rahbek, Carsten; Burt, David W.; Houde, Peter; Zhang, Yong; Yang, Huanming; Wang, Jian; Jarvis, Erich D.; Gilbert, M. Thomas P.; Wang, Jun

    2015-01-01

    Birds are the most species-rich class of tetrapod vertebrates and have wide relevance across many research fields. We explored bird macroevolution using full genomes from 48 avian species representing all major extant clades. The avian genome is principally characterized by its constrained size, which predominantly arose because of lineage-specific erosion of repetitive elements, large segmental deletions, and gene loss. Avian genomes furthermore show a remarkably high degree of evolutionary stasis at the levels of nucleotide sequence, gene synteny, and chromosomal structure. Despite this pattern of conservation, we detected many non-neutral evolutionary changes in protein-coding genes and noncoding regions. These analyses reveal that pan-avian genomic diversity covaries with adaptations to different lifestyles and convergent evolution of traits. PMID:25504712

  9. Genome-wide analysis of the WRKY transcription factors in aegilops tauschii.

    Science.gov (United States)

    Ma, Jianhui; Zhang, Daijing; Shao, Yun; Liu, Pei; Jiang, Lina; Li, Chunxi

    2014-01-01

    The WRKY transcription factors (TFs) play important roles in responding to abiotic and biotic stress in plants. However, due to its unfinished genome sequencing, relatively few WRKY TFs with full-length coding sequences (CDSs) have been identified in wheat. Instead, the Aegilops tauschii genome, which is the D-genome progenitor of the hexaploid wheat genome, provides important resources for the discovery of new genes. In this study, we performed a bioinformatics analysis to identify WRKY TFs with full-length CDSs from the A. tauschii genome. A detailed evolutionary analysis for all these TFs was conducted, and quantitative real-time PCR was carried out to investigate the expression patterns of the abiotic stress-related WRKY TFs under different abiotic stress conditions in A. tauschii seedlings. A total of 93 WRKY TFs were identified from A. tauschii, and 79 of them were found to be newly discovered genes compared with wheat. Gene phylogeny, gene structure and chromosome location of the 93 WRKY TFs were fully analyzed. These studies provide a global view of the WRKY TFs from A. tauschii and a firm foundation for further investigations in both A. tauschii and wheat. © 2015 S. Karger AG, Basel.

  10. Full genome sequences and molecular characterization of tick-borne encephalitis virus strains isolated from human patients

    Czech Academy of Sciences Publication Activity Database

    Formanová, P.; Černý, Jiří; Černá Bolfíková, B.; Valdés, James J.; Kozlová, I.; Dzhioev, Y.; Růžek, Daniel

    2015-01-01

    Roč. 6, č. 1 (2015), s. 38-46 ISSN 1877-959X R&D Projects: GA ČR GAP502/11/2116; GA ČR GAP302/12/2490 Institutional support: RVO:60077344 Keywords : tick-borne encephalitis virus * tick-borne encephalitis * genome analysis * human patient s Subject RIV: EE - Microbiology, Virology Impact factor: 2.690, year: 2015

  11. Identification and Characterization of Microsatellite Markers Derived from the Whole Genome Analysis of Taenia solium.

    Directory of Open Access Journals (Sweden)

    Mónica J Pajuelo

    2015-12-01

    Full Text Available Infections with Taenia solium are the most common cause of adult acquired seizures worldwide, and are the leading cause of epilepsy in developing countries. A better understanding of the genetic diversity of T. solium will improve parasite diagnostics and transmission pathways in endemic areas thereby facilitating the design of future control measures and interventions. Microsatellite markers are useful genome features, which enable strain typing and identification in complex pathogen genomes. Here we describe microsatellite identification and characterization in T. solium, providing information that will assist in global efforts to control this important pathogen.For genome sequencing, T. solium cysts and proglottids were collected from Huancayo and Puno in Peru, respectively. Using next generation sequencing (NGS and de novo assembly, we assembled two draft genomes and one hybrid genome. Microsatellite sequences were identified and 36 of them were selected for further analysis. Twenty T. solium isolates were collected from Tumbes in the northern region, and twenty from Puno in the southern region of Peru. The size-polymorphism of the selected microsatellites was determined with multi-capillary electrophoresis. We analyzed the association between microsatellite polymorphism and the geographic origin of the samples.The predicted size of the hybrid (proglottid genome combined with cyst genome T. solium genome was 111 MB with a GC content of 42.54%. A total of 7,979 contigs (>1,000 nt were obtained. We identified 9,129 microsatellites in the Puno-proglottid genome and 9,936 in the Huancayo-cyst genome, with 5 or more repeats, ranging from mono- to hexa-nucleotide. Seven microsatellites were polymorphic and 29 were monomorphic within the analyzed isolates. T. solium tapeworms were classified into two genetic groups that correlated with the North/South geographic origin of the parasites.The availability of draft genomes for T. solium represents a

  12. High resolution aquifer characterization using crosshole GPR full-waveform tomography

    Science.gov (United States)

    Gueting, N.; Vienken, T.; Klotzsche, A.; Van Der Kruk, J.; Vanderborght, J.; Caers, J.; Vereecken, H.; Englert, A.

    2016-12-01

    Limited knowledge about the spatial distribution of aquifer properties typically constrains our ability to predict subsurface flow and transport. Here, we investigate the value of using high resolution full-waveform inversion of cross-borehole ground penetrating radar (GPR) data for aquifer characterization. By stitching together GPR tomograms from multiple adjacent crosshole planes, we are able to image, with a decimeter scale resolution, the dielectric permittivity and electrical conductivity of an alluvial aquifer along cross-sections of 50 m length and 10 m depth. A logistic regression model is employed to predict the spatial distribution of lithological facies on the basis of the GPR results. Vertical profiles of porosity and hydraulic conductivity from direct-push, flowmeter and grain size data suggest that the GPR predicted facies classification is meaningful with regard to porosity and hydraulic conductivity, even though the distributions of individual facies show some overlap and the absolute hydraulic conductivities from the different methods (direct-push, flowmeter, grain size) differ up to approximately one order of magnitude. Comparison of the GPR predicted facies architecture with tracer test data suggests that the plume splitting observed in a tracer experiment was caused by a hydraulically low-conductive sand layer with a thickness of only a few decimeters. Because this sand layer is identified by GPR full-waveform inversion but not by conventional GPR ray-based inversion we conclude that the improvement in spatial resolution due to full-waveform inversion is crucial to detect small-scale aquifer structures that are highly relevant for solute transport.

  13. De Novo Genome and Transcriptome Assembly of the Canadian Beaver (Castor canadensis

    Directory of Open Access Journals (Sweden)

    Si Lok

    2017-02-01

    Full Text Available The Canadian beaver (Castor canadensis is the largest indigenous rodent in North America. We report a draft annotated assembly of the beaver genome, the first for a large rodent and the first mammalian genome assembled directly from uncorrected and moderate coverage (< 30 × long reads generated by single-molecule sequencing. The genome size is 2.7 Gb estimated by k-mer analysis. We assembled the beaver genome using the new Canu assembler optimized for noisy reads. The resulting assembly was refined using Pilon supported by short reads (80 × and checked for accuracy by congruency against an independent short read assembly. We scaffolded the assembly using the exon–gene models derived from 9805 full-length open reading frames (FL-ORFs constructed from the beaver leukocyte and muscle transcriptomes. The final assembly comprised 22,515 contigs with an N50 of 278,680 bp and an N50-scaffold of 317,558 bp. Maximum contig and scaffold lengths were 3.3 and 4.2 Mb, respectively, with a combined scaffold length representing 92% of the estimated genome size. The completeness and accuracy of the scaffold assembly was demonstrated by the precise exon placement for 91.1% of the 9805 assembled FL-ORFs and 83.1% of the BUSCO (Benchmarking Universal Single-Copy Orthologs gene set used to assess the quality of genome assemblies. Well-represented were genes involved in dentition and enamel deposition, defining characteristics of rodents with which the beaver is well-endowed. The study provides insights for genome assembly and an important genomics resource for Castoridae and rodent evolutionary biology.

  14. Stable preparations of tyrosine hydroxylase provide the solution structure of the full-length enzyme

    Science.gov (United States)

    Bezem, Maria T.; Baumann, Anne; Skjærven, Lars; Meyer, Romain; Kursula, Petri; Martinez, Aurora; Flydal, Marte I.

    2016-01-01

    Tyrosine hydroxylase (TH) catalyzes the rate-limiting step in the biosynthesis of catecholamine neurotransmitters. TH is a highly complex enzyme at mechanistic, structural, and regulatory levels, and the preparation of kinetically and conformationally stable enzyme for structural characterization has been challenging. Here, we report on improved protocols for purification of recombinant human TH isoform 1 (TH1), which provide large amounts of pure, stable, active TH1 with an intact N-terminus. TH1 purified through fusion with a His-tagged maltose-binding protein on amylose resin was representative of the iron-bound functional enzyme, showing high activity and stabilization by the natural feedback inhibitor dopamine. TH1 purified through fusion with a His-tagged ZZ domain on TALON is remarkably stable, as it was partially inhibited by resin-derived cobalt. This more stable enzyme preparation provided high-quality small-angle X-ray scattering (SAXS) data and reliable structural models of full-length tetrameric TH1. The SAXS-derived model reveals an elongated conformation (Dmax = 20 nm) for TH1, different arrangement of the catalytic domains compared with the crystal structure of truncated forms, and an N-terminal region with an unstructured tail that hosts the phosphorylation sites and a separated Ala-rich helical motif that may have a role in regulation of TH by interacting with binding partners. PMID:27462005

  15. Quench start localization in full-length SSC R ampersand D dipoles

    International Nuclear Information System (INIS)

    Devred, A.; Chapman, M.; Cortella, J.; Desportes, A.; Kaugerts, J.; Kirk, T.; Mirk, K.; Schermer, R.; Tompkins, J.C.; Turner, J.; Bleadon, M.; Brown, B.C.; Hanft, R.; Kuchnir, M.; Lamm, M.; Mantsch, P.; Mazur, P.O.; Orris, D.; Peoples, J.; Strait, J.; Tool, G.; Caspi, S.; Gilbert, W.; Meuser, R.; Peters, C.; Rechen, J.; Royet, J.; Scanlan, R.; Taylor, C.; Zbasnik, J.

    1989-04-01

    Full-length SSC R ampersand D dipole magnets instrumented with four voltage taps on each turn of the inner quarter coils have been tested. These voltage taps enable accurate location of the point at which the quenches start and detailed studies of quench development in the coil. Attention here is focused on localizing the quench source. After recalling the basic mechanism of a quench (why it occurs and how it propagates), the method of quench origin analysis is described: the quench propagation velocity on the turn where the quench occurs is calculated, and the quench location is then verified by reiterating the analysis on the adjacent turns. Last, the velocity value, which appears to be higher than previously measured, is discussed

  16. Assessing Telomere Length Using Surface Enhanced Raman Scattering

    Science.gov (United States)

    Zong, Shenfei; Wang, Zhuyuan; Chen, Hui; Cui, Yiping

    2014-11-01

    Telomere length can provide valuable insight into telomeres and telomerase related diseases, including cancer. Here, we present a brand-new optical telomere length measurement protocol using surface enhanced Raman scattering (SERS). In this protocol, two single strand DNA are used as SERS probes. They are labeled with two different Raman molecules and can specifically hybridize with telomeres and centromere, respectively. First, genome DNA is extracted from cells. Then the telomere and centromere SERS probes are added into the genome DNA. After hybridization with genome DNA, excess SERS probes are removed by magnetic capturing nanoparticles. Finally, the genome DNA with SERS probes attached is dropped onto a SERS substrate and subjected to SERS measurement. Longer telomeres result in more attached telomere probes, thus a stronger SERS signal. Consequently, SERS signal can be used as an indicator of telomere length. Centromere is used as the inner control. By calibrating the SERS intensity of telomere probe with that of the centromere probe, SERS based telomere measurement is realized. This protocol does not require polymerase chain reaction (PCR) or electrophoresis procedures, which greatly simplifies the detection process. We anticipate that this easy-operation and cost-effective protocol is a fine alternative for the assessment of telomere length.

  17. Genome-Based Comparison of Clostridioides difficile: Average Amino Acid Identity Analysis of Core Genomes.

    Science.gov (United States)

    Cabal, Adriana; Jun, Se-Ran; Jenjaroenpun, Piroon; Wanchai, Visanu; Nookaew, Intawat; Wongsurawat, Thidathip; Burgess, Mary J; Kothari, Atul; Wassenaar, Trudy M; Ussery, David W

    2018-02-14

    Infections due to Clostridioides difficile (previously known as Clostridium difficile) are a major problem in hospitals, where cases can be caused by community-acquired strains as well as by nosocomial spread. Whole genome sequences from clinical samples contain a lot of information but that needs to be analyzed and compared in such a way that the outcome is useful for clinicians or epidemiologists. Here, we compare 663 public available complete genome sequences of C. difficile using average amino acid identity (AAI) scores. This analysis revealed that most of these genomes (640, 96.5%) clearly belong to the same species, while the remaining 23 genomes produce four distinct clusters within the Clostridioides genus. The main C. difficile cluster can be further divided into sub-clusters, depending on the chosen cutoff. We demonstrate that MLST, either based on partial or full gene-length, results in biased estimates of genetic differences and does not capture the true degree of similarity or differences of complete genomes. Presence of genes coding for C. difficile toxins A and B (ToxA/B), as well as the binary C. difficile toxin (CDT), was deduced from their unique PfamA domain architectures. Out of the 663 C. difficile genomes, 535 (80.7%) contained at least one copy of ToxA or ToxB, while these genes were missing from 128 genomes. Although some clusters were enriched for toxin presence, these genes are variably present in a given genetic background. The CDT genes were found in 191 genomes, which were restricted to a few clusters only, and only one cluster lacked the toxin A/B genes consistently. A total of 310 genomes contained ToxA/B without CDT (47%). Further, published metagenomic data from stools were used to assess the presence of C. difficile sequences in blinded cases of C. difficile infection (CDI) and controls, to test if metagenomic analysis is sensitive enough to detect the pathogen, and to establish strain relationships between cases from the same

  18. Isolation and characterization of repeat elements of the oak genome and their application in population analysis

    International Nuclear Information System (INIS)

    Fluch, S.; Burg, K.

    1998-01-01

    Four minisatellite sequence elements have been identified and isolated from the genome of the oak species Quercus petraea and Quercus robur. Minisatellites 1 and 2 are putative members of repeat families, while minisatellites 3 and 4 show repeat length variation among individuals of test populations. A 590 base pair (bp) long element has also been identified which reveals individual-specific autoradiographic patterns when used as probe in Southern hybridisations of genomic oak DNA. (author)

  19. Pharmacological efficacy of anti-IL-1β scFv, Fab and full-length antibodies in treatment of rheumatoid arthritis.

    Science.gov (United States)

    Qi, Jianying; Ye, Xianlong; Ren, Guiping; Kan, Fangming; Zhang, Yu; Guo, Mo; Zhang, Zhiyi; Li, Deshan

    2014-02-01

    Rheumatoid arthritis (RA) is a chronic autoimmune inflammatory disease that mainly causes the synovial joint inflammation and cartilage destruction. Interleukin-1β (IL-1β) is an important proinflammatory cytokine involved in the pathogenesis of RA. In this study, we constructed and expressed anti-IL-1β-full-length antibody in CHO-K1-SV, anti-IL-1β-Fab and anti-IL-1β-scFv in Rosetta. We compared the therapeutic efficacy of three anti-IL-1β antibodies for CIA mice. Mice with CIA were subcutaneously injected with humanized anti-IL-1β-scFv, anti-IL-1β-Fab or anti-IL-1β-full-length antibody. The effects of treatment were determined by arthritis severity score, autoreactive humoral, cellular immune responses, histological lesion and cytokines production. Compared with anti-IL-1β-scFv treatments, anti-IL-1β-Fab and anti-IL-1β-full-length antibody therapy resulted in more significant effect in alleviating the severity of arthritis by preventing bone damage and cartilage destruction, reducing humoral and cellular immune responses, and down-regulating the expression of IL-1β, IL-6, IL-2, IFN-γ, TNF-α and MMP-3 in inflammatory tissue. The therapeutic effects of anti-IL-1β-Fab and anti-IL-1β-full-length antibodies on CIA mice had no significant difference. However, production of anti-IL-1β-full-length antibody in eukaryotic system is, in general, time-consuming and more expensive than that of anti-IL-1β-Fab in prokaryotic systems. In conclusion, as a small molecule antibody, anti-IL-1β-Fab is an ideal candidate for RA therapy. Copyright © 2013 Elsevier Ltd. All rights reserved.

  20. Enhancing faba bean (Vicia faba L.) genome resources.

    Science.gov (United States)

    Cooper, James W; Wilson, Michael H; Derks, Martijn F L; Smit, Sandra; Kunert, Karl J; Cullis, Christopher; Foyer, Christine H

    2017-04-01

    Grain legume improvement is currently impeded by a lack of genomic resources. The paucity of genome information for faba bean can be attributed to the intrinsic difficulties of assembling/annotating its giant (~13 Gb) genome. In order to address this challenge, RNA-sequencing analysis was performed on faba bean (cv. Wizard) leaves. Read alignment to the faba bean reference transcriptome identified 16 300 high quality unigenes. In addition, Illumina paired-end sequencing was used to establish a baseline for genomic information assembly. Genomic reads were assembled de novo into contigs with a size range of 50-5000 bp. Over 85% of sequences did not align to known genes, of which ~10% could be aligned to known repetitive genetic elements. Over 26 000 of the reference transcriptome unigenes could be aligned to DNA-sequencing (DNA-seq) reads with high confidence. Moreover, this comparison identified 56 668 potential splice points in all identified unigenes. Sequence length data were extended at 461 putative loci through alignment of DNA-seq contigs to full-length, publicly available linkage marker sequences. Reads also yielded coverages of 3466× and 650× for the chloroplast and mitochondrial genomes, respectively. Inter- and intraspecies organelle genome comparisons established core legume organelle gene sets, and revealed polymorphic regions of faba bean organelle genomes. © The Author 2017. Published by Oxford University Press on behalf of the Society for Experimental Biology.

  1. Whole Genome Sequencing of Enterovirus species C Isolates by High-throughput Sequencing: Development of Generic Primers

    Directory of Open Access Journals (Sweden)

    Maël Bessaud

    2016-08-01

    Full Text Available Enteroviruses are among the most common viruses infecting humans and can cause diverse clinical syndromes ranging from minor febrile illness to severe and potentially fatal diseases. Enterovirus species C (EV-C consists of more than 20 types, among which the 3 serotypes of polioviruses, the etiological agents of poliomyelitis, are included. Biodiversity and evolution of EV-C genomes are shaped by frequent recombination events. Therefore, identification and characterization of circulating EV-C strains require the sequencing of different genomic regions.A simple method was developed to sequence quickly the entire genome of EV-C isolates. Four overlapping fragments were produced separately by RT-PCR performed with generic primers. The four amplicons were then pooled and purified prior to be sequenced by high-throughput technique.The method was assessed on a panel of EV-Cs belonging to a wide-range of types. It can be used to determine full-length genome sequences through de novo assembly of thousands of reads. It was also able to discriminate reads from closely related viruses in mixtures.By decreasing the workload compared to classical Sanger-based techniques, this method will serve as a precious tool for sequencing large panels of EV-Cs isolated in cell cultures during environmental surveillance or from patients, including vaccine-derived polioviruses.

  2. Biochemical characterization of a recombinant Japanese encephalitis virus RNA-dependent RNA polymerase

    Directory of Open Access Journals (Sweden)

    Kim Chan-Mi

    2007-07-01

    Full Text Available Abstract Background Japanese encephalitis virus (JEV NS5 is a viral nonstructural protein that carries both methyltransferase and RNA-dependent RNA polymerase (RdRp domains. It is a key component of the viral RNA replicase complex that presumably includes other viral nonstructural and cellular proteins. The biochemical properties of JEV NS5 have not been characterized due to the lack of a robust in vitro RdRp assay system, and the molecular mechanisms for the initiation of RNA synthesis by JEV NS5 remain to be elucidated. Results To characterize the biochemical properties of JEV RdRp, we expressed in Escherichia coli and purified an enzymatically active full-length recombinant JEV NS5 protein with a hexahistidine tag at the N-terminus. The purified NS5 protein, but not the mutant NS5 protein with an Ala substitution at the first Asp of the RdRp-conserved GDD motif, exhibited template- and primer-dependent RNA synthesis activity using a poly(A RNA template. The NS5 protein was able to use both plus- and minus-strand 3'-untranslated regions of the JEV genome as templates in the absence of a primer, with the latter RNA being a better template. Analysis of the RNA synthesis initiation site using the 3'-end 83 nucleotides of the JEV genome as a minimal RNA template revealed that the NS5 protein specifically initiates RNA synthesis from an internal site, U81, at the two nucleotides upstream of the 3'-end of the template. Conclusion As a first step toward the understanding of the molecular mechanisms for JEV RNA replication and ultimately for the in vitro reconstitution of viral RNA replicase complex, we for the first time established an in vitro JEV RdRp assay system with a functional full-length recombinant JEV NS5 protein and characterized the mechanisms of RNA synthesis from nonviral and viral RNA templates. The full-length recombinant JEV NS5 will be useful for the elucidation of the structure-function relationship of this enzyme and for the

  3. Lyso-myristoyl phosphatidylcholine micelles sustain the activity of Dengue non-structural (NS) protein 3 protease domain fused with the full-length NS2B.

    Science.gov (United States)

    Huang, Qiwei; Li, Qingxin; Joy, Joma; Chen, Angela Shuyi; Ruiz-Carrillo, David; Hill, Jeffrey; Lescar, Julien; Kang, Congbao

    2013-12-01

    Dengue virus (DENV), a member of the flavivirus genus, affects 50-100 million people in tropical and sub-tropical regions. The DENV protease domain is located at the N-terminus of the NS3 protease and requires for its enzymatic activity a hydrophilic segment of the NS2B that acts as a cofactor. The protease is an important antiviral drug target because it plays a crucial role in virus replication by cleaving the genome-coded polypeptide into mature functional proteins. Currently, there are no drugs to inhibit DENV protease activity. Most structural and functional studies have been conducted using protein constructs containing the NS3 protease domain connected to a soluble segment of the NS2B membrane protein via a nine-residue linker. For in vitro structural and functional studies, it would be useful to produce a natural form of the DENV protease containing the NS3 protease domain and the full-length NS2B protein. Herein, we describe the expression and purification of a natural form of DENV protease (NS2BFL-NS3pro) containing the full-length NS2B protein and the protease domain of NS3 (NS3pro). The protease was expressed and purified in detergent micelles necessary for its folding. Our results show that this purified protein was active in detergent micelles such as lyso-myristoyl phosphatidylcholine (LMPC). These findings should facilitate further structural and functional studies of the protease and will facilitate drug discovery targeting DENV. Copyright © 2013 Elsevier Inc. All rights reserved.

  4. Full Mitochondrial Genome Sequence of the Sugar Beet Wireworm Limonius californicus (Coleoptera: Elateridae), a Common Agricultural Pest.

    Science.gov (United States)

    Gerritsen, Alida T; New, Daniel D; Robison, Barrie D; Rashed, Arash; Hohenlohe, Paul; Forney, Larry; Rashidi, Mahnaz; Wilson, Cathy M; Settles, Matthew L

    2016-01-21

    We report here the full mitochondrial genome sequence of Limonius californicus, a species of click beetle that is an agricultural pest in its larval form. The circular genome is 16.5 kb and contains 13 protein-coding genes, 2 rRNA genes, and 22 tRNA genes. Copyright © 2016 Gerritsen et al.

  5. On the normalization of the minimum free energy of RNAs by sequence length.

    Directory of Open Access Journals (Sweden)

    Edoardo Trotta

    Full Text Available The minimum free energy (MFE of ribonucleic acids (RNAs increases at an apparent linear rate with sequence length. Simple indices, obtained by dividing the MFE by the number of nucleotides, have been used for a direct comparison of the folding stability of RNAs of various sizes. Although this normalization procedure has been used in several studies, the relationship between normalized MFE and length has not yet been investigated in detail. Here, we demonstrate that the variation of MFE with sequence length is not linear and is significantly biased by the mathematical formula used for the normalization procedure. For this reason, the normalized MFEs strongly decrease as hyperbolic functions of length and produce unreliable results when applied for the comparison of sequences with different sizes. We also propose a simple modification of the normalization formula that corrects the bias enabling the use of the normalized MFE for RNAs longer than 40 nt. Using the new corrected normalized index, we analyzed the folding free energies of different human RNA families showing that most of them present an average MFE density more negative than expected for a typical genomic sequence. Furthermore, we found that a well-defined and restricted range of MFE density characterizes each RNA family, suggesting the use of our corrected normalized index to improve RNA prediction algorithms. Finally, in coding and functional human RNAs the MFE density appears scarcely correlated with sequence length, consistent with a negligible role of thermodynamic stability demands in determining RNA size.

  6. Cocrystallization studies of full-length recombinant butyrylcholinesterase (BChE) with cocaine

    Energy Technology Data Exchange (ETDEWEB)

    Asojo, Oluwatoyin Ajibola; Asojo, Oluyomi Adebola; Ngamelue, Michelle N.; Homma, Kohei; Lockridge, Oksana (Nebraska-Med)

    2011-09-16

    Human butyrylcholinesterase (BChE; EC 3.1.1.8) is a 340 kDa tetrameric glycoprotein that is present in human serum at about 5 mg l{sup -1} and has well documented therapeutic effects on cocaine toxicity. BChE holds promise as a therapeutic that reduces and finally eliminates the rewarding effects of cocaine, thus weaning an addict from the drug. There have been extensive computational studies of cocaine hydrolysis by BChE. Since there are no reported structures of BChE with cocaine or any of the hydrolysis products, full-length monomeric recombinant wild-type BChE was cocrystallized with cocaine. The refined 3 {angstrom} resolution structure appears to retain the hydrolysis product benzoic acid in sufficient proximity to form a hydrogen bond to the active-site Ser198.

  7. Genomic characterization of Haemophilus parasuis SH0165, a highly virulent strain of serovar 5 prevalent in China.

    Directory of Open Access Journals (Sweden)

    Zhuofei Xu

    Full Text Available Haemophilus parasuis can be either a commensal bacterium of the porcine respiratory tract or an opportunistic pathogen causing Glässer's disease, a severe systemic disease that has led to significant economical losses in the pig industry worldwide. We determined the complete genomic sequence of H. parasuis SH0165, a highly virulent strain of serovar 5, which was isolated from a hog pen in North China. The single circular chromosome was 2,269,156 base pairs in length and contained 2,031 protein-coding genes. Together with the full spectrum of genes detected by the analysis of metabolic pathways, we confirmed that H. parasuis generates ATP via both fermentation and respiration, and possesses an intact TCA cycle for anabolism. In addition to possessing the complete pathway essential for the biosynthesis of heme, this pathogen was also found to be well-equipped with different iron acquisition systems, such as the TonB system and ABC-type transport complexes, to overcome iron limitation during infection and persistence. We identified a number of genes encoding potential virulence factors, such as type IV fimbriae and surface polysaccharides. Analysis of the genome confirmed that H. parasuis is naturally competent, as genes related to DNA uptake are present. A nine-mer DNA uptake signal sequence (ACAAGCGGT, identical to that found in Actinobacillus pleuropneumoniae and Mannheimia haemolytica, followed by similar downstream motifs, was identified in the SH0165 genome. Genomic and phylogenetic comparisons with other Pasteurellaceae species further indicated that H. parasuis was closely related to another swine pathogenic bacteria A. pleuropneumoniae. The comprehensive genetic analysis presented here provides a foundation for future research on the metabolism, natural competence and virulence of H. parasuis.

  8. Multiple different defense mechanisms are activated in the young transgenic tobacco plants which express the full length genome of the Tobacco mosaic virus, and are resistant against this virus.

    Science.gov (United States)

    Jada, Balaji; Soitamo, Arto J; Siddiqui, Shahid Aslam; Murukesan, Gayatri; Aro, Eva-Mari; Salakoski, Tapio; Lehto, Kirsi

    2014-01-01

    Previously described transgenic tobacco lines express the full length infectious Tobacco mosaic virus (TMV) genome under the 35S promoter (Siddiqui et al., 2007. Mol Plant Microbe Interact, 20: 1489-1494). Through their young stages these plants exhibit strong resistance against both the endogenously expressed and exogenously inoculated TMV, but at the age of about 7-8 weeks they break into TMV infection, with typical severe virus symptoms. Infections with some other viruses (Potato viruses Y, A, and X) induce the breaking of the TMV resistance and lead to synergistic proliferation of both viruses. To deduce the gene functions related to this early resistance, we have performed microarray analysis of the transgenic plants during the early resistant stage, and after the resistance break, and also of TMV-infected wild type tobacco plants. Comparison of these transcriptomes to those of corresponding wild type healthy plants indicated that 1362, 1150 and 550 transcripts were up-regulated in the transgenic plants before and after the resistance break, and in the TMV-infected wild type tobacco plants, respectively, and 1422, 1200 and 480 transcripts were down-regulated in these plants, respectively. These transcriptome alterations were distinctly different between the three types of plants, and it appears that several different mechanisms, such as the enhanced expression of the defense, hormone signaling and protein degradation pathways contributed to the TMV-resistance in the young transgenic plants. In addition to these alterations, we also observed a distinct and unique gene expression alteration in these plants, which was the strong suppression of the translational machinery. This may also contribute to the resistance by slowing down the synthesis of viral proteins. Viral replication potential may also be suppressed, to some extent, by the reduction of the translation initiation and elongation factors eIF-3 and eEF1A and B, which are required for the TMV replication

  9. Quality scores for 32,000 genomes

    DEFF Research Database (Denmark)

    Land, Miriam L.; Hyatt, Doug; Jun, Se-Ran

    2014-01-01

    Background More than 80% of the microbial genomes in GenBank are of ‘draft’ quality (12,553 draft vs. 2,679 finished, as of October, 2013). We have examined all the microbial DNA sequences available for complete, draft, and Sequence Read Archive genomes in GenBank as well as three other major...... public databases, and assigned quality scores for more than 30,000 prokaryotic genome sequences. Results Scores were assigned using four categories: the completeness of the assembly, the presence of full-length rRNA genes, tRNA composition and the presence of a set of 102 conserved genes in prokaryotes....... Most (~88%) of the genomes had quality scores of 0.8 or better and can be safely used for standard comparative genomics analysis. We compared genomes across factors that may influence the score. We found that although sequencing depth coverage of over 100x did not ensure a better score, sequencing read...

  10. Characterization of genome in tetraploid StY species of Elymus (Triticeae: Poaceae) using sequential FISH and GISH.

    Science.gov (United States)

    Liu, Ruijuan; Wang, Richard R-C; Yu, Feng; Lu, Xingwang; Dou, Quanwen

    2017-08-01

    Genomes of ten species of Elymus, either presumed or known as tetraploid StY, were characterized using fluorescence in situ hybridization (FISH) and genomic in situ hybridization (GISH). These tetraploid species could be grouped into three categories. Type I included StY genome reported species-Roegneria pendulina, R. nutans, R. glaberrima, R. ciliaris, and Elymus nevskii, and StY genome presumed species-R. sinica, R. breviglumis, and R. dura, whose genome could be separated into two sets based on different GISH intensities. Type I genome constitution was deemed as putative StY. The St genome were mainly characterized with intense hybridization with pAs1, fewer AAG sites, and linked distribution of 5S rDNA and 18S-26S rDNA, while the Y genome with less intense hybridization with pAs1, more varied AAG sites, and isolated distribution of 5S rDNA and 18S-26S rDNA. Nevertheless, further genomic variations were detected among the different StY species. Type II included E. alashanicus, whose genome could be easily separated based on GISH pattern. FISH and GISH patterns suggested that E. alashanicus comprised a modified St genome and an unknown genome. Type III included E. longearistatus, whose genome could not be separated by GISH and was designated as St l Y l . Notably, a close relationship between S l and Y l genomes was observed.

  11. Failure Mode and Effects Analysis (FMEA) of the solid state full length rod control system

    International Nuclear Information System (INIS)

    Shopsky, W.E.

    1977-01-01

    The Full Length Rod Control System (FLRCS) controls the power to the rod drive mechanisms for rod movement in response to signals received from the Reactor Control System or from signals generated through Reactor Operator action. Rod movement is used to control reactivity of the reactor during plant operation. The Full Length Rod Control System is designed to perform its reactivity control function in conjunction with the Reactor Control and Protection System, to maintain the reactor core within design safety limits. By the use of a Failure Mode and Effects Analysis, it is shown that the FLRCS will perform its reactivity control functions considering the loss of single active components. That is, sufficient fault limiting control circuits are provided which blocks control rod movement and/or indicates presence of a fault condition at the Control Board. Reactor operator action or automatic reactor trip will thus mitigate the consequences of potential failure of the FLRCS. The analysis also qualitatively demonstrates the reliability of the FLRCS to perform its intended function

  12. Informational laws of genome structures

    Science.gov (United States)

    Bonnici, Vincenzo; Manca, Vincenzo

    2016-06-01

    In recent years, the analysis of genomes by means of strings of length k occurring in the genomes, called k-mers, has provided important insights into the basic mechanisms and design principles of genome structures. In the present study, we focus on the proper choice of the value of k for applying information theoretic concepts that express intrinsic aspects of genomes. The value k = lg2(n), where n is the genome length, is determined to be the best choice in the definition of some genomic informational indexes that are studied and computed for seventy genomes. These indexes, which are based on information entropies and on suitable comparisons with random genomes, suggest five informational laws, to which all of the considered genomes obey. Moreover, an informational genome complexity measure is proposed, which is a generalized logistic map that balances entropic and anti-entropic components of genomes and is related to their evolutionary dynamics. Finally, applications to computational synthetic biology are briefly outlined.

  13. A Novel Strategy to Engineer Pre-Vascularized Full-Length Dental Pulp-like Tissue Constructs.

    Science.gov (United States)

    Athirasala, Avathamsa; Lins, Fernanda; Tahayeri, Anthony; Hinds, Monica; Smith, Anthony J; Sedgley, Christine; Ferracane, Jack; Bertassoni, Luiz E

    2017-06-12

    The requirement for immediate vascularization of engineered dental pulp poses a major hurdle towards successful implementation of pulp regeneration as an effective therapeutic strategy for root canal therapy, especially in adult teeth. Here, we demonstrate a novel strategy to engineer pre-vascularized, cell-laden hydrogel pulp-like tissue constructs in full-length root canals for dental pulp regeneration. We utilized gelatin methacryloyl (GelMA) hydrogels with tunable physical and mechanical properties to determine the microenvironmental conditions (microstructure, degradation, swelling and elastic modulus) that enhanced viability, spreading and proliferation of encapsulated odontoblast-like cells (OD21), and the formation of endothelial monolayers by endothelial colony forming cells (ECFCs). GelMA hydrogels with higher polymer concentration (15% w/v) and stiffness enhanced OD21 cell viability, spreading and proliferation, as well as endothelial cell spreading and monolayer formation. We then fabricated pre-vascularized, full-length, dental pulp-like tissue constructs by dispensing OD21 cell-laden GelMA hydrogel prepolymer in root canals of extracted teeth and fabricating 500 µm channels throughout the root canals. ECFCs seeded into the microchannels successfully formed monolayers and underwent angiogenic sprouting within 7 days in culture. In summary, the proposed approach is a simple and effective strategy for engineering of pre-vascularized dental pulp constructs offering potentially beneficial translational outcomes.

  14. Lengths of Orthologous Prokaryotic Proteins Are Affected by Evolutionary Factors.

    Science.gov (United States)

    Tatarinova, Tatiana; Salih, Bilal; Dien Bard, Jennifer; Cohen, Irit; Bolshoy, Alexander

    2015-01-01

    Proteins of the same functional family (for example, kinases) may have significantly different lengths. It is an open question whether such variation in length is random or it appears as a response to some unknown evolutionary driving factors. The main purpose of this paper is to demonstrate existence of factors affecting prokaryotic gene lengths. We believe that the ranking of genomes according to lengths of their genes, followed by the calculation of coefficients of association between genome rank and genome property, is a reasonable approach in revealing such evolutionary driving factors. As we demonstrated earlier, our chosen approach, Bubble-sort, combines stability, accuracy, and computational efficiency as compared to other ranking methods. Application of Bubble Sort to the set of 1390 prokaryotic genomes confirmed that genes of Archaeal species are generally shorter than Bacterial ones. We observed that gene lengths are affected by various factors: within each domain, different phyla have preferences for short or long genes; thermophiles tend to have shorter genes than the soil-dwellers; halophiles tend to have longer genes. We also found that species with overrepresentation of cytosines and guanines in the third position of the codon (GC3 content) tend to have longer genes than species with low GC3 content.

  15. Study of canine parvovirus evolution: comparative analysis of full-length VP2 gene sequences from Argentina and international field strains.

    Science.gov (United States)

    Gallo Calderón, Marina; Wilda, Maximiliano; Boado, Lorena; Keller, Leticia; Malirat, Viviana; Iglesias, Marcela; Mattion, Nora; La Torre, Jose

    2012-02-01

    The continuous emergence of new strains of canine parvovirus (CPV), poorly protected by current vaccination, is a concern among breeders, veterinarians, and dog owners around the world. Therefore, the understanding of the genetic variation in emerging CPV strains is crucial for the design of disease control strategies, including vaccines. In this paper, we obtained the sequences of the full-length gene encoding for the main capsid protein (VP2) of 11 canine parvovirus type 2 (CPV-2) Argentine representative field strains, selected from a total of 75 positive samples studied in our laboratory in the last 9 years. A comparative sequence analysis was performed on 9 CPV-2c, one CPV-2a, and one CPV-2b Argentine strains with respect to international strains reported in the GenBank database. In agreement with previous reports, a high degree of identity was found among CPV-2c Argentine strains (99.6-100% and 99.7-100% at nucleotide and amino acid levels, respectively). However, the appearance of a new substitution in the 440 position (T440A) in four CPV-2c Argentine strains obtained after the year 2009 gives support to the variability observed for this position located within the VP2, three-fold spike. This is the first report on the genetic characterization of the full-length VP2 gene of emerging CPV strains in South America and shows that all the Argentine CPV-2c isolates cluster together with European and North American CPV-2c strains.

  16. Length and nucleotide sequence polymorphism at the trnL and trnF non-coding regions of chloroplast genomes among Saccharum and Erianthus species

    Science.gov (United States)

    The aneupolyploidy genome of sugarcane (Saccharum hybrids spp.) and lack of a classical genetic linkage map make genetics research most difficult for sugarcane. Whole genome sequencing and genetic characterization of sugarcane and related taxa are far behind other crops. In this study, universal PCR...

  17. Assessment and optimization of theileria parva sporozoite full-length p67 antigen expression in mammalian cells

    Science.gov (United States)

    Delivery of various forms of recombinant Theileria parva sporozoite antigen (p67) has been shown to elicit antibody responses in cattle capable of providing protection against East Coast fever, the clinical disease caused by T. parva. Previous formulations of full-length and shorter recombinant vers...

  18. Keep your Sox on: Community genomics-directed isolation and microscopic characterization of the dominant subsurface sulfur-oxidizing bacterium in a sediment aquifer

    Science.gov (United States)

    Mullin, S. W.; Wrighton, K. C.; Luef, B.; Wilkins, M. J.; Handley, K. M.; Williams, K. H.; Banfield, J. F.

    2012-12-01

    Community genomics and proteomics (proteogenomics) can be used to predict the metabolic potential of complex microbial communities and provide insight into microbial activity and nutrient cycling in situ. Inferences regarding the physiology of specific organisms then can guide isolation efforts, which, if successful, can yield strains that can be metabolically and structurally characterized to further test metagenomic predictions. Here we used proteogenomic data from an acetate-stimulated, sulfidic sediment column deployed in a groundwater well in Rifle, CO to direct laboratory amendment experiments to isolate a bacterial strain potentially involved in sulfur oxidation for physiological and microscopic characterization (Handley et al, submitted 2012). Field strains of Sulfurovum (genome r9c2) were predicted to be capable of CO2 fixation via the reverse TCA cycle and sulfur oxidation (Sox and SQR) coupled to either nitrate reduction (Nap, Nir, Nos) in anaerobic environments or oxygen reduction in microaerobic (cbb3 and bd oxidases) environments; however, key genes for sulfur oxidation (soxXAB) were not identified. Sulfidic groundwater and sediment from the Rifle site were used to inoculate cultures that contained various sulfur species, with and without nitrate and oxygen. We isolated a bacterium, Sulfurovum sp. OBA, whose 16S rRNA gene shares 99.8 % identity to the gene of the dominant genomically characterized strain (genome r9c2) in the Rifle sediment column. The 16S rRNA gene of the isolate most closely matches (95 % sequence identity) the gene of Sulfurovum sp. NBC37-1, a genome-sequenced deep-sea sulfur oxidizer. Strain OBA grew via polysulfide, colloidal sulfur, and tetrathionate oxidation coupled to nitrate reduction under autotrophic and mixotrophic conditions. Strain OBA also grew heterotrophically, oxidizing glucose, fructose, mannose, and maltose with nitrate as an electron acceptor. Over the range of oxygen concentrations tested, strain OBA was not

  19. Morphology Characterization of PP/Clay Nanocomposites Across the Length Scales of the Structural Architecture

    NARCIS (Netherlands)

    Szazdi, Laszlo; Abranyi, Agnes; Pukansky Jr, Bela; Vancso, Gyula J.; Pukanszky, B.; Pukanszky, Bela

    2006-01-01

    The structure and rheological properties of a large number of layered silicate poly(propylene) nanocomposites were studied with widely varying compositions. Morphology characterization at different length scales was achieved by SEM, TEM, and XRD. Rheological measurements supplied additional

  20. Determination and analysis of the full-length chicken parvovirus genome.

    Science.gov (United States)

    Viral enteric disease in poultry is an ongoing problem in many parts of the world. Many enteric viruses have been identified in turkeys and chickens, including avian astroviruses, rotaviruses, reoviruses, and coronaviruses. Through the application of a molecular screening method targeting particle-a...

  1. Molecular cloning, characterization and expression of phenylalanine ...

    African Journals Online (AJOL)

    A full-length cDNA and genomic DNA of phenylalanine ammonia-lyase gene, which catalyzes the first step in the flavonoid biosynthetic pathway, were isolated from Ginkgo biloba for the first time (designated as GbPAL, GenBank Accession No. EU071050). The cDNA and genomic DNA sequences of GbPAL were the same, ...

  2. Unexpected structural complexity of supernumerary marker chromosomes characterized by microarray comparative genomic hybridization

    Directory of Open Access Journals (Sweden)

    Hing Anne V

    2008-04-01

    Full Text Available Abstract Background Supernumerary marker chromosomes (SMCs are structurally abnormal extra chromosomes that cannot be unambiguously identified by conventional banding techniques. In the past, SMCs have been characterized using a variety of different molecular cytogenetic techniques. Although these techniques can sometimes identify the chromosome of origin of SMCs, they are cumbersome to perform and are not available in many clinical cytogenetic laboratories. Furthermore, they cannot precisely determine the region or breakpoints of the chromosome(s involved. In this study, we describe four patients who possess one or more SMCs (a total of eight SMCs in all four patients that were characterized by microarray comparative genomic hybridization (array CGH. Results In at least one SMC from all four patients, array CGH uncovered unexpected complexity, in the form of complex rearrangements, that could have gone undetected using other molecular cytogenetic techniques. Although array CGH accurately defined the chromosome content of all but two minute SMCs, fluorescence in situ hybridization was necessary to determine the structure of the markers. Conclusion The increasing use of array CGH in clinical cytogenetic laboratories will provide an efficient method for more comprehensive characterization of SMCs. Improved SMC characterization, facilitated by array CGH, will allow for more accurate SMC/phenotype correlation.

  3. Comparative analysis of the full genome of Helicobacter pylori isolate Sahul64 identifies genes of high divergence.

    Science.gov (United States)

    Lu, Wei; Wise, Michael J; Tay, Chin Yen; Windsor, Helen M; Marshall, Barry J; Peacock, Christopher; Perkins, Tim

    2014-03-01

    Isolates of Helicobacter pylori can be classified phylogeographically. High genetic diversity and rapid microevolution are a hallmark of H. pylori genomes, a phenomenon that is proposed to play a functional role in persistence and colonization of diverse human populations. To provide further genomic evidence in the lineage of H. pylori and to further characterize diverse strains of this pathogen in different human populations, we report the finished genome sequence of Sahul64, an H. pylori strain isolated from an indigenous Australian. Our analysis identified genes that were highly divergent compared to the 38 publically available genomes, which include genes involved in the biosynthesis and modification of lipopolysaccharide, putative prophage genes, restriction modification components, and hypothetical genes. Furthermore, the virulence-associated vacA locus is a pseudogene and the cag pathogenicity island (cagPAI) is not present. However, the genome does contain a gene cluster associated with pathogenicity, including dupA. Our analysis found that with the addition of Sahul64 to the 38 genomes, the core genome content of H. pylori is reduced by approximately 14% (∼170 genes) and the pan-genome has expanded from 2,070 to 2,238 genes. We have identified three putative horizontally acquired regions, including one that is likely to have been acquired from the closely related Helicobacter cetorum prior to speciation. Our results suggest that Sahul64, with the absence of cagPAI, highly divergent cell envelope proteins, and a predicted nontransportable VacA protein, could be more highly adapted to ancient indigenous Australian people but with lower virulence potential compared to other sequenced and cagPAI-positive H. pylori strains.

  4. Characterization of the complete mitochondrial genome of the king pigeon (Columba livia breed king).

    Science.gov (United States)

    Zhang, Rui-Hua; He, Wen-Xiao; Xu, Tong

    2015-06-01

    The king pigeon is a breed of pigeon developed over many years of selective breeding primarily as a utility breed. In the present work, we report the complete mitochondrial genome sequence of king pigeon for the first time. The total length of the mitogenome was 17,221 bp with the base composition of 30.14% for A, 24.05% for T, 31.82% for C, and 13.99% for G and an A-T (54.22 %)-rich feature was detected. It harbored 13 protein-coding genes, two ribosomal RNA genes, 22 transfer RNA genes, and one non-coding control region (D-loop region). The arrangement of all genes was identical to the typical mitochondrial genomes of pigeon. The complete mitochondrial genome sequence of king pigeon would serve as an important data set of the germplasm resources for further study.

  5. Broad genomic and transcriptional analysis reveals a highly derived genome in dinoflagellate mitochondria

    Directory of Open Access Journals (Sweden)

    Keeling Patrick J

    2007-09-01

    Full Text Available Abstract Background Dinoflagellates comprise an ecologically significant and diverse eukaryotic phylum that is sister to the phylum containing apicomplexan endoparasites. The mitochondrial genome of apicomplexans is uniquely reduced in gene content and size, encoding only three proteins and two ribosomal RNAs (rRNAs within a highly compacted 6 kb DNA. Dinoflagellate mitochondrial genomes have been comparatively poorly studied: limited available data suggest some similarities with apicomplexan mitochondrial genomes but an even more radical type of genomic organization. Here, we investigate structure, content and expression of dinoflagellate mitochondrial genomes. Results From two dinoflagellates, Crypthecodinium cohnii and Karlodinium micrum, we generated over 42 kb of mitochondrial genomic data that indicate a reduced gene content paralleling that of mitochondrial genomes in apicomplexans, i.e., only three protein-encoding genes and at least eight conserved components of the highly fragmented large and small subunit rRNAs. Unlike in apicomplexans, dinoflagellate mitochondrial genes occur in multiple copies, often as gene fragments, and in numerous genomic contexts. Analysis of cDNAs suggests several novel aspects of dinoflagellate mitochondrial gene expression. Polycistronic transcripts were found, standard start codons are absent, and oligoadenylation occurs upstream of stop codons, resulting in the absence of termination codons. Transcripts of at least one gene, cox3, are apparently trans-spliced to generate full-length mRNAs. RNA substitutional editing, a process previously identified for mRNAs in dinoflagellate mitochondria, is also implicated in rRNA expression. Conclusion The dinoflagellate mitochondrial genome shares the same gene complement and fragmentation of rRNA genes with its apicomplexan counterpart. However, it also exhibits several unique characteristics. Most notable are the expansion of gene copy numbers and their arrangements

  6. The isolation and localization of arbitrary restriction fragment length polymorphisms in Southern African populations

    International Nuclear Information System (INIS)

    Conn, V.

    1987-01-01

    The main aim of this study was to contribute to the mapping of the human genome by searching for and characterizing a number of RFLPs (restriction fragment length polymorphisms) in the human genome. The more specific aims of this study were: 1. To isolate single-copy human DNA sequences from a human genomic library. 2. To use these single-copy sequences as DNA probes to search for polymorphic variation among Caucasoid individuals. 3. To show by means of family studies that the RFLPs were inherited in a co-dominant Mendelian fashion. 4. To determine the population frequencies of these RFLPs in Southern African Populations, namely the Bantu-speaking Negroids and the San. 5. To assign these RFLP-detecting DNA sequences to human chromosomes using somatic cell hybrid lines. In this study DNA was labelled with Phosphorus 32

  7. A comparative phylogenetic analysis of full-length mariner elements ...

    Indian Academy of Sciences (India)

    Unknown

    recent study showing non-occurance of inter-subfamily excisions because of .... length shown in our figure is greater because of the gaps introduced to maintain an ... to test the feasibility of transforming silkmoths with a foreign gene of ...

  8. Virtually full-length subtype F and F/D recombinant HIV-1 from Africa and South America

    NARCIS (Netherlands)

    Laukkanen, T.; Carr, J. K.; Janssens, W.; Liitsola, K.; Gotte, D.; McCutchan, F. E.; Op de Coul, E.; Cornelissen, M.; Heyndrickx, L.; van der Groen, G.; Salminen, M. O.

    2000-01-01

    For reliable classification of HIV-1 strains appropriate reference sequences are needed. The HIV-1 genetic subtype F has a wide geographic spread, causing significant epidemics in South America, Africa, and some regions of Europe. Previously only two full-length sequences of each of the HIV-1

  9. MobilomeFINDER: web-based tools for in silico and experimental discovery of bacterial genomic islands

    Science.gov (United States)

    Ou, Hong-Yu; He, Xinyi; Harrison, Ewan M.; Kulasekara, Bridget R.; Thani, Ali Bin; Kadioglu, Aras; Lory, Stephen; Hinton, Jay C. D.; Barer, Michael R.; Rajakumar, Kumar

    2007-01-01

    MobilomeFINDER (http://mml.sjtu.edu.cn/MobilomeFINDER) is an interactive online tool that facilitates bacterial genomic island or ‘mobile genome’ (mobilome) discovery; it integrates the ArrayOme and tRNAcc software packages. ArrayOme utilizes a microarray-derived comparative genomic hybridization input data set to generate ‘inferred contigs’ produced by merging adjacent genes classified as ‘present’. Collectively these ‘fragments’ represent a hypothetical ‘microarray-visualized genome (MVG)’. ArrayOme permits recognition of discordances between physical genome and MVG sizes, thereby enabling identification of strains rich in microarray-elusive novel genes. Individual tRNAcc tools facilitate automated identification of genomic islands by comparative analysis of the contents and contexts of tRNA sites and other integration hotspots in closely related sequenced genomes. Accessory tools facilitate design of hotspot-flanking primers for in silico and/or wet-science-based interrogation of cognate loci in unsequenced strains and analysis of islands for features suggestive of foreign origins; island-specific and genome-contextual features are tabulated and represented in schematic and graphical forms. To date we have used MobilomeFINDER to analyse several Enterobacteriaceae, Pseudomonas aeruginosa and Streptococcus suis genomes. MobilomeFINDER enables high-throughput island identification and characterization through increased exploitation of emerging sequence data and PCR-based profiling of unsequenced test strains; subsequent targeted yeast recombination-based capture permits full-length sequencing and detailed functional studies of novel genomic islands. PMID:17537813

  10. Characterization and genome analysis of the first facultatively alkaliphilic Thermodesulfovibrio isolated from the deep terrestrial subsurface

    Directory of Open Access Journals (Sweden)

    Yulia Frank

    2016-12-01

    Full Text Available Members of the genus Thermodesulfovibrio belong to the Nitrospirae phylum and all isolates characterized to date are neutrophiles. They have been isolated from terrestrial hot springs and thermophilic methanogenic anaerobic sludges. Their molecular signatures have, however, also been detected in deep subsurface. The purpose of this study was to characterize and analyze the genome of a newly isolated, moderately alkaliphilic Thermodesulfovibrio from a 2 km deep aquifer system in Western Siberia, Russia. The new isolate, designated N1, grows optimally at pH 8.5-9.0 and at 65 ºC. It is able to reduce sulfate, thiosulfate or sulfite with a limited range of electron donors such as formate, pyruvate and lactate. Analysis of the 1.93 Mb draft genome of strain N1 revealed that it contains a set of genes for dissimilatory sulfate reduction, including sulfate adenyltransferase, adenosine-5'-phosphosulfate reductase AprAB, membrane-bound electron transfer complex QmoABC, dissimilatory sulfite reductase DsrABC and sulfite reductase-associated electron transfer complex DsrMKJOP. Hydrogen turnover is enabled by soluble cytoplasmic, membrane-linked, and soluble periplasmic hydrogenases and a periplasmic formate dehydrogenase. The use of thiosulfate as an electron acceptor is enabled by a membrane-linked molybdopterin oxidoreductase. The N1 requirement for organic carbon sources corresponds to the lack of the autotrophic C1-fixation pathways. Comparative analysis of the genomes of Thermodesulfovibrio (T. yellowstonii, T. islandicus, T. аggregans, T. thiophilus, and strain N1 revealed a low overall genetic diversity and several adaptive traits. Consistent with an alkaliphilic lifestyle, a multisubunit Na+/H+ antiporter of the Mnh family is encoded in the Thermodesulfovibrio strain N1 genome. Nitrogenase genes were found in T. yellowstonii, T. aggregans, and T. islandicus, nitrate reductase in T. islandicus, and cellulose synthetase in T. aggregans and strain N

  11. Genome Sequences of Gordonia Phages BaxterFox, Kita, Nymphadora, and Yeezy.

    Science.gov (United States)

    Pope, Welkin H; Bandla, Sharanya; Colbert, Alexandra K; Eichinger, Fiona G; Gamburg, Michelle B; Horiates, Stavroula G; Jamison, Jerrica M; Julian, Dana R; Moore, Whitney A; Murthy, Pranav; Powell, Meghan C; Smith, Sydney V; Mezghani, Nadia; Milliken, Katherine A; Thompson, Paige K; Toner, Chelsea L; Ulbrich, Megan C; Furbee, Emily C; Grubb, Sarah R; Warner, Marcie H; Montgomery, Matthew T; Garlena, Rebecca A; Russell, Daniel A; Jacobs-Sera, Deborah; Hatfull, Graham F

    2016-08-11

    Gordonia phages BaxterFox, Kita, Nymphadora, and Yeezy are newly characterized phages of Gordonia terrae, isolated from soil samples in Pittsburgh, Pennsylvania. These phages have genome lengths between 50,346 and 53,717 bp, and encode on average 84 predicted proteins. All have G+C content of 66.6%. Copyright © 2016 Pope et al.

  12. Genome-Wide Analysis of Simple Sequence Repeats in Bitter Gourd (Momordica charantia

    Directory of Open Access Journals (Sweden)

    Junjie Cui

    2017-06-01

    Full Text Available Bitter gourd (Momordica charantia is widely cultivated as a vegetable and medicinal herb in many Asian and African countries. After the sequencing of the cucumber (Cucumis sativus, watermelon (Citrullus lanatus, and melon (Cucumis melo genomes, bitter gourd became the fourth cucurbit species whose whole genome was sequenced. However, a comprehensive analysis of simple sequence repeats (SSRs in bitter gourd, including a comparison with the three aforementioned cucurbit species has not yet been published. Here, we identified a total of 188,091 and 167,160 SSR motifs in the genomes of the bitter gourd lines ‘Dali-11’ and ‘OHB3-1,’ respectively. Subsequently, the SSR content, motif lengths, and classified motif types were characterized for the bitter gourd genomes and compared among all the cucurbit genomes. Lastly, a large set of 138,727 unique in silico SSR primer pairs were designed for bitter gourd. Among these, 71 primers were selected, all of which successfully amplified SSRs from the two bitter gourd lines ‘Dali-11’ and ‘K44’. To further examine the utilization of unique SSR primers, 21 SSR markers were used to genotype a collection of 211 bitter gourd lines from all over the world. A model-based clustering method and phylogenetic analysis indicated a clear separation among the geographic groups. The genomic SSR markers developed in this study have considerable potential value in advancing bitter gourd research.

  13. Sequencing and characterizing the genome of Estrella lausannensis as an undergraduate project: training students and biological insights

    Directory of Open Access Journals (Sweden)

    Claire eBertelli

    2015-02-01

    Full Text Available With the widespread availability of high-throughput sequencing technologies, sequencing projects have become pervasive in the molecular life sciences. The huge bulk of data generated daily must be analyzed further by biologists with skills in bioinformatics and by embedded bioinformaticians, i.e., bioinformaticians integrated in wet lab research groups. Thus, students interested in molecular life sciences must be trained in the main steps of genomics: sequencing, assembly, annotation and analysis. To reach that goal, a practical course has been set up for master students at the University of Lausanne: the Sequence a genome class. At the beginning of the academic year, a few bacterial species whose genome is unknown are provided to the students, who sequence and assemble the genome(s and perform manual annotation. Here, we report the progress of the first class from September 2010 to June 2011 and the results obtained by seven master students who specifically assembled and annotated the genome of Estrella lausannensis, an obligate intracellular bacterium related to Chlamydia. The draft genome of Estrella is composed of 29 scaffolds encompassing 2,819,825 bp that encode for 2,233 putative proteins. Estrella also possesses a 9,136 bp plasmid that encodes for 14 genes, among which we found an integrase and a toxin/antitoxin module. Like all other members of the Chlamydiales order, Estrella possesses a highly conserved type III secretion system, considered as a key virulence factor. The annotation of the Estrella genome also allowed the characterization of the metabolic abilities of this strictly intracellular bacterium. Altogether, the students provided the scientific community with the Estrella genome sequence and a preliminary understanding of the biology of this recently-discovered bacterial genus, while learning to use cutting-edge technologies for sequencing and to perform bioinformatics analyses.

  14. Full-Length High-Temperature Severe Fuel Damage Test No. 5: Final safety analysis

    International Nuclear Information System (INIS)

    Lanning, D.D.; Lombardo, N.J.; Panisko, F.E.

    1993-09-01

    This report presents the final safety analysis for the preparation, conduct, and post-test discharge operation for the Full-Length High Temperature Experiment-5 (FLHT-5) to be conducted in the L-24 position of the National Research Universal (NRU) Reactor at Chalk River Nuclear Laboratories (CRNL), Ontario, Canada. The test is sponsored by an international group organized by the US Nuclear Regulatory Commission. The test is designed and conducted by staff from Pacific Northwest Laboratory with CRNL staff support. The test will study the consequences of loss-of-coolant and the progression of severe fuel damage

  15. Restriction site extension PCR: a novel method for high-throughput characterization of tagged DNA fragments and genome walking.

    Directory of Open Access Journals (Sweden)

    Jiabing Ji

    Full Text Available BACKGROUND: Insertion mutant isolation and characterization are extremely valuable for linking genes to physiological function. Once an insertion mutant phenotype is identified, the challenge is to isolate the responsible gene. Multiple strategies have been employed to isolate unknown genomic DNA that flanks mutagenic insertions, however, all these methods suffer from limitations due to inefficient ligation steps, inclusion of restriction sites within the target DNA, and non-specific product generation. These limitations become close to insurmountable when the goal is to identify insertion sites in a high throughput manner. METHODOLOGY/PRINCIPAL FINDINGS: We designed a novel strategy called Restriction Site Extension PCR (RSE-PCR to efficiently conduct large-scale isolation of unknown genomic DNA fragments linked to DNA insertions. The strategy is a modified adaptor-mediated PCR without ligation. An adapter, with complementarity to the 3' overhang of the endonuclease (KpnI, NsiI, PstI, or SacI restricted DNA fragments, extends the 3' end of the DNA fragments in the first cycle of the primary RSE-PCR. During subsequent PCR cycles and a second semi-nested PCR (secondary RSE-PCR, touchdown and two-step PCR are combined to increase the amplification specificity of target fragments. The efficiency and specificity was demonstrated in our characterization of 37 tex mutants of Arabidopsis. All the steps of RSE-PCR can be executed in a 96 well PCR plate. Finally, RSE-PCR serves as a successful alternative to Genome Walker as demonstrated by gene isolation from maize, a plant with a more complex genome than Arabidopsis. CONCLUSIONS/SIGNIFICANCE: RSE-PCR has high potential application in identifying tagged (T-DNA or transposon sequence or walking from known DNA toward unknown regions in large-genome plants, with likely application in other organisms as well.

  16. Structural characterization of a novel full-length transcript promoter from Horseradish Latent Virus (HRLV) and its transcriptional regulation by multiple stress responsive transcription factors.

    Science.gov (United States)

    Khan, Ahamed; Shrestha, Ankita; Bhuyan, Kashyap; Maiti, Indu B; Dey, Nrisingha

    2018-01-01

    The promoter fragment described in this study can be employed for strong transgene expression under both biotic and abiotic stress conditions. Plant-infecting Caulimoviruses have evolved multiple regulatory mechanisms to address various environmental stimuli during the course of evolution. One such mechanism involves the retention of discrete stress responsive cis-elements which are required for their survival and host-specificity. Here we describe the characterization of a novel Caulimoviral promoter isolated from Horseradish Latent Virus (HRLV) and its regulation by multiple stress responsive Transcription factors (TFs) namely DREB1, AREB1 and TGA1a. The activity of full length transcript (Flt-) promoter from HRLV (- 677 to + 283) was investigated in both transient and transgenic assays where we identified H12 (- 427 to + 73) as the highest expressing fragment having ~ 2.5-fold stronger activity than the CaMV35S promoter. The H12 promoter was highly active and near-constitutive in the vegetative and reproductive parts of both Tobacco and Arabidopsis transgenic plants. Interestingly, H12 contains a distinct cluster of cis-elements like dehydration-responsive element (DRE-core; GCCGAC), an ABA-responsive element (ABRE; ACGTGTC) and as-1 element (TGACG) which are known to be induced by cold, drought and pathogen/SA respectively. The specific binding of DREB1, AREB1 and TGA1a to DRE, ABRE and as-1 elements respectively were confirmed by the gel-binding assays using H12 promoter-specific probes. Detailed mutational analysis of the H12 promoter suggested that the presence of DRE-core and as-1 element was indispensable for its activity which was further confirmed by the transactivation assays. Our studies imply that H12 could be a valuable genetic tool for regulated transgene expression under diverse environmental conditions.

  17. Genomic Characterization of Phenylalanine Ammonia Lyase Gene in Buckwheat.

    Directory of Open Access Journals (Sweden)

    Karthikeyan Thiyagarajan

    Full Text Available Phenylalanine Ammonia Lyase (PAL gene which plays a key role in bio-synthesis of medicinally important compounds, Rutin/quercetin was sequence characterized for its efficient genomics application. These compounds possessing anti-diabetic and anti-cancer properties and are predominantly produced by Fagopyrum spp. In the present study, PAL gene was sequenced from three Fagopyrum spp. (F. tataricum, F. esculentum and F. dibotrys and showed the presence of three SNPs and four insertion/deletions at intra and inter specific level. Among them, the potential SNP (position 949th bp G>C with Parsimony Informative Site was selected and successfully utilised to individuate the zygosity/allelic variation of 16 F. tataricum varieties. Insertion mutations were identified in coding region, which resulted the change of a stretch of 39 amino acids on the putative protein. Our Study revealed that autogamous species (F. tataricum has lower frequency of observed SNPs as compared to allogamous species (F. dibotrys and F. esculentum. The identified SNPs in F. tataricum didn't result to amino acid change, while in other two species it caused both conservative and non-conservative variations. Consistent pattern of SNPs across the species revealed their phylogenetic importance. We found two groups of F. tataricum and one of them was closely related with F. dibotrys. Sequence characterization information of PAL gene reported in present investigation can be utilized in genetic improvement of buckwheat in reference to its medicinal value.

  18. Genomic Diversity and Evolution of the Lyssaviruses

    Science.gov (United States)

    Delmas, Olivier; Holmes, Edward C.; Talbi, Chiraz; Larrous, Florence; Dacheux, Laurent; Bouchier, Christiane; Bourhy, Hervé

    2008-01-01

    Lyssaviruses are RNA viruses with single-strand, negative-sense genomes responsible for rabies-like diseases in mammals. To date, genomic and evolutionary studies have most often utilized partial genome sequences, particularly of the nucleoprotein and glycoprotein genes, with little consideration of genome-scale evolution. Herein, we report the first genomic and evolutionary analysis using complete genome sequences of all recognised lyssavirus genotypes, including 14 new complete genomes of field isolates from 6 genotypes and one genotype that is completely sequenced for the first time. In doing so we significantly increase the extent of genome sequence data available for these important viruses. Our analysis of these genome sequence data reveals that all lyssaviruses have the same genomic organization. A phylogenetic analysis reveals strong geographical structuring, with the greatest genetic diversity in Africa, and an independent origin for the two known genotypes that infect European bats. We also suggest that multiple genotypes may exist within the diversity of viruses currently classified as ‘Lagos Bat’. In sum, we show that rigorous phylogenetic techniques based on full length genome sequence provide the best discriminatory power for genotype classification within the lyssaviruses. PMID:18446239

  19. Full-length VP2 gene analysis of canine parvovirus reveals emergence of newer variants in India.

    Science.gov (United States)

    Nookala, Mangadevi; Mukhopadhyay, Hirak Kumar; Sivaprakasam, Amsaveni; Balasubramanian, Brindhalakshmi; Antony, Prabhakar Xavier; Thanislass, Jacob; Srinivas, Mouttou Vivek; Pillai, Raghavan Madhusoodanan

    2016-12-01

    The canine parvovirus (CPV) infection is a highly contagious and serious enteric disease of dogs with high fatality rate. The present study was taken up to characterize the full-length viral polypeptide 2 (VP2) gene of CPV of Indian origin along with the commercially available vaccines. The faecal samples from parvovirus suspected dogs were collected from various states of India for screening by PCR assay and 66.29% of samples were found positive. Six CPV-2a, three CPV-2b, and one CPV-2c types were identified by sequence analysis. Several unique and existing mutations have been noticed in CPV types analyzed indicating emergence of newer variants of CPV in India. The phylogenetic analysis revealed that all the field CPV types were grouped in different subclades within two main clades, but away from the commercial vaccine strains. CPV-2b and CPV-2c types with unique mutations were found to be establishing in India apart from the prevailing CPV-2a type. Mutations and the positive selection of the mutants were found to be the major mechanism of emergence and evolution of parvovirus. Therefore, the incorporation of local strain in the vaccine formulation may be considered for effective control of CPV infections in India.

  20. Thousands of primer-free, high-quality, full-length SSU rRNA sequences from all domains of life

    DEFF Research Database (Denmark)

    Karst, Soeren M; Dueholm, Morten S; McIlroy, Simon J

    2016-01-01

    Ribosomal RNA (rRNA) genes are the consensus marker for determination of microbial diversity on the planet, invaluable in studies of evolution and, for the past decade, high-throughput sequencing of variable regions of ribosomal RNA genes has become the backbone of most microbial ecology studies...... (SSU) rRNA genes and synthetic long read sequencing by molecular tagging, to generate primer-free, full-length SSU rRNA gene sequences from all domains of life, with a median raw error rate of 0.17%. We generated thousands of full-length SSU rRNA sequences from five well-studied ecosystems (soil, human...... gut, fresh water, anaerobic digestion, and activated sludge) and obtained sequences covering all domains of life and the majority of all described phyla. Interestingly, 30% of all bacterial operational taxonomic units were novel, compared to the SILVA database (less than 97% similarity...

  1. Mapping the space of genomic signatures.

    Directory of Open Access Journals (Sweden)

    Lila Kari

    Full Text Available We propose a computational method to measure and visualize interrelationships among any number of DNA sequences allowing, for example, the examination of hundreds or thousands of complete mitochondrial genomes. An "image distance" is computed for each pair of graphical representations of DNA sequences, and the distances are visualized as a Molecular Distance Map: Each point on the map represents a DNA sequence, and the spatial proximity between any two points reflects the degree of structural similarity between the corresponding sequences. The graphical representation of DNA sequences utilized, Chaos Game Representation (CGR, is genome- and species-specific and can thus act as a genomic signature. Consequently, Molecular Distance Maps could inform species identification, taxonomic classifications and, to a certain extent, evolutionary history. The image distance employed, Structural Dissimilarity Index (DSSIM, implicitly compares the occurrences of oligomers of length up to k (herein k = 9 in DNA sequences. We computed DSSIM distances for more than 5 million pairs of complete mitochondrial genomes, and used Multi-Dimensional Scaling (MDS to obtain Molecular Distance Maps that visually display the sequence relatedness in various subsets, at different taxonomic levels. This general-purpose method does not require DNA sequence alignment and can thus be used to compare similar or vastly different DNA sequences, genomic or computer-generated, of the same or different lengths. We illustrate potential uses of this approach by applying it to several taxonomic subsets: phylum Vertebrata, (superkingdom Protista, classes Amphibia-Insecta-Mammalia, class Amphibia, and order Primates. This analysis of an extensive dataset confirms that the oligomer composition of full mtDNA sequences can be a source of taxonomic information. This method also correctly finds the mtDNA sequences most closely related to that of the anatomically modern human (the Neanderthal

  2. Genomic characterization of the Guillain-Barre syndrome-associated Campylobacter jejuni ICDCCJ07001 Isolate.

    Directory of Open Access Journals (Sweden)

    Maojun Zhang

    Full Text Available Campylobacter jejuni ICDCCJ07001 (HS:41, ST2993 was isolated from a Guillain-Barré syndrome (GBS patient during a 36-case GBS outbreak triggered by C. jejuni infections in north China in 2007. Sequence analysis revealed that the ICDCCJ07001 genome consisted of 1,664,840 base pairs (bp and one tetracycline resistance plasmid of 44,084 bp. The GC content was 59.29% and 1,579 and 37 CDSs were identified on the chromosome and plasmid, respectively. The ICDCCJ07001 genome was compared to C. jejuni subsp. jejuni strains 81-176, 81116, NCTC11168, RM1221 and C. jejuni subsp. doylei 269.97. The length and organization of ICDCCJ07001 was similar to that of NCTC11168, 81-176 and 81-116 except that CMLP1 had a reverse orientation in strain ICDCCJ07001. Comparative genomic analyses were also carried out between GBS-associated C. jejuni strains. Thirteen common genes were present in four GBS-associated strains and 9 genes mapped to the LOS cluster and the ICDCCJ07001_pTet (44 kb plasmid was mosaic in structure. Thirty-seven predicted CDS in ICDCCJ07001_pTet were homologous to genes present in three virulence-associated plasmids in Campylobacter: 81-176_pTet, pCC31 and 81-176_pVir. Comparative analysis of virulence loci and virulence-associated genes indicated that the LOS biosynthesis loci of ICDCCJ07001 belonged to type A, previously reported to be associated with cases of GBS. The polysaccharide capsular biosynthesis (CPS loci and the flagella modification (FM loci of ICDCCJ07001 were similar to corresponding sequences of strain 260.94 of similar serotype as strain ICDCCJ07001. Other virulence-associated genes including cadF, peb1, jlpA, cdt and ciaB were conserved between the C. jejuni strains examined.

  3. Genomic Characterization of DArT Markers Based on High-Density Linkage Analysis and Physical Mapping to the Eucalyptus Genome

    Science.gov (United States)

    Petroli, César D.; Sansaloni, Carolina P.; Carling, Jason; Steane, Dorothy A.; Vaillancourt, René E.; Myburg, Alexander A.; da Silva, Orzenil Bonfim; Pappas, Georgios Joannis; Kilian, Andrzej; Grattapaglia, Dario

    2012-01-01

    genome is yet available to allow such detailed characterization. PMID:22984541

  4. Construction of occluded recombinant baculoviruses containing the full-length cry1Ab and cry1Ac genes from Bacillus thuringiensis

    Directory of Open Access Journals (Sweden)

    B.M. Ribeiro

    1998-06-01

    Full Text Available The administration of baculoviruses to insects for bioassay purposes is carried out, in most cases, by contamination of food surfaces with a known amount of occlusion bodies (OBs. Since per os infection is the natural route of infection, occluded recombinant viruses containing crystal protein genes (cry1Ab and cry1Ac from Bacillus thuringiensis were constructed for comparison with the baculovirus prototype Autographa californica nucleopolyhedrovirus (AcNPV. The transfer vector pAcUW2B was used for construction of occluded recombinant viruses. The transfer vector containing the crystal protein genes was cotransfected with linearized DNA from a non-occluded recombinant virus. The isolation of recombinant viruses was greatly facilitated by the reduction of background "wild type" virus and the increased proportion of recombinant viruses. Since the recombinant viruses containing full-length and truncated forms of the crystal protein genes did not seem to improve the pathogenicity of the recombinant viruses when compared with the wild type AcNPV, and in order to compare expression levels of the full-length crystal proteins produced by non-occluded and occluded recombinant viruses the full-length cry1Ab and cry1Ac genes were chosen for construction of occluded recombinant viruses. The recombinant viruses containing full-length and truncated forms of the crystal protein genes did not seem to improve its pathogenicity but the size of the larvae infected with the recombinant viruses was significantly smaller than that of larvae infected with the wild type virus.

  5. Molecular and Biological Characterization of an Isolate of Cucumber mosaic virus from Glycine soja by Generating its Infectious Full-genome cDNA Clones

    Directory of Open Access Journals (Sweden)

    Mi Sa Vo Phan

    2014-06-01

    Full Text Available Molecular and biological characteristics of an isolate of Cucumber mosaic virus (CMV from Glycine soja (wild soybean, named as CMV-209, was examined in this study. Comparison of nucleotide sequences and phylogenetic analyses of CMV-209 with the other CMV strains revealed that CMV-209 belonged to CMV subgroup I. However, CMV-209 showed some genetic distance from the CMV strains assigned to subgroup IA or subgroup IB. Infectious full-genome cDNA clones of CMV-209 were generated under the control of the Cauliflower mosaic virus 35S promoter. Infectivity of the CMV-209 clones was evaluated in Nicotiana benthamiana and various legume species. Our assays revealed that CMV-209 could systemically infect Glycine soja (wild soybean and Pisum sativum (pea as well as N. benthamiana, but not the other legume species.

  6. The use of comparative genomic hybridization to characterize genome dynamics and diversity among the serotypes of Shigella

    Directory of Open Access Journals (Sweden)

    Sun Meisheng

    2006-08-01

    Full Text Available Abstract Background Compelling evidence indicates that Shigella species, the etiologic agents of bacillary dysentery, as well as enteroinvasive Escherichia coli, are derived from multiple origins of Escherichia coli and form a single pathovar. To further understand the genome diversity and virulence evolution of Shigella, comparative genomic hybridization microarray analysis was employed to compare the gene content of E. coli K-12 with those of 43 Shigella strains from all lineages. Results For the 43 strains subjected to CGH microarray analyses, the common backbone of the Shigella genome was estimated to contain more than 1,900 open reading frames (ORFs, with a mean number of 726 undetectable ORFs. The mosaic distribution of absent regions indicated that insertions and/or deletions have led to the highly diversified genomes of pathogenic strains. Conclusion These results support the hypothesis that by gain and loss of functions, Shigella species became successful human pathogens through convergent evolution from diverse genomic backgrounds. Moreover, we also found many specific differences between different lineages, providing a window into understanding bacterial speciation and taxonomic relationships.

  7. Genomic characterization of recurrent high-grade astroblastoma.

    Science.gov (United States)

    Bale, Tejus A; Abedalthagafi, Malak; Bi, Wenya Linda; Kang, Yun Jee; Merrill, Parker; Dunn, Ian F; Dubuc, Adrian; Charbonneau, Sarah K; Brown, Loreal; Ligon, Azra H; Ramkissoon, Shakti H; Ligon, Keith L

    2016-01-01

    Astroblastomas are rare primary brain tumors, diagnosed based on histologic features. Not currently assigned a WHO grade, they typically display indolent behavior, with occasional variants taking a more aggressive course. We characterized the immunohistochemical characteristics, copy number (high-resolution array comparative genomic hybridization, OncoCopy) and mutational profile (targeted next-generation exome sequencing, OncoPanel) of a cohort of seven biopsies from four patients to identify recurrent genomic events that may help distinguish astroblastomas from other more common high-grade gliomas. We found that tumor histology was variable across patients and between primary and recurrent tumor samples. No common molecular features were identified among the four tumors. Mutations commonly observed in astrocytic tumors (IDH1/2, TP53, ATRX, and PTEN) or ependymoma were not identified. However one case with rapid clinical progression displayed mutations more commonly associated with GBM (NF1(N1054H/K63)*, PIK3CA(R38H) and ERG(A403T)). Conversely, another case, originally classified as glioblastoma with nine-year survival before recurrence, lacked a GBM mutational profile. Other mutations frequently seen in lower grade gliomas (BCOR, BCORL1, ERBB3, MYB, ATM) were also present in several tumors. Copy number changes were variable across tumors. Our findings indicate that astroblastomas have variable growth patterns and morphologic features, posing significant challenges to accurate classification in the absence of diagnostically specific copy number alterations and molecular features. Their histopathologic overlap with glioblastoma will likely confound the observation of long-term GBM "survivors". Further genomic profiling is needed to determine whether these tumors represent a distinct entity and to guide management strategies. Copyright © 2016 Elsevier Inc. All rights reserved.

  8. High-resolution characterization of a hepatocellular carcinoma genome.

    Science.gov (United States)

    Totoki, Yasushi; Tatsuno, Kenji; Yamamoto, Shogo; Arai, Yasuhito; Hosoda, Fumie; Ishikawa, Shumpei; Tsutsumi, Shuichi; Sonoda, Kohtaro; Totsuka, Hirohiko; Shirakihara, Takuya; Sakamoto, Hiromi; Wang, Linghua; Ojima, Hidenori; Shimada, Kazuaki; Kosuge, Tomoo; Okusaka, Takuji; Kato, Kazuto; Kusuda, Jun; Yoshida, Teruhiko; Aburatani, Hiroyuki; Shibata, Tatsuhiro

    2011-05-01

    Hepatocellular carcinoma, one of the most common virus-associated cancers, is the third most frequent cause of cancer-related death worldwide. By massively parallel sequencing of a primary hepatitis C virus-positive hepatocellular carcinoma (36× coverage) and matched lymphocytes (>28× coverage) from the same individual, we identified more than 11,000 somatic substitutions of the tumor genome that showed predominance of T>C/A>G transition and a decrease of the T>C substitution on the transcribed strand, suggesting preferential DNA repair. Gene annotation enrichment analysis of 63 validated non-synonymous substitutions revealed enrichment of phosphoproteins. We further validated 22 chromosomal rearrangements, generating four fusion transcripts that had altered transcriptional regulation (BCORL1-ELF4) or promoter activity. Whole-exome sequencing at a higher sequence depth (>76× coverage) revealed a TSC1 nonsense substitution in a subpopulation of the tumor cells. This first high-resolution characterization of a virus-associated cancer genome identified previously uncharacterized mutation patterns, intra-chromosomal rearrangements and fusion genes, as well as genetic heterogeneity within the tumor.

  9. Mining genome sequencing data to identify the genomic features linked to breast cancer histopathology

    Directory of Open Access Journals (Sweden)

    Zheng Ping

    2014-01-01

    Full Text Available Background: Genetics and genomics have radically altered our understanding of breast cancer progression. However, the genomic basis of various histopathologic features of breast cancer is not yet well-defined. Materials and Methods: The Cancer Genome Atlas (TCGA is an international database containing a large collection of human cancer genome sequencing data. cBioPortal is a web tool developed for mining these sequencing data. We performed mining of TCGA sequencing data in an attempt to characterize the genomic features correlated with breast cancer histopathology. We first assessed the quality of the TCGA data using a group of genes with known alterations in various cancers. Both genome-wide gene mutation and copy number changes as well as a group of genes with a high frequency of genetic changes were then correlated with various histopathologic features of invasive breast cancer. Results: Validation of TCGA data using a group of genes with known alterations in breast cancer suggests that the TCGA has accurately documented the genomic abnormalities of multiple malignancies. Further analysis of TCGA breast cancer sequencing data shows that accumulation of specific genomic defects is associated with higher tumor grade, larger tumor size and receptor negativity. Distinct groups of genomic changes were found to be associated with the different grades of invasive ductal carcinoma. The mutator role of the TP53 gene was validated by genomic sequencing data of invasive breast cancer and TP53 mutation was found to play a critical role in defining high tumor grade. Conclusions: Data mining of the TCGA genome sequencing data is an innovative and reliable method to help characterize the genomic abnormalities associated with histopathologic features of invasive breast cancer.

  10. Packaging of a unit-length viral genome: the role of nucleotides and the gpD decoration protein in stable nucleocapsid assembly in bacteriophage lambda.

    Science.gov (United States)

    Yang, Qin; Maluf, Nasib Karl; Catalano, Carlos Enrique

    2008-11-28

    The developmental pathways for a variety of eukaryotic and prokaryotic double-stranded DNA viruses include packaging of viral DNA into a preformed procapsid structure, catalyzed by terminase enzymes and fueled by ATP hydrolysis. In most instances, a capsid expansion process accompanies DNA packaging, which significantly increases the volume of the capsid to accommodate the full-length viral genome. "Decoration" proteins add to the surface of the expanded capsid lattice, and the terminase motors tightly package DNA, generating up to approximately 20 atm of internal capsid pressure. Herein we describe biochemical studies on genome packaging using bacteriophage lambda as a model system. Kinetic analysis suggests that the packaging motor possesses at least four ATPase catalytic sites that act cooperatively to effect DNA translocation, and that the motor is highly processive. While not required for DNA translocation into the capsid, the phage lambda capsid decoration protein gpD is essential for the packaging of the penultimate 8-10 kb (15-20%) of the viral genome; virtually no DNA is packaged in the absence of gpD when large DNA substrates are used, most likely due to a loss of capsid structural integrity. Finally, we show that ATP hydrolysis is required to retain the genome in a packaged state subsequent to condensation within the capsid. Presumably, the packaging motor continues to "idle" at the genome end and to maintain a positive pressure towards the packaged state. Surprisingly, ADP, guanosine triphosphate, and the nonhydrolyzable ATP analog 5'-adenylyl-beta,gamma-imidodiphosphate (AMP-PNP) similarly stabilize the packaged viral genome despite the fact that they fail to support genome packaging. In contrast, the poorly hydrolyzed ATP analog ATP-gammaS only partially stabilizes the nucleocapsid, and a DNA is released in "quantized" steps. We interpret the ensemble of data to indicate that (i) the viral procapsid possesses a degree of plasticity that is required to

  11. Comprehensive cytological characterization of the Gossypium hirsutum genome based on the development of a set of chromosome cytological markers

    Directory of Open Access Journals (Sweden)

    Wenbo Shan

    2016-08-01

    Full Text Available Cotton is the world's most important natural fiber crop. It is also a model system for studying polyploidization, genomic organization, and genome-size variation. Integrating the cytological characterization of cotton with its genetic map will be essential for understanding its genome structure and evolution, as well as for performing further genetic-map based mapping and cloning. In this study, we isolated a complete set of bacterial artificial chromosome clones anchored to each of the 52 chromosome arms of the tetraploid cotton Gossypium hirsutum. Combining these with telomere and centromere markers, we constructed a standard karyotype for the G. hirsutum inbred line TM-1. We dissected the chromosome arm localizations of the 45S and 5S rDNA and suggest a centromere repositioning event in the homoeologous chromosomes AT09 and DT09. By integrating a systematic karyotype analysis with the genetic linkage map, we observed different genome sizes and chromosomal structures between the subgenomes of the tetraploid cotton and those of its diploid ancestors. Using evidence of conserved coding sequences, we suggest that the different evolutionary paths of non-coding retrotransposons account for most of the variation in size between the subgenomes of tetraploid cotton and its diploid ancestors. These results provide insights into the cotton genome and will facilitate further genome studies in G. hirsutum.

  12. Genome sequence determination and metagenomic characterization of a Dehalococcoides mixed culture grown on cis-1,2-dichloroethene.

    Science.gov (United States)

    Yohda, Masafumi; Yagi, Osami; Takechi, Ayane; Kitajima, Mizuki; Matsuda, Hisashi; Miyamura, Naoaki; Aizawa, Tomoko; Nakajima, Mutsuyasu; Sunairi, Michio; Daiba, Akito; Miyajima, Takashi; Teruya, Morimi; Teruya, Kuniko; Shiroma, Akino; Shimoji, Makiko; Tamotsu, Hinako; Juan, Ayaka; Nakano, Kazuma; Aoyama, Misako; Terabayashi, Yasunobu; Satou, Kazuhito; Hirano, Takashi

    2015-07-01

    A Dehalococcoides-containing bacterial consortium that performed dechlorination of 0.20 mM cis-1,2-dichloroethene to ethene in 14 days was obtained from the sediment mud of the lotus field. To obtain detailed information of the consortium, the metagenome was analyzed using the short-read next-generation sequencer SOLiD 3. Matching the obtained sequence tags with the reference genome sequences indicated that the Dehalococcoides sp. in the consortium was highly homologous to Dehalococcoides mccartyi CBDB1 and BAV1. Sequence comparison with the reference sequence constructed from 16S rRNA gene sequences in a public database showed the presence of Sedimentibacter, Sulfurospirillum, Clostridium, Desulfovibrio, Parabacteroides, Alistipes, Eubacterium, Peptostreptococcus and Proteocatella in addition to Dehalococcoides sp. After further enrichment, the members of the consortium were narrowed down to almost three species. Finally, the full-length circular genome sequence of the Dehalococcoides sp. in the consortium, D. mccartyi IBARAKI, was determined by analyzing the metagenome with the single-molecule DNA sequencer PacBio RS. The accuracy of the sequence was confirmed by matching it to the tag sequences obtained by SOLiD 3. The genome is 1,451,062 nt and the number of CDS is 1566, which includes 3 rRNA genes and 47 tRNA genes. There exist twenty-eight RDase genes that are accompanied by the genes for anchor proteins. The genome exhibits significant sequence identity with other Dehalococcoides spp. throughout the genome, but there exists significant difference in the distribution RDase genes. The combination of a short-read next-generation DNA sequencer and a long-read single-molecule DNA sequencer gives detailed information of a bacterial consortium. Copyright © 2014 The Society for Biotechnology, Japan. Published by Elsevier B.V. All rights reserved.

  13. Targeted isolation, sequence assembly and characterization of two white spruce (Picea glauca) BAC clones for terpenoid synthase and cytochrome P450 genes involved in conifer defence reveal insights into a conifer genome.

    Science.gov (United States)

    Hamberger, Björn; Hall, Dawn; Yuen, Mack; Oddy, Claire; Hamberger, Britta; Keeling, Christopher I; Ritland, Carol; Ritland, Kermit; Bohlmann, Jörg

    2009-08-06

    Conifers are a large group of gymnosperm trees which are separated from the angiosperms by more than 300 million years of independent evolution. Conifer genomes are extremely large and contain considerable amounts of repetitive DNA. Currently, conifer sequence resources exist predominantly as expressed sequence tags (ESTs) and full-length (FL)cDNAs. There is no genome sequence available for a conifer or any other gymnosperm. Conifer defence-related genes often group into large families with closely related members. The goals of this study are to assess the feasibility of targeted isolation and sequence assembly of conifer BAC clones containing specific genes from two large gene families, and to characterize large segments of genomic DNA sequence for the first time from a conifer. We used a PCR-based approach to identify BAC clones for two target genes, a terpene synthase (3-carene synthase; 3CAR) and a cytochrome P450 (CYP720B4) from a non-arrayed genomic BAC library of white spruce (Picea glauca). Shotgun genomic fragments isolated from the BAC clones were sequenced to a depth of 15.6- and 16.0-fold coverage, respectively. Assembly and manual curation yielded sequence scaffolds of 172 kbp (3CAR) and 94 kbp (CYP720B4) long. Inspection of the genomic sequences revealed the intron-exon structures, the putative promoter regions and putative cis-regulatory elements of these genes. Sequences related to transposable elements (TEs), high complexity repeats and simple repeats were prevalent and comprised approximately 40% of the sequenced genomic DNA. An in silico simulation of the effect of sequencing depth on the quality of the sequence assembly provides direction for future efforts of conifer genome sequencing. We report the first targeted cloning, sequencing, assembly, and annotation of large segments of genomic DNA from a conifer. We demonstrate that genomic BAC clones for individual members of multi-member gene families can be isolated in a gene-specific fashion. The

  14. Secretory production of tetrameric native full-length streptavidin with thermostability using Streptomyces lividans as a host.

    Science.gov (United States)

    Noda, Shuhei; Matsumoto, Takuya; Tanaka, Tsutomu; Kondo, Akihiko

    2015-01-13

    Streptavidin is a tetrameric protein derived from Streptomyces avidinii, and has tight and specific biotin binding affinity. Applications of the streptavidin-biotin system have been widely studied. Streptavidin is generally produced using protein expression in Escherichia coli. In the present study, the secretory production of streptavidin was carried out using Streptomyces lividans as a host. In this study, we used the gene encoding native full-length streptavidin, whereas the core region is generally used for streptavidin production in E. coli. Tetrameric streptavidin composed of native full-length streptavidin monomers was successfully secreted in the culture supernatant of S. lividans transformants, and had specific biotin binding affinity as strong as streptavidin produced by E. coli. The amount of Sav using S. lividans was about 9 times higher than using E. coli. Surprisingly, streptavidin produced by S. lividans exhibited affinity to biotin after boiling, despite the fact that tetrameric streptavidin is known to lose its biotin binding ability after brief boiling. We successfully produced a large amount of tetrameric streptavidin as a secretory-form protein with unique thermotolerance.

  15. Draft genome sequence of pectic polysaccharide-degrading moderate thermophilic bacterium Geobacillus thermodenitrificans DSM 101594

    Directory of Open Access Journals (Sweden)

    Raimonda Petkauskaite

    Full Text Available Abstract Geobacillus thermodenitrificans DSM 101594 was isolated as a producer of extracellular thermostable pectic polysaccharide degrading enzymes. The completely sequenced genome was 3.6 Mb in length with GC content of 48.86%. A number of genes encoding enzymatic active against the high molecular weight polysaccharides of potential biotechnological importance were identified in the genome.

  16. Genomic Characterization of the Genus Nairovirus (Family Bunyaviridae).

    Science.gov (United States)

    Kuhn, Jens H; Wiley, Michael R; Rodriguez, Sergio E; Bào, Yīmíng; Prieto, Karla; Travassos da Rosa, Amelia P A; Guzman, Hilda; Savji, Nazir; Ladner, Jason T; Tesh, Robert B; Wada, Jiro; Jahrling, Peter B; Bente, Dennis A; Palacios, Gustavo

    2016-06-10

    Nairovirus, one of five bunyaviral genera, includes seven species. Genomic sequence information is limited for members of the Dera Ghazi Khan, Hughes, Qalyub, Sakhalin, and Thiafora nairovirus species. We used next-generation sequencing and historical virus-culture samples to determine 14 complete and nine coding-complete nairoviral genome sequences to further characterize these species. Previously unsequenced viruses include Abu Mina, Clo Mor, Great Saltee, Hughes, Raza, Sakhalin, Soldado, and Tillamook viruses. In addition, we present genomic sequence information on additional isolates of previously sequenced Avalon, Dugbe, Sapphire II, and Zirqa viruses. Finally, we identify Tunis virus, previously thought to be a phlebovirus, as an isolate of Abu Hammad virus. Phylogenetic analyses indicate the need for reassignment of Sapphire II virus to Dera Ghazi Khan nairovirus and reassignment of Hazara, Tofla, and Nairobi sheep disease viruses to novel species. We also propose new species for the Kasokero group (Kasokero, Leopards Hill, Yogue viruses), the Ketarah group (Gossas, Issyk-kul, Keterah/soft tick viruses) and the Burana group (Wēnzhōu tick virus, Huángpí tick virus 1, Tǎchéng tick virus 1). Our analyses emphasize the sister relationship of nairoviruses and arenaviruses, and indicate that several nairo-like viruses (Shāyáng spider virus 1, Xīnzhōu spider virus, Sānxiá water strider virus 1, South Bay virus, Wǔhàn millipede virus 2) require establishment of novel genera in a larger nairovirus-arenavirus supergroup.

  17. Comparative genomic characterization of three Streptococcus parauberis strains in fish pathogen, as assessed by wide-genome analyses.

    Directory of Open Access Journals (Sweden)

    Seong-Won Nho

    Full Text Available Streptococcus parauberis, which is the main causative agent of streptococcosis among olive flounder (Paralichthys olivaceus in northeast Asia, can be distinctly divided into two groups (type I and type II by an agglutination test. Here, the whole genome sequences of two Japanese strains (KRS-02083 and KRS-02109 were determined and compared with the previously determined genome of a Korean strain (KCTC 11537. The genomes of S. parauberis are intermediate in size and have lower GC contents than those of other streptococci. We annotated 2,236 and 2,048 genes in KRS-02083 and KRS-02109, respectively. Our results revealed that the three S. parauberis strains contain different genomic insertions and deletions. In particular, the genomes of Korean and Japanese strains encode different factors for sugar utilization; the former encodes the phosphotransferase system (PTS for sorbose, whereas the latter encodes proteins for lactose hydrolysis, respectively. And the KRS-02109 strain, specifically, was the type II strain found to be able to resist phage infection through the clustered regularly interspaced short palindromic repeats (CRISPR/Cas system and which might contribute valuably to serologically distribution. Thus, our genome-wide association study shows that polymorphisms can affect pathogen responses, providing insight into biological/biochemical pathways and phylogenetic diversity.

  18. RT-PCR and sequence analysis of the full-length fusion protein of Canine Distemper Virus from domestic dogs.

    Science.gov (United States)

    Romanutti, Carina; Gallo Calderón, Marina; Keller, Leticia; Mattion, Nora; La Torre, José

    2016-02-01

    During 2007-2014, 84 out of 236 (35.6%) samples from domestic dogs submitted to our laboratory for diagnostic purposes were positive for Canine Distemper Virus (CDV), as analyzed by RT-PCR amplification of a fragment of the nucleoprotein gene. Fifty-nine of them (70.2%) were from dogs that had been vaccinated against CDV. The full-length gene encoding the Fusion (F) protein of fifteen isolates was sequenced and compared with that of those of other CDVs, including wild-type and vaccine strains. Phylogenetic analysis using the F gene full-length sequences grouped all the Argentinean CDV strains in the SA2 clade. Sequence identity with the Onderstepoort vaccine strain was 89.0-90.6%, and the highest divergence was found in the 135 amino acids corresponding to the F protein signal-peptide, Fsp (64.4-66.7% identity). In contrast, this region was highly conserved among the local strains (94.1-100% identity). One extra putative N-glycosylation site was identified in the F gene of CDV Argentinean strains with respect to the vaccine strain. The present report is the first to analyze full-length F protein sequences of CDV strains circulating in Argentina, and contributes to the knowledge of molecular epidemiology of CDV, which may help in understanding future disease outbreaks. Copyright © 2015 Elsevier B.V. All rights reserved.

  19. Discovery and genomic characterization of a novel ovine partetravirus and a new genotype of bovine partetravirus.

    Directory of Open Access Journals (Sweden)

    Herman Tse

    Full Text Available Partetravirus is a recently described group of animal parvoviruses which include the human partetravirus, bovine partetravirus and porcine partetravirus (previously known as human parvovirus 4, bovine hokovirus and porcine hokovirus respectively. In this report, we describe the discovery and genomic characterization of partetraviruses in bovine and ovine samples from China. These partetraviruses were detected by PCR in 1.8% of bovine liver samples, 66.7% of ovine liver samples and 71.4% of ovine spleen samples. One of the bovine partetraviruses detected in the present samples is phylogenetically distinct from previously reported bovine partetraviruses and likely represents a novel genotype. The ovine partetravirus is a novel partetravirus and phylogenetically most related to the bovine partetraviruses. The genome organization is conserved amongst these viruses, including the presence of a putative transmembrane protein encoded by an overlapping reading frame in ORF2. Results from the present study provide further support to the classification of partetraviruses as a separate genus in Parvovirinae.

  20. A genome-wide characterization of microRNA genes in maize.

    Directory of Open Access Journals (Sweden)

    Lifang Zhang

    2009-11-01

    Full Text Available MicroRNAs (miRNAs are small, non-coding RNAs that play essential roles in plant growth, development, and stress response. We conducted a genome-wide survey of maize miRNA genes, characterizing their structure, expression, and evolution. Computational approaches based on homology and secondary structure modeling identified 150 high-confidence genes within 26 miRNA families. For 25 families, expression was verified by deep-sequencing of small RNA libraries that were prepared from an assortment of maize tissues. PCR-RACE amplification of 68 miRNA transcript precursors, representing 18 families conserved across several plant species, showed that splice variation and the use of alternative transcriptional start and stop sites is common within this class of genes. Comparison of sequence variation data from diverse maize inbred lines versus teosinte accessions suggest that the mature miRNAs are under strong purifying selection while the flanking sequences evolve equivalently to other genes. Since maize is derived from an ancient tetraploid, the effect of whole-genome duplication on miRNA evolution was examined. We found that, like protein-coding genes, duplicated miRNA genes underwent extensive gene-loss, with approximately 35% of ancestral sites retained as duplicate homoeologous miRNA genes. This number is higher than that observed with protein-coding genes. A search for putative miRNA targets indicated bias towards genes in regulatory and metabolic pathways. As maize is one of the principal models for plant growth and development, this study will serve as a foundation for future research into the functional roles of miRNA genes.

  1. Arthropod phylogenetics in light of three novel millipede (myriapoda: diplopoda) mitochondrial genomes with comments on the appropriateness of mitochondrial genome sequence data for inferring deep level relationships.

    Science.gov (United States)

    Brewer, Michael S; Swafford, Lynn; Spruill, Chad L; Bond, Jason E

    2013-01-01

    Arthropods are the most diverse group of eukaryotic organisms, but their phylogenetic relationships are poorly understood. Herein, we describe three mitochondrial genomes representing orders of millipedes for which complete genomes had not been characterized. Newly sequenced genomes are combined with existing data to characterize the protein coding regions of myriapods and to attempt to reconstruct the evolutionary relationships within the Myriapoda and Arthropoda. The newly sequenced genomes are similar to previously characterized millipede sequences in terms of synteny and length. Unique translocations occurred within the newly sequenced taxa, including one half of the Appalachioria falcifera genome, which is inverted with respect to other millipede genomes. Across myriapods, amino acid conservation levels are highly dependent on the gene region. Additionally, individual loci varied in the level of amino acid conservation. Overall, most gene regions showed low levels of conservation at many sites. Attempts to reconstruct the evolutionary relationships suffered from questionable relationships and low support values. Analyses of phylogenetic informativeness show the lack of signal deep in the trees (i.e., genes evolve too quickly). As a result, the myriapod tree resembles previously published results but lacks convincing support, and, within the arthropod tree, well established groups were recovered as polyphyletic. The novel genome sequences described herein provide useful genomic information concerning millipede groups that had not been investigated. Taken together with existing sequences, the variety of compositions and evolution of myriapod mitochondrial genomes are shown to be more complex than previously thought. Unfortunately, the use of mitochondrial protein-coding regions in deep arthropod phylogenetics appears problematic, a result consistent with previously published studies. Lack of phylogenetic signal renders the resulting tree topologies as suspect

  2. Structure and function of the first full-length murein peptide ligase (Mpl) cell wall recycling protein.

    Science.gov (United States)

    Das, Debanu; Hervé, Mireille; Feuerhelm, Julie; Farr, Carol L; Chiu, Hsiu-Ju; Elsliger, Marc-André; Knuth, Mark W; Klock, Heath E; Miller, Mitchell D; Godzik, Adam; Lesley, Scott A; Deacon, Ashley M; Mengin-Lecreulx, Dominique; Wilson, Ian A

    2011-03-18

    Bacterial cell walls contain peptidoglycan, an essential polymer made by enzymes in the Mur pathway. These proteins are specific to bacteria, which make them targets for drug discovery. MurC, MurD, MurE and MurF catalyze the synthesis of the peptidoglycan precursor UDP-N-acetylmuramoyl-L-alanyl-γ-D-glutamyl-meso-diaminopimelyl-D-alanyl-D-alanine by the sequential addition of amino acids onto UDP-N-acetylmuramic acid (UDP-MurNAc). MurC-F enzymes have been extensively studied by biochemistry and X-ray crystallography. In gram-negative bacteria, ∼30-60% of the bacterial cell wall is recycled during each generation. Part of this recycling process involves the murein peptide ligase (Mpl), which attaches the breakdown product, the tripeptide L-alanyl-γ-D-glutamyl-meso-diaminopimelate, to UDP-MurNAc. We present the crystal structure at 1.65 Å resolution of a full-length Mpl from the permafrost bacterium Psychrobacter arcticus 273-4 (PaMpl). Although the Mpl structure has similarities to Mur enzymes, it has unique sequence and structure features that are likely related to its role in cell wall recycling, a function that differentiates it from the MurC-F enzymes. We have analyzed the sequence-structure relationships that are unique to Mpl proteins and compared them to MurC-F ligases. We have also characterized the biochemical properties of this enzyme (optimal temperature, pH and magnesium binding profiles and kinetic parameters). Although the structure does not contain any bound substrates, we have identified ∼30 residues that are likely to be important for recognition of the tripeptide and UDP-MurNAc substrates, as well as features that are unique to Psychrobacter Mpl proteins. These results provide the basis for future mutational studies for more extensive function characterization of the Mpl sequence-structure relationships.

  3. Human microcephaly protein RTTN interacts with STIL and is required to build full-length centrioles.

    Science.gov (United States)

    Chen, Hsin-Yi; Wu, Chien-Ting; Tang, Chieh-Ju C; Lin, Yi-Nan; Wang, Won-Jing; Tang, Tang K

    2017-08-15

    Mutations in many centriolar protein-encoding genes cause primary microcephaly. Using super-resolution and electron microscopy, we find that the human microcephaly protein, RTTN, is recruited to the proximal end of the procentriole at early S phase, and is located at the inner luminal walls of centrioles. Further studies demonstrate that RTTN directly interacts with STIL and acts downstream of STIL-mediated centriole assembly. CRISPR/Cas9-mediated RTTN gene knockout in p53-deficient cells induce amplification of primitive procentriole bodies that lack the distal-half centriolar proteins, POC5 and POC1B. Additional analyses show that RTTN serves as an upstream effector of CEP295, which mediates the loading of POC1B and POC5 to the distal-half centrioles. Interestingly, the naturally occurring microcephaly-associated mutant, RTTN (A578P), shows a low affinity for STIL binding and blocks centriole assembly. These findings reveal that RTTN contributes to building full-length centrioles and illuminate the molecular mechanism through which the RTTN (A578P) mutation causes primary microcephaly.Mutations in many centriolar protein-encoding genes cause primary microcephaly. Here the authors show that human microcephaly protein RTTN directly interacts with STIL and acts downstream of STIL-mediated centriole assembly, contributing to building full-length centrioles.

  4. GONOME: measuring correlations between GO terms and genomic positions

    Directory of Open Access Journals (Sweden)

    Bailey Timothy L

    2006-02-01

    Full Text Available Abstract Background: Current methods to find significantly under- and over-represented gene ontology (GO terms in a set of genes consider the genes as equally probable "balls in a bag", as may be appropriate for transcripts in micro-array data. However, due to the varying length of genes and intergenic regions, that approach is inappropriate for deciding if any GO terms are correlated with a set of genomic positions. Results: We present an algorithm – GONOME – that can determine which GO terms are significantly associated with a set of genomic positions given a genome annotated with (at least the starts and ends of genes. We show that certain GO terms may appear to be significantly associated with a set of randomly chosen positions in the human genome if gene lengths are not considered, and that these same terms have been reported as significantly over-represented in a number of recent papers. This apparent over-representation disappears when gene lengths are considered, as GONOME does. For example, we show that, when gene length is taken into account, the term "development" is not significantly enriched in genes associated with human CpG islands, in contradiction to a previous report. We further demonstrate the efficacy of GONOME by showing that occurrences of the proteosome-associated control element (PACE upstream activating sequence in the S. cerevisiae genome associate significantly to appropriate GO terms. An extension of this approach yields a whole-genome motif discovery algorithm that allows identification of many other promoter sequences linked to different types of genes, including a large group of previously unknown motifs significantly associated with the terms 'translation' and 'translational elongation'. Conclusion: GONOME is an algorithm that correctly extracts over-represented GO terms from a set of genomic positions. By explicitly considering gene size, GONOME avoids a systematic bias toward GO terms linked to large genes

  5. Building a model: developing genomic resources for common milkweed (Asclepias syriaca with low coverage genome sequencing

    Directory of Open Access Journals (Sweden)

    Weitemier Kevin

    2011-05-01

    Full Text Available Abstract Background Milkweeds (Asclepias L. have been extensively investigated in diverse areas of evolutionary biology and ecology; however, there are few genetic resources available to facilitate and compliment these studies. This study explored how low coverage genome sequencing of the common milkweed (Asclepias syriaca L. could be useful in characterizing the genome of a plant without prior genomic information and for development of genomic resources as a step toward further developing A. syriaca as a model in ecology and evolution. Results A 0.5× genome of A. syriaca was produced using Illumina sequencing. A virtually complete chloroplast genome of 158,598 bp was assembled, revealing few repeats and loss of three genes: accD, clpP, and ycf1. A nearly complete rDNA cistron (18S-5.8S-26S; 7,541 bp and 5S rDNA (120 bp sequence were obtained. Assessment of polymorphism revealed that the rDNA cistron and 5S rDNA had 0.3% and 26.7% polymorphic sites, respectively. A partial mitochondrial genome sequence (130,764 bp, with identical gene content to tobacco, was also assembled. An initial characterization of repeat content indicated that Ty1/copia-like retroelements are the most common repeat type in the milkweed genome. At least one A. syriaca microread hit 88% of Catharanthus roseus (Apocynaceae unigenes (median coverage of 0.29× and 66% of single copy orthologs (COSII in asterids (median coverage of 0.14×. From this partial characterization of the A. syriaca genome, markers for population genetics (microsatellites and phylogenetics (low-copy nuclear genes studies were developed. Conclusions The results highlight the promise of next generation sequencing for development of genomic resources for any organism. Low coverage genome sequencing allows characterization of the high copy fraction of the genome and exploration of the low copy fraction of the genome, which facilitate the development of molecular tools for further study of a target species

  6. Contributing to Tumor Molecular Characterization Projects with a Global Impact | Office of Cancer Genomics

    Science.gov (United States)

    My name is Nicholas Griner and I am the Scientific Program Manager for the Cancer Genome Characterization Initiative (CGCI) in the Office of Cancer Genomics (OCG). Until recently, I spent most of my scientific career working in a cancer research laboratory. In my postdoctoral training, my research focused on identifying novel pathways that contribute to both prostate and breast cancers and studying proteins within these pathways that may be targeted with cancer drugs.

  7. Draft genome sequence of the silver pomfret fish, Pampus argenteus.

    Science.gov (United States)

    AlMomin, Sabah; Kumar, Vinod; Al-Amad, Sami; Al-Hussaini, Mohsen; Dashti, Talal; Al-Enezi, Khaznah; Akbar, Abrar

    2016-01-01

    Silver pomfret, Pampus argenteus, is a fish species from coastal waters. Despite its high commercial value, this edible fish has not been sequenced. Hence, its genetic and genomic studies have been limited. We report the first draft genome sequence of the silver pomfret obtained using a Next Generation Sequencing (NGS) technology. We assembled 38.7 Gb of nucleotides into scaffolds of 350 Mb with N50 of about 1.5 kb, using high quality paired end reads. These scaffolds represent 63.7% of the estimated silver pomfret genome length. The newly sequenced and assembled genome has 11.06% repetitive DNA regions, and this percentage is comparable to that of the tilapia genome. The genome analysis predicted 16 322 genes. About 91% of these genes showed homology with known proteins. Many gene clusters were annotated to protein and fatty-acid metabolism pathways that may be important in the context of the meat texture and immune system developmental processes. The reference genome can pave the way for the identification of many other genomic features that could improve breeding and population-management strategies, and it can also help characterize the genetic diversity of P. argenteus.

  8. Quench propagation study for the BNL-built, full-length, 50mm aperture SSC model dipoles

    International Nuclear Information System (INIS)

    Muratore, J.; Anerella, M.; Cottingham, G.

    1993-01-01

    As part of the program to build and test SSC 50mm aperture prototype dipole magnets, a series of seven full-length dipoles were built and tested at BNL. Important part of the testing program was the study of quench propagation velocity and hot spot temperature over a range of experimental conditions in order to characterize the safety of the conductor during quenches experienced under different circumstances. Such studies are important tools in design, implementation, and verification of quench protection strategies in superconducting accelerator magnets. This investigation was facilitated by artificially inducing quenches under controlled experimental conditions with spot heaters placed at carefully chosen locations on the magnet coils. Such studies were done as part of the 15m-long magnet test program and were performed on five of the magnets in the series. All were equipped with spot heaters on an inner coil, and two of these also had spot heaters on an outer coil. Therefore, in addition to the studies in the inner coils, it was also possible to study quench propagation in the outer coils, where slower quench velocities and higher conductor temperatures are expected, in comparison to that in the inner coils. In spontaneous quenches, where there may be no voltage taps, it is not possible to measure the conductor hot spot temperature. It is straightforward to measure the number of MIITs generated, since only the magnet current and voltage need be measured. The concept of MIITs then becomes a valuable diagnostic tool which can characterize the temperature behavior of a conductor during quench and can be used to determine limits for safe operation of the coil. With spot heaters placed at known locations and closely bracketed by voltage taps, hot spot temperature can be measured. Research such as is described in this paper is therefore important in order to determine the validity of the MIITs approach and to establish a correlation between temperature and MIITs

  9. A draft genome assembly of the army worm, Spodoptera frugiperda.

    Science.gov (United States)

    Kakumani, Pavan Kumar; Malhotra, Pawan; Mukherjee, Sunil K; Bhatnagar, Raj K

    2014-08-01

    Spodoptera is an agriculturally important pest insect and studies in understanding its biology have been limited by the unavailability of its genome. In the present study, the genomic DNA was sequenced and assembled into 37,243 scaffolds of size, 358 Mb with N50 of 53.7 kb. Based on degree of identity, we could anchor 305 Mb of the genome onto all the 28 chromosomes of Bombyx mori. Repeat elements were identified, which accounts for 20.28% of the total genome. Further, we predicted 11,595 genes, with an average intron length of 726 bp. The genes were annotated and domain analysis revealed that Sf genes share a significant homology and expression pattern with B. mori, despite differences in KOG gene categories and representation of certain protein families. The present study on Sf genome would help in the characterization of cellular pathways to understand its biology and comparative evolutionary studies among lepidopteran family members to help annotate their genomes. Copyright © 2014 Elsevier Inc. All rights reserved.

  10. Complete mitochondrial genome of sublittoral macroalga Rhodymenia pseudopalmata (Rhodymeniales, Rhodophyta).

    Science.gov (United States)

    Kim, Kyeong Mi; Yang, Eun Chan; Yi, Gangman; Yoon, Hwan Su

    2014-08-01

    We sequenced and characterized the first complete mitochondrial genome of the sublittoral red alga Rhodymenia pseudopalmata (Rhodymeniales, Rhodophyta). The mitogenome is 26,166 bp in length with 29.5% GC content. The circular mitogenome contains 47 genes, including 24 protein-coding, 2 rRNA and 21 tRNA genes including two copies of trnG, trnL, trnM and trnS. There are two cases of gene-overlapping, found between sdhD and nad4, and between secY and rps12. The R. pseudopalmata mitochondria genome differs from that of Gracilariopsis lemaneiformis by three missing genes (orf60, rpl20 and trnH).

  11. Construction of the BAC Library of Small Abalone (Haliotis diversicolor) for Gene Screening and Genome Characterization.

    Science.gov (United States)

    Jiang, Likun; You, Weiwei; Zhang, Xiaojun; Xu, Jian; Jiang, Yanliang; Wang, Kai; Zhao, Zixia; Chen, Baohua; Zhao, Yunfeng; Mahboob, Shahid; Al-Ghanim, Khalid A; Ke, Caihuan; Xu, Peng

    2016-02-01

    The small abalone (Haliotis diversicolor) is one of the most important aquaculture species in East Asia. To facilitate gene cloning and characterization, genome analysis, and genetic breeding of it, we constructed a large-insert bacterial artificial chromosome (BAC) library, which is an important genetic tool for advanced genetics and genomics research. The small abalone BAC library includes 92,610 clones with an average insert size of 120 Kb, equivalent to approximately 7.6× of the small abalone genome. We set up three-dimensional pools and super pools of 18,432 BAC clones for target gene screening using PCR method. To assess the approach, we screened 12 target genes in these 18,432 BAC clones and identified 16 positive BAC clones. Eight positive BAC clones were then sequenced and assembled with the next generation sequencing platform. The assembled contigs representing these 8 BAC clones spanned 928 Kb of the small abalone genome, providing the first batch of genome sequences for genome evaluation and characterization. The average GC content of small abalone genome was estimated as 40.33%. A total of 21 protein-coding genes, including 7 target genes, were annotated into the 8 BACs, which proved the feasibility of PCR screening approach with three-dimensional pools in small abalone BAC library. One hundred fifty microsatellite loci were also identified from the sequences for marker development in the future. The BAC library and clone pools provided valuable resources and tools for genetic breeding and conservation of H. diversicolor.

  12. Genomic organization, sequence characterization and expression analysis of Tenebrio molitor apolipophorin-III in response to an intracellular pathogen, Listeria monocytogenes.

    Science.gov (United States)

    Noh, Ju Young; Patnaik, Bharat Bhusan; Tindwa, Hamisi; Seo, Gi Won; Kim, Dong Hyun; Patnaik, Hongray Howrelia; Jo, Yong Hun; Lee, Yong Seok; Lee, Bok Luel; Kim, Nam Jung; Han, Yeon Soo

    2014-01-25

    Apolipophorin III (apoLp-III) is a well-known hemolymph protein having a functional role in lipid transport and immune response of insects. We cloned full-length cDNA encoding putative apoLp-III from larvae of the coleopteran beetle, Tenebrio molitor (TmapoLp-III), by identification of clones corresponding to the partial sequence of TmapoLp-III, subsequently followed with full length sequencing by a clone-by-clone primer walking method. The complete cDNA consists of 890 nucleotides, including an ORF encoding 196 amino acid residues. Excluding a putative signal peptide of the first 20 amino acid residues, the 176-residue mature apoLp-III has a calculated molecular mass of 19,146Da. Genomic sequence analysis with respect to its cDNA showed that TmapoLp-III was organized into four exons interrupted by three introns. Several immune-related transcription factor binding sites were discovered in the putative 5'-flanking region. BLAST and phylogenetic analyses reveal that TmapoLp-III has high sequence identity (88%) with Tribolium castaneum apoLp-III but shares little sequence homologies (molitor. Copyright © 2013 Elsevier B.V. All rights reserved.

  13. Design of Genomic Signatures of Pathogen Identification & Characterization

    Energy Technology Data Exchange (ETDEWEB)

    Slezak, T; Gardner, S; Allen, J; Vitalis, E; Jaing, C

    2010-02-09

    This chapter will address some of the many issues associated with the identification of signatures based on genomic DNA/RNA, which can be used to identify and characterize pathogens for biodefense and microbial forensic goals. For the purposes of this chapter, we define a signature as one or more strings of contiguous genomic DNA or RNA bases that are sufficient to identify a pathogenic target of interest at the desired resolution and which could be instantiated with particular detection chemistry on a particular platform. The target may be a whole organism, an individual functional mechanism (e.g., a toxin gene), or simply a nucleic acid indicative of the organism. The desired resolution will vary with each program's goals but could easily range from family to genus to species to strain to isolate. The resolution may not be taxonomically based but rather pan-mechanistic in nature: detecting virulence or antibiotic-resistance genes shared by multiple microbes. Entire industries exist around different detection chemistries and instrument platforms for identification of pathogens, and we will only briefly mention a few of the techniques that we have used at Lawrence Livermore National Laboratory (LLNL) to support our biosecurity-related work since 2000. Most nucleic acid based detection chemistries involve the ability to isolate and amplify the signature target region(s), combined with a technique to detect the amplification. Genomic signature based identification techniques have the advantage of being precise, highly sensitive and relatively fast in comparison to biochemical typing methods and protein signatures. Classical biochemical typing methods were developed long before knowledge of DNA and resulted in dozens of tests (Gram's stain, differential growth characteristics media, etc.) that could be used to roughly characterize the major known pathogens (of course some are uncultivable). These tests could take many days to complete and precise resolution

  14. Generation and Analysis of Full-length cDNA Sequences from Elephant Shark (Callorhinchus milii)

    KAUST Repository

    Kodzius, Rimantas; Tay, Boon-Hui; Tan, Yue Ying; Brenner, Sydney; Venkatesh, Byrappa

    2009-01-01

    Cartilaginous fishes are the oldest living group of jawed vertebrates and therefore is an important group for understanding the evolution of vertebrate genomes including the human genome. Our laboratory has proposed elephant shark (C. milii) as a

  15. Why the Length of a Quantum String Cannot Be Lorentz Contracted

    Directory of Open Access Journals (Sweden)

    Antonio Aurilia

    2013-01-01

    Full Text Available We propose a quantum gravity-extended form of the classical length contraction law obtained in special relativity. More specifically, the framework of our discussion is the UV self-complete theory of quantum gravity. We show how our results are consistent with (i the generalized form of the uncertainty principle (GUP, (ii the so-called hoop-conjecture, and (iii the intriguing notion of “classicalization” of trans-Planckian physics. We argue that there is a physical limit to the Lorentz contraction rule in the form of some minimal universal length determined by quantum gravity, say the Planck Length, or any of its current embodiments such as the string length, or the TeV quantum gravity length scale. In the latter case, we determine the critical boost that separates the ordinary “particle phase,” characterized by the Compton wavelength, from the “black hole phase,” characterized by the effective Schwarzschild radius of the colliding system.

  16. Genomic Characterization of Metformin Hepatic Response.

    Directory of Open Access Journals (Sweden)

    Marcelo R Luizon

    2016-11-01

    Full Text Available Metformin is used as a first-line therapy for type 2 diabetes (T2D and prescribed for numerous other diseases. However, its mechanism of action in the liver has yet to be characterized in a systematic manner. To comprehensively identify genes and regulatory elements associated with metformin treatment, we carried out RNA-seq and ChIP-seq (H3K27ac, H3K27me3 on primary human hepatocytes from the same donor treated with vehicle control, metformin or metformin and compound C, an AMP-activated protein kinase (AMPK inhibitor (allowing to identify AMPK-independent pathways. We identified thousands of metformin responsive AMPK-dependent and AMPK-independent differentially expressed genes and regulatory elements. We functionally validated several elements for metformin-induced promoter and enhancer activity. These include an enhancer in an ataxia telangiectasia mutated (ATM intron that has SNPs in linkage disequilibrium with a metformin treatment response GWAS lead SNP (rs11212617 that showed increased enhancer activity for the associated haplotype. Expression quantitative trait locus (eQTL liver analysis and CRISPR activation suggest that this enhancer could be regulating ATM, which has a known role in AMPK activation, and potentially also EXPH5 and DDX10, its neighboring genes. Using ChIP-seq and siRNA knockdown, we further show that activating transcription factor 3 (ATF3, our top metformin upregulated AMPK-dependent gene, could have an important role in gluconeogenesis repression. Our findings provide a genome-wide representation of metformin hepatic response, highlight important sequences that could be associated with interindividual variability in glycemic response to metformin and identify novel T2D treatment candidates.

  17. Human-specific HERV-K insertion causes genomic variations in the human genome.

    Directory of Open Access Journals (Sweden)

    Wonseok Shin

    Full Text Available Human endogenous retroviruses (HERV sequences account for about 8% of the human genome. Through comparative genomics and literature mining, we identified a total of 29 human-specific HERV-K insertions. We characterized them focusing on their structure and flanking sequence. The results showed that four of the human-specific HERV-K insertions deleted human genomic sequences via non-classical insertion mechanisms. Interestingly, two of the human-specific HERV-K insertion loci contained two HERV-K internals and three LTR elements, a pattern which could be explained by LTR-LTR ectopic recombination or template switching. In addition, we conducted a polymorphic test and observed that twelve out of the 29 elements are polymorphic in the human population. In conclusion, human-specific HERV-K elements have inserted into human genome since the divergence of human and chimpanzee, causing human genomic changes. Thus, we believe that human-specific HERV-K activity has contributed to the genomic divergence between humans and chimpanzees, as well as within the human population.

  18. Thermal significance of fission-track length distributions

    International Nuclear Information System (INIS)

    Crowley, K.D.

    1985-01-01

    The semi-analytical solution of an equation describing the production and shortening of fission tracks in apatite suggests that certain thermal histories have unique length-distribution 'signatures'. Isothermal-heating histories should be characterized by flattened, length-shortened distributions; step-heating histories should be characterized by bimodal track length distributions; and linear-cooling histories should be characterized by negatively skewed, length-shortened distributions. The model formulated here to investigate track length distributions can be used to constrain the thermal histories of natural samples for which unbiased track length data are available - provided that the geologic history of the system of interest can be used to partially constrain one of the unknowns in the model equations, time or temperature. (author)

  19. Genome-wide development and deployment of informative intron-spanning and intron-length polymorphism markers for genomics-assisted breeding applications in chickpea.

    Science.gov (United States)

    Srivastava, Rishi; Bajaj, Deepak; Sayal, Yogesh K; Meher, Prabina K; Upadhyaya, Hari D; Kumar, Rajendra; Tripathi, Shailesh; Bharadwaj, Chellapilla; Rao, Atmakuri R; Parida, Swarup K

    2016-11-01

    The discovery and large-scale genotyping of informative gene-based markers is essential for rapid delineation of genes/QTLs governing stress tolerance and yield component traits in order to drive genetic enhancement in chickpea. A genome-wide 119169 and 110491 ISM (intron-spanning markers) from 23129 desi and 20386 kabuli protein-coding genes and 7454 in silico InDel (insertion-deletion) (1-45-bp)-based ILP (intron-length polymorphism) markers from 3283 genes were developed that were structurally and functionally annotated on eight chromosomes and unanchored scaffolds of chickpea. A much higher amplification efficiency (83%) and intra-specific polymorphic potential (86%) detected by these markers than that of other sequence-based genetic markers among desi and kabuli chickpea accessions was apparent even by a cost-effective agarose gel-based assay. The genome-wide physically mapped 1718 ILP markers assayed a wider level of functional genetic diversity (19-81%) and well-defined phylogenetics among domesticated chickpea accessions. The gene-derived 1424 ILP markers were anchored on a high-density (inter-marker distance: 0.65cM) desi intra-specific genetic linkage map/functional transcript map (ICC 4958×ICC 2263) of chickpea. This reference genetic map identified six major genomic regions harbouring six robust QTLs mapped on five chromosomes, which explained 11-23% seed weight trait variation (7.6-10.5 LOD) in chickpea. The integration of high-resolution QTL mapping with differential expression profiling detected six including one potential serine carboxypeptidase gene with ILP markers (linked tightly to the major seed weight QTLs) exhibiting seed-specific expression as well as pronounced up-regulation especially in seeds of high (ICC 4958) as compared to low (ICC 2263) seed weight mapping parental accessions. The marker information generated in the present study was made publicly accessible through a user-friendly web-resource, "Chickpea ISM-ILP Marker Database

  20. Predicting statistical properties of open reading frames in bacterial genomes.

    Directory of Open Access Journals (Sweden)

    Katharina Mir

    Full Text Available An analytical model based on the statistical properties of Open Reading Frames (ORFs of eubacterial genomes such as codon composition and sequence length of all reading frames was developed. This new model predicts the average length, maximum length as well as the length distribution of the ORFs of 70 species with GC contents varying between 21% and 74%. Furthermore, the number of annotated genes is predicted with high accordance. However, the ORF length distribution in the five alternative reading frames shows interesting deviations from the predicted distribution. In particular, long ORFs appear more often than expected statistically. The unexpected depletion of stop codons in these alternative open reading frames cannot completely be explained by a biased codon usage in the +1 frame. While it is unknown if the stop codon depletion has a biological function, it could be due to a protein coding capacity of alternative ORFs exerting a selection pressure which prevents the fixation of stop codon mutations. The comparison of the analytical model with bacterial genomes, therefore, leads to a hypothesis suggesting novel gene candidates which can now be investigated in subsequent wet lab experiments.

  1. Genome characterization of a porcine circovirus type 3 in South China.

    Science.gov (United States)

    Shen, H; Liu, X; Zhang, P; Wang, L; Liu, Y; Zhang, L; Liang, P; Song, C

    2018-02-01

    Porcine circovirus type 3 (PCV3) is a novel circovirus that was associated with porcine dermatitis and nephropathy syndrome, reproductive failure, and multisystemic inflammation. Recently, a PCV3 strain was identified from pyretic and pneumonic piglets in Guangdong province, China. This virus strain was sequenced and designated PCV3-China/GD2016. The complete genome of PCV3-China/GD2016 is 2,000 bp in length and shared 99.1% and 99.1% nucleotide identities with PCV3/29160 and PCV3/2164, respectively. [Corrections added after initial online publication on 13 March 2017: The numbers '98.5%' and '97.4%' has been changed to '99.1%' and '99.1%' in the previous sentence.] Phylogenetic analysis based on the complete genome showed that PCV3-China/GD2016 clustered with the emerging PCV3 and separated with other virus in genus Circovirus. The results of this study suggest that PCV3 has existed within the pigs of China. It is urgent to investigate the pathogenicity and epidemiology of this novel circovirus China. © 2017 Blackwell Verlag GmbH.

  2. Complete genome sequencing and evolutionary analysis of Indian isolates of Dengue virus type 2

    Energy Technology Data Exchange (ETDEWEB)

    Dash, Paban Kumar, E-mail: pabandash@rediffmail.com; Sharma, Shashi; Soni, Manisha; Agarwal, Ankita; Parida, Manmohan; Rao, P.V.Lakshmana

    2013-07-05

    Highlights: •Complete genome of Indian DENV-2 was deciphered for the first time in this study. •The recent Indian DENV-2 revealed presence of many unique amino acid residues. •Genotype shift (American to Cosmopolitan) characterizes evolution of DENV-2 in India. •Circulation of a unique clade of DENV-2 in South Asia was identified. -- Abstract: Dengue is the most important arboviral infection of global public health significance. It is now endemic in most parts of the South East Asia including India. Though Dengue virus type 2 (DENV-2) is predominantly associated with major outbreaks in India, complete genome information of Indian DENV-2 is not available. In this study, the full-length genome of five DENV-2 isolates (four from 2001 to 2011 and one from 1960), from different parts of India was determined. The complete genome of the Indian DENV-2 was found to be 10,670 bases long with an open reading frame coding for 3391 amino acids. The recent Indian DENV-2 (2001–2011) revealed a nucleotide sequence identity of around 90% and 97% with an older Indian DENV-2 (1960) and closely related Sri Lankan and Chinese DENV-2 respectively. Presence of unique amino acid residues and non-conservative substitutions in critical amino acid residues of major structural and non-structural proteins was observed in recent Indian DENV-2. Selection pressure analysis revealed positive selection in few amino acid sites of the genes encoding for structural and non-structural proteins. The molecular phylogenetic analysis based on comparison of both complete coding region and envelope protein gene with globally diverse DENV-2 viruses classified the recent Indian isolates into a unique South Asian clade within Cosmopolitan genotype. A shift of genotype from American to Cosmopolitan in 1970s characterized the evolution of DENV-2 in India. Present study is the first report on complete genome characterization of emerging DENV-2 isolates from India and highlights the circulation of a

  3. Complete genome sequencing and evolutionary analysis of Indian isolates of Dengue virus type 2

    International Nuclear Information System (INIS)

    Dash, Paban Kumar; Sharma, Shashi; Soni, Manisha; Agarwal, Ankita; Parida, Manmohan; Rao, P.V.Lakshmana

    2013-01-01

    Highlights: •Complete genome of Indian DENV-2 was deciphered for the first time in this study. •The recent Indian DENV-2 revealed presence of many unique amino acid residues. •Genotype shift (American to Cosmopolitan) characterizes evolution of DENV-2 in India. •Circulation of a unique clade of DENV-2 in South Asia was identified. -- Abstract: Dengue is the most important arboviral infection of global public health significance. It is now endemic in most parts of the South East Asia including India. Though Dengue virus type 2 (DENV-2) is predominantly associated with major outbreaks in India, complete genome information of Indian DENV-2 is not available. In this study, the full-length genome of five DENV-2 isolates (four from 2001 to 2011 and one from 1960), from different parts of India was determined. The complete genome of the Indian DENV-2 was found to be 10,670 bases long with an open reading frame coding for 3391 amino acids. The recent Indian DENV-2 (2001–2011) revealed a nucleotide sequence identity of around 90% and 97% with an older Indian DENV-2 (1960) and closely related Sri Lankan and Chinese DENV-2 respectively. Presence of unique amino acid residues and non-conservative substitutions in critical amino acid residues of major structural and non-structural proteins was observed in recent Indian DENV-2. Selection pressure analysis revealed positive selection in few amino acid sites of the genes encoding for structural and non-structural proteins. The molecular phylogenetic analysis based on comparison of both complete coding region and envelope protein gene with globally diverse DENV-2 viruses classified the recent Indian isolates into a unique South Asian clade within Cosmopolitan genotype. A shift of genotype from American to Cosmopolitan in 1970s characterized the evolution of DENV-2 in India. Present study is the first report on complete genome characterization of emerging DENV-2 isolates from India and highlights the circulation of a

  4. Comparative Genomics Analyses Reveal Extensive Chromosome Colinearity and Novel Quantitative Trait Loci in Eucalyptus.

    Directory of Open Access Journals (Sweden)

    Fagen Li

    Full Text Available Dense genetic maps, along with quantitative trait loci (QTLs detected on such maps, are powerful tools for genomics and molecular breeding studies. In the important woody genus Eucalyptus, the recent release of E. grandis genome sequence allows for sequence-based genomic comparison and searching for positional candidate genes within QTL regions. Here, dense genetic maps were constructed for E. urophylla and E. tereticornis using genomic simple sequence repeats (SSR, expressed sequence tag (EST derived SSR, EST-derived cleaved amplified polymorphic sequence (EST-CAPS, and diversity arrays technology (DArT markers. The E. urophylla and E. tereticornis maps comprised 700 and 585 markers across 11 linkage groups, totaling at 1,208.2 and 1,241.4 cM in length, respectively. Extensive synteny and colinearity were observed as compared to three earlier DArT-based eucalypt maps (two maps with E. grandis × E. urophylla and one map of E. globulus and with the E. grandis genome sequence. Fifty-three QTLs for growth (10-56 months of age and wood density (56 months were identified in 22 discrete regions on both maps, in which only one colocalizaiton was found between growth and wood density. Novel QTLs were revealed as compared with those previously detected on DArT-based maps for similar ages in Eucalyptus. Eleven to 585 positional candidate genes were obained for a 56-month-old QTL through aligning QTL confidence interval with the E. grandis genome. These results will assist in comparative genomics studies, targeted gene characterization, and marker-assisted selection in Eucalyptus and the related taxa.

  5. Transformation of Cowpea Vigna unguiculata with a Full-Length DNA Copy of Cowpea Mosaic Virus M-RNA

    NARCIS (Netherlands)

    Hille, Jacques; Goldbach, Rob

    1987-01-01

    A full-length DNA copy of the M-RNA of cowpea mosaic virus (CPMV), supplied with either the 35S promoter from cauliflower mosaic virus (CaMV) or the nopaline synthase promoter from Agrobacterium tumefaciens, was introduced into the T-DNA region of a Ti-plasmid-derived gene vector and transferred to

  6. Implementation of Whole Genome Sequencing (WGS for Identification and Characterization of Shiga Toxin-Producing Escherichia coli (STEC in the United States

    Directory of Open Access Journals (Sweden)

    Rebecca L Lindsey

    2016-05-01

    Full Text Available Shiga toxin-producing Escherichia coli (STEC is an important foodborne pathogen capable of causing severe disease in humans. Rapid and accurate identification and characterization techniques are essential during outbreak investigations. Current methods for characterization of STEC are expensive and time-consuming. With the advent of rapid and cheap whole genome sequencing (WGS benchtop sequencers, the potential exists to replace traditional workflows with WGS. The aim of this study was to validate tools to do reference identification and characterization from WGS for STEC in a single workflow within an easy to use commercially available software platform. Publically available serotype, virulence, and antimicrobial resistance databases were downloaded from the Center for Genomic Epidemiology (CGE (www.genomicepidemiology.org and integrated into a genotyping plug-in with in silico PCR tools to confirm some of the virulence genes detected from WGS data. Additionally, down sampling experiments on the WGS sequence data were performed to determine a threshold for sequence coverage needed to accurately predict serotype and virulence genes using the established workflow. The serotype database was tested on a total of 228 genomes and correctly predicted from WGS for 96.1% of O serogroups and 96.5% of H serogroups identified by conventional testing techniques. A total of 59 genomes were evaluated to determine the threshold of coverage to detect the different WGS targets, 40 were evaluated for serotype and virulence gene detection and 19 for the stx gene subtypes. For serotype, 95% of the O and 100% of the H serogroups were detected at > 40x and ≥ 30x coverage, respectively. For virulence targets and stx gene subtypes, nearly all genes were detected at > 40x, though some targets were 100% detectable from genomes with coverage ≥20x. The resistance detection tool was 97% concordant with phenotypic testing results. With isolates sequenced to > 40x

  7. Genomic analysis identifies masqueraders of full-term cerebral palsy.

    Science.gov (United States)

    Takezawa, Yusuke; Kikuchi, Atsuo; Haginoya, Kazuhiro; Niihori, Tetsuya; Numata-Uematsu, Yurika; Inui, Takehiko; Yamamura-Suzuki, Saeko; Miyabayashi, Takuya; Anzai, Mai; Suzuki-Muromoto, Sato; Okubo, Yukimune; Endo, Wakaba; Togashi, Noriko; Kobayashi, Yasuko; Onuma, Akira; Funayama, Ryo; Shirota, Matsuyuki; Nakayama, Keiko; Aoki, Yoko; Kure, Shigeo

    2018-05-01

    Cerebral palsy is a common, heterogeneous neurodevelopmental disorder that causes movement and postural disabilities. Recent studies have suggested genetic diseases can be misdiagnosed as cerebral palsy. We hypothesized that two simple criteria, that is, full-term births and nonspecific brain MRI findings, are keys to extracting masqueraders among cerebral palsy cases due to the following: (1) preterm infants are susceptible to multiple environmental factors and therefore demonstrate an increased risk of cerebral palsy and (2) brain MRI assessment is essential for excluding environmental causes and other particular disorders. A total of 107 patients-all full-term births-without specific findings on brain MRI were identified among 897 patients diagnosed with cerebral palsy who were followed at our center. DNA samples were available for 17 of the 107 cases for trio whole-exome sequencing and array comparative genomic hybridization. We prioritized variants in genes known to be relevant in neurodevelopmental diseases and evaluated their pathogenicity according to the American College of Medical Genetics guidelines. Pathogenic/likely pathogenic candidate variants were identified in 9 of 17 cases (52.9%) within eight genes: CTNNB1 , CYP2U1 , SPAST , GNAO1 , CACNA1A , AMPD2 , STXBP1 , and SCN2A . Five identified variants had previously been reported. No pathogenic copy number variations were identified. The AMPD2 missense variant and the splice-site variants in CTNNB1 and AMPD2 were validated by in vitro functional experiments. The high rate of detecting causative genetic variants (52.9%) suggests that patients diagnosed with cerebral palsy in full-term births without specific MRI findings may include genetic diseases masquerading as cerebral palsy.

  8. Improving microbial genome annotations in an integrated database context.

    Directory of Open Access Journals (Sweden)

    I-Min A Chen

    Full Text Available Effective comparative analysis of microbial genomes requires a consistent and complete view of biological data. Consistency regards the biological coherence of annotations, while completeness regards the extent and coverage of functional characterization for genomes. We have developed tools that allow scientists to assess and improve the consistency and completeness of microbial genome annotations in the context of the Integrated Microbial Genomes (IMG family of systems. All publicly available microbial genomes are characterized in IMG using different functional annotation and pathway resources, thus providing a comprehensive framework for identifying and resolving annotation discrepancies. A rule based system for predicting phenotypes in IMG provides a powerful mechanism for validating functional annotations, whereby the phenotypic traits of an organism are inferred based on the presence of certain metabolic reactions and pathways and compared to experimentally observed phenotypes. The IMG family of systems are available at http://img.jgi.doe.gov/.

  9. MUMmer4: A fast and versatile genome alignment system.

    Directory of Open Access Journals (Sweden)

    Guillaume Marçais

    2018-01-01

    Full Text Available The MUMmer system and the genome sequence aligner nucmer included within it are among the most widely used alignment packages in genomics. Since the last major release of MUMmer version 3 in 2004, it has been applied to many types of problems including aligning whole genome sequences, aligning reads to a reference genome, and comparing different assemblies of the same genome. Despite its broad utility, MUMmer3 has limitations that can make it difficult to use for large genomes and for the very large sequence data sets that are common today. In this paper we describe MUMmer4, a substantially improved version of MUMmer that addresses genome size constraints by changing the 32-bit suffix tree data structure at the core of MUMmer to a 48-bit suffix array, and that offers improved speed through parallel processing of input query sequences. With a theoretical limit on the input size of 141Tbp, MUMmer4 can now work with input sequences of any biologically realistic length. We show that as a result of these enhancements, the nucmer program in MUMmer4 is easily able to handle alignments of large genomes; we illustrate this with an alignment of the human and chimpanzee genomes, which allows us to compute that the two species are 98% identical across 96% of their length. With the enhancements described here, MUMmer4 can also be used to efficiently align reads to reference genomes, although it is less sensitive and accurate than the dedicated read aligners. The nucmer aligner in MUMmer4 can now be called from scripting languages such as Perl, Python and Ruby. These improvements make MUMer4 one the most versatile genome alignment packages available.

  10. Genome-wide identification and characterization of putative lncRNAs in the diamondback moth, Plutella xylostella (L.).

    Science.gov (United States)

    Wang, Yue; Xu, Tingting; He, Weiyi; Shen, Xiujing; Zhao, Qian; Bai, Jianlin; You, Minsheng

    2018-01-01

    Long non-coding RNAs (lncRNAs) are of particular interest because of their contributions to many biological processes. Here, we present the genome-wide identification and characterization of putative lncRNAs in a global insect pest, Plutella xylostella. A total of 8096 lncRNAs were identified and classified into three groups. The average length of exons in lncRNAs was longer than that in coding genes and the GC content was lower than that in mRNAs. Most lncRNAs were flanked by canonical splice sites, similar to mRNAs. Expression profiling identified 114 differentially expressed lncRNAs during the DBM development and found that majority were temporally specific. While the biological functions of lncRNAs remain uncharacterized, many are microRNA precursors or competing endogenous RNAs involved in micro-RNA regulatory pathways. This work provides a valuable resource for further studies on molecular bases for development of DBM and lay the foundation for discovery of lncRNA functions in P. xylostella. Copyright © 2017 Elsevier Inc. All rights reserved.

  11. Conserved elements within the genome of foot-and-mouth disease virus; their influence on viral replication

    DEFF Research Database (Denmark)

    Kjær, Jonas

    -and-mouth disease virus (FMDV) have been identified, e.g. the IRES. Such elements can be crucial for the efficient replication of the genomic RNA. A better understanding of the influence of these elements is required to identify currently unrecognized interactions within the viruses which may be important...... for the development of anti-viral agents. SHAPE analysis of the entire FMDV genome (Poulsen, 2015) has identified three conserved RNA structures within the coding regions for 2B, 3C and 3D (RNA-dependent RNA polymerase) which might have an important role in virus replication. The FMDV 2A peptide, another conserved...... polypeptide. The nature of this “cleavage” has so far not been investigated in the context of the full-length FMDV RNA within cells. The focus of this PhD thesis has been to characterize these elements and their influence on the FMDV replication. In order to fulfil the aims of this thesis a series of studies...

  12. Characterization of long-length, MOCVD-derived REBCO coated conductors.

    Energy Technology Data Exchange (ETDEWEB)

    Miller, D. J.; Maroni, V. A.; Hiller, J. M.; Koritala, R. E.; Chen, Y.; Reeves Black, J. L.; Selvamanickam, V.; SuperPower, Inc.; Development Dimensions International, Inc.

    2009-06-01

    A leading approach to the fabrication of long-length, high-performance REBa{sub 2}Cu{sub 3}O{sub 7} (REBCO) coated conductor is by metal-organic chemical vapor deposition (MOCVD) of REBCO on buffered templates. Templates are produced by ion beam assisted deposition of textured MgO onto polished metal substrates. The overall performance of MOCVD coated conductors achieved to date is impressive, but further improvement is desired. We have used a coordinated set of characterization techniques to identify the underlying causes for critical current (Ic) performance variations in long-length MOCVD conductors. Using electron microscopy and Raman spectroscopy, we studied tape specimens from specially designed experiments performed in SuperPower's MOCVD manufacturing equipment with its six-track ldquohelixrdquo tape path. We find that in multi-pass depositions used to produce thicker REBCO films, the REBCO phase uniformity and texture quality in the first pass play key roles in pass-to-pass microstructure evolution, with nucleation of second phase particles in the first layer promoting misoriented grains that propagate through subsequent layers. These misoriented grains, many growing in close proximity with second phase particles, present current-blocking obstacles that limit Ic performance. Our results show that achieving more uniform deposition in the very first deposited layer plays a critical role that in turn leads to reduced misoriented grain content and REBCO lattice disorder in the second and subsequent layers of the REBCO film.

  13. Full Waveform Inversion for Reservoir Characterization - A Synthetic Study

    KAUST Repository

    Zabihi Naeini, E.; Kamath, N.; Tsvankin, I.; Alkhalifah, Tariq Ali

    2017-01-01

    Most current reservoir-characterization workflows are based on classic amplitude-variation-with-offset (AVO) inversion techniques. Although these methods have generally served us well over the years, here we examine full-waveform inversion (FWI

  14. Global identification of the full-length transcripts and alternative splicing related to phenolic acid biosynthetic genes in Salvia miltiorrhiza

    Directory of Open Access Journals (Sweden)

    Zhichao eXu

    2016-02-01

    Full Text Available Salvianolic acids are among the main bioactive components in Salvia miltiorrhiza, and their biosynthesis has attracted widespread interest. However, previous studies on the biosynthesis of phenolic acids using next-generation sequencing platforms are limited with regard to the assembly of full-length transcripts. Based on hybrid-seq (next-generation and single molecular real-time sequencing of the S. miltiorrhiza root transcriptome, we experimentally identified 15 full-length transcripts and 4 alternative splicing events of enzyme-coding genes involved in the biosynthesis of rosmarinic acid. Moreover, we herein demonstrate that lithospermic acid B accumulates in the phloem and xylem of roots, in agreement with the expression patterns of the identified key genes related to rosmarinic acid biosynthesis. According to co-expression patterns, we predicted that 6 candidate cytochrome P450s and 5 candidate laccases participate in the salvianolic acid pathway. Our results provide a valuable resource for further investigation into the synthetic biology of phenolic acids in S. miltiorrhiza.

  15. The effect of two different renal denervation strategies on blood pressure in resistant hypertension: Comparison of full-length versus proximal renal artery ablation.

    Science.gov (United States)

    Chen, Weijie; Ling, Zhiyu; Du, Huaan; Song, Wenxin; Xu, Yanping; Liu, Zengzhang; Su, Li; Xiao, Peilin; Yuan, Yuelong; Lu, Jiayi; Zhang, Jianhong; Li, Zhifeng; Shao, Jiang; Zhong, Bin; Zhou, Bei; Woo, Kamsang; Yin, Yuehui

    2016-11-01

    Renal denervation (RDN) is used to manage blood pressure (BP) in patients with resistant hypertension (rHT), but effectiveness is still a concern, and key arterial portion for successful RDN is not clear. The aim of this study was to investigate the efficacy and safety of proximal versus full-length renal artery ablation in patients with resistant hypertension (rHT). Forty-seven patients with rHT were randomly assigned to receive full-length ablation (n = 23) or proximal ablation (n = 24) of the renal arteries. All lesions were treated with radiofrequency energy via a saline-irrigated catheter. Office BP was measured during 12 months of follow-up and ambulatory BP at baseline and 6 months (n = 15 in each group). Compared with full-length ablation, proximal ablation reduced the number of ablation points in both the right (6.1 ± 0.7 vs. 3.3 ± 0.6, P renal arteries (6.2 ± 0.7 vs. 3.3 ± 0.8, P  0.5). Similar office BPs was reduced by -39.4 ± 11.5/-20.9 ± 7.1 mm Hg at 6 months and -38.2 ± 10.3/-21.5 ± 5.8 mm Hg at 12 months in the full-length group (P efficacy and safety profile compared with full-length RDN, and propose the proximal artery as the key portion for RDN. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.

  16. Phenotypic plasticity, QTL mapping and genomic characterization of bud set in black poplar

    Directory of Open Access Journals (Sweden)

    Fabbrini Francesco

    2012-04-01

    Full Text Available Abstract Background The genetic control of important adaptive traits, such as bud set, is still poorly understood in most forest trees species. Poplar is an ideal model tree to study bud set because of its indeterminate shoot growth. Thus, a full-sib family derived from an intraspecific cross of P. nigra with 162 clonally replicated progeny was used to assess the phenotypic plasticity and genetic variation of bud set in two sites of contrasting environmental conditions. Results Six crucial phenological stages of bud set were scored. Night length appeared to be the most important signal triggering the onset of growth cessation. Nevertheless, the effect of other environmental factors, such as temperature, increased during the process. Moreover, a considerable role of genotype × environment (G × E interaction was found in all phenological stages with the lowest temperature appearing to influence the sensitivity of the most plastic genotypes. Descriptors of growth cessation and bud onset explained the largest part of phenotypic variation of the entire process. Quantitative trait loci (QTL for these traits were detected. For the four selected traits (the onset of growth cessation (date2.5, the transition from shoot to bud (date1.5, the duration of bud formation (subproc1 and bud maturation (subproc2 eight and sixteen QTL were mapped on the maternal and paternal map, respectively. The identified QTL, each one characterized by small or modest effect, highlighted the complex nature of traits involved in bud set process. Comparison between map location of QTL and P. trichocarpa genome sequence allowed the identification of 13 gene models, 67 bud set-related expressional and six functional candidate genes (CGs. These CGs are functionally related to relevant biological processes, environmental sensing, signaling, and cell growth and development. Some strong QTL had no obvious CGs, and hold great promise to identify unknown genes that affect bud set

  17. Genomics Strategies for Germplasm Characterization and the Development of Climate Resilient Crops

    Directory of Open Access Journals (Sweden)

    Robert eHenry

    2014-02-01

    Full Text Available Food security requires the development and deployment of crop varieties resilient to climate variation and change. The study of variations in the genome of wild plant populations can be used to guide crop improvement. Genome variation found in wild crop relatives may be directly relevant to the breeding of environmentally adapted and climate resilient crops. Analysis of the genomes of populations growing in contrasting environments will reveal the genes subject to natural selection in adaptation to climate variations. Whole genome sequencing of these populations should define the numbers and types of genes associated with climate adaptation. This strategy is facilitated by recent advances in sequencing technologies. Wild relatives of rice and barley have been used to assess these approaches. This strategy is most easily applied to species for which a high quality reference genome sequence is available and where populations of wild relatives can be found growing in diverse environments or across environmental gradients.

  18. Genetic deletion of muscle RANK or selective inhibition of RANKL is not as effective as full-length OPG-fc in mitigating muscular dystrophy.

    Science.gov (United States)

    Dufresne, Sébastien S; Boulanger-Piette, Antoine; Bossé, Sabrina; Argaw, Anteneh; Hamoudi, Dounia; Marcadet, Laetitia; Gamu, Daniel; Fajardo, Val A; Yagita, Hideo; Penninger, Josef M; Russell Tupling, A; Frenette, Jérôme

    2018-04-24

    Although there is a strong association between osteoporosis and skeletal muscle atrophy/dysfunction, the functional relevance of a particular biological pathway that regulates synchronously bone and skeletal muscle physiopathology is still elusive. Receptor-activator of nuclear factor κB (RANK), its ligand RANKL and the soluble decoy receptor osteoprotegerin (OPG) are the key regulators of osteoclast differentiation and bone remodelling. We thus hypothesized that RANK/RANKL/OPG, which is a key pathway for bone regulation, is involved in Duchenne muscular dystrophy (DMD) physiopathology. Our results show that muscle-specific RANK deletion (mdx-RANK mko ) in dystrophin deficient mdx mice improves significantly specific force [54% gain in force] of EDL muscles with no protective effect against eccentric contraction-induced muscle dysfunction. In contrast, full-length OPG-Fc injections restore the force of dystrophic EDL muscles [162% gain in force], protect against eccentric contraction-induced muscle dysfunction ex vivo and significantly improve functional performance on downhill treadmill and post-exercise physical activity. Since OPG serves a soluble receptor for RANKL and as a decoy receptor for TRAIL, mdx mice were injected with anti-RANKL and anti-TRAIL antibodies to decipher the dual function of OPG. Injections of anti-RANKL and/or anti-TRAIL increase significantly the force of dystrophic EDL muscle [45% and 17% gains in force, respectively]. In agreement, truncated OPG-Fc that contains only RANKL domains produces similar gains, in terms of force production, than anti-RANKL treatments. To corroborate that full-length OPG-Fc also acts independently of RANK/RANKL pathway, dystrophin/RANK double-deficient mice were treated with full-length OPG-Fc for 10 days. Dystrophic EDL muscles exhibited a significant gain in force relative to untreated dystrophin/RANK double-deficient mice, indicating that the effect of full-length OPG-Fc is in part independent of the RANKL

  19. Genome-Wide Identification, Characterization and Phylogenetic Analysis of ATP-Binding Cassette (ABC Transporter Genes in Common Carp (Cyprinus carpio.

    Directory of Open Access Journals (Sweden)

    Xiang Liu

    Full Text Available The ATP-binding cassette (ABC gene family is considered to be one of the largest gene families in all forms of prokaryotic and eukaryotic life. Although the ABC transporter genes have been annotated in some species, detailed information about the ABC superfamily and the evolutionary characterization of ABC genes in common carp (Cyprinus carpio are still unclear. In this research, we identified 61 ABC transporter genes in the common carp genome. Phylogenetic analysis revealed that they could be classified into seven subfamilies, namely 11 ABCAs, six ABCBs, 19 ABCCs, eight ABCDs, two ABCEs, four ABCFs, and 11 ABCGs. Comparative analysis of the ABC genes in seven vertebrate species including common carp, showed that at least 10 common carp genes were retained from the third round of whole genome duplication, while 12 duplicated ABC genes may have come from the fourth round of whole genome duplication. Gene losses were also observed for 14 ABC genes. Expression profiles of the 61 ABC genes in six common carp tissues (brain, heart, spleen, kidney, intestine, and gill revealed extensive functional divergence among the ABC genes. Different copies of some genes had tissue-specific expression patterns, which may indicate some gene function specialization. This study provides essential genomic resources for future studies in common carp.

  20. Genetic Characterization and Classification of Human and Animal Sapoviruses.

    Directory of Open Access Journals (Sweden)

    Tomoichiro Oka

    Full Text Available Sapoviruses (SaVs are enteric caliciviruses that have been detected in multiple mammalian species, including humans, pigs, mink, dogs, sea lions, chimpanzees, and rats. They show a high level of diversity. A SaV genome commonly encodes seven nonstructural proteins (NSs, including the RNA polymerase protein NS7, and two structural proteins (VP1 and VP2. We classified human and animal SaVs into 15 genogroups (G based on available VP1 sequences, including three newly characterized genomes from this study. We sequenced the full length genomes of one new genogroup V (GV, one GVII and one GVIII porcine SaV using long range RT-PCR including newly designed forward primers located in the conserved motifs of the putative NS3, and also 5' RACE methods. We also determined the 5'- and 3'-ends of sea lion GV SaV and canine GXIII SaV. Although the complete genomic sequences of GIX-GXII, and GXV SaVs are unavailable, common features of SaV genomes include: 1 "GTG" at the 5'-end of the genome, and a short (9~14 nt 5'-untranslated region; and 2 the first five amino acids (M [A/V] S [K/R] P of the putative NS1 and the five amino acids (FEMEG surrounding the putative cleavage site between NS7 and VP1 were conserved among the chimpanzee, two of five genogroups of pig (GV and GVIII, sea lion, canine, and human SaVs. In contrast, these two amino acid motifs were clearly different in three genogroups of porcine (GIII, GVI and GVII, and bat SaVs. Our results suggest that several animal SaVs have genetic similarities to human SaVs. However, the ability of SaVs to be transmitted between humans and animals is uncertain.

  1. The impacts of drift and selection on genomic evolution in insects

    Directory of Open Access Journals (Sweden)

    K. Jun Tong

    2017-04-01

    Full Text Available Genomes evolve through a combination of mutation, drift, and selection, all of which act heterogeneously across genes and lineages. This leads to differences in branch-length patterns among gene trees. Genes that yield trees with the same branch-length patterns can be grouped together into clusters. Here, we propose a novel phylogenetic approach to explain the factors that influence the number and distribution of these gene-tree clusters. We apply our method to a genomic dataset from insects, an ancient and diverse group of organisms. We find some evidence that when drift is the dominant evolutionary process, each cluster tends to contain a large number of fast-evolving genes. In contrast, strong negative selection leads to many distinct clusters, each of which contains only a few slow-evolving genes. Our work, although preliminary in nature, illustrates the use of phylogenetic methods to shed light on the factors driving rate variation in genomic evolution.

  2. Complete mitochondrial genome of the big-eared horseshoe bat Rhinolophus macrotis (Chiroptera, Rhinolophidae).

    Science.gov (United States)

    Zhang, Lin; Sun, Keping; Feng, Jiang

    2016-11-01

    We sequenced and characterized the complete mitochondrial genome of the big-eared horseshoe bat, Rhinolophus macrotis. Total length of the mitogenome is 16,848 bp, with a base composition of 31.2% A, 25.3% T, 28.8% C and 14.7% G. The mitogenome consists of 13 protein-coding genes, 2 rRNA (12S and 16S rRNA) genes, 22 tRNA genes and 1 control region. It has the same gene arrangement pattern as those of typical vertebrate mitochondrial genome. The results will contribute to our understanding of the taxonomic status and evolution in the genus Rhinolophus bats.

  3. Genomic Characterization of Interspecific Hybrids and an Admixture Population Derived from Panicum amarum × P. virgatum

    Directory of Open Access Journals (Sweden)

    Christopher Heffelfinger

    2015-07-01

    Full Text Available Switchgrass ( L. and its relatives are regarded as top bioenergy crop candidates; however, one critical barrier is the introduction of useful genetic diversity and the development of new cultivars and hybrids. Combining genomes from related cultivars and species provides an opportunity to introduce new traits. In switchgrass, a breeding advantage would be achieved by combining the genomes of intervarietal ecotypes or interspecific hybrids. The recovery of wide crosses, however, is often tedious and may involve complicated embryo rescue and numerous backcrosses. Here, we demonstrate a straightforward approach to wide crosses involving the use of a selectable transgene for recovery of interspecific [ cv. Alamo × Ell var or Atlantic Coastal Panicgrass (ACP] F hybrids followed by backcrossing to generate a nontransgenic admixture population. A nontransgenic herbicide-sensitive (HbS admixture population of 83 FBC progeny was analyzed by genotyping-by-sequencing (GBS to characterize local ancestry, parental contribution, and patterns of recombination. These results demonstrate a widely applicable breeding strategy that makes use of transgenic selectable resistance to identify and recover true hybrids.

  4. Full Genome Characterization of Novel DS-1-Like G8P[8] Rotavirus Strains that Have Emerged in Thailand: Reassortment of Bovine and Human Rotavirus Gene Segments in Emerging DS-1-Like Intergenogroup Reassortant Strains.

    Directory of Open Access Journals (Sweden)

    Ratana Tacharoenmuang

    Full Text Available The emergence and rapid spread of unusual DS-1-like intergenogroup reassortant rotavirus strains have been recently reported in Asia, Australia, and Europe. During rotavirus surveillance in Thailand in 2013-2014, novel DS-1-like intergenogroup reassortant strains having G8P[8] genotypes (i.e., strains KKL-17, PCB-79, PCB-84, PCB-85, PCB-103, SKT-107, SWL-12, NP-130, PCB-656, SKT-457, SSKT-269, and SSL-55 were identified in stool samples from hospitalized children with severe diarrhea. In this study, we determined and characterized the complete genomes of these 12 strains (seven strains, KKL-17, PCB-79, PCB-84, PCB-85, PCB-103, SKT-107, and SWL-12, found in 2013 (2013 strains, and five, NP-130, PCB-656, SKT-457, SSKT-269, and SSL-55, in 2014 (2014 strains. On full genomic analysis, all 12 strains showed a unique genotype constellation comprising a mixture of genogroup 1 and 2 genes: G8-P[8]-I2-R2-C2-M2-A2-N2-T2-E2-H2. With the exception of the G genotype, the unique genotype constellation of the 12 strains (P[8]-I2-R2-C2-M2-A2-N2-T2-E2-H2 was found to be shared with DS-1-like intergenogroup reassortant strains. On phylogenetic analysis, six of the 11 genes of the 2013 strains (VP4, VP2, VP3, NSP1, NSP3, and NSP5 appeared to have originated from DS-1-like intergenogroup reassortant strains, while the remaining four (VP7, VP6, VP1, and NSP2 and one (NSP4 gene appeared to be of bovine and human origin, respectively. Thus, the 2013 strains appeared to be reassortant strains as to DS-1-like intergenogroup reassortant, bovine, bovine-like human, and/or human rotaviruses. On the other hand, five of the 11 genes of the 2014 strains (VP4, VP2, VP3, NSP1, and NSP3 appeared to have originated from DS-1-like intergenogroup reassortant strains, while three (VP7, VP1, and NSP2 and one (NSP4 were assumed to be of bovine and human origin, respectively. Notably, the remaining two genes, VP6 and NSP5, of the 2014 strains appeared to have originated from locally

  5. Genomics-informed isolation and characterization of a symbiotic Nanoarchaeota system from a terrestrial geothermal environment.

    Science.gov (United States)

    Wurch, Louie; Giannone, Richard J; Belisle, Bernard S; Swift, Carolyn; Utturkar, Sagar; Hettich, Robert L; Reysenbach, Anna-Louise; Podar, Mircea

    2016-07-05

    Biological features can be inferred, based on genomic data, for many microbial lineages that remain uncultured. However, cultivation is important for characterizing an organism's physiology and testing its genome-encoded potential. Here we use single-cell genomics to infer cultivation conditions for the isolation of an ectosymbiotic Nanoarchaeota ('Nanopusillus acidilobi') and its host (Acidilobus, a crenarchaeote) from a terrestrial geothermal environment. The cells of 'Nanopusillus' are among the smallest known cellular organisms (100-300 nm). They appear to have a complete genetic information processing machinery, but lack almost all primary biosynthetic functions as well as respiration and ATP synthesis. Genomic and proteomic comparison with its distant relative, the marine Nanoarchaeum equitans illustrate an ancient, common evolutionary history of adaptation of the Nanoarchaeota to ectosymbiosis, so far unique among the Archaea.

  6. Genome-wide microsatellite characterization and marker development in the sequenced Brassica crop species.

    Science.gov (United States)

    Shi, Jiaqin; Huang, Shunmou; Zhan, Jiepeng; Yu, Jingyin; Wang, Xinfa; Hua, Wei; Liu, Shengyi; Liu, Guihua; Wang, Hanzhong

    2014-02-01

    Although much research has been conducted, the pattern of microsatellite distribution has remained ambiguous, and the development/utilization of microsatellite markers has still been limited/inefficient in Brassica, due to the lack of genome sequences. In view of this, we conducted genome-wide microsatellite characterization and marker development in three recently sequenced Brassica crops: Brassica rapa, Brassica oleracea and Brassica napus. The analysed microsatellite characteristics of these Brassica species were highly similar or almost identical, which suggests that the pattern of microsatellite distribution is likely conservative in Brassica. The genomic distribution of microsatellites was highly non-uniform and positively or negatively correlated with genes or transposable elements, respectively. Of the total of 115 869, 185 662 and 356 522 simple sequence repeat (SSR) markers developed with high frequencies (408.2, 343.8 and 356.2 per Mb or one every 2.45, 2.91 and 2.81 kb, respectively), most represented new SSR markers, the majority had determined physical positions, and a large number were genic or putative single-locus SSR markers. We also constructed a comprehensive database for the newly developed SSR markers, which was integrated with public Brassica SSR markers and annotated genome components. The genome-wide SSR markers developed in this study provide a useful tool to extend the annotated genome resources of sequenced Brassica species to genetic study/breeding in different Brassica species.

  7. Organizational heterogeneity of vertebrate genomes.

    Science.gov (United States)

    Frenkel, Svetlana; Kirzhner, Valery; Korol, Abraham

    2012-01-01

    Genomes of higher eukaryotes are mosaics of segments with various structural, functional, and evolutionary properties. The availability of whole-genome sequences allows the investigation of their structure as "texts" using different statistical and computational methods. One such method, referred to as Compositional Spectra (CS) analysis, is based on scoring the occurrences of fixed-length oligonucleotides (k-mers) in the target DNA sequence. CS analysis allows generating species- or region-specific characteristics of the genome, regardless of their length and the presence of coding DNA. In this study, we consider the heterogeneity of vertebrate genomes as a joint effect of regional variation in sequence organization superimposed on the differences in nucleotide composition. We estimated compositional and organizational heterogeneity of genome and chromosome sequences separately and found that both heterogeneity types vary widely among genomes as well as among chromosomes in all investigated taxonomic groups. The high correspondence of heterogeneity scores obtained on three genome fractions, coding, repetitive, and the remaining part of the noncoding DNA (the genome dark matter--GDM) allows the assumption that CS-heterogeneity may have functional relevance to genome regulation. Of special interest for such interpretation is the fact that natural GDM sequences display the highest deviation from the corresponding reshuffled sequences.

  8. The Whole Genome Assembly and Comparative Genomic Research of Thellungiella parvula (Extremophile Crucifer Mitochondrion

    Directory of Open Access Journals (Sweden)

    Xuelin Wang

    2016-01-01

    Full Text Available The complete nucleotide sequences of the mitochondrial (mt genome of an extremophile species Thellungiella parvula (T. parvula have been determined with the lengths of 255,773 bp. T. parvula mt genome is a circular sequence and contains 32 protein-coding genes, 19 tRNA genes, and three ribosomal RNA genes with a 11.5% coding sequence. The base composition of 27.5% A, 27.5% T, 22.7% C, and 22.3% G in descending order shows a slight bias of 55% AT. Fifty-three repeats were identified in the mitochondrial genome of T. parvula, including 24 direct repeats, 28 tandem repeats (TRs, and one palindromic repeat. Furthermore, a total of 199 perfect microsatellites have been mined with a high A/T content (83.1% through simple sequence repeat (SSR analysis and they were distributed unevenly within this mitochondrial genome. We also analyzed other plant mitochondrial genomes’ evolution in general, providing clues for the understanding of the evolution of organelles genomes in plants. Comparing with other Brassicaceae species, T. parvula is related to Arabidopsis thaliana whose characters of low temperature resistance have been well documented. This study will provide important genetic tools for other Brassicaceae species research and improve yields of economically important plants.

  9. Full splitting of the first zero-field steps in the I-V curve of Josephson junctions of intermediate length

    International Nuclear Information System (INIS)

    Hansen, J.B.; Divin, Y.Y.; Mygind, J.

    1986-01-01

    We report on the observation of full splitting of the first zero-field steps in the I-V curves of Josephson transmission lines of intermediate length Lroughly-equal(3--5)lambda/sub J/, where lambda/sub J/ is the Josephson penetration length. We study in detail how this splitting of the step into two branches depends on the temperature of the junction and on a weak applied magnetic field. We relate the splitting to excitations in the junctions whose behavior is described by the perturbed Sine-Gordon equation

  10. Genome-wide linkage scan for maximum and length-dependent knee muscle strength in young men: significant evidence for linkage at chromosome 14q24.3.

    Science.gov (United States)

    De Mars, G; Windelinckx, A; Huygens, W; Peeters, M W; Beunen, G P; Aerssens, J; Vlietinck, R; Thomis, M A I

    2008-05-01

    Maintenance of high muscular fitness is positively related to bone health, functionality in daily life and increasing insulin sensitivity, and negatively related to falls and fractures, morbidity and mortality. Heritability of muscle strength phenotypes ranges between 31% and 95%, but little is known about the identity of the genes underlying this complex trait. As a first attempt, this genome-wide linkage study aimed to identify chromosomal regions linked to muscle and bone cross-sectional area, isometric knee flexion and extension torque, and torque-length relationship for knee flexors and extensors. In total, 283 informative male siblings (17-36 years old), belonging to 105 families, were used to conduct a genome-wide SNP-based multipoint linkage analysis. The strongest evidence for linkage was found for the torque-length relationship of the knee flexors at 14q24.3 (LOD = 4.09; p<10(-5)). Suggestive evidence for linkage was found at 14q32.2 (LOD = 3.00; P = 0.005) for muscle and bone cross-sectional area, at 2p24.2 (LOD = 2.57; p = 0.01) for isometric knee torque at 30 degrees flexion, at 1q21.3, 2p23.3 and 18q11.2 (LOD = 2.33, 2.69 and 2.21; p<10(-4) for all) for the torque-length relationship of the knee extensors and at 18p11.31 (LOD = 2.39; p = 0.0004) for muscle-mass adjusted isometric knee extension torque. We conclude that many small contributing genes rather than a few important genes are involved in causing variation in different underlying phenotypes of muscle strength. Furthermore, some overlap in promising genomic regions were identified among different strength phenotypes.

  11. The complete chloroplast genome sequence of Aconitum coreanum and Aconitum carmichaelii and comparative analysis with other Aconitum species.

    Directory of Open Access Journals (Sweden)

    Inkyu Park

    Full Text Available Aconitum species (belonging to the Ranunculaceae are well known herbaceous medicinal ingredients and have great economic value in Asian countries. However, there are still limited genomic resources available for Aconitum species. In this study, we sequenced the chloroplast (cp genomes of two Aconitum species, A. coreanum and A. carmichaelii, using the MiSeq platform. The two Aconitum chloroplast genomes were 155,880 and 157,040 bp in length, respectively, and exhibited LSC and SSC regions separated by a pair of inverted repeat regions. Both cp genomes had 38% GC content and contained 131 unique functional genes including 86 protein-coding genes, eight ribosomal RNA genes, and 37 transfer RNA genes. The gene order, content, and orientation of the two Aconitum cp genomes exhibited the general structure of angiosperms, and were similar to those of other Aconitum species. Comparison of the cp genome structure and gene order with that of other Aconitum species revealed general contraction and expansion of the inverted repeat regions and single copy boundary regions. Divergent regions were also identified. In phylogenetic analysis, Aconitum species positon among the Ranunculaceae was determined with other family cp genomes in the Ranunculales. We obtained a barcoding target sequence in a divergent region, ndhC-trnV, and successfully developed a SCAR (sequence characterized amplified region marker for discrimination of A. coreanum. Our results provide useful genetic information and a specific barcode for discrimination of Aconitum species.

  12. Potency of full-length MGF to induce maximal activation of the IGF-I R Is similar to recombinant human IGF-I at high equimolar concentrations

    NARCIS (Netherlands)

    J.A.M.J.L. Janssen (Joseph); L.J. Hofland (Leo); C.J. Strasburger; E.S.R.D. Van Dungen (Elisabeth S.R. Den); M. Thevis (Mario)

    2016-01-01

    textabstractAims To compare full-length mechano growth factor (full-length MGF) with human recombinant insulin-like growth factor-I (IGF-I) and human recombinant insulin (HI) in their ability to activate the human IGF-I receptor (IGF-IR), the human insulin receptor (IR-A) and the human insulin

  13. Human social genomics.

    Directory of Open Access Journals (Sweden)

    Steven W Cole

    2014-08-01

    Full Text Available A growing literature in human social genomics has begun to analyze how everyday life circumstances influence human gene expression. Social-environmental conditions such as urbanity, low socioeconomic status, social isolation, social threat, and low or unstable social status have been found to associate with differential expression of hundreds of gene transcripts in leukocytes and diseased tissues such as metastatic cancers. In leukocytes, diverse types of social adversity evoke a common conserved transcriptional response to adversity (CTRA characterized by increased expression of proinflammatory genes and decreased expression of genes involved in innate antiviral responses and antibody synthesis. Mechanistic analyses have mapped the neural "social signal transduction" pathways that stimulate CTRA gene expression in response to social threat and may contribute to social gradients in health. Research has also begun to analyze the functional genomics of optimal health and thriving. Two emerging opportunities now stand to revolutionize our understanding of the everyday life of the human genome: network genomics analyses examining how systems-level capabilities emerge from groups of individual socially sensitive genomes and near-real-time transcriptional biofeedback to empirically optimize individual well-being in the context of the unique genetic, geographic, historical, developmental, and social contexts that jointly shape the transcriptional realization of our innate human genomic potential for thriving.

  14. Comparison of methods for genomic localization of gene trap sequences

    Directory of Open Access Journals (Sweden)

    Ferrin Thomas E

    2006-09-01

    Full Text Available Abstract Background Gene knockouts in a model organism such as mouse provide a valuable resource for the study of basic biology and human disease. Determining which gene has been inactivated by an untargeted gene trapping event poses a challenging annotation problem because gene trap sequence tags, which represent sequence near the vector insertion site of a trapped gene, are typically short and often contain unresolved residues. To understand better the localization of these sequences on the mouse genome, we compared stand-alone versions of the alignment programs BLAT, SSAHA, and MegaBLAST. A set of 3,369 sequence tags was aligned to build 34 of the mouse genome using default parameters for each algorithm. Known genome coordinates for the cognate set of full-length genes (1,659 sequences were used to evaluate localization results. Results In general, all three programs performed well in terms of localizing sequences to a general region of the genome, with only relatively subtle errors identified for a small proportion of the sequence tags. However, large differences in performance were noted with regard to correctly identifying exon boundaries. BLAT correctly identified the vast majority of exon boundaries, while SSAHA and MegaBLAST missed the majority of exon boundaries. SSAHA consistently reported the fewest false positives and is the fastest algorithm. MegaBLAST was comparable to BLAT in speed, but was the most susceptible to localizing sequence tags incorrectly to pseudogenes. Conclusion The differences in performance for sequence tags and full-length reference sequences were surprisingly small. Characteristic variations in localization results for each program were noted that affect the localization of sequence at exon boundaries, in particular.

  15. Nine Loci for Ocular Axial Length Identified through Genome-wide Association Studies, Including Shared Loci with Refractive Error

    Science.gov (United States)

    Cheng, Ching-Yu; Schache, Maria; Ikram, M. Kamran; Young, Terri L.; Guggenheim, Jeremy A.; Vitart, Veronique; MacGregor, Stuart; Verhoeven, Virginie J.M.; Barathi, Veluchamy A.; Liao, Jiemin; Hysi, Pirro G.; Bailey-Wilson, Joan E.; St. Pourcain, Beate; Kemp, John P.; McMahon, George; Timpson, Nicholas J.; Evans, David M.; Montgomery, Grant W.; Mishra, Aniket; Wang, Ya Xing; Wang, Jie Jin; Rochtchina, Elena; Polasek, Ozren; Wright, Alan F.; Amin, Najaf; van Leeuwen, Elisabeth M.; Wilson, James F.; Pennell, Craig E.; van Duijn, Cornelia M.; de Jong, Paulus T.V.M.; Vingerling, Johannes R.; Zhou, Xin; Chen, Peng; Li, Ruoying; Tay, Wan-Ting; Zheng, Yingfeng; Chew, Merwyn; Rahi, Jugnoo S.; Hysi, Pirro G.; Yoshimura, Nagahisa; Yamashiro, Kenji; Miyake, Masahiro; Delcourt, Cécile; Maubaret, Cecilia; Williams, Cathy; Guggenheim, Jeremy A.; Northstone, Kate; Ring, Susan M.; Davey-Smith, George; Craig, Jamie E.; Burdon, Kathryn P.; Fogarty, Rhys D.; Iyengar, Sudha K.; Igo, Robert P.; Chew, Emily; Janmahasathian, Sarayut; Iyengar, Sudha K.; Igo, Robert P.; Chew, Emily; Janmahasathian, Sarayut; Stambolian, Dwight; Wilson, Joan E. Bailey; MacGregor, Stuart; Lu, Yi; Jonas, Jost B.; Xu, Liang; Saw, Seang-Mei; Baird, Paul N.; Rochtchina, Elena; Mitchell, Paul; Wang, Jie Jin; Jonas, Jost B.; Nangia, Vinay; Hayward, Caroline; Wright, Alan F.; Vitart, Veronique; Polasek, Ozren; Campbell, Harry; Vitart, Veronique; Rudan, Igor; Vatavuk, Zoran; Vitart, Veronique; Paterson, Andrew D.; Hosseini, S. Mohsen; Iyengar, Sudha K.; Igo, Robert P.; Fondran, Jeremy R.; Young, Terri L.; Feng, Sheng; Verhoeven, Virginie J.M.; Klaver, Caroline C.; van Duijn, Cornelia M.; Metspalu, Andres; Haller, Toomas; Mihailov, Evelin; Pärssinen, Olavi; Wedenoja, Juho; Wilson, Joan E. Bailey; Wojciechowski, Robert; Baird, Paul N.; Schache, Maria; Pfeiffer, Norbert; Höhn, René; Pang, Chi Pui; Chen, Peng; Meitinger, Thomas; Oexle, Konrad; Wegner, Aharon; Yoshimura, Nagahisa; Yamashiro, Kenji; Miyake, Masahiro; Pärssinen, Olavi; Yip, Shea Ping; Ho, Daniel W.H.; Pirastu, Mario; Murgia, Federico; Portas, Laura; Biino, Genevra; Wilson, James F.; Fleck, Brian; Vitart, Veronique; Stambolian, Dwight; Wilson, Joan E. Bailey; Hewitt, Alex W.; Ang, Wei; Verhoeven, Virginie J.M.; Klaver, Caroline C.; van Duijn, Cornelia M.; Saw, Seang-Mei; Wong, Tien-Yin; Teo, Yik-Ying; Fan, Qiao; Cheng, Ching-Yu; Zhou, Xin; Ikram, M. Kamran; Saw, Seang-Mei; Teo, Yik-Ying; Fan, Qiao; Cheng, Ching-Yu; Zhou, Xin; Ikram, M. Kamran; Saw, Seang-Mei; Wong, Tien-Yin; Teo, Yik-Ying; Fan, Qiao; Cheng, Ching-Yu; Zhou, Xin; Ikram, M. Kamran; Saw, Seang-Mei; Wong, Tien-Yin; Teo, Yik-Ying; Fan, Qiao; Cheng, Ching-Yu; Zhou, Xin; Ikram, M. Kamran; Saw, Seang-Mei; Tai, E-Shyong; Teo, Yik-Ying; Fan, Qiao; Cheng, Ching-Yu; Zhou, Xin; Ikram, M. Kamran; Saw, Seang-Mei; Teo, Yik-Ying; Fan, Qiao; Cheng, Ching-Yu; Zhou, Xin; Ikram, M. Kamran; Mackey, David A.; MacGregor, Stuart; Hammond, Christopher J.; Hysi, Pirro G.; Deangelis, Margaret M.; Morrison, Margaux; Zhou, Xiangtian; Chen, Wei; Paterson, Andrew D.; Hosseini, S. Mohsen; Mizuki, Nobuhisa; Meguro, Akira; Lehtimäki, Terho; Mäkelä, Kari-Matti; Raitakari, Olli; Kähönen, Mika; Burdon, Kathryn P.; Craig, Jamie E.; Iyengar, Sudha K.; Igo, Robert P.; Lass, Jonathan H.; Reinhart, William; Belin, Michael W.; Schultze, Robert L.; Morason, Todd; Sugar, Alan; Mian, Shahzad; Soong, Hunson Kaz; Colby, Kathryn; Jurkunas, Ula; Yee, Richard; Vital, Mark; Alfonso, Eduardo; Karp, Carol; Lee, Yunhee; Yoo, Sonia; Hammersmith, Kristin; Cohen, Elisabeth; Laibson, Peter; Rapuano, Christopher; Ayres, Brandon; Croasdale, Christopher; Caudill, James; Patel, Sanjay; Baratz, Keith; Bourne, William; Maguire, Leo; Sugar, Joel; Tu, Elmer; Djalilian, Ali; Mootha, Vinod; McCulley, James; Bowman, Wayne; Cavanaugh, H. Dwight; Verity, Steven; Verdier, David; Renucci, Ann; Oliva, Matt; Rotkis, Walter; Hardten, David R.; Fahmy, Ahmad; Brown, Marlene; Reeves, Sherman; Davis, Elizabeth A.; Lindstrom, Richard; Hauswirth, Scott; Hamilton, Stephen; Lee, W. Barry; Price, Francis; Price, Marianne; Kelly, Kathleen; Peters, Faye; Shaughnessy, Michael; Steinemann, Thomas; Dupps, B.J.; Meisler, David M.; Mifflin, Mark; Olson, Randal; Aldave, Anthony; Holland, Gary; Mondino, Bartly J.; Rosenwasser, George; Gorovoy, Mark; Dunn, Steven P.; Heidemann, David G.; Terry, Mark; Shamie, Neda; Rosenfeld, Steven I.; Suedekum, Brandon; Hwang, David; Stone, Donald; Chodosh, James; Galentine, Paul G.; Bardenstein, David; Goddard, Katrina; Chin, Hemin; Mannis, Mark; Varma, Rohit; Borecki, Ingrid; Chew, Emily Y.; Haller, Toomas; Mihailov, Evelin; Metspalu, Andres; Wedenoja, Juho; Simpson, Claire L.; Wojciechowski, Robert; Höhn, René; Mirshahi, Alireza; Zeller, Tanja; Pfeiffer, Norbert; Lackner, Karl J.; Donnelly, Peter; Barroso, Ines; Blackwell, Jenefer M.; Bramon, Elvira; Brown, Matthew A.; Casas, Juan P.; Corvin, Aiden; Deloukas, Panos; Duncanson, Audrey; Jankowski, Janusz; Markus, Hugh S.; Mathew, Christopher G.; Palmer, Colin N.A.; Plomin, Robert; Rautanen, Anna; Sawcer, Stephen J.; Trembath, Richard C.; Viswanathan, Ananth C.; Wood, Nicholas W.; Spencer, Chris C.A.; Band, Gavin; Bellenguez, Céline; Freeman, Colin; Hellenthal, Garrett; Giannoulatou, Eleni; Pirinen, Matti; Pearson, Richard; Strange, Amy; Su, Zhan; Vukcevic, Damjan; Donnelly, Peter; Langford, Cordelia; Hunt, Sarah E.; Edkins, Sarah; Gwilliam, Rhian; Blackburn, Hannah; Bumpstead, Suzannah J.; Dronov, Serge; Gillman, Matthew; Gray, Emma; Hammond, Naomi; Jayakumar, Alagurevathi; McCann, Owen T.; Liddle, Jennifer; Potter, Simon C.; Ravindrarajah, Radhi; Ricketts, Michelle; Waller, Matthew; Weston, Paul; Widaa, Sara; Whittaker, Pamela; Barroso, Ines; Deloukas, Panos; Mathew, Christopher G.; Blackwell, Jenefer M.; Brown, Matthew A.; Corvin, Aiden; Spencer, Chris C.A.; Bettecken, Thomas; Meitinger, Thomas; Oexle, Konrad; Pirastu, Mario; Portas, Laura; Nag, Abhishek; Williams, Katie M.; Yonova-Doing, Ekaterina; Klein, Ronald; Klein, Barbara E.; Hosseini, S. Mohsen; Paterson, Andrew D.; Genuth, S.; Nathan, D.M.; Zinman, B.; Crofford, O.; Crandall, J.; Reid, M.; Brown-Friday, J.; Engel, S.; Sheindlin, J.; Martinez, H.; Shamoon, H.; Engel, H.; Phillips, M.; Gubitosi-Klug, R.; Mayer, L.; Pendegast, S.; Zegarra, H.; Miller, D.; Singerman, L.; Smith-Brewer, S.; Novak, M.; Quin, J.; Dahms, W.; Genuth, Saul; Palmert, M.; Brillon, D.; Lackaye, M.E.; Kiss, S.; Chan, R.; Reppucci, V.; Lee, T.; Heinemann, M.; Whitehouse, F.; Kruger, D.; Jones, J.K.; McLellan, M.; Carey, J.D.; Angus, E.; Thomas, A.; Galprin, A.; Bergenstal, R.; Johnson, M.; Spencer, M.; Morgan, K.; Etzwiler, D.; Kendall, D.; Aiello, Lloyd Paul; Golden, E.; Jacobson, A.; Beaser, R.; Ganda, O.; Hamdy, O.; Wolpert, H.; Sharuk, G.; Arrigg, P.; Schlossman, D.; Rosenzwieg, J.; Rand, L.; Nathan, D.M.; Larkin, M.; Ong, M.; Godine, J.; Cagliero, E.; Lou, P.; Folino, K.; Fritz, S.; Crowell, S.; Hansen, K.; Gauthier-Kelly, C.; Service, J.; Ziegler, G.; Luttrell, L.; Caulder, S.; Lopes-Virella, M.; Colwell, J.; Soule, J.; Fernandes, J.; Hermayer, K.; Kwon, S.; Brabham, M.; Blevins, A.; Parker, J.; Lee, D.; Patel, N.; Pittman, C.; Lindsey, P.; Bracey, M.; Lee, K.; Nutaitis, M.; Farr, A.; Elsing, S.; Thompson, T.; Selby, J.; Lyons, T.; Yacoub-Wasef, S.; Szpiech, M.; Wood, D.; Mayfield, R.; Molitch, M.; Schaefer, B.; Jampol, L.; Lyon, A.; Gill, M.; Strugula, Z.; Kaminski, L.; Mirza, R.; Simjanoski, E.; Ryan, D.; Kolterman, O.; Lorenzi, G.; Goldbaum, M.; Sivitz, W.; Bayless, M.; Counts, D.; Johnsonbaugh, S.; Hebdon, M.; Salemi, P.; Liss, R.; Donner, T.; Gordon, J.; Hemady, R.; Kowarski, A.; Ostrowski, D.; Steidl, S.; Jones, B.; Herman, W.H.; Martin, C.L.; Pop-Busui, R.; Sarma, A.; Albers, J.; Feldman, E.; Kim, K.; Elner, S.; Comer, G.; Gardner, T.; Hackel, R.; Prusak, R.; Goings, L.; Smith, A.; Gothrup, J.; Titus, P.; Lee, J.; Brandle, M.; Prosser, L.; Greene, D.A.; Stevens, M.J.; Vine, A.K.; Bantle, J.; Wimmergren, N.; Cochrane, A.; Olsen, T.; Steuer, E.; Rath, P.; Rogness, B.; Hainsworth, D.; Goldstein, D.; Hitt, S.; Giangiacomo, J.; Schade, D.S.; Canady, J.L.; Chapin, J.E.; Ketai, L.H.; Braunstein, C.S.; Bourne, P.A.; Schwartz, S.; Brucker, A.; Maschak-Carey, B.J.; Baker, L.; Orchard, T.; Silvers, N.; Ryan, C.; Songer, T.; Doft, B.; Olson, S.; Bergren, R.L.; Lobes, L.; Rath, P. Paczan; Becker, D.; Rubinstein, D.; Conrad, P.W.; Yalamanchi, S.; Drash, A.; Morrison, A.; Bernal, M.L.; Vaccaro-Kish, J.; Malone, J.; Pavan, P.R.; Grove, N.; Iyer, M.N.; Burrows, A.F.; Tanaka, E.A.; Gstalder, R.; Dagogo-Jack, S.; Wigley, C.; Ricks, H.; Kitabchi, A.; Murphy, M.B.; Moser, S.; Meyer, D.; Iannacone, A.; Chaum, E.; Yoser, S.; Bryer-Ash, M.; Schussler, S.; Lambeth, H.; Raskin, P.; Strowig, S.; Zinman, B.; Barnie, A.; Devenyi, R.; Mandelcorn, M.; Brent, M.; Rogers, S.; Gordon, A.; Palmer, J.; Catton, S.; Brunzell, J.; Wessells, H.; de Boer, I.H.; Hokanson, J.; Purnell, J.; Ginsberg, J.; Kinyoun, J.; Deeb, S.; Weiss, M.; Meekins, G.; Distad, J.; Van Ottingham, L.; Dupre, J.; Harth, J.; Nicolle, D.; Driscoll, M.; Mahon, J.; Canny, C.; May, M.; Lipps, J.; Agarwal, A.; Adkins, T.; Survant, L.; Pate, R.L.; Munn, G.E.; Lorenz, R.; Feman, S.; White, N.; Levandoski, L.; Boniuk, I.; Grand, G.; Thomas, M.; Joseph, D.D.; Blinder, K.; Shah, G.; Boniuk; Burgess; Santiago, J.; Tamborlane, W.; Gatcomb, P.; Stoessel, K.; Taylor, K.; Goldstein, J.; Novella, S.; Mojibian, H.; Cornfeld, D.; Lima, J.; Bluemke, D.; Turkbey, E.; van der Geest, R.J.; Liu, C.; Malayeri, A.; Jain, A.; Miao, C.; Chahal, H.; Jarboe, R.; Maynard, J.; Gubitosi-Klug, R.; Quin, J.; Gaston, P.; Palmert, M.; Trail, R.; Dahms, W.; Lachin, J.; Cleary, P.; Backlund, J.; Sun, W.; Braffett, B.; Klumpp, K.; Chan, K.; Diminick, L.; Rosenberg, D.; Petty, B.; Determan, A.; Kenny, D.; Rutledge, B.; Younes, Naji; Dews, L.; Hawkins, M.; Cowie, C.; Fradkin, J.; Siebert, C.; Eastman, R.; Danis, R.; Gangaputra, S.; Neill, S.; Davis, M.; Hubbard, L.; Wabers, H.; Burger, M.; Dingledine, J.; Gama, V.; Sussman, R.; Steffes, M.; Bucksa, J.; Nowicki, M.; Chavers, B.; O’Leary, D.; Polak, J.; Harrington, A.; Funk, L.; Crow, R.; Gloeb, B.; Thomas, S.; O’Donnell, C.; Soliman, E.; Zhang, Z.M.; Prineas, R.; Campbell, C.; Ryan, C.; Sandstrom, D.; Williams, T.; Geckle, M.; Cupelli, E.; Thoma, F.; Burzuk, B.; Woodfill, T.; Low, P.; Sommer, C.; Nickander, K.; Budoff, M.; Detrano, R.; Wong, N.; Fox, M.; Kim, L.; Oudiz, R.; Weir, G.; Espeland, M.; Manolio, T.; Rand, L.; Singer, D.; Stern, M.; Boulton, A.E.; Clark, C.; D’Agostino, R.; Lopes-Virella, M.; Garvey, W.T.; Lyons, T.J.; Jenkins, A.; Virella, G.; Jaffa, A.; Carter, Rickey; Lackland, D.; Brabham, M.; McGee, D.; Zheng, D.; Mayfield, R.K.; Boright, A.; Bull, S.; Sun, L.; Scherer, S.; Zinman, B.; Natarajan, R.; Miao, F.; Zhang, L.; Chen;, Z.; Nathan, D.M.; Makela, Kari-Matti; Lehtimaki, Terho; Kahonen, Mika; Raitakari, Olli; Yoshimura, Nagahisa; Matsuda, Fumihiko; Chen, Li Jia; Pang, Chi Pui; Yip, Shea Ping; Yap, Maurice K.H.; Meguro, Akira; Mizuki, Nobuhisa; Inoko, Hidetoshi; Foster, Paul J.; Zhao, Jing Hua; Vithana, Eranga; Tai, E-Shyong; Fan, Qiao; Xu, Liang; Campbell, Harry; Fleck, Brian; Rudan, Igor; Aung, Tin; Hofman, Albert; Uitterlinden, André G.; Bencic, Goran; Khor, Chiea-Chuen; Forward, Hannah; Pärssinen, Olavi; Mitchell, Paul; Rivadeneira, Fernando; Hewitt, Alex W.; Williams, Cathy; Oostra, Ben A.; Teo, Yik-Ying; Hammond, Christopher J.; Stambolian, Dwight; Mackey, David A.; Klaver, Caroline C.W.; Wong, Tien-Yin; Saw, Seang-Mei; Baird, Paul N.

    2013-01-01

    Refractive errors are common eye disorders of public health importance worldwide. Ocular axial length (AL) is the major determinant of refraction and thus of myopia and hyperopia. We conducted a meta-analysis of genome-wide association studies for AL, combining 12,531 Europeans and 8,216 Asians. We identified eight genome-wide significant loci for AL (RSPO1, C3orf26, LAMA2, GJD2, ZNRF3, CD55, MIP, and ALPPL2) and confirmed one previously reported AL locus (ZC3H11B). Of the nine loci, five (LAMA2, GJD2, CD55, ALPPL2, and ZC3H11B) were associated with refraction in 18 independent cohorts (n = 23,591). Differential gene expression was observed for these loci in minus-lens-induced myopia mouse experiments and human ocular tissues. Two of the AL genes, RSPO1 and ZNRF3, are involved in Wnt signaling, a pathway playing a major role in the regulation of eyeball size. This study provides evidence of shared genes between AL and refraction, but importantly also suggests that these traits may have unique pathways. PMID:24144296

  16. The Most Developmentally Truncated Fishes Show Extensive Hox Gene Loss and Miniaturized Genomes

    Science.gov (United States)

    Malmstrøm, Martin; Britz, Ralf; Matschiner, Michael; Tørresen, Ole K; Hadiaty, Renny Kurnia; Yaakob, Norsham; Tan, Heok Hui; Jakobsen, Kjetill Sigurd; Salzburger, Walter; Rüber, Lukas

    2018-01-01

    Abstract The world’s smallest fishes belong to the genus Paedocypris. These miniature fishes are endemic to an extreme habitat: the peat swamp forests in Southeast Asia, characterized by highly acidic blackwater. This threatened habitat is home to a large array of fishes, including a number of miniaturized but also developmentally truncated species. Especially the genus Paedocypris is characterized by profound, organism-wide developmental truncation, resulting in sexually mature individuals of <8 mm in length with a larval phenotype. Here, we report on evolutionary simplification in the genomes of two species of the dwarf minnow genus Paedocypris using whole-genome sequencing. The two species feature unprecedented Hox gene loss and genome reduction in association with their massive developmental truncation. We also show how other genes involved in the development of musculature, nervous system, and skeleton have been lost in Paedocypris, mirroring its highly progenetic phenotype. Further, our analyses suggest two mechanisms responsible for the genome streamlining in Paedocypris in relation to other Cypriniformes: severe intron shortening and reduced repeat content. As the first report on the genomic sequence of a vertebrate species with organism-wide developmental truncation, the results of our work enhance our understanding of genome evolution and how genotypes are translated to phenotypes. In addition, as a naturally simplified system closely related to zebrafish, Paedocypris provides novel insights into vertebrate development. PMID:29684203

  17. The Most Developmentally Truncated Fishes Show Extensive Hox Gene Loss and Miniaturized Genomes.

    Science.gov (United States)

    Malmstrøm, Martin; Britz, Ralf; Matschiner, Michael; Tørresen, Ole K; Hadiaty, Renny Kurnia; Yaakob, Norsham; Tan, Heok Hui; Jakobsen, Kjetill Sigurd; Salzburger, Walter; Rüber, Lukas

    2018-04-01

    The world's smallest fishes belong to the genus Paedocypris. These miniature fishes are endemic to an extreme habitat: the peat swamp forests in Southeast Asia, characterized by highly acidic blackwater. This threatened habitat is home to a large array of fishes, including a number of miniaturized but also developmentally truncated species. Especially the genus Paedocypris is characterized by profound, organism-wide developmental truncation, resulting in sexually mature individuals of <8 mm in length with a larval phenotype. Here, we report on evolutionary simplification in the genomes of two species of the dwarf minnow genus Paedocypris using whole-genome sequencing. The two species feature unprecedented Hox gene loss and genome reduction in association with their massive developmental truncation. We also show how other genes involved in the development of musculature, nervous system, and skeleton have been lost in Paedocypris, mirroring its highly progenetic phenotype. Further, our analyses suggest two mechanisms responsible for the genome streamlining in Paedocypris in relation to other Cypriniformes: severe intron shortening and reduced repeat content. As the first report on the genomic sequence of a vertebrate species with organism-wide developmental truncation, the results of our work enhance our understanding of genome evolution and how genotypes are translated to phenotypes. In addition, as a naturally simplified system closely related to zebrafish, Paedocypris provides novel insights into vertebrate development.

  18. Copper Coordination in the Full-Length, Recombinant Prion Protein†

    Science.gov (United States)

    Burns, Colin S.; Aronoff-Spencer, Eliah; Legname, Giuseppe; Prusiner, Stanley B.; Antholine, William E.; Gerfen, Gary J.; Peisach, Jack; Millhauser, Glenn L.

    2010-01-01

    The prion protein (PrP) binds divalent copper at physiologically relevant conditions and is believed to participate in copper regulation or act as a copper-dependent enzyme. Ongoing studies aim at determining the molecular features of the copper binding sites. The emerging consensus is that most copper binds in the octarepeat domain, which is composed of four or more copies of the fundamental sequence PHGGGWGQ. Previous work from our laboratory using PrP-derived peptides, in conjunction with EPR and X-ray crystallography, demonstrated that the HGGGW segment constitutes the fundamental binding unit in the octarepeat domain [Burns et al. (2002) Biochemistry 41, 3991–4001; Aronoff-Spencer et al. (2000) Biochemistry 39, 13760–13771]. Copper coordination arises from the His imidazole and sequential deprotonated glycine amides. In this present work, recombinant, full-length Syrian hamster PrP is investigated using EPR methodologies. Four copper ions are taken up in the octarepeat domain, which supports previous findings. However, quantification studies reveal a fifth binding site in the flexible region between the octarepeats and the PrP globular C-terminal domain. A series of PrP peptide constructs show that this site involves His96 in the PrP(92–96) segment GGGTH. Further examination by X-band EPR, S-band EPR, and electron spin–echo envelope spectroscopy, demonstrates coordination by the His96 imidazole and the glycine preceding the threonine. The copper affinity for this type of binding site is highly pH dependent, and EPR studies here show that recombinant PrP loses its affinity for copper below pH 6.0. These studies seem to provide a complete profile of the copper binding sites in PrP and support the hypothesis that PrP function is related to its ability to bind copper in a pH-dependent fashion. PMID:12779334

  19. Deciphering the hybridisation history leading to the Lager lineage based on the mosaic genomes of Saccharomyces bayanus strains NBRC1948 and CBS380.

    Directory of Open Access Journals (Sweden)

    Huu-Vang Nguyen

    Full Text Available Saccharomyces bayanus is a yeast species described as one of the two parents of the hybrid brewing yeast S. pastorianus. Strains CBS380(T and NBRC1948 have been retained successively as pure-line representatives of S. bayanus. In the present study, sequence analyses confirmed and upgraded our previous finding: S. bayanus type strain CBS380(T harbours a mosaic genome. The genome of strain NBRC1948 was also revealed to be mosaic. Both genomes were characterized by amplification and sequencing of different markers, including genes involved in maltotriose utilization or genes detected by array-CGH mapping. Sequence comparisons with public Saccharomyces spp. nucleotide sequences revealed that the CBS380(T and NBRC1948 genomes are composed of: a predominant non-cerevisiae genetic background belonging to S. uvarum, a second unidentified species provisionally named S. lagerae, and several introgressed S. cerevisiae fragments. The largest cerevisiae-introgressed DNA common to both genomes totals 70kb in length and is distributed in three contigs, cA, cB and cC. These vary in terms of length and presence of MAL31 or MTY1 (maltotriose-transporter gene. In NBRC1948, two additional cerevisiae-contigs, cD and cE, totaling 12kb in length, as well as several smaller cerevisiae fragments were identified. All of these contigs were partially detected in the genomes of S. pastorianus lager strains CBS1503 (S. monacensis and CBS1513 (S. carlsbergensis explaining the noticeable common ability of S. bayanus and S. pastorianus to metabolize maltotriose. NBRC1948 was shown to be inter-fertile with S. uvarum CBS7001. The cross involving these two strains produced F1 segregants resembling the strains CBS380(T or NRRLY-1551. This demonstrates that these S. bayanus strains were the offspring of a cross between S. uvarum and a strain similar to NBRC1948. Phylogenies established with selected cerevisiae and non-cerevisiae genes allowed us to decipher the complex hybridisation

  20. Identification of full-length transmitted/founder viruses and their progeny in primary HIV-1 infection

    Energy Technology Data Exchange (ETDEWEB)

    Korber, Bette [Los Alamos National Laboratory; Hraber, Peter [Los Alamos National Laboratory; Giorgi, Elena [Los Alamos National Laboratory; Bhattacharya, T [Los Alamos National Laboratory

    2009-01-01

    Identification of transmitted/founder virus genomes and their progeny by is a novel strategy for probing the molecular basis of HIV-1 transmission and for evaluating the genetic imprint of viral and host factors that act to constrain or facilitate virus replication. Here, we show in a cohort of twelve acutely infected subjects (9 clade B; 3 clade C), that complete genomic sequences of transmitted/founder viruses could be inferred using single genome amplification of plasma viral RNA, direct amplicon sequencing, and a model of random virus evolution. This allowed for the precise identification, chemical synthesis, molecular cloning, and biological analysis of those viruses actually responsible for productive clinical infection and for a comprehensive mapping of sequential viral genomes and proteomes for mutations that are necessary or incidental to the establishment of HIV-1 persistence. Transmitted/founder viruses were CD4 and CCR5 tropic, replicated preferentially in activated primary T-Iymphocytes but not monocyte-derived macrophages, and were effectively shielded from most heterologous or broadly neutralizing antibodies. By 3 months of infection, the evolving viral quasispecies in three subjects showed mutational fixation at only 2-5 discreet genomic loci. By 6-12 months, mutational fixation was evident at 18-27 genomic loci. Some, but not all, of these mutations were attributable to virus escape from cytotoxic Tlymphocytes or neutralizing antibodies, suggesting that other viral or host factors may influence early HIV -1 fitness.

  1. LRSim: A Linked-Reads Simulator Generating Insights for Better Genome Partitioning

    Directory of Open Access Journals (Sweden)

    Ruibang Luo

    Full Text Available Linked-read sequencing, using highly-multiplexed genome partitioning and barcoding, can span hundreds of kilobases to improve de novo assembly, haplotype phasing, and other applications. Based on our analysis of 14 datasets, we introduce LRSim that simulates linked-reads by emulating the library preparation and sequencing process with fine control over variants, linked-read characteristics, and the short-read profile. We conclude from the phasing and assembly of multiple datasets, recommendations on coverage, fragment length, and partitioning when sequencing genomes of different sizes and complexities. These optimizations improve results by orders of magnitude, and enable the development of novel methods. LRSim is available at https://github.com/aquaskyline/LRSIM. Keywords: Linked-read, Molecular barcoding, Reads partitioning, Phasing, Reads simulation, Genome assembly, 10X Genomics

  2. Characterization of Toll-like receptor 3 gene in large yellow croaker, Pseudosciaena crocea.

    Science.gov (United States)

    Huang, Xue-Na; Wang, Zhi-Yong; Yao, Cui-Luan

    2011-07-01

    Toll-like receptor 3 (TLR3) plays an important role in innate immune responses. In this report, the full-length cDNA sequence and genomic structure of Pseudosciaena crocea TLR3 (PcTLR3) were identified and characterized. The full-length cDNA of PcTLR3 was of 3384 bp, including a 5'-terminal untranslated region (UTR) of 65 bp, a 3'-terminal UTR of 589 bp and an open reading frame (ORF) of 2730 bp encoding a polypeptide of 909 amino acid residues. The full-length genome sequence of PcTLR3 was composed of 5721 nucleotides, including five exons and four introns. The putative PcTLR3 protein contained a signal peptide sequence, 16 leucine-rich repeat (LRR) motifs, a transmembrane region and a Toll/interleukin-1 receptor (TIR) domain. Quantitative real-time reverse transcription PCR analysis revealed a broad expression of PcTLR3 in most tissues, with the predominant expression in liver, then intestine, and the weakest expression in blood cells. The expression of PcTLR3 after injection with poly inosinic:cytidylic (I:C) and Vibrio parahemolyticus was tested in spleen, blood cells and liver. The results indicated that PcTLR3 transcripts could be induced in the three tissues by injection with poly I:C. The highest expression was in the blood cells with 43.5 times (at 6h) greater expression than in the control (pparahemolyticus challenge, a moderate up-regulation and down-regulation of PcTLR3 was found in blood cells and liver, respectively. Our results suggested that PcTLR3 might play an important role in fish's defense against both viral and bacterial infection. Copyright © 2011 Elsevier Ltd. All rights reserved.

  3. Beam test of a full-length prototype of the BESIII drift chamber with the readout electronics

    International Nuclear Information System (INIS)

    Qin, Z.H.; Chen, Y.B.; Sheng, H.Y.; Wu, L.H.; Liu, J.B.; Zhuang, B.A.; Jiang, X.S.; Zhao, Y.B.; Zhu, K.J.; Yan, Z.K.; Chen, C.; Xu, M.H.; Wang, L.; Ma, X.Y.; Tang, X.; Liu, R.G.; Jin, Y.; Zhu, Q.M.; Zhang, G.F.; Wu, Z.; Li, R.Y.; Zhao, P.P.; Dai, H.L.; Li, X.P.; Li, J.

    2007-01-01

    A full-length prototype of the BESIII drift chamber together with its readout electronics was built and a beam test was performed. Two different methods, namely 'single-threshold method' and 'double-threshold method' for timing measurement, were studied. Test results show that the BESIII drift chamber and its readout electronics can reach their design specifications. The 'double-threshold method' results in a better timing accuracy and noise suppression capabilities as compared with the 'single-threshold method'

  4. Genome wide SSR high density genetic map construction from an interspecific cross of Gossypium hirsutum × Gossypium tomentosum

    Directory of Open Access Journals (Sweden)

    Muhammad Kashif Riaz eKhan

    2016-04-01

    Full Text Available A high density genetic map was constructed using F2 population derived from an interspecific cross of G. hirsutum x G. tomentosum. The map consisted of 3,093 marker loci distributed across all the 26 chromosomes and covered 4,365.3 cM of cotton genome with an average inter-marker distance of 1.48 cM. The maximum length of chromosome was 218.38 cM and the minimum was 122.09 cM with an average length of 167.90 cM. A sub-genome covers more genetic distance (2,189.01 cM with an average inter loci distance of 1.53 cM than D sub-genome which covers a length of 2,176.29 cM with an average distance of 1.43 cM. There were 716 distorted loci in the map accounting for 23.14% and most distorted loci were distributed on D sub-genome (25.06%, which were more than on A sub-genome (21.23%. In our map 49 segregation hotspots (SDR were distributed across the genome with more on D sub-genome as compared to A genome. Two post-polyploidization reciprocal translocations of A2/A3 and A4/A5 were suggested by 7 pairs of duplicate loci. The map constructed through these studies is one of the three densest genetic maps in cotton however; this is the first dense genome wide SSR interspecific genetic map between G. hirsutum and G. tomentosum.

  5. Virtual Northern analysis of the human genome.

    Directory of Open Access Journals (Sweden)

    Evan H Hurowitz

    2007-05-01

    Full Text Available We applied the Virtual Northern technique to human brain mRNA to systematically measure human mRNA transcript lengths on a genome-wide scale.We used separation by gel electrophoresis followed by hybridization to cDNA microarrays to measure 8,774 mRNA transcript lengths representing at least 6,238 genes at high (>90% confidence. By comparing these transcript lengths to the Refseq and H-Invitational full-length cDNA databases, we found that nearly half of our measurements appeared to represent novel transcript variants. Comparison of length measurements determined by hybridization to different cDNAs derived from the same gene identified clones that potentially correspond to alternative transcript variants. We observed a close linear relationship between ORF and mRNA lengths in human mRNAs, identical in form to the relationship we had previously identified in yeast. Some functional classes of protein are encoded by mRNAs whose untranslated regions (UTRs tend to be longer or shorter than average; these functional classes were similar in both human and yeast.Human transcript diversity is extensive and largely unannotated. Our length dataset can be used as a new criterion for judging the completeness of cDNAs and annotating mRNA sequences. Similar relationships between the lengths of the UTRs in human and yeast mRNAs and the functions of the proteins they encode suggest that UTR sequences serve an important regulatory role among eukaryotes.

  6. Integration of Genome-Wide TF Binding and Gene Expression Data to Characterize Gene Regulatory Networks in Plant Development.

    Science.gov (United States)

    Chen, Dijun; Kaufmann, Kerstin

    2017-01-01

    Key transcription factors (TFs) controlling the morphogenesis of flowers and leaves have been identified in the model plant Arabidopsis thaliana. Recent genome-wide approaches based on chromatin immunoprecipitation (ChIP) followed by high-throughput DNA sequencing (ChIP-seq) enable systematic identification of genome-wide TF binding sites (TFBSs) of these regulators. Here, we describe a computational pipeline for analyzing ChIP-seq data to identify TFBSs and to characterize gene regulatory networks (GRNs) with applications to the regulatory studies of flower development. In particular, we provide step-by-step instructions on how to download, analyze, visualize, and integrate genome-wide data in order to construct GRNs for beginners of bioinformatics. The practical guide presented here is ready to apply to other similar ChIP-seq datasets to characterize GRNs of interest.

  7. A novel life cycle modeling system for Ebola virus shows a genome length-dependent role of VP24 in virus infectivity.

    Science.gov (United States)

    Watt, Ari; Moukambi, Felicien; Banadyga, Logan; Groseth, Allison; Callison, Julie; Herwig, Astrid; Ebihara, Hideki; Feldmann, Heinz; Hoenen, Thomas

    2014-09-01

    Work with infectious Ebola viruses is restricted to biosafety level 4 (BSL4) laboratories, presenting a significant barrier for studying these viruses. Life cycle modeling systems, including minigenome systems and transcription- and replication-competent virus-like particle (trVLP) systems, allow modeling of the virus life cycle under BSL2 conditions; however, all current systems model only certain aspects of the virus life cycle, rely on plasmid-based viral protein expression, and have been used to model only single infectious cycles. We have developed a novel life cycle modeling system allowing continuous passaging of infectious trVLPs containing a tetracistronic minigenome that encodes a reporter and the viral proteins VP40, VP24, and GP1,2. This system is ideally suited for studying morphogenesis, budding, and entry, in addition to genome replication and transcription. Importantly, the specific infectivity of trVLPs in this system was ∼ 500-fold higher than that in previous systems. Using this system for functional studies of VP24, we showed that, contrary to previous reports, VP24 only very modestly inhibits genome replication and transcription when expressed in a regulated fashion, which we confirmed using infectious Ebola viruses. Interestingly, we also discovered a genome length-dependent effect of VP24 on particle infectivity, which was previously undetected due to the short length of monocistronic minigenomes and which is due at least partially to a previously unknown function of VP24 in RNA packaging. Based on our findings, we propose a model for the function of VP24 that reconciles all currently available data regarding the role of VP24 in nucleocapsid assembly as well as genome replication and transcription. Ebola viruses cause severe hemorrhagic fevers in humans, with no countermeasures currently being available, and must be studied in maximum-containment laboratories. Only a few of these laboratories exist worldwide, limiting our ability to study

  8. Genome BLAST distance phylogenies inferred from whole plastid and whole mitochondrion genome sequences

    Directory of Open Access Journals (Sweden)

    Holland Barbara R

    2006-07-01

    Full Text Available Abstract Background Phylogenetic methods which do not rely on multiple sequence alignments are important tools in inferring trees directly from completely sequenced genomes. Here, we extend the recently described Genome BLAST Distance Phylogeny (GBDP strategy to compute phylogenetic trees from all completely sequenced plastid genomes currently available and from a selection of mitochondrial genomes representing the major eukaryotic lineages. BLASTN, TBLASTX, or combinations of both are used to locate high-scoring segment pairs (HSPs between two sequences from which pairwise similarities and distances are computed in different ways resulting in a total of 96 GBDP variants. The suitability of these distance formulae for phylogeny reconstruction is directly estimated by computing a recently described measure of "treelikeness", the so-called δ value, from the respective distance matrices. Additionally, we compare the trees inferred from these matrices using UPGMA, NJ, BIONJ, FastME, or STC, respectively, with the NCBI taxonomy tree of the taxa under study. Results Our results indicate that, at this taxonomic level, plastid genomes are much more valuable for inferring phylogenies than are mitochondrial genomes, and that distances based on breakpoints are of little use. Distances based on the proportion of "matched" HSP length to average genome length were best for tree estimation. Additionally we found that using TBLASTX instead of BLASTN and, particularly, combining TBLASTX and BLASTN leads to a small but significant increase in accuracy. Other factors do not significantly affect the phylogenetic outcome. The BIONJ algorithm results in phylogenies most in accordance with the current NCBI taxonomy, with NJ and FastME performing insignificantly worse, and STC performing as well if applied to high quality distance matrices. δ values are found to be a reliable predictor of phylogenetic accuracy. Conclusion Using the most treelike distance matrices, as

  9. Hyb-Seq: Combining Target Enrichment and Genome Skimming for Plant Phylogenomics

    Directory of Open Access Journals (Sweden)

    Kevin Weitemier

    2014-08-01

    Full Text Available Premise of the study: Hyb-Seq, the combination of target enrichment and genome skimming, allows simultaneous data collection for low-copy nuclear genes and high-copy genomic targets for plant systematics and evolution studies. Methods and Results: Genome and transcriptome assemblies for milkweed (Asclepias syriaca were used to design enrichment probes for 3385 exons from 768 genes (>1.6 Mbp followed by Illumina sequencing of enriched libraries. Hyb-Seq of 12 individuals (10 Asclepias species and two related genera resulted in at least partial assembly of 92.6% of exons and 99.7% of genes and an average assembly length >2 Mbp. Importantly, complete plastomes and nuclear ribosomal DNA cistrons were assembled using off-target reads. Phylogenomic analyses demonstrated signal conflict between genomes. Conclusions: The Hyb-Seq approach enables targeted sequencing of thousands of low-copy nuclear exons and flanking regions, as well as genome skimming of high-copy repeats and organellar genomes, to efficiently produce genome-scale data sets for phylogenomics.

  10. Analysis of the Complete Mitochondrial Genome Sequence of the Diploid Cotton Gossypium raimondii by Comparative Genomics Approaches

    Directory of Open Access Journals (Sweden)

    Changwei Bi

    2016-01-01

    Full Text Available Cotton is one of the most important economic crops and the primary source of natural fiber and is an important protein source for animal feed. The complete nuclear and chloroplast (cp genome sequences of G. raimondii are already available but not mitochondria. Here, we assembled the complete mitochondrial (mt DNA sequence of G. raimondii into a circular genome of length of 676,078 bp and performed comparative analyses with other higher plants. The genome contains 39 protein-coding genes, 6 rRNA genes, and 25 tRNA genes. We also identified four larger repeats (63.9 kb, 10.6 kb, 9.1 kb, and 2.5 kb in this mt genome, which may be active in intramolecular recombination in the evolution of cotton. Strikingly, nearly all of the G. raimondii mt genome has been transferred to nucleus on Chr1, and the transfer event must be very recent. Phylogenetic analysis reveals that G. raimondii, as a member of Malvaceae, is much closer to another cotton (G. barbadense than other rosids, and the clade formed by two Gossypium species is sister to Brassicales. The G. raimondii mt genome may provide a crucial foundation for evolutionary analysis, molecular biology, and cytoplasmic male sterility in cotton and other higher plants.

  11. Sequencing and characterization of the complete mitochondrial genome of Japanese Swellshark (Cephalloscyllium umbratile).

    Science.gov (United States)

    Zhu, Ke-Cheng; Liang, Yin-Yin; Wu, Na; Guo, Hua-Yang; Zhang, Nan; Jiang, Shi-Gui; Zhang, Dian-Chang

    2017-11-10

    To further comprehend the genome features of Cephalloscyllium umbratile (Carcharhiniformes), an endangered species, the complete mitochondrial DNA (mtDNA) was firstly sequenced and annotated. The full-length mtDNA of C. umbratile was 16,697 bp and contained ribosomal RNA (rRNA) genes, 13 protein-coding genes (PCGs), 23 transfer RNA (tRNA) genes, and a major non-coding control region. Each PCG was initiated by an authoritative ATN codon, except for COX1 initiated by a GTG codon. Seven of 13 PCGs had a typical TAA termination codon, while others terminated with a single T or TA. Moreover, the relative synonymous codon usage of the 13 PCGs was consistent with that of other published Carcharhiniformes. All tRNA genes had typical clover-leaf secondary structures, except for tRNA-Ser (GCT), which lacked the dihydrouridine 'DHU' arm. Furthermore, the analysis of the average Ka/Ks in the 13 PCGs of three Carcharhiniformes species indicated a strong purifying selection within this group. In addition, phylogenetic analysis revealed that C. umbratile was closely related to Glyphis glyphis and Glyphis garricki. Our data supply a useful resource for further studies on genetic diversity and population structure of C. umbratile.

  12. The Complete Chloroplast and Mitochondrial Genome Sequences of Boea hygrometrica: Insights into the Evolution of Plant Organellar Genomes

    Science.gov (United States)

    Wang, Xumin; Deng, Xin; Zhang, Xiaowei; Hu, Songnian; Yu, Jun

    2012-01-01

    The complete nucleotide sequences of the chloroplast (cp) and mitochondrial (mt) genomes of resurrection plant Boea hygrometrica (Bh, Gesneriaceae) have been determined with the lengths of 153,493 bp and 510,519 bp, respectively. The smaller chloroplast genome contains more genes (147) with a 72% coding sequence, and the larger mitochondrial genome have less genes (65) with a coding faction of 12%. Similar to other seed plants, the Bh cp genome has a typical quadripartite organization with a conserved gene in each region. The Bh mt genome has three recombinant sequence repeats of 222 bp, 843 bp, and 1474 bp in length, which divide the genome into a single master circle (MC) and four isomeric molecules. Compared to other angiosperms, one remarkable feature of the Bh mt genome is the frequent transfer of genetic material from the cp genome during recent Bh evolution. We also analyzed organellar genome evolution in general regarding genome features as well as compositional dynamics of sequence and gene structure/organization, providing clues for the understanding of the evolution of organellar genomes in plants. The cp-derived sequences including tRNAs found in angiosperm mt genomes support the conclusion that frequent gene transfer events may have begun early in the land plant lineage. PMID:22291979

  13. Genome characterization of the selected long- and short-sleep mouse lines.

    Science.gov (United States)

    Dowell, Robin; Odell, Aaron; Richmond, Phillip; Malmer, Daniel; Halper-Stromberg, Eitan; Bennett, Beth; Larson, Colin; Leach, Sonia; Radcliffe, Richard A

    2016-12-01

    The Inbred Long- and Short-Sleep (ILS, ISS) mouse lines were selected for differences in acute ethanol sensitivity using the loss of righting response (LORR) as the selection trait. The lines show an over tenfold difference in LORR and, along with a recombinant inbred panel derived from them (the LXS), have been widely used to dissect the genetic underpinnings of acute ethanol sensitivity. Here we have sequenced the genomes of the ILS and ISS to investigate the DNA variants that contribute to their sensitivity difference. We identified ~2.7 million high-confidence SNPs and small indels and ~7000 structural variants between the lines; variants were found to occur in 6382 annotated genes. Using a hidden Markov model, we were able to reconstruct the genome-wide ancestry patterns of the eight inbred progenitor strains from which the ILS and ISS were derived, and found that quantitative trait loci that have been mapped for LORR were slightly enriched for DNA variants. Finally, by mapping and quantifying RNA-seq reads from the ILS and ISS to their strain-specific genomes rather than to the reference genome, we found a substantial improvement in a differential expression analysis between the lines. This work will help in identifying and characterizing the DNA sequence variants that contribute to the difference in ethanol sensitivity between the ILS and ISS and will also aid in accurate quantification of RNA-seq data generated from the LXS RIs.

  14. The carcinogenic liver fluke, Clonorchis sinensis: new assembly, reannotation and analysis of the genome and characterization of tissue transcriptomes.

    Directory of Open Access Journals (Sweden)

    Yan Huang

    Full Text Available Clonorchis sinensis (C. sinensis, an important food-borne parasite that inhabits the intrahepatic bile duct and causes clonorchiasis, is of interest to both the public health field and the scientific research community. To learn more about the migration, parasitism and pathogenesis of C. sinensis at the molecular level, the present study developed an upgraded genomic assembly and annotation by sequencing paired-end and mate-paired libraries. We also performed transcriptome sequence analyses on multiple C. sinensis tissues (sucker, muscle, ovary and testis. Genes encoding molecules involved in responses to stimuli and muscle-related development were abundantly expressed in the oral sucker. Compared with other species, genes encoding molecules that facilitate the recognition and transport of cholesterol were observed in high copy numbers in the genome and were highly expressed in the oral sucker. Genes encoding transporters for fatty acids, glucose, amino acids and oxygen were also highly expressed, along with other molecules involved in metabolizing these substrates. All genes involved in energy metabolism pathways, including the β-oxidation of fatty acids, the citrate cycle, oxidative phosphorylation, and fumarate reduction, were expressed in the adults. Finally, we also provide valuable insights into the mechanism underlying the process of pathogenesis by characterizing the secretome of C. sinensis. The characterization and elaborate analysis of the upgraded genome and the tissue transcriptomes not only form a detailed and fundamental C. sinensis resource but also provide novel insights into the physiology and pathogenesis of C. sinensis. We anticipate that this work will aid the development of innovative strategies for the prevention and control of clonorchiasis.

  15. Studies of nontarget-mediated distribution of human full-length IgG1 antibody and its FAb fragment in cardiovascular and metabolic-related tissues.

    Science.gov (United States)

    Davidsson, Pia; Söderling, Ann-Sofi; Svensson, Lena; Ahnmark, Andrea; Flodin, Christine; Wanag, Ewa; Screpanti-Sundqvist, Valentina; Gennemark, Peter

    2015-05-01

    Tissue distribution and pharmacokinetics (PK) of full-length nontargeted antibody and its antigen-binding fragment (FAb) were evaluated for a range of tissues primarily of interest for cardiovascular and metabolic diseases. Mice were intravenously injected with a dose of 10 mg/kg of either human IgG1or its FAb fragment; perfused tissues were collected at a range of time points over 3 weeks for the human IgG1 antibody and 1 week for the human FAb antibody. Tissues were homogenized and antibody concentrations were measured by specific immunoassays on the Gyros system. Exposure in terms of maximum concentration (Cmax ) and area under the curve was assessed for all nine tissues. Tissue exposure of full-length antibody relative to plasma exposure was found to be between 1% and 10%, except for brain (0.2%). Relative concentrations of FAb antibody were the same, except for kidney tissue, where the antibody concentration was found to be ten times higher than in plasma. However, the absolute tissue uptake of full-length IgG was significantly higher than the absolute tissue uptake of the FAb antibody. This study provides a reference PK state for full-length whole and FAb antibodies in tissues related to cardiovascular and metabolic diseases that do not include antigen or antibody binding. © 2015 Wiley Periodicals, Inc. and the American Pharmacists Association.

  16. Genome wide characterization of simple sequence repeats in watermelon genome and their application in comparative mapping and genetic diversity analysis.

    Science.gov (United States)

    Zhu, Huayu; Song, Pengyao; Koo, Dal-Hoe; Guo, Luqin; Li, Yanman; Sun, Shouru; Weng, Yiqun; Yang, Luming

    2016-08-05

    Microsatellite markers are one of the most informative and versatile DNA-based markers used in plant genetic research, but their development has traditionally been difficult and costly. The whole genome sequencing with next-generation sequencing (NGS) technologies provides large amounts of sequence data to develop numerous microsatellite markers at whole genome scale. SSR markers have great advantage in cross-species comparisons and allow investigation of karyotype and genome evolution through highly efficient computation approaches such as in silico PCR. Here we described genome wide development and characterization of SSR markers in the watermelon (Citrullus lanatus) genome, which were then use in comparative analysis with two other important crop species in the Cucurbitaceae family: cucumber (Cucumis sativus L.) and melon (Cucumis melo L.). We further applied these markers in evaluating the genetic diversity and population structure in watermelon germplasm collections. A total of 39,523 microsatellite loci were identified from the watermelon draft genome with an overall density of 111 SSRs/Mbp, and 32,869 SSR primers were designed with suitable flanking sequences. The dinucleotide SSRs were the most common type representing 34.09 % of the total SSR loci and the AT-rich motifs were the most abundant in all nucleotide repeat types. In silico PCR analysis identified 832 and 925 SSR markers with each having a single amplicon in the cucumber and melon draft genome, respectively. Comparative analysis with these cross-species SSR markers revealed complicated mosaic patterns of syntenic blocks among the genomes of three species. In addition, genetic diversity analysis of 134 watermelon accessions with 32 highly informative SSR loci placed these lines into two groups with all accessions of C.lanatus var. citorides and three accessions of C. colocynthis clustered in one group and all accessions of C. lanatus var. lanatus and the remaining accessions of C. colocynthis

  17. Signals Involved in Regulation of Hepatitis C Virus RNA Genome Translation and Replication

    Directory of Open Access Journals (Sweden)

    Michael Niepmann

    2018-03-01

    Full Text Available Hepatitis C virus (HCV preferentially replicates in the human liver and frequently causes chronic infection, often leading to cirrhosis and liver cancer. HCV is an enveloped virus classified in the genus Hepacivirus in the family Flaviviridae and has a single-stranded RNA genome of positive orientation. The HCV RNA genome is translated and replicated in the cytoplasm. Translation is controlled by the Internal Ribosome Entry Site (IRES in the 5′ untranslated region (5′ UTR, while also downstream elements like the cis-replication element (CRE in the coding region and the 3′ UTR are involved in translation regulation. The cis-elements controlling replication of the viral RNA genome are located mainly in the 5′- and 3′-UTRs at the genome ends but also in the protein coding region, and in part these signals overlap with the signals controlling RNA translation. Many long-range RNA–RNA interactions (LRIs are predicted between different regions of the HCV RNA genome, and several such LRIs are actually involved in HCV translation and replication regulation. A number of RNA cis-elements recruit cellular RNA-binding proteins that are involved in the regulation of HCV translation and replication. In addition, the liver-specific microRNA-122 (miR-122 binds to two target sites at the 5′ end of the viral RNA genome as well as to at least three additional target sites in the coding region and the 3′ UTR. It is involved in the regulation of HCV RNA stability, translation and replication, thereby largely contributing to the hepatotropism of HCV. However, we are still far from completely understanding all interactions that regulate HCV RNA genome translation, stability, replication and encapsidation. In particular, many conclusions on the function of cis-elements in HCV replication have been obtained using full-length HCV genomes or near-full-length replicon systems. These include both genome ends, making it difficult to decide if a cis-element in

  18. Genome organization, instabilities, stem cells, and cancer

    Directory of Open Access Journals (Sweden)

    Senthil Kumar Pazhanisamy

    2009-01-01

    Full Text Available It is now widely recognized that advances in exploring genome organization provide remarkable insights on the induction and progression of chromosome abnormalities. Much of what we know about how mutations evolve and consequently transform into genome instabilities has been characterized in the spatial organization context of chromatin. Nevertheless, many underlying concepts of impact of the chromatin organization on perpetuation of multiple mutations and on propagation of chromosomal aberrations remain to be investigated in detail. Genesis of genome instabilities from accumulation of multiple mutations that drive tumorigenesis is increasingly becoming a focal theme in cancer studies. This review focuses on structural alterations evolve to raise a variety of genome instabilities that are manifested at the nucleotide, gene or sub-chromosomal, and whole chromosome level of genome. Here we explore an underlying connection between genome instability and cancer in the light of genome architecture. This review is limited to studies directed towards spatial organizational aspects of origin and propagation of aberrations into genetically unstable tumors.

  19. Five Complete Chloroplast Genome Sequences from Diospyros: Genome Organization and Comparative Analysis.

    Directory of Open Access Journals (Sweden)

    Jianmin Fu

    Full Text Available Diospyros is the largest genus in Ebenaceae, comprising more than 500 species with remarkable economic value, especially Diospyros kaki Thunb., which has traditionally been an important food resource in China, Korea, and Japan. Complete chloroplast (cp genomes from D. kaki, D. lotus L., D. oleifera Cheng., D. glaucifolia Metc., and Diospyros 'Jinzaoshi' were sequenced using Illumina sequencing technology. This is the first cp genome reported in Ebenaceae. The cp genome sequences of Diospyros ranged from 157,300 to 157,784 bp in length, presenting a typical quadripartite structure with two inverted repeats each separated by one large and one small single-copy region. For each cp genome, 134 genes were annotated, including 80 protein-coding, 31 tRNA, and 4 rRNA unique genes. In all, 179 repeats and 283 single sequence repeats were identified. Four hypervariable regions, namely, intergenic region of trnQ_rps16, trnV_ndhC, and psbD_trnT, and intron of ndhA, were identified in the Diospyros genomes. Phylogenetic analyses based on the whole cp genome, protein-coding, and intergenic and intron sequences indicated that D. oleifera is closely related to D. kaki and could be used as a model plant for future research on D. kaki; to our knowledge, this is proposed for the first time. Further, these analyses together with two large deletions (301 and 140 bp in the cp genome of D. 'Jinzaoshi', support its placement as a new species in Diospyros. Both maximum parsimony and likelihood analyses for 19 taxa indicated the basal position of Ericales in asterids and suggested that Ebenaceae is monophyletic in Ericales.

  20. Causes of genome instability

    DEFF Research Database (Denmark)

    Langie, Sabine A S; Koppen, Gudrun; Desaulniers, Daniel

    2015-01-01

    function, chromosome segregation, telomere length). The purpose of this review is to describe the crucial aspects of genome instability, to outline the ways in which environmental chemicals can affect this cancer hallmark and to identify candidate chemicals for further study. The overall aim is to make......Genome instability is a prerequisite for the development of cancer. It occurs when genome maintenance systems fail to safeguard the genome's integrity, whether as a consequence of inherited defects or induced via exposure to environmental agents (chemicals, biological agents and radiation). Thus...

  1. The (in)complete organelle genome: exploring the use and nonuse of available technologies for characterizing mitochondrial and plastid chromosomes.

    Science.gov (United States)

    Sanitá Lima, Matheus; Woods, Laura C; Cartwright, Matthew W; Smith, David Roy

    2016-11-01

    Not long ago, scientists paid dearly in time, money and skill for every nucleotide that they sequenced. Today, DNA sequencing technologies epitomize the slogan 'faster, easier, cheaper and more', and in many ways, sequencing an entire genome has become routine, even for the smallest laboratory groups. This is especially true for mitochondrial and plastid genomes. Given their relatively small sizes and high copy numbers per cell, organelle DNAs are currently among the most highly sequenced kind of chromosome. But accurately characterizing an organelle genome and the information it encodes can require much more than DNA sequencing and bioinformatics analyses. Organelle genomes can be surprisingly complex and can exhibit convoluted and unconventional modes of gene expression. Unravelling this complexity can demand a wide assortment of experiments, from pulsed-field gel electrophoresis to Southern and Northern blots to RNA analyses. Here, we show that it is exactly these types of 'complementary' analyses that are often lacking from contemporary organelle genome papers, particularly short 'genome announcement' articles. Consequently, crucial and interesting features of organelle chromosomes are going undescribed, which could ultimately lead to a poor understanding and even a misrepresentation of these genomes and the genes they express. High-throughput sequencing and bioinformatics have made it easy to sequence and assemble entire chromosomes, but they should not be used as a substitute for or at the expense of other types of genomic characterization methods. © 2016 The Authors. Molecular Ecology Resources Published by John Wiley & Sons Ltd.

  2. Molecular characterization of a nuclear topoisomerase II from Nicotiana tabacum that functionally complements a temperature-sensitive topoisomerase II yeast mutant.

    Science.gov (United States)

    Singh, B N; Mudgil, Yashwanti; Sopory, S K; Reddy, M K

    2003-07-01

    We have successfully expressed enzymatically active plant topoisomerase II in Escherichia coli for the first time, which has enabled its biochemical characterization. Using a PCR-based strategy, we obtained a full-length cDNA and the corresponding genomic clone of tobacco topoisomerase II. The genomic clone has 18 exons interrupted by 17 introns. Most of the 5' and 3' splice junctions follow the typical canonical consensus dinucleotide sequence GU-AG present in other plant introns. The position of introns and phasing with respect to primary amino acid sequence in tobacco TopII and Arabidopsis TopII are highly conserved, suggesting that the two genes are evolved from the common ancestral type II topoisomerase gene. The cDNA encodes a polypeptide of 1482 amino acids. The primary amino acid sequence shows a striking sequence similarity, preserving all the structural domains that are conserved among eukaryotic type II topoisomerases in an identical spatial order. We have expressed the full-length polypeptide in E. coli and purified the recombinant protein to homogeneity. The full-length polypeptide relaxed supercoiled DNA and decatenated the catenated DNA in a Mg(2+)- and ATP-dependent manner, and this activity was inhibited by 4'-(9-acridinylamino)-3'-methoxymethanesulfonanilide (m-AMSA). The immunofluorescence and confocal microscopic studies, with antibodies developed against the N-terminal region of tobacco recombinant topoisomerase II, established the nuclear localization of topoisomerase II in tobacco BY2 cells. The regulated expression of tobacco topoisomerase II gene under the GAL1 promoter functionally complemented a temperature-sensitive TopII(ts) yeast mutant.

  3. Characterizing Protein Interactions Employing a Genome-Wide siRNA Cellular Phenotyping Screen

    Science.gov (United States)

    Suratanee, Apichat; Schaefer, Martin H.; Betts, Matthew J.; Soons, Zita; Mannsperger, Heiko; Harder, Nathalie; Oswald, Marcus; Gipp, Markus; Ramminger, Ellen; Marcus, Guillermo; Männer, Reinhard; Rohr, Karl; Wanker, Erich; Russell, Robert B.; Andrade-Navarro, Miguel A.; Eils, Roland; König, Rainer

    2014-01-01

    Characterizing the activating and inhibiting effect of protein-protein interactions (PPI) is fundamental to gain insight into the complex signaling system of a human cell. A plethora of methods has been suggested to infer PPI from data on a large scale, but none of them is able to characterize the effect of this interaction. Here, we present a novel computational development that employs mitotic phenotypes of a genome-wide RNAi knockdown screen and enables identifying the activating and inhibiting effects of PPIs. Exemplarily, we applied our technique to a knockdown screen of HeLa cells cultivated at standard conditions. Using a machine learning approach, we obtained high accuracy (82% AUC of the receiver operating characteristics) by cross-validation using 6,870 known activating and inhibiting PPIs as gold standard. We predicted de novo unknown activating and inhibiting effects for 1,954 PPIs in HeLa cells covering the ten major signaling pathways of the Kyoto Encyclopedia of Genes and Genomes, and made these predictions publicly available in a database. We finally demonstrate that the predicted effects can be used to cluster knockdown genes of similar biological processes in coherent subgroups. The characterization of the activating or inhibiting effect of individual PPIs opens up new perspectives for the interpretation of large datasets of PPIs and thus considerably increases the value of PPIs as an integrated resource for studying the detailed function of signaling pathways of the cellular system of interest. PMID:25255318

  4. Characterization of a Genomic Signature of Pregnancy in the Breast

    Science.gov (United States)

    Belitskaya-Lévy, Ilana; Zeleniuch-Jacquotte, Anne; Russo, Jose; Russo, Irma H.; Bordás, Pal; Åhman, Janet; Afanasyeva, Yelena; Johansson, Robert; Lenner, Per; Li, Xiaochun; de Cicco, Ricardo López; Peri, Suraj; Ross, Eric; Russo, Patricia A.; Santucci-Pereira, Julia; Sheriff, Fathima S.; Slifker, Michael; Hallmans, Göran; Toniolo, Paolo; Arslan, Alan A.

    2012-01-01

    The objective of the current study was to comprehensively compare the genomic profiles in the breast of parous and nulliparous postmenopausal women to identify genes that permanently change their expression following pregnancy. The study was designed as a two-phase approach. In the discovery phase, we compared breast genomic profiles of 37 parous with 18 nulliparous postmenopausal women. In the validation phase, confirmation of the genomic patterns observed in the discovery phase was sought in an independent set of 30 parous and 22 nulliparous postmenopausal women. RNA was hybridized to Affymetrix HG_U133 Plus 2.0 oligonucleotide arrays containing probes to 54,675 transcripts; scanned and the images analyzed using Affymetrix GCOS software. Surrogate variable analysis, logistic regression and significance analysis for microarrays were used to identify statistically significant differences in expression of genes. The False Discovery Rate (FDR) approach was used to control for multiple comparisons. We found that 208 genes (305 probe sets) were differentially expressed between parous and nulliparous women in both discovery and validation phases of the study at a FDR of 10% and with at least a 1.25-fold change. These genes are involved in regulation of transcription, centrosome organization, RNA splicing, cell cycle control, adhesion and differentiation. The results provide persuasive evidence that full-term pregnancy induces long-term genomic changes in the breast. The genomic signature of pregnancy could be used as an intermediate marker to assess potential chemopreventive interventions with hormones mimicking the effects of pregnancy for prevention of breast cancer. PMID:21622728

  5. Genome-wide mapping of autonomous promoter activity in human cells.

    Science.gov (United States)

    van Arensbergen, Joris; FitzPatrick, Vincent D; de Haas, Marcel; Pagie, Ludo; Sluimer, Jasper; Bussemaker, Harmen J; van Steensel, Bas

    2017-02-01

    Previous methods to systematically characterize sequence-intrinsic activity of promoters have been limited by relatively low throughput and the length of the sequences that could be tested. Here we present 'survey of regulatory elements' (SuRE), a method that assays more than 10 8 DNA fragments, each 0.2-2 kb in size, for their ability to drive transcription autonomously. In SuRE, a plasmid library of random genomic fragments upstream of a 20-bp barcode is constructed, and decoded by paired-end sequencing. This library is used to transfect cells, and barcodes in transcribed RNA are quantified by high-throughput sequencing. When applied to the human genome, we achieve 55-fold genome coverage, allowing us to map autonomous promoter activity genome-wide in K562 cells. By computational modeling we delineate subregions within promoters that are relevant for their activity. We show that antisense promoter transcription is generally dependent on the sense core promoter sequences, and that most enhancers and several families of repetitive elements act as autonomous transcription initiation sites.

  6. Is length an appropriate estimator to characterize pulmonary alveolar capillaries? A critical evaluation in the human lung

    DEFF Research Database (Denmark)

    Mühlfeld, Christian; Weibel, Ewald R.; Hahn, Ute

    2010-01-01

    Stereological estimations of total capillary length have been used to characterize changes in the alveolar capillary network (ACN) during developmental processes or pathophysiological conditions. Here, we analyzed whether length estimations are appropriate to describe the 3D nature of the ACN. Semi...... resulted in a mean of 2,746 km (SD: 722 km). Because of the geometry of the ACN both approaches carry an unpredictable bias. The bias incurred by the design-based approach is proportional to the ratio between radius and length of the capillary segments in the ACN, the number of branching points...... and the winding of the capillaries. The model-based approach is biased because of the real noncylindrical shape of capillaries and the network structure. In conclusion, the estimation of the total length of capillaries in the ACN cannot be recommended as the geometry of the ACN does not fulfill the requirements...

  7. Genome-wide analysis of EgEVE_1, a transcriptionally active endogenous viral element associated to small RNAs in Eucalyptus genomes

    Directory of Open Access Journals (Sweden)

    Helena Sanches Marcon

    2017-02-01

    Full Text Available Abstract Endogenous viral elements (EVEs are the result of heritable horizontal gene transfer from viruses to hosts. In the last years, several EVE integration events were reported in plants by the exponential availability of sequenced genomes. Eucalyptus grandis is a forest tree species with a sequenced genome that is poorly studied in terms of evolution and mobile genetic elements composition. Here we report the characterization of E. grandis endogenous viral element 1 (EgEVE_1, a transcriptionally active EVE with a size of 5,664 bp. Phylogenetic analysis and genomic distribution demonstrated that EgEVE_1 is a newly described member of the Caulimoviridae family, distinct from the recently characterized plant Florendoviruses. Genomic distribution of EgEVE_1 and Florendovirus is also distinct. EgEVE_1 qPCR quantification in Eucalyptus urophylla suggests that this genome has more EgEVE_1 copies than E. grandis. EgEVE_1 transcriptional activity was demonstrated by RT-qPCR in five Eucalyptus species and one intrageneric hybrid. We also identified that Eucalyptus EVEs can generate small RNAs (sRNAs,that might be involved in de novo DNA methylation and virus resistance. Our data suggest that EVE families in Eucalyptus have distinct properties, and we provide the first comparative analysis of EVEs in Eucalyptus genomes.

  8. In Depth Characterization of Repetitive DNA in 23 Plant Genomes Reveals Sources of Genome Size Variation in the Legume Tribe Fabeae.

    Directory of Open Access Journals (Sweden)

    Jiří Macas

    Full Text Available The differential accumulation and elimination of repetitive DNA are key drivers of genome size variation in flowering plants, yet there have been few studies which have analysed how different types of repeats in related species contribute to genome size evolution within a phylogenetic context. This question is addressed here by conducting large-scale comparative analysis of repeats in 23 species from four genera of the monophyletic legume tribe Fabeae, representing a 7.6-fold variation in genome size. Phylogenetic analysis and genome size reconstruction revealed that this diversity arose from genome size expansions and contractions in different lineages during the evolution of Fabeae. Employing a combination of low-pass genome sequencing with novel bioinformatic approaches resulted in identification and quantification of repeats making up 55-83% of the investigated genomes. In turn, this enabled an analysis of how each major repeat type contributed to the genome size variation encountered. Differential accumulation of repetitive DNA was found to account for 85% of the genome size differences between the species, and most (57% of this variation was found to be driven by a single lineage of Ty3/gypsy LTR-retrotransposons, the Ogre elements. Although the amounts of several other lineages of LTR-retrotransposons and the total amount of satellite DNA were also positively correlated with genome size, their contributions to genome size variation were much smaller (up to 6%. Repeat analysis within a phylogenetic framework also revealed profound differences in the extent of sequence conservation between different repeat types across Fabeae. In addition to these findings, the study has provided a proof of concept for the approach combining recent developments in sequencing and bioinformatics to perform comparative analyses of repetitive DNAs in a large number of non-model species without the need to assemble their genomes.

  9. Complete genome sequences of cowpea polerovirus 1 and cowpea polerovirus 2 infecting cowpea plants in Burkina Faso.

    Science.gov (United States)

    Palanga, Essowè; Martin, Darren P; Galzi, Serge; Zabré, Jean; Bouda, Zakaria; Neya, James Bouma; Sawadogo, Mahamadou; Traore, Oumar; Peterschmitt, Michel; Roumagnac, Philippe; Filloux, Denis

    2017-07-01

    The full-length genome sequences of two novel poleroviruses found infecting cowpea plants, cowpea polerovirus 1 (CPPV1) and cowpea polerovirus 2 (CPPV2), were determined using overlapping RT-PCR and RACE-PCR. Whereas the 5845-nt CPPV1 genome was most similar to chickpea chlorotic stunt virus (73% identity), the 5945-nt CPPV2 genome was most similar to phasey bean mild yellow virus (86% identity). The CPPV1 and CPPV2 genomes both have a typical polerovirus genome organization. Phylogenetic analysis of the inferred P1-P2 and P3 amino acid sequences confirmed that CPPV1 and CPPV2 are indeed poleroviruses. Four apparently unique recombination events were detected within a dataset of 12 full polerovirus genome sequences, including two events in the CPPV2 genome. Based on the current species demarcation criteria for the family Luteoviridae, we tentatively propose that CPPV1 and CPPV2 should be considered members of novel polerovirus species.

  10. Genome characterization and population genetic structure of the zoonotic pathogen, Streptococcus canis

    Directory of Open Access Journals (Sweden)

    Richards Vincent P

    2012-12-01

    Full Text Available Abstract Background Streptococcus canis is an important opportunistic pathogen of dogs and cats that can also infect a wide range of additional mammals including cows where it can cause mastitis. It is also an emerging human pathogen. Results Here we provide characterization of the first genome sequence for this species, strain FSL S3-227 (milk isolate from a cow with an intra-mammary infection. A diverse array of putative virulence factors was encoded by the S. canis FSL S3-227 genome. Approximately 75% of these gene sequences were homologous to known Streptococcal virulence factors involved in invasion, evasion, and colonization. Present in the genome are multiple potentially mobile genetic elements (MGEs [plasmid, phage, integrative conjugative element (ICE] and comparison to other species provided convincing evidence for lateral gene transfer (LGT between S. canis and two additional bovine mastitis causing pathogens (Streptococcus agalactiae, and Streptococcus dysgalactiae subsp. dysgalactiae, with this transfer possibly contributing to host adaptation. Population structure among isolates obtained from Europe and USA [bovine = 56, canine = 26, and feline = 1] was explored. Ribotyping of all isolates and multi locus sequence typing (MLST of a subset of the isolates (n = 45 detected significant differentiation between bovine and canine isolates (Fisher exact test: P = 0.0000 [ribotypes], P = 0.0030 [sequence types], suggesting possible host adaptation of some genotypes. Concurrently, the ancestral clonal complex (54% of isolates occurred in many tissue types, all hosts, and all geographic locations suggesting the possibility of a wide and diverse niche. Conclusion This study provides evidence highlighting the importance of LGT in the evolution of the bacteria S. canis, specifically, its possible role in host adaptation and acquisition of virulence factors. Furthermore, recent LGT detected between S. canis and human

  11. Caspase 3 inactivates biologically active full length interleukin-33 as a classical cytokine but does not prohibit nuclear translocation

    International Nuclear Information System (INIS)

    Ali, Shafaqat; Nguyen, Dang Quan; Falk, Werner; Martin, Michael Uwe

    2010-01-01

    IL-33 is a member of the IL-1 family of cytokines with dual function which either activates cells via the IL-33 receptor in a paracrine fashion or translocates to the nucleus to regulate gene transcription in an intracrine manner. We show that full length murine IL-33 is active as a cytokine and that it is not processed by caspase 1 to mature IL-33 but instead cleaved by caspase 3 at aa175 to yield two products which are both unable to bind to the IL-33 receptor. Full length IL-33 and its N-terminal caspase 3 breakdown product, however, translocate to the nucleus. Finally, bioactive IL-33 is not released by cells constitutively or after activation. This suggests that IL-33 is not a classical cytokine but exerts its function in the nucleus of intact cells and only activates others cells via its receptor as an alarm mediator after destruction of the producing cell.

  12. Characterization of sida golden mottle virus isolated from Sida santaremensis Monteiro in Florida.

    Science.gov (United States)

    Al-Aqeel, H A; Iqbal, Zafar; Polston, J E

    2018-06-21

    The genome of sida golden mottle virus (SiGMoV) (GU997691 and GU997692) isolated from Sida santaremensis Monteiro in Manatee County, Florida, was sequenced and characterized. SiGMoV was determined to be a bipartite virus belonging to the genus Begomovirus with a genome organization typical of the New World viruses in the genus. SiGMoV DNA-A had the highest identity scores (89%) and showed the closest evolutionary relationships to sida golden mosaic Buckup virus (SiGMBuV) (JX162591 and HQ008338). However, SiGMoV DNA-B had the highest identity scores (93%) and showed the closest evolutionary relationship to corchorus yellow spot virus (DQ875869), SiGMBuV (JX162592) and sida golden mosaic Florida virus (SiGMFlV) (HE806443). There was extensive recombination in the SiGMoV DNA-A and much less in DNA-B. Full-length clones of SiGMoV were infectious and were able to infect and cause symptoms in several plant species.

  13. Full-Length Fibronectin Drives Fibroblast Accumulation at the Surface of Collagen Microtissues during Cell-Induced Tissue Morphogenesis.

    Directory of Open Access Journals (Sweden)

    Jasper Foolen

    Full Text Available Generating and maintaining gradients of cell density and extracellular matrix (ECM components is a prerequisite for the development of functionality of healthy tissue. Therefore, gaining insights into the drivers of spatial organization of cells and the role of ECM during tissue morphogenesis is vital. In a 3D model system of tissue morphogenesis, a fibronectin-FRET sensor recently revealed the existence of two separate fibronectin populations with different conformations in microtissues, i.e. 'compact and adsorbed to collagen' versus 'extended and fibrillar' fibronectin that does not colocalize with the collagen scaffold. Here we asked how the presence of fibronectin might drive this cell-induced tissue morphogenesis, more specifically the formation of gradients in cell density and ECM composition. Microtissues were engineered in a high-throughput model system containing rectangular microarrays of 12 posts, which constrained fibroblast-populated collagen gels, remodeled by the contractile cells into trampoline-shaped microtissues. Fibronectin's contribution during the tissue maturation process was assessed using fibronectin-knockout mouse embryonic fibroblasts (Fn-/- MEFs and floxed equivalents (Fnf/f MEFs, in fibronectin-depleted growth medium with and without exogenously added plasma fibronectin (full-length, or various fragments. In the absence of full-length fibronectin, Fn-/- MEFs remained homogenously distributed throughout the cell-contracted collagen gels. In contrast, in the presence of full-length fibronectin, both cell types produced shell-like tissues with a predominantly cell-free compacted collagen core and a peripheral surface layer rich in cells. Single cell assays then revealed that Fn-/- MEFs applied lower total strain energy on nanopillar arrays coated with either fibronectin or vitronectin when compared to Fnf/f MEFs, but that the presence of exogenously added plasma fibronectin rescued their contractility. While collagen

  14. Controlling the optical path length in turbid media using differential path-length spectroscopy: fiber diameter dependence

    NARCIS (Netherlands)

    Kaspers, O. P.; Sterenborg, H. J. C. M.; Amelink, A.

    2008-01-01

    We have characterized the path length for the differential path-length spectroscopy (DPS) fiber optic geometry for a wide range of optical properties and for fiber diameters ranging from 200 mu m to 1000 mu m. Phantom measurements show that the path length is nearly constant for scattering

  15. First full length sequences of the S gene of European isolates reveal further diversity among turkey coronaviruses.

    OpenAIRE

    2011-01-01

    Abstract An increasing incidence of enteric disorders clinically evocative of the poult enteritis complex has been observed in turkeys in France since 2003. Using a newly designed real-time RT-PCR assay specific for the nucleocapsid (N) gene of infectious bronchitis virus (IBV) and turkey coronaviruses (TCoV), coronaviruses were identified in 37 % of the intestinal samples collected from diseased turkey flocks. The full length Spike (S) gene of these viruses was amplified, cloned a...

  16. The Complete Chloroplast Genome of Catha edulis: A Comparative Analysis of Genome Features with Related Species

    Directory of Open Access Journals (Sweden)

    Cuihua Gu

    2018-02-01

    Full Text Available Qat (Catha edulis, Celastraceae is a woody evergreen species with great economic and cultural importance. It is cultivated for its stimulant alkaloids cathine and cathinone in East Africa and southwest Arabia. However, genome information, especially DNA sequence resources, for C. edulis are limited, hindering studies regarding interspecific and intraspecific relationships. Herein, the complete chloroplast (cp genome of Catha edulis is reported. This genome is 157,960 bp in length with 37% GC content and is structurally arranged into two 26,577 bp inverted repeats and two single-copy areas. The size of the small single-copy and the large single-copy regions were 18,491 bp and 86,315 bp, respectively. The C. edulis cp genome consists of 129 coding genes including 37 transfer RNA (tRNA genes, 8 ribosomal RNA (rRNA genes, and 84 protein coding genes. For those genes, 112 are single copy genes and 17 genes are duplicated in two inverted regions with seven tRNAs, four rRNAs, and six protein coding genes. The phylogenetic relationships resolved from the cp genome of qat and 32 other species confirms the monophyly of Celastraceae. The cp genomes of C. edulis, Euonymus japonicus and seven Celastraceae species lack the rps16 intron, which indicates an intron loss took place among an ancestor of this family. The cp genome of C. edulis provides a highly valuable genetic resource for further phylogenomic research, barcoding and cp transformation in Celastraceae.

  17. Single virus genomics: a new tool for virus discovery.

    Directory of Open Access Journals (Sweden)

    Lisa Zeigler Allen

    Full Text Available Whole genome amplification and sequencing of single microbial cells has significantly influenced genomics and microbial ecology by facilitating direct recovery of reference genome data. However, viral genomics continues to suffer due to difficulties related to the isolation and characterization of uncultivated viruses. We report here on a new approach called 'Single Virus Genomics', which enabled the isolation and complete genome sequencing of the first single virus particle. A mixed assemblage comprised of two known viruses; E. coli bacteriophages lambda and T4, were sorted using flow cytometric methods and subsequently immobilized in an agarose matrix. Genome amplification was then achieved in situ via multiple displacement amplification (MDA. The complete lambda phage genome was recovered with an average depth of coverage of approximately 437X. The isolation and genome sequencing of uncultivated viruses using Single Virus Genomics approaches will enable researchers to address questions about viral diversity, evolution, adaptation and ecology that were previously unattainable.

  18. Genomic and Phenotypic Characterization of Yeast Biosensor for Deep-space Radiation

    Science.gov (United States)

    Marina, Diana B.; Santa Maria, Sergio; Bhattacharya, Sharmila

    2016-01-01

    The BioSentinel mission was selected to launch as a secondary payload onboard NASA Exploration Mission 1 (EM-1) in 2018. In BioSentinel, the budding yeast Saccharomyces cerevisiae will be used as a biosensor to measure the long-term impact of deep-space radiation to living organisms. In the 4U-payload, desiccated yeast cells from different strains will be stored inside microfluidic cards equipped with 3-color LED optical detection system to monitor cell growth and metabolic activity. At different times throughout the 12-month mission, these cards will be filled with liquid yeast growth media to rehydrate and grow the desiccated cells. The growth and metabolic rates of wild-type and radiation-sensitive strains in deep-space radiation environment will be compared to the rates measured in the ground- and microgravity-control units. These rates will also be correlated with measurements obtained from onboard physical dosimeters. In our preliminary long-term desiccation study, we found that air-drying yeast cells in 10% trehalose is the best method of cell preservation in order to survive the entire 18-month mission duration (6-month pre-launch plus 12-month full-mission periods). However, our study also revealed that desiccated yeast cells have decreasing viability over time when stored in payload-like environment. This suggests that the yeast biosensor will have different population of cells at different time points during the long-term mission. In this study, we are characterizing genomic and phenotypic changes in our yeast biosensor due to long-term storage and desiccation. For each yeast strain that will be part of the biosensor, several clones were reisolated after long-term storage by desiccation. These clones were compared to their respective original isolate in terms of genomic composition, desiccation tolerance and radiation sensitivity. Interestingly, clones from a radiation-sensitive mutant have better desiccation tolerance compared to their original isolate

  19. Identification and characterization of a new bocavirus species in gorillas.

    Directory of Open Access Journals (Sweden)

    Amit Kapoor

    2010-07-01

    Full Text Available A novel parvovirus, provisionally named Gorilla Bocavirus species 1 (GBoV1, was identified in four stool samples from Western gorillas (Gorilla gorilla with acute enteritis. The complete genomic sequence of the new parvovirus revealed three open reading frames (ORFs with an organization similar to that of known bocaviruses. Phylogenetic analysis using complete capsid and non structural (NS gene sequence suggested that the new parvovirus is most closely related to human bocaviruses (HBoV. However, the NS ORF is more similar in length to the NS ORF found in canine minute virus and bovine parvovirus than in HBoV. Comparative genetic analysis using GBoV and HBoV genomes enabled characterization of unique splice donor and acceptor sites that appear to be highly conserved among all four HBoV species, and provided evidence for expression of two different NS proteins in all primate bocaviruses. GBoV is the first non-human primate bocavirus identified and provides new insights into the genetic diversity and evolution of this highly prevalent and recently discovered group of parvoviruses.

  20. Universal internucleotide statistics in full genomes: a footprint of the DNA structure and packaging?

    Directory of Open Access Journals (Sweden)

    Mikhail I Bogachev

    Full Text Available Uncovering the fundamental laws that govern the complex DNA structural organization remains challenging and is largely based upon reconstructions from the primary nucleotide sequences. Here we investigate the distributions of the internucleotide intervals and their persistence properties in complete genomes of various organisms from Archaea and Bacteria to H. Sapiens aiming to reveal the manifestation of the universal DNA architecture. We find that in all considered organisms the internucleotide interval distributions exhibit the same [Formula: see text]-exponential form. While in prokaryotes a single [Formula: see text]-exponential function makes the best fit, in eukaryotes the PDF contains additionally a second [Formula: see text]-exponential, which in the human genome makes a perfect approximation over nearly 10 decades. We suggest that this functional form is a footprint of the heterogeneous DNA structure, where the first [Formula: see text]-exponential reflects the universal helical pitch that appears both in pro- and eukaryotic DNA, while the second [Formula: see text]-exponential is a specific marker of the large-scale eukaryotic DNA organization.

  1. Molecular characterization and diversity of a novel non-autonomous mutator-like transposon family in brassica

    International Nuclear Information System (INIS)

    Nouroz, F.

    2015-01-01

    Transposable elements (TEs) are capable of mobilizing from one genomic location to other, with changes in their copy numbers. Mutator-like elements (MULEs) are DNA transposons characterized by 9 bp target site duplications (TSDs), with high variability in sequence and length, and include non-conserved terminal inverted repeats (TIRs). We identified and characterized a family of Mutator-like elements designated as Shahroz. The structural and molecular analyses revealed that family had a small number of mostly defective non-autonomous MULEs and has shown limited activity in the evolutionary history of the Brassica A-genome. The Shahroz elements range in size from 2734 to 3160 bp including 76 bp imperfect TIRs and 9 bp variable TSDs. The individual copies have shown high homology (52-99%) in their entire lengths. The study revealed that the elements are less in numbers but active in Brassica rapa genomes and PCR amplification revealed their specificity and amplification in A-genome containing diploid and polyploids Brassica. The phylogenetic analysis of Brassica MULEs with other plant Mutator elements revealed that no correlation exists between Brassica MULEs and other elements suggesting a separate line of evolution. Analyzing the regions flanking the insertions revealed that the insertions have showed a preference for AT rich regions. The detailed study of these insertions revealed that although less in number and small sizes, they have played a role in Brassica genome evolution by their mobilization. (author)

  2. Targeted genomic enrichment and sequencing of CyHV-3 from carp tissues confirms low nucleotide diversity and mixed genotype infections

    Directory of Open Access Journals (Sweden)

    Saliha Hammoumi

    2016-09-01

    Full Text Available Koi herpesvirus disease (KHVD is an emerging disease that causes mass mortality in koi and common carp, Cyprinus carpio L. Its causative agent is Cyprinid herpesvirus 3 (CyHV-3, also known as koi herpesvirus (KHV. Although data on the pathogenesis of this deadly virus is relatively abundant in the literature, still little is known about its genomic diversity and about the molecular mechanisms that lead to such a high virulence. In this context, we developed a new strategy for sequencing full-length CyHV-3 genomes directly from infected fish tissues. Total genomic DNA extracted from carp gill tissue was specifically enriched with CyHV-3 sequences through hybridization to a set of nearly 2 million overlapping probes designed to cover the entire genome length, using KHV-J sequence (GenBank accession number AP008984 as reference. Applied to 7 CyHV-3 specimens from Poland and Indonesia, this targeted genomic enrichment enabled recovery of the full genomes with >99.9% reference coverage. The enrichment rate was directly correlated to the estimated number of viral copies contained in the DNA extracts used for library preparation, which varied between ∼5000 and ∼2×107. The average sequencing depth was >200 for all samples, thus allowing the search for variants with high confidence. Sequence analyses highlighted a significant proportion of intra-specimen sequence heterogeneity, suggesting the presence of mixed infections in all investigated fish. They also showed that inter-specimen genetic diversity at the genome scale was very low (>99.95% of sequence identity. By enabling full genome comparisons directly from infected fish tissues, this new method will be valuable to trace outbreaks rapidly and at a reasonable cost, and in turn to understand the transmission routes of CyHV-3.

  3. Full genome sequencing and genetic characterization of Eubenangee viruses identify Pata virus as a distinct species within the genus Orbivirus.

    Directory of Open Access Journals (Sweden)

    Manjunatha N Belaganahalli

    Full Text Available Eubenangee virus has previously been identified as the cause of Tammar sudden death syndrome (TSDS. Eubenangee virus (EUBV, Tilligery virus (TILV, Pata virus (PATAV and Ngoupe virus (NGOV are currently all classified within the Eubenangee virus species of the genus Orbivirus, family Reoviridae. Full genome sequencing confirmed that EUBV and TILV (both of which are from Australia show high levels of aa sequence identity (>92% in the conserved polymerase VP1(Pol, sub-core VP3(T2 and outer core VP7(T13 proteins, and are therefore appropriately classified within the same virus species. However, they show much lower amino acid (aa identity levels in their larger outer-capsid protein VP2 (<53%, consistent with membership of two different serotypes - EUBV-1 and EUBV-2 (respectively. In contrast PATAV showed significantly lower levels of aa sequence identity with either EUBV or TILV (with <71% in VP1(Pol and VP3(T2, and <57% aa identity in VP7(T13 consistent with membership of a distinct virus species. A proposal has therefore been sent to the Reoviridae Study Group of ICTV to recognise 'Pata virus' as a new Orbivirus species, with the PATAV isolate as serotype 1 (PATAV-1. Amongst the other orbiviruses, PATAV shows closest relationships to Epizootic Haemorrhagic Disease virus (EHDV, with 80.7%, 72.4% and 66.9% aa identity in VP3(T2, VP1(Pol, and VP7(T13 respectively. Although Ngoupe virus was not available for these studies, like PATAV it was isolated in Central Africa, and therefore seems likely to also belong to the new species, possibly as a distinct 'type'. The data presented will facilitate diagnostic assay design and the identification of additional isolates of these viruses.

  4. Genome-wide identification, characterization and phylogenetic analysis of 50 catfish ATP-binding cassette (ABC) transporter genes.

    Science.gov (United States)

    Liu, Shikai; Li, Qi; Liu, Zhanjiang

    2013-01-01

    Although a large set of full-length transcripts was recently assembled in catfish, annotation of large gene families, especially those with duplications, is still a great challenge. Most often, complexities in annotation cause mis-identification and thereby much confusion in the scientific literature. As such, detailed phylogenetic analysis and/or orthology analysis are required for annotation of genes involved in gene families. The ATP-binding cassette (ABC) transporter gene superfamily is a large gene family that encodes membrane proteins that transport a diverse set of substrates across membranes, playing important roles in protecting organisms from diverse environment. In this work, we identified a set of 50 ABC transporters in catfish genome. Phylogenetic analysis allowed their identification and annotation into seven subfamilies, including 9 ABCA genes, 12 ABCB genes, 12 ABCC genes, 5 ABCD genes, 2 ABCE genes, 4 ABCF genes and 6 ABCG genes. Most ABC transporters are conserved among vertebrates, though cases of recent gene duplications and gene losses do exist. Gene duplications in catfish were found for ABCA1, ABCB3, ABCB6, ABCC5, ABCD3, ABCE1, ABCF2 and ABCG2. The whole set of catfish ABC transporters provide the essential genomic resources for future biochemical, toxicological and physiological studies of ABC drug efflux transporters. The establishment of orthologies should allow functional inferences with the information from model species, though the function of lineage-specific genes can be distinct because of specific living environment with different selection pressure.

  5. Llama immunization with full-length VAR2CSA generates cross-reactive and inhibitory single-domain antibodies against the DBL1X domain.

    Science.gov (United States)

    Nunes-Silva, Sofia; Gangnard, Stéphane; Vidal, Marta; Vuchelen, Anneleen; Dechavanne, Sebastien; Chan, Sherwin; Pardon, Els; Steyaert, Jan; Ramboarina, Stephanie; Chêne, Arnaud; Gamain, Benoît

    2014-12-09

    VAR2CSA stands today as the leading vaccine candidate aiming to protect future pregnant women living in malaria endemic areas against the severe clinical outcomes of pregnancy associated malaria (PAM). The rational design of an efficient VAR2CSA-based vaccine relies on a profound understanding of the molecular interactions associated with P. falciparum infected erythrocyte sequestration in the placenta. Following immunization of a llama with the full-length VAR2CSA recombinant protein, we have expressed and characterized a panel of 19 nanobodies able to recognize the recombinant VAR2CSA as well as the surface of erythrocytes infected with parasites originating from different parts of the world. Domain mapping revealed that a large majority of nanobodies targeted DBL1X whereas a few of them were directed towards DBL4ε, DBL5ε and DBL6ε. One nanobody targeting the DBL1X was able to recognize the native VAR2CSA protein of the three parasite lines tested. Furthermore, four nanobodies targeting DBL1X reproducibly inhibited CSA adhesion of erythrocytes infected with the homologous NF54-CSA parasite strain, providing evidences that DBL1X domain is part or close to the CSA binding site. These nanobodies could serve as useful tools to identify conserved epitopes shared between different variants and to characterize the interactions between VAR2CSA and CSA.

  6. Fast and robust methods for full genome sequencing of Porcine Reproductive and Respiratory Syndrome Virus (PRRSV) Type 1 and Type 2

    DEFF Research Database (Denmark)

    Kvisgaard, Lise Kirstine; Hjulsager, Charlotte Kristiane; Fahnøe, Ulrik

    . In the present study, fast and robust methods for long range RT-PCR amplification and subsequent next generation sequencing (NGS) of PRRSV Type 1 and Type 2 viruses were developed and validated on nine Type 1 and nine Type 2 PRRSV viruses. The methods were shown to generate robust and reliable sequences both...... on primary material and cell culture adapted viruses and the protocols were shown to perform well on all three NGS platforms tested (Roche 454 FLX, Illumina HiSeq 2000, and Ion Torrent PGM™ Sequencer). To complete the sequences at the 5’ end, 5’ Rapid Amplification of cDNA Ends (5’ RACE) was conducted...... followed by cycle sequencing of clones. The genome lengths were determined to be 14,876-15,098 and 15,342-15,408 nucleotides long for the Type 1 and Type 2 strains, respectively. These methods will greatly facilitate the generation of more complete genome PRRSV sequences globally which in turn may lead...

  7. Characterization of apparently balanced chromosomal rearrangements from the developmental genome anatomy project.

    Science.gov (United States)

    Higgins, Anne W; Alkuraya, Fowzan S; Bosco, Amy F; Brown, Kerry K; Bruns, Gail A P; Donovan, Diana J; Eisenman, Robert; Fan, Yanli; Farra, Chantal G; Ferguson, Heather L; Gusella, James F; Harris, David J; Herrick, Steven R; Kelly, Chantal; Kim, Hyung-Goo; Kishikawa, Shotaro; Korf, Bruce R; Kulkarni, Shashikant; Lally, Eric; Leach, Natalia T; Lemyre, Emma; Lewis, Janine; Ligon, Azra H; Lu, Weining; Maas, Richard L; MacDonald, Marcy E; Moore, Steven D P; Peters, Roxanna E; Quade, Bradley J; Quintero-Rivera, Fabiola; Saadi, Irfan; Shen, Yiping; Shendure, Jay; Williamson, Robin E; Morton, Cynthia C

    2008-03-01

    Apparently balanced chromosomal rearrangements in individuals with major congenital anomalies represent natural experiments of gene disruption and dysregulation. These individuals can be studied to identify novel genes critical in human development and to annotate further the function of known genes. Identification and characterization of these genes is the goal of the Developmental Genome Anatomy Project (DGAP). DGAP is a multidisciplinary effort that leverages the recent advances resulting from the Human Genome Project to increase our understanding of birth defects and the process of human development. Clinically significant phenotypes of individuals enrolled in DGAP are varied and, in most cases, involve multiple organ systems. Study of these individuals' chromosomal rearrangements has resulted in the mapping of 77 breakpoints from 40 chromosomal rearrangements by FISH with BACs and fosmids, array CGH, Southern-blot hybridization, MLPA, RT-PCR, and suppression PCR. Eighteen chromosomal breakpoints have been cloned and sequenced. Unsuspected genomic imbalances and cryptic rearrangements were detected, but less frequently than has been reported previously. Chromosomal rearrangements, both balanced and unbalanced, in individuals with multiple congenital anomalies continue to be a valuable resource for gene discovery and annotation.

  8. Identification and characterization of insect-specific proteins by genome data analysis

    Directory of Open Access Journals (Sweden)

    Clark Terry

    2007-04-01

    Full Text Available Abstract Background Insects constitute the vast majority of known species with their importance including biodiversity, agricultural, and human health concerns. It is likely that the successful adaptation of the Insecta clade depends on specific components in its proteome that give rise to specialized features. However, proteome determination is an intensive undertaking. Here we present results from a computational method that uses genome analysis to characterize insect and eukaryote proteomes as an approximation complementary to experimental approaches. Results Homologs in common to Drosophila melanogaster, Anopheles gambiae, Bombyx mori, Tribolium castaneum, and Apis mellifera were compared to the complete genomes of three non-insect eukaryotes (opisthokonts Homo sapiens, Caenorhabditis elegans and Saccharomyces cerevisiae. This operation yielded 154 groups of orthologous proteins in Drosophila to be insect-specific homologs; 466 groups were determined to be common to eukaryotes (represented by three opisthokonts. ESTs from the hemimetabolous insect Locust migratoria were also considered in order to approximate their corresponding genes in the insect-specific homologs. Stress and stimulus response proteins were found to constitute a higher fraction in the insect-specific homologs than in the homologs common to eukaryotes. Conclusion The significant representation of stress response and stimulus response proteins in proteins determined to be insect-specific, along with specific cuticle and pheromone/odorant binding proteins, suggest that communication and adaptation to environments may distinguish insect evolution relative to other eukaryotes. The tendency for low Ka/Ks ratios in the insect-specific protein set suggests purifying selection pressure. The generally larger number of paralogs in the insect-specific proteins may indicate adaptation to environment changes. Instances in our insect-specific protein set have been arrived at through

  9. Identifying elemental genomic track types and representing them uniformly

    Directory of Open Access Journals (Sweden)

    Gundersen Sveinung

    2011-12-01

    Full Text Available Abstract Background With the recent advances and availability of various high-throughput sequencing technologies, data on many molecular aspects, such as gene regulation, chromatin dynamics, and the three-dimensional organization of DNA, are rapidly being generated in an increasing number of laboratories. The variation in biological context, and the increasingly dispersed mode of data generation, imply a need for precise, interoperable and flexible representations of genomic features through formats that are easy to parse. A host of alternative formats are currently available and in use, complicating analysis and tool development. The issue of whether and how the multitude of formats reflects varying underlying characteristics of data has to our knowledge not previously been systematically treated. Results We here identify intrinsic distinctions between genomic features, and argue that the distinctions imply that a certain variation in the representation of features as genomic tracks is warranted. Four core informational properties of tracks are discussed: gaps, lengths, values and interconnections. From this we delineate fifteen generic track types. Based on the track type distinctions, we characterize major existing representational formats and find that the track types are not adequately supported by any single format. We also find, in contrast to the XML formats, that none of the existing tabular formats are conveniently extendable to support all track types. We thus propose two unified formats for track data, an improved XML format, BioXSD 1.1, and a new tabular format, GTrack 1.0. Conclusions The defined track types are shown to capture relevant distinctions between genomic annotation tracks, resulting in varying representational needs and analysis possibilities. The proposed formats, GTrack 1.0 and BioXSD 1.1, cater to the identified track distinctions and emphasize preciseness, flexibility and parsing convenience.

  10. Synonymous Codon Usage Analysis of Thirty Two Mycobacteriophage Genomes

    Directory of Open Access Journals (Sweden)

    Sameer Hassan

    2009-01-01

    Full Text Available Synonymous codon usage of protein coding genes of thirty two completely sequenced mycobacteriophage genomes was studied using multivariate statistical analysis. One of the major factors influencing codon usage is identified to be compositional bias. Codons ending with either C or G are preferred in highly expressed genes among which C ending codons are highly preferred over G ending codons. A strong negative correlation between effective number of codons (Nc and GC3s content was also observed, showing that the codon usage was effected by gene nucleotide composition. Translational selection is also identified to play a role in shaping the codon usage operative at the level of translational accuracy. High level of heterogeneity is seen among and between the genomes. Length of genes is also identified to influence the codon usage in 11 out of 32 phage genomes. Mycobacteriophage Cooper is identified to be the highly biased genome with better translation efficiency comparing well with the host specific tRNA genes.

  11. Repetitive DNA in the pea (Pisum sativum L. genome: comprehensive characterization using 454 sequencing and comparison to soybean and Medicago truncatula

    Directory of Open Access Journals (Sweden)

    Navrátilová Alice

    2007-11-01

    Full Text Available Abstract Background Extraordinary size variation of higher plant nuclear genomes is in large part caused by differences in accumulation of repetitive DNA. This makes repetitive DNA of great interest for studying the molecular mechanisms shaping architecture and function of complex plant genomes. However, due to methodological constraints of conventional cloning and sequencing, a global description of repeat composition is available for only a very limited number of higher plants. In order to provide further data required for investigating evolutionary patterns of repeated DNA within and between species, we used a novel approach based on massive parallel sequencing which allowed a comprehensive repeat characterization in our model species, garden pea (Pisum sativum. Results Analysis of 33.3 Mb sequence data resulted in quantification and partial sequence reconstruction of major repeat families occurring in the pea genome with at least thousands of copies. Our results showed that the pea genome is dominated by LTR-retrotransposons, estimated at 140,000 copies/1C. Ty3/gypsy elements are less diverse and accumulated to higher copy numbers than Ty1/copia. This is in part due to a large population of Ogre-like retrotransposons which alone make up over 20% of the genome. In addition to numerous types of mobile elements, we have discovered a set of novel satellite repeats and two additional variants of telomeric sequences. Comparative genome analysis revealed that there are only a few repeat sequences conserved between pea and soybean genomes. On the other hand, all major families of pea mobile elements are well represented in M. truncatula. Conclusion We have demonstrated that even in a species with a relatively large genome like pea, where a single 454-sequencing run provided only 0.77% coverage, the generated sequences were sufficient to reconstruct and analyze major repeat families corresponding to a total of 35–48% of the genome. These data

  12. Three-dimensional Structure of a Viral Genome-delivery Portal Vertex

    Energy Technology Data Exchange (ETDEWEB)

    A Olia; P Prevelige Jr.; J Johnson; G Cingolani

    2011-12-31

    DNA viruses such as bacteriophages and herpesviruses deliver their genome into and out of the capsid through large proteinaceous assemblies, known as portal proteins. Here, we report two snapshots of the dodecameric portal protein of bacteriophage P22. The 3.25-{angstrom}-resolution structure of the portal-protein core bound to 12 copies of gene product 4 (gp4) reveals a {approx}1.1-MDa assembly formed by 24 proteins. Unexpectedly, a lower-resolution structure of the full-length portal protein unveils the unique topology of the C-terminal domain, which forms a {approx}200-{angstrom}-long {alpha}-helical barrel. This domain inserts deeply into the virion and is highly conserved in the Podoviridae family. We propose that the barrel domain facilitates genome spooling onto the interior surface of the capsid during genome packaging and, in analogy to a rifle barrel, increases the accuracy of genome ejection into the host cell.

  13. Full length articles published in BJOMS during 2010-11--an analysis by sub-specialty and study type.

    Science.gov (United States)

    Arakeri, Gururaj; Colbert, Serryth; Rosenbaum, Gavin; Brennan, Peter A

    2012-12-01

    Full length articles such as prospective and retrospective studies, case series, laboratory-based research and reviews form the majority of papers published in the British Journal of Oral and Maxillofacial Surgery (BJOMS). We were interested to evaluate the breakdown of these types of articles both by sub-specialty and the type of study as well as the proportion that are written by UK colleagues compared to overseas authors over a 2 year period (2010-11). A total of 191 full length articles across all sub-specialties of our discipline were published, with 107 papers (56%) coming from UK authors. There were proportionately more oncology papers arising from the UK than overseas (60 and 30% of total respectively) while the opposite was found for cleft/deformity studies (10% and 22%). There was only one laboratory-based study published from the UK compared with 27 papers from overseas. The number of quality papers being submitted to the Journal continues to increase, and the type of article being published between UK and overseas probably reflects different practices and case-loads amongst colleagues. The relatively few UK laboratory based studies published in BJOMS compared to overseas authors are most likely due to authors seeking the most prestigious journals possible for their work. Copyright © 2012 The British Association of Oral and Maxillofacial Surgeons. Published by Elsevier Ltd. All rights reserved.

  14. The Organelle Genomes of Hassawi Rice (Oryza sativa L.) and Its Hybrid in Saudi Arabia: Genome Variation, Rearrangement, and Origins

    Science.gov (United States)

    Zhang, Tongwu; Hu, Songnian; Zhang, Guangyu; Pan, Linlin; Zhang, Xiaowei; Al-Mssallem, Ibrahim S.; Yu, Jun

    2012-01-01

    Hassawi rice (Oryza sativa L.) is a landrace adapted to the climate of Saudi Arabia, characterized by its strong resistance to soil salinity and drought. Using high quality sequencing reads extracted from raw data of a whole genome sequencing project, we assembled both chloroplast (cp) and mitochondrial (mt) genomes of the wild-type Hassawi rice (Hassawi-1) and its dwarf hybrid (Hassawi-2). We discovered 16 InDels (insertions and deletions) but no SNP (single nucleotide polymorphism) is present between the two Hassawi cp genomes. We identified 48 InDels and 26 SNPs in the two Hassawi mt genomes and a new type of sequence variation, termed reverse complementary variation (RCV) in the rice cp genomes. There are two and four RCVs identified in Hassawi-1 when compared to 93–11 (indica) and Nipponbare (japonica), respectively. Microsatellite sequence analysis showed there are more SSRs in the genic regions of both cp and mt genomes in the Hassawi rice than in the other rice varieties. There are also large repeats in the Hassawi mt genomes, with the longest length of 96,168 bp and 96,165 bp in Hassawi-1 and Hassawi-2, respectively. We believe that frequent DNA rearrangement in the Hassawi mt and cp genomes indicate ongoing dynamic processes to reach genetic stability under strong environmental pressures. Based on sequence variation analysis and the breeding history, we suggest that both Hassawi-1 and Hassawi-2 originated from the Indonesian variety Peta since genetic diversity between the two Hassawi cultivars is very low albeit an unknown historic origin of the wild-type Hassawi rice. PMID:22870184

  15. Insights from the complete chloroplast genome into the evolution of Sesamum indicum L.

    Directory of Open Access Journals (Sweden)

    Haiyang Zhang

    Full Text Available Sesame (Sesamum indicum L. is one of the oldest oilseed crops. In order to investigate the evolutionary characters according to the Sesame Genome Project, apart from sequencing its nuclear genome, we sequenced the complete chloroplast genome of S. indicum cv. Yuzhi 11 (white seeded using Illumina and 454 sequencing. Comparisons of chloroplast genomes between S. indicum and the 18 other higher plants were then analyzed. The chloroplast genome of cv. Yuzhi 11 contains 153,338 bp and a total of 114 unique genes (KC569603. The number of chloroplast genes in sesame is the same as that in Nicotiana tabacum, Vitis vinifera and Platanus occidentalis. The variation in the length of the large single-copy (LSC regions and inverted repeats (IR in sesame compared to 18 other higher plant species was the main contributor to size variation in the cp genome in these species. The 77 functional chloroplast genes, except for ycf1 and ycf2, were highly conserved. The deletion of the cp ycf1 gene sequence in cp genomes may be due either to its transfer to the nuclear genome, as has occurred in sesame, or direct deletion, as has occurred in Panax ginseng and Cucumis sativus. The sesame ycf2 gene is only 5,721 bp in length and has lost about 1,179 bp. Nucleotides 1-585 of ycf2 when queried in BLAST had hits in the sesame draft genome. Five repeats (R10, R12, R13, R14 and R17 were unique to the sesame chloroplast genome. We also found that IR contraction/expansion in the cp genome alters its rate of evolution. Chloroplast genes and repeats display the signature of convergent evolution in sesame and other species. These findings provide a foundation for further investigation of cp genome evolution in Sesamum and other higher plants.

  16. Transcriptome mining, functional characterization, and phylogeny of a large terpene synthase gene family in spruce (Picea spp.

    Directory of Open Access Journals (Sweden)

    Dullat Harpreet K

    2011-03-01

    Full Text Available Abstract Background In conifers, terpene synthases (TPSs of the gymnosperm-specific TPS-d subfamily form a diverse array of mono-, sesqui-, and diterpenoid compounds, which are components of the oleoresin secretions and volatile emissions. These compounds contribute to defence against herbivores and pathogens and perhaps also protect against abiotic stress. Results The availability of extensive transcriptome resources in the form of expressed sequence tags (ESTs and full-length cDNAs in several spruce (Picea species allowed us to estimate that a conifer genome contains at least 69 unique and transcriptionally active TPS genes. This number is comparable to the number of TPSs found in any of the sequenced and well-annotated angiosperm genomes. We functionally characterized a total of 21 spruce TPSs: 12 from Sitka spruce (P. sitchensis, 5 from white spruce (P. glauca, and 4 from hybrid white spruce (P. glauca × P. engelmannii, which included 15 monoterpene synthases, 4 sesquiterpene synthases, and 2 diterpene synthases. Conclusions The functional diversity of these characterized TPSs parallels the diversity of terpenoids found in the oleoresin and volatile emissions of Sitka spruce and provides a context for understanding this chemical diversity at the molecular and mechanistic levels. The comparative characterization of Sitka spruce and Norway spruce diterpene synthases revealed the natural occurrence of TPS sequence variants between closely related spruce species, confirming a previous prediction from site-directed mutagenesis and modelling.

  17. Context based computational analysis and characterization of ARS consensus sequences (ACS of Saccharomyces cerevisiae genome

    Directory of Open Access Journals (Sweden)

    Vinod Kumar Singh

    2016-09-01

    Full Text Available Genome-wide experimental studies in Saccharomyces cerevisiae reveal that autonomous replicating sequence (ARS requires an essential consensus sequence (ACS for replication activity. Computational studies identified thousands of ACS like patterns in the genome. However, only a few hundreds of these sites act as replicating sites and the rest are considered as dormant or evolving sites. In a bid to understand the sequence makeup of replication sites, a content and context-based analysis was performed on a set of replicating ACS sequences that binds to origin-recognition complex (ORC denoted as ORC-ACS and non-replicating ACS sequences (nrACS, that are not bound by ORC. In this study, DNA properties such as base composition, correlation, sequence dependent thermodynamic and DNA structural profiles, and their positions have been considered for characterizing ORC-ACS and nrACS. Analysis reveals that ORC-ACS depict marked differences in nucleotide composition and context features in its vicinity compared to nrACS. Interestingly, an A-rich motif was also discovered in ORC-ACS sequences within its nucleosome-free region. Profound changes in the conformational features, such as DNA helical twist, inclination angle and stacking energy between ORC-ACS and nrACS were observed. Distribution of ACS motifs in the non-coding segments points to the locations of ORC-ACS which are found far away from the adjacent gene start position compared to nrACS thereby enabling an accessible environment for ORC-proteins. Our attempt is novel in considering the contextual view of ACS and its flanking region along with nucleosome positioning in the S. cerevisiae genome and may be useful for any computational prediction scheme.

  18. What can we learn about lyssavirus genomes using 454 sequencing?

    Science.gov (United States)

    Höper, Dirk; Finke, Stefan; Freuling, Conrad M; Hoffmann, Bernd; Beer, Martin

    2012-01-01

    The main task of the individual project number four"Whole genome sequencing, virus-host adaptation, and molecular epidemiological analyses of lyssaviruses "within the network" Lyssaviruses--a potential re-emerging public health threat" is to provide high quality complete genome sequences from lyssaviruses. These sequences are analysed in-depth with regard to the diversity of the viral populations as to both quasi-species and so-called defective interfering RNAs. Moreover, the sequence data will facilitate further epidemiological analyses, will provide insight into the evolution of lyssaviruses and will be the basis for the design of novel nucleic acid based diagnostics. The first results presented here indicate that not only high quality full-length lyssavirus genome sequences can be generated, but indeed efficient analysis of the viral population gets feasible.

  19. The rearranged mitochondrial genome of Leptopilina boulardi (Hymenoptera: Figitidae, a parasitoid wasp of Drosophila

    Directory of Open Access Journals (Sweden)

    Daniel S. Oliveira

    Full Text Available Abstract The partial mitochondrial genome sequence of Leptopilina boulardi (Hymenoptera: Figitidae was characterized. Illumina sequencing was used yielding 35,999,679 reads, from which 102,482 were utilized in the assembly. The length of the sequenced region of this partial mitochondrial genome is 15,417 bp, consisting of 13 protein-coding, two rRNA, and 21tRNA genes (the trnaM failed to be sequenced and a partial A+T-rich region. All protein-coding genes start with ATN codons. Eleven protein-coding genes presented TAA stop codons, whereas ND6 and COII that presented TA, and T nucleotides, respectively. The gene pattern revealed extensive rearrangements compared to the typical pattern generally observed in insects. These rearrangements involve two protein-coding and two ribosomal genes, along with the 16 tRNA genes. This gene order is different from the pattern described for Ibalia leucospoides (Ibaliidae, Cynipoidea, suggesting that this particular gene order can be variable among Cynipoidea superfamily members. A maximum likelihood phylogenetic analysis of the main groups of Apocrita was performed using amino acid sequence of 13 protein-coding genes, showing monophyly for the Cynipoidea superfamily within the Hymenoptera phylogeny.

  20. Analyses of charophyte chloroplast genomes help characterize the ancestral chloroplast genome of land plants.

    Science.gov (United States)

    Civaň, Peter; Foster, Peter G; Embley, Martin T; Séneca, Ana; Cox, Cymon J

    2014-04-01

    Despite the significance of the relationships between embryophytes and their charophyte algal ancestors in deciphering the origin and evolutionary success of land plants, few chloroplast genomes of the charophyte algae have been reconstructed to date. Here, we present new data for three chloroplast genomes of the freshwater charophytes Klebsormidium flaccidum (Klebsormidiophyceae), Mesotaenium endlicherianum (Zygnematophyceae), and Roya anglica (Zygnematophyceae). The chloroplast genome of Klebsormidium has a quadripartite organization with exceptionally large inverted repeat (IR) regions and, uniquely among streptophytes, has lost the rrn5 and rrn4.5 genes from the ribosomal RNA (rRNA) gene cluster operon. The chloroplast genome of Roya differs from other zygnematophycean chloroplasts, including the newly sequenced Mesotaenium, by having a quadripartite structure that is typical of other streptophytes. On the basis of the improbability of the novel gain of IR regions, we infer that the quadripartite structure has likely been lost independently in at least three zygnematophycean lineages, although the absence of the usual rRNA operonic synteny in the IR regions of Roya may indicate their de novo origin. Significantly, all zygnematophycean chloroplast genomes have undergone substantial genomic rearrangement, which may be the result of ancient retroelement activity evidenced by the presence of integrase-like and reverse transcriptase-like elements in the Roya chloroplast genome. Our results corroborate the close phylogenetic relationship between Zygnematophyceae and land plants and identify 89 protein-coding genes and 22 introns present in the chloroplast genome at the time of the evolutionary transition of plants to land, all of which can be found in the chloroplast genomes of extant charophytes.

  1. Patient-Specific Instruments Based on Knee Joint Computed Tomography and Full-Length Lower Extremity Radiography in Total Knee Replacement

    Directory of Open Access Journals (Sweden)

    Hua Tian

    2018-01-01

    Conclusions: The use of PSIs based on knee joint CT and standing full-length lower extremity radiography in TKR resulted in acceptable alignment compared with the use of conventional instruments, although the marginal advantage was not statistically different. Surgical time and clinical results were also similar between the two groups. However, the PSI group had less postoperative drainage.

  2. Genome-Wide Characterization of Simple Sequence Repeat (SSR) Loci in Chinese Jujube and Jujube SSR Primer Transferability

    Science.gov (United States)

    Xiao, Jing; Zhao, Jin; Liu, Mengjun; Liu, Ping; Dai, Li; Zhao, Zhihui

    2015-01-01

    Chinese jujube (Ziziphus jujuba), an economically important species in the Rhamnaceae family, is a popular fruit tree in Asia. Here, we surveyed and characterized simple sequence repeats (SSRs) in the jujube genome. A total of 436,676 SSR loci were identified, with an average distance of 0.93 Kb between the loci. A large proportion of the SSRs included mononucleotide, dinucleotide and trinucleotide repeat motifs, which accounted for 64.87%, 24.40%, and 8.74% of all repeats, respectively. Among the mononucleotide repeats, A/T was the most common, whereas AT/TA was the most common dinucleotide repeat. A total of 30,565 primer pairs were successfully designed and screened using a series of criteria. Moreover, 725 of 1,000 randomly selected primer pairs were effective among 6 cultivars, and 511 of these primer pairs were polymorphic. Sequencing the amplicons of two SSRs across three jujube cultivars revealed variations in the repeats. The transferability of jujube SSR primers proved that 35/64 SSRs could be transferred across family boundary. Using jujube SSR primers, clustering analysis results from 15 species were highly consistent with the Angiosperm Phylogeny Group (APGIII) System. The genome-wide characterization of SSRs in Chinese jujube is very valuable for whole-genome characterization and marker-assisted selection in jujube breeding. In addition, the transferability of jujube SSR primers could provide a solid foundation for their further utilization. PMID:26000739

  3. Tracing Monotreme Venom Evolution in the Genomics Era

    Directory of Open Access Journals (Sweden)

    Camilla M. Whittington

    2014-04-01

    Full Text Available The monotremes (platypuses and echidnas represent one of only four extant venomous mammalian lineages. Until recently, monotreme venom was poorly understood. However, the availability of the platypus genome and increasingly sophisticated genomic tools has allowed us to characterize platypus toxins, and provides a means of reconstructing the evolutionary history of monotreme venom. Here we review the physiology of platypus and echidna crural (venom systems as well as pharmacological and genomic studies of monotreme toxins. Further, we synthesize current ideas about the evolution of the venom system, which in the platypus is likely to have been retained from a venomous ancestor, whilst being lost in the echidnas. We also outline several research directions and outstanding questions that would be productive to address in future research. An improved characterization of mammalian venoms will not only yield new toxins with potential therapeutic uses, but will also aid in our understanding of the way that this unusual trait evolves.

  4. Genomic organization, sequence divergence, and recombination of feline immunodeficiency virus from lions in the wild

    Directory of Open Access Journals (Sweden)

    Sondgeroth Kerry

    2008-02-01

    Full Text Available Abstract Background Feline immunodeficiency virus (FIV naturally infects multiple species of cat and is related to human immunodeficiency virus in humans. FIV infection causes AIDS-like disease and mortality in the domestic cat (Felis catus and serves as a natural model for HIV infection in humans. In African lions (Panthera leo and other exotic felid species, disease etiology introduced by FIV infection are less clear, but recent studies indicate that FIV causes moderate to severe CD4 depletion. Results In this study, comparative genomic methods are used to evaluate the full proviral genome of two geographically distinct FIV subtypes isolated from free-ranging lions. Genome organization of FIVPle subtype B (9891 bp from lions in the Serengeti National Park in Tanzania and FIVPle subtype E (9899 bp isolated from lions in the Okavango Delta in Botswana, both resemble FIV genome sequence from puma, Pallas cat and domestic cat across 5' LTR, gag, pol, vif, orfA, env, rev and 3'LTR regions. Comparative analyses of available full-length FIV consisting of subtypes A, B and C from FIVFca, Pallas cat FIVOma and two puma FIVPco subtypes A and B recapitulate the species-specific monophyly of FIV marked by high levels of genetic diversity both within and between species. Across all FIVPle gene regions except env, lion subtypes B and E are monophyletic, and marginally more similar to Pallas cat FIVOma than to other FIV. Sequence analyses indicate the SU and TM regions of env vary substantially between subtypes, with FIVPle subtype E more related to domestic cat FIVFca than to FIVPle subtype B and FIVOma likely reflecting recombination between strains in the wild. Conclusion This study demonstrates the necessity of whole-genome analysis to complement population/gene-based studies, which are of limited utility in uncovering complex events such as recombination that may lead to functional differences in virulence and pathogenicity. These full-length lion

  5. The Past, Present, and Future of Human Centromere Genomics

    Directory of Open Access Journals (Sweden)

    Megan E. Aldrup-MacDonald

    2014-01-01

    Full Text Available The centromere is the chromosomal locus essential for chromosome inheritance and genome stability. Human centromeres are located at repetitive alpha satellite DNA arrays that compose approximately 5% of the genome. Contiguous alpha satellite DNA sequence is absent from the assembled reference genome, limiting current understanding of centromere organization and function. Here, we review the progress in centromere genomics spanning the discovery of the sequence to its molecular characterization and the work done during the Human Genome Project era to elucidate alpha satellite structure and sequence variation. We discuss exciting recent advances in alpha satellite sequence assembly that have provided important insight into the abundance and complex organization of this sequence on human chromosomes. In light of these new findings, we offer perspectives for future studies of human centromere assembly and function.

  6. Targeted identification of short interspersed nuclear element families shows their widespread existence and extreme heterogeneity in plant genomes.

    Science.gov (United States)

    Wenke, Torsten; Döbel, Thomas; Sörensen, Thomas Rosleff; Junghans, Holger; Weisshaar, Bernd; Schmidt, Thomas

    2011-09-01

    Short interspersed nuclear elements (SINEs) are non-long terminal repeat retrotransposons that are highly abundant, heterogeneous, and mostly not annotated in eukaryotic genomes. We developed a tool designated SINE-Finder for the targeted discovery of tRNA-derived SINEs. We analyzed sequence data of 16 plant genomes, including 13 angiosperms and three gymnosperms and identified 17,829 full-length and truncated SINEs falling into 31 families showing the widespread occurrence of SINEs in higher plants. The investigation focused on potato (Solanum tuberosum), resulting in the detection of seven different SolS SINE families consisting of 1489 full-length and 870 5' truncated copies. Consensus sequences of full-length members range in size from 106 to 244 bp depending on the SINE family. SolS SINEs populated related species and evolved separately, which led to some distinct subfamilies. Solanaceae SINEs are dispersed along chromosomes and distributed without clustering but with preferred integration into short A-rich motifs. They emerged more than 23 million years ago and were species specifically amplified during the radiation of potato, tomato (Solanum lycopersicum), and tobacco (Nicotiana tabacum). We show that tobacco TS retrotransposons are composite SINEs consisting of the 3' end of a long interspersed nuclear element integrated downstream of a nonhomologous SINE family followed by successfully colonization of the genome. We propose an evolutionary scenario for the formation of TS as a spontaneous event, which could be typical for the emergence of SINE families.

  7. Unleashing the genome of Brassica rapa

    Directory of Open Access Journals (Sweden)

    Haibao eTang

    2012-07-01

    Full Text Available The completion and release of the Brassica rapa genome is of great benefit to researchers of the Brassicas, Arabidopsis, and genome evolution. While its lineage is closely related to the model organism Arabidopsis thaliana, the Brassicas experienced a whole genome triplication subsequent to their divergence. This event contemporaneously created three copies of its ancestral genome, which had diploidized through the process of homeologous gene loss known as fractionation. By the fractionation of homeologous gene content and genetic regulatory binding sites, Brassica’s genome is well placed to use comparative genomic techniques to identify syntenic regions, homeologous gene duplications, and putative regulatory sequences. Here, we use the comparative genomics platform CoGe to perform several different genomic analyses with which to study structural changes of its genome and dynamics of various genetic elements. Starting with whole genome comparisons, the Brassica paleohexaploidy is characterized, syntenic regions with Arabidopsis thaliana are identified, and the TOC1 gene in the circadian rhythm pathway from Arabidopsis thaliana is used to find duplicated orthologs in Brassica rapa. These TOC1 genes are further analyzed to identify conserved noncoding sequences that contain cis-acting regulatory elements and promoter sequences previously implicated in circadian rhythmicity. Each 'cookbook style' analysis includes a step-by-step walkthrough with links to CoGe to quickly reproduce each step of the analytical process.

  8. Full genome sequences are key to disclose RHDV2 emergence in the Macaronesian islands.

    Science.gov (United States)

    Lopes, Ana M; Blanco-Aguiar, Jose; Martín-Alonso, Aaron; Leitão, Manuel; Foronda, Pilar; Mendes, Marco; Gonçalves, David; Abrantes, Joana; Esteves, Pedro J

    2018-02-01

    A recent publication by Carvalho et al. in "Virus Genes" (June 2017) reported the presence of the new variant of rabbit hemorrhagic disease virus (RHDV2) in the two larger islands of the archipelago of Madeira. Based on the capsid protein sequence, the authors suggested that the high sequence identity, along with the short time spanning between outbreaks, points to dissemination from Porto Santo to Madeira. By including information of the full RHDV2 genome of strains from Azores, Madeira, and the Canary Islands, we confirm the results obtained by Carvalho et al., but further show that several subtypes of RHDV2 circulate in these islands: non-recombinant RHDV2 in the Canary Islands, G1/RHDV2 in Azores, Porto Santo and Madeira, and NP/RHDV2 also in Madeira. Here we conclude that RHDV2 has been independently introduced in these archipelagos, and that in Madeira at least two independent introductions must have occurred. We provide additional information on the dynamics of RHDV2 in the Macaronesian archipelagos of Azores, Madeira, and the Canary Islands and highlight the importance of analyzing RHDV2 complete genome.

  9. Assembly of viral genomes from metagenomes

    Directory of Open Access Journals (Sweden)

    Saskia L Smits

    2014-12-01

    Full Text Available Viral infections remain a serious global health issue. Metagenomic approaches are increasingly used in the detection of novel viral pathogens but also to generate complete genomes of uncultivated viruses. In silico identification of complete viral genomes from sequence data would allow rapid phylogenetic characterization of these new viruses. Often, however, complete viral genomes are not recovered, but rather several distinct contigs derived from a single entity, some of which have no sequence homology to any known proteins. De novo assembly of single viruses from a metagenome is challenging, not only because of the lack of a reference genome, but also because of intrapopulation variation and uneven or insufficient coverage. Here we explored different assembly algorithms, remote homology searches, genome-specific sequence motifs, k-mer frequency ranking, and coverage profile binning to detect and obtain viral target genomes from metagenomes. All methods were tested on 454-generated sequencing datasets containing three recently described RNA viruses with a relatively large genome which were divergent to previously known viruses from the viral families Rhabdoviridae and Coronaviridae. Depending on specific characteristics of the target virus and the metagenomic community, different assembly and in silico gap closure strategies were successful in obtaining near complete viral genomes.

  10. Effect of genomic long-range correlations on DNA persistence length: from theory to single molecule experiments.

    Science.gov (United States)

    Moukhtar, Julien; Faivre-Moskalenko, Cendrine; Milani, Pascale; Audit, Benjamin; Vaillant, Cedric; Fontaine, Emeline; Mongelard, Fabien; Lavorel, Guillaume; St-Jean, Philippe; Bouvet, Philippe; Argoul, Françoise; Arneodo, Alain

    2010-04-22

    Sequence dependency of DNA intrinsic bending properties has been emphasized as a possible key ingredient to in vivo chromatin organization. We use atomic force microscopy (AFM) in air and liquid to image intrinsically straight (synthetic), uncorrelated (hepatitis C RNA virus) and persistent long-range correlated (human) DNA fragments in various ionic conditions such that the molecules freely equilibrate on the mica surface before being captured in a particular conformation. 2D thermodynamic equilibrium is experimentally verified by a detailed statistical analysis of the Gaussian nature of the DNA bend angle fluctuations. We show that the worm-like chain (WLC) model, commonly used to describe the average conformation of long semiflexible polymers, reproduces remarkably well the persistence length estimates for the first two molecules as consistently obtained from (i) mean square end-to-end distance measurement and (ii) mean projection of the end-to-end vector on the initial orientation. Whatever the operating conditions (air or liquid, concentration of metal cations Mg(2+) and/or Ni(2+)), the persistence length found for the uncorrelated viral DNA underestimates the value obtained for the straight DNA. We show that this systematic difference is the signature of the presence of an uncorrelated structural intrinsic disorder in the hepatitis C virus (HCV) DNA fragment that superimposes on local curvatures induced by thermal fluctuations and that only the entropic disorder depends upon experimental conditions. In contrast, the WLC model fails to describe the human DNA conformations. We use a mean-field extension of the WLC model to account for the presence of long-range correlations (LRC) in the intrinsic curvature disorder of human genomic DNA: the stronger the LRC, the smaller the persistence length. The comparison of AFM imaging of human DNA with LRC DNA simulations confirms that the rather small mean square end-to-end distance observed, particularly for G

  11. Characterization of the complete mitochondrial genomes of Nematodirus oiratianus and Nematodirus spathiger of small ruminants.

    Science.gov (United States)

    Zhao, Guang-Hui; Jia, Yan-Qing; Cheng, Wen-Yu; Zhao, Wen; Bian, Qing-Qing; Liu, Guo-Hua

    2014-07-11

    Nematodirus spp. are among the most common nematodes of ruminants worldwide. N. oiratianus and N. spathiger are distributed worldwide as highly prevalent gastrointestinal nematodes, which cause emerging health problems and economic losses. Accurate identification of Nematodirus species is essential to develop effective control strategies for Nematodirus infection in ruminants. Mitochondrial DNA (mtDNA) could provide powerful genetic markers for identifying these closely related species and resolving phylogenetic relationships at different taxonomic levels. In the present study, the complete mitochondrial (mt) genomes of N. oiratianus and N. spathiger from small ruminants in China were obtained using Long-range PCR and sequencing. The complete mt genomes of N. oiratianus and N. spathiger were 13,765 bp and 13,519 bp in length, respectively. Both mt genomes were circular and consisted of 36 genes, including 12 genes encoding proteins, 2 genes encoding rRNA, and 22 genes encoding tRNA. Phylogenetic analyses based on the concatenated amino acid sequence data of all 12 protein-coding genes by Bayesian inference (BI), Maximum likelihood (ML) and Maximum parsimony (MP) showed that the two Nematodirus species (Molineidae) were closely related to Dictyocaulidae. The availability of the complete mtDNA sequences of N. oiratianus and N. spathiger not only provides new mtDNA sources for a better understanding of nematode mt genomics and phylogeny, but also provides novel and useful genetic markers for studying diagnosis, population genetics and molecular epidemiology of Nematodirus spp. in small ruminants.

  12. The complete chloroplast genome sequence of Hibiscus syriacus.

    Science.gov (United States)

    Kwon, Hae-Yun; Kim, Joon-Hyeok; Kim, Sea-Hyun; Park, Ji-Min; Lee, Hyoshin

    2016-09-01

    The complete chloroplast genome sequence of Hibiscus syriacus L. is presented in this study. The genome is composed of 161 019 bp in length, with a typical circular structure containing a pair of inverted repeats of 25 745 bp of length separated by a large single-copy region and a small single-copy region of 89 698 bp and 19 831 bp of length, respectively. The overall GC content is 36.8%. One hundred and fourteen genes were annotated, including 81 protein-coding genes, 4 ribosomal RNA genes and 29 transfer RNA genes.

  13. Identification and Characterization of Microsatellite Markers Derived from the Whole Genome Analysis of Taenia solium.

    Science.gov (United States)

    Pajuelo, Mónica J; Eguiluz, María; Dahlstrom, Eric; Requena, David; Guzmán, Frank; Ramirez, Manuel; Sheen, Patricia; Frace, Michael; Sammons, Scott; Cama, Vitaliano; Anzick, Sarah; Bruno, Dan; Mahanty, Siddhartha; Wilkins, Patricia; Nash, Theodore; Gonzalez, Armando; García, Héctor H; Gilman, Robert H; Porcella, Steve; Zimic, Mirko

    2015-12-01

    Infections with Taenia solium are the most common cause of adult acquired seizures worldwide, and are the leading cause of epilepsy in developing countries. A better understanding of the genetic diversity of T. solium will improve parasite diagnostics and transmission pathways in endemic areas thereby facilitating the design of future control measures and interventions. Microsatellite markers are useful genome features, which enable strain typing and identification in complex pathogen genomes. Here we describe microsatellite identification and characterization in T. solium, providing information that will assist in global efforts to control this important pathogen. For genome sequencing, T. solium cysts and proglottids were collected from Huancayo and Puno in Peru, respectively. Using next generation sequencing (NGS) and de novo assembly, we assembled two draft genomes and one hybrid genome. Microsatellite sequences were identified and 36 of them were selected for further analysis. Twenty T. solium isolates were collected from Tumbes in the northern region, and twenty from Puno in the southern region of Peru. The size-polymorphism of the selected microsatellites was determined with multi-capillary electrophoresis. We analyzed the association between microsatellite polymorphism and the geographic origin of the samples. The predicted size of the hybrid (proglottid genome combined with cyst genome) T. solium genome was 111 MB with a GC content of 42.54%. A total of 7,979 contigs (>1,000 nt) were obtained. We identified 9,129 microsatellites in the Puno-proglottid genome and 9,936 in the Huancayo-cyst genome, with 5 or more repeats, ranging from mono- to hexa-nucleotide. Seven microsatellites were polymorphic and 29 were monomorphic within the analyzed isolates. T. solium tapeworms were classified into two genetic groups that correlated with the North/South geographic origin of the parasites. The availability of draft genomes for T. solium represents a significant step

  14. Discovery and annotation of small proteins using genomics, proteomics and computational approaches

    Energy Technology Data Exchange (ETDEWEB)

    Yang, Xiaohan; Tschaplinski, Timothy J.; Hurst, Gregory B.; Jawdy, Sara; Abraham, Paul E.; Lankford, Patricia K.; Adams, Rachel M.; Shah, Manesh B.; Hettich, Robert L.; Lindquist, Erika; Kalluri, Udaya C.; Gunter, Lee E.; Pennacchio, Christa; Tuskan, Gerald A.

    2011-03-02

    Small proteins (10 200 amino acids aa in length) encoded by short open reading frames (sORF) play important regulatory roles in various biological processes, including tumor progression, stress response, flowering, and hormone signaling. However, ab initio discovery of small proteins has been relatively overlooked. Recent advances in deep transcriptome sequencing make it possible to efficiently identify sORFs at the genome level. In this study, we obtained 2.6 million expressed sequence tag (EST) reads from Populus deltoides leaf transcriptome and reconstructed full-length transcripts from the EST sequences. We identified an initial set of 12,852 sORFs encoding proteins of 10 200 aa in length. Three computational approaches were then used to enrich for bona fide protein-coding sORFs from the initial sORF set: (1) codingpotential prediction, (2) evolutionary conservation between P. deltoides and other plant species, and (3) gene family clustering within P. deltoides. As a result, a high-confidence sORF candidate set containing 1469 genes was obtained. Analysis of the protein domains, non-protein-coding RNA motifs, sequence length distribution, and protein mass spectrometry data supported this high-confidence sORF set. In the high-confidence sORF candidate set, known protein domains were identified in 1282 genes (higher-confidence sORF candidate set), out of which 611 genes, designated as highest-confidence candidate sORF set, were supported by proteomics data. Of the 611 highest-confidence candidate sORF genes, 56 were new to the current Populus genome annotation. This study not only demonstrates that there are potential sORF candidates to be annotated in sequenced genomes, but also presents an efficient strategy for discovery of sORFs in species with no genome annotation yet available.

  15. Full-genome analysis of a canine pneumovirus causing acute respiratory disease in dogs, Italy.

    Directory of Open Access Journals (Sweden)

    Nicola Decaro

    Full Text Available An outbreak of canine infectious respiratory disease (CIRD associated to canine pneumovirus (CnPnV infection is reported. The outbreak occurred in a shelter of the Apulia region and involved 37 out of 350 dogs that displayed cough and/or nasal discharge with no evidence of fever. The full-genomic characterisation showed that the causative agent (strain Bari/100-12 was closely related to CnPnVs that have been recently isolated in the USA, as well as to murine pneumovirus, which is responsible for respiratory disease in mice. The present study represents a useful contribution to the knowledge of the pathogenic potential of CnPnV and its association with CIRD in dogs. Further studies will elucidate the pathogenicity and epidemiology of this novel pneumovirus, thus addressing the eventual need for specific vaccines.

  16. BACHD rats expressing full-length mutant huntingtin exhibit differences in social behavior compared to wild-type littermates.

    Directory of Open Access Journals (Sweden)

    Giuseppe Manfré

    Full Text Available Huntington disease (HD is a devastating inherited neurodegenerative disorder characterized by progressive motor, cognitive, and psychiatric symptoms without any cure to slow down or stop the progress of the disease. The BACHD rat model for HD carrying the human full-length mutant huntingtin protein (mHTT with 97 polyQ repeats has been recently established as a promising model which reproduces several HD-like features. While motor and cognitive functions have been characterized in BACHD rats, little is known about their social phenotype.This study focuses especially on social behavior since evidence for social disturbances exists in human patients. Our objective was to compare social behavior in BACHD and wild-type (WT rats at different ages, using two different measures of sociability.Animals were tested longitudinally at the age of 2, 4 and 8 months in the social interaction test to examine different parameters of sociability. A separate cohort of 7 month old rats was tested in the three chamber social test to measure both sociability and social novelty. Gene expression analyses in 8 months old animals were performed by real time qRT-PCR to evaluate a potential involvement of D1 and D2 dopaminergic receptors and the contribution of Brain-derived neurotrophic factor (BDNF to the observed behavioral alterations.In the social interaction test, BACHD rats showed age-dependent changes in behaviour when they were-re introduced to their cagemate after a 24 hours-period of individual housing. The time spent on nape attacks increased with aging. Furthermore, a significant higher level of pinning at 2 months of age was shown in the BACHD rats compared to wild-types, followed by a reduction at 4 and 8 months. On the other hand, BACHD rats exhibited a decreased active social behaviour compared to wild-types, reflected by genotype-effects on approaching, following and social nose contact. In the three chamber social test, BACHD rats seemed to show a mild

  17. Characterization of canine osteosarcoma by array comparative genomic hybridization and RT-qPCR: signatures of genomic imbalance in canine osteosarcoma parallel the human counterpart.

    Science.gov (United States)

    Angstadt, Andrea Y; Motsinger-Reif, Alison; Thomas, Rachael; Kisseberth, William C; Guillermo Couto, C; Duval, Dawn L; Nielsen, Dahlia M; Modiano, Jaime F; Breen, Matthew

    2011-11-01

    Osteosarcoma (OS) is the most commonly diagnosed malignant bone tumor in humans and dogs, characterized in both species by extremely complex karyotypes exhibiting high frequencies of genomic imbalance. Evaluation of genomic signatures in human OS using array comparative genomic hybridization (aCGH) has assisted in uncovering genetic mechanisms that result in disease phenotype. Previous low-resolution (10-20 Mb) aCGH analysis of canine OS identified a wide range of recurrent DNA copy number aberrations, indicating extensive genomic instability. In this study, we profiled 123 canine OS tumors by 1 Mb-resolution aCGH to generate a dataset for direct comparison with current data for human OS, concluding that several high frequency aberrations in canine and human OS are orthologous. To ensure complete coverage of gene annotation, we identified the human refseq genes that map to these orthologous aberrant dog regions and found several candidate genes warranting evaluation for OS involvement. Specifically, subsequenct FISH and qRT-PCR analysis of RUNX2, TUSC3, and PTEN indicated that expression levels correlated with genomic copy number status, showcasing RUNX2 as an OS associated gene and TUSC3 as a possible tumor suppressor candidate. Together these data demonstrate the ability of genomic comparative oncology to identify genetic abberations which may be important for OS progression. Large scale screening of genomic imbalance in canine OS further validates the use of the dog as a suitable model for human cancers, supporting the idea that dysregulation discovered in canine cancers will provide an avenue for complementary study in human counterparts. Copyright © 2011 Wiley-Liss, Inc.

  18. Comprehensive characterization of genomic instability in pluripotent stem cells and their derived neuroprogenitor cell lines

    Directory of Open Access Journals (Sweden)

    Nestor Luis Lopez Corrales

    2012-12-01

    Full Text Available The genomic integrity of two human pluripotent stem cells and their derived neuroprogenitor cell lines was studied, applying a combination of high-resolution genetic methodologies. The usefulness of combining array-comparative genomic hybridization (aCGH and multiplex fluorescence in situ hybridization (M-FISH techniques should be delineated to exclude/detect a maximum of possible genomic structural aberrations. Interestingly, in parts different genomic imbalances at chromosomal and subchromosomal levels were detected in pluripotent stem cells and their derivatives. Some of the copy number variations were inherited from the original cell line, whereas other modifications were presumably acquired during the differentiation and manipulation procedures. These results underline the necessity to study both pluripotent stem cells and their differentiated progeny by as many approaches as possible in order to assess their genomic stability before using them in clinical therapies.

  19. Draft genome sequence of the intestinal parasite Blastocystis subtype 4-isolate WR1

    Directory of Open Access Journals (Sweden)

    Ivan Wawrzyniak

    2015-06-01

    Full Text Available The intestinal protistan parasite Blastocystis is characterized by an extensive genetic variability with 17 subtypes (ST1–ST17 described to date. Only the whole genome of a human ST7 isolate was previously sequenced. Here we report the draft genome sequence of Blastocystis ST4-WR1 isolated from a laboratory rodent at Singapore.

  20. From Conventional to Next Generation Sequencing of Epstein-Barr Virus Genomes

    Directory of Open Access Journals (Sweden)

    Hin Kwok

    2016-02-01

    Full Text Available Genomic sequences of Epstein–Barr virus (EBV have been of interest because the virus is associated with cancers, such as nasopharyngeal carcinoma, and conditions such as infectious mononucleosis. The progress of whole-genome EBV sequencing has been limited by the inefficiency and cost of the first-generation sequencing technology. With the advancement of next-generation sequencing (NGS and target enrichment strategies, increasing number of EBV genomes has been published. These genomes were sequenced using different approaches, either with or without EBV DNA enrichment. This review provides an overview of the EBV genomes published to date, and a description of the sequencing technology and bioinformatic analyses employed in generating these sequences. We further explored ways through which the quality of sequencing data can be improved, such as using DNA oligos for capture hybridization, and longer insert size and read length in the sequencing runs. These advances will enable large-scale genomic sequencing of EBV which will facilitate a better understanding of the genetic variations of EBV in different geographic regions and discovery of potentially pathogenic variants in specific diseases.