WorldWideScience

Sample records for human dna sequence

  1. Dna Sequencing

    Science.gov (United States)

    Tabor, Stanley; Richardson, Charles C.

    1995-04-25

    A method for sequencing a strand of DNA, including the steps off: providing the strand of DNA; annealing the strand with a primer able to hybridize to the strand to give an annealed mixture; incubating the mixture with four deoxyribonucleoside triphosphates, a DNA polymerase, and at least three deoxyribonucleoside triphosphates in different amounts, under conditions in favoring primer extension to form nucleic acid fragments complementory to the DNA to be sequenced; labelling the nucleic and fragments; separating them and determining the position of the deoxyribonucleoside triphosphates by differences in the intensity of the labels, thereby to determine the DNA sequence.

  2. The DNA sequence, annotation and analysis of human chromosome 3

    DEFF Research Database (Denmark)

    Muzny, Donna M; Scherer, Steven E; Kaul, Rajinder

    2006-01-01

    After the completion of a draft human genome sequence, the International Human Genome Sequencing Consortium has proceeded to finish and annotate each of the 24 chromosomes comprising the human genome. Here we describe the sequencing and analysis of human chromosome 3, one of the largest human chr...

  3. The DNA sequence of the human X chromosome.

    Science.gov (United States)

    Ross, Mark T; Grafham, Darren V; Coffey, Alison J; Scherer, Steven; McLay, Kirsten; Muzny, Donna; Platzer, Matthias; Howell, Gareth R; Burrows, Christine; Bird, Christine P; Frankish, Adam; Lovell, Frances L; Howe, Kevin L; Ashurst, Jennifer L; Fulton, Robert S; Sudbrak, Ralf; Wen, Gaiping; Jones, Matthew C; Hurles, Matthew E; Andrews, T Daniel; Scott, Carol E; Searle, Stephen; Ramser, Juliane; Whittaker, Adam; Deadman, Rebecca; Carter, Nigel P; Hunt, Sarah E; Chen, Rui; Cree, Andrew; Gunaratne, Preethi; Havlak, Paul; Hodgson, Anne; Metzker, Michael L; Richards, Stephen; Scott, Graham; Steffen, David; Sodergren, Erica; Wheeler, David A; Worley, Kim C; Ainscough, Rachael; Ambrose, Kerrie D; Ansari-Lari, M Ali; Aradhya, Swaroop; Ashwell, Robert I S; Babbage, Anne K; Bagguley, Claire L; Ballabio, Andrea; Banerjee, Ruby; Barker, Gary E; Barlow, Karen F; Barrett, Ian P; Bates, Karen N; Beare, David M; Beasley, Helen; Beasley, Oliver; Beck, Alfred; Bethel, Graeme; Blechschmidt, Karin; Brady, Nicola; Bray-Allen, Sarah; Bridgeman, Anne M; Brown, Andrew J; Brown, Mary J; Bonnin, David; Bruford, Elspeth A; Buhay, Christian; Burch, Paula; Burford, Deborah; Burgess, Joanne; Burrill, Wayne; Burton, John; Bye, Jackie M; Carder, Carol; Carrel, Laura; Chako, Joseph; Chapman, Joanne C; Chavez, Dean; Chen, Ellson; Chen, Guan; Chen, Yuan; Chen, Zhijian; Chinault, Craig; Ciccodicola, Alfredo; Clark, Sue Y; Clarke, Graham; Clee, Chris M; Clegg, Sheila; Clerc-Blankenburg, Kerstin; Clifford, Karen; Cobley, Vicky; Cole, Charlotte G; Conquer, Jen S; Corby, Nicole; Connor, Richard E; David, Robert; Davies, Joy; Davis, Clay; Davis, John; Delgado, Oliver; Deshazo, Denise; Dhami, Pawandeep; Ding, Yan; Dinh, Huyen; Dodsworth, Steve; Draper, Heather; Dugan-Rocha, Shannon; Dunham, Andrew; Dunn, Matthew; Durbin, K James; Dutta, Ireena; Eades, Tamsin; Ellwood, Matthew; Emery-Cohen, Alexandra; Errington, Helen; Evans, Kathryn L; Faulkner, Louisa; Francis, Fiona; Frankland, John; Fraser, Audrey E; Galgoczy, Petra; Gilbert, James; Gill, Rachel; Glöckner, Gernot; Gregory, Simon G; Gribble, Susan; Griffiths, Coline; Grocock, Russell; Gu, Yanghong; Gwilliam, Rhian; Hamilton, Cerissa; Hart, Elizabeth A; Hawes, Alicia; Heath, Paul D; Heitmann, Katja; Hennig, Steffen; Hernandez, Judith; Hinzmann, Bernd; Ho, Sarah; Hoffs, Michael; Howden, Phillip J; Huckle, Elizabeth J; Hume, Jennifer; Hunt, Paul J; Hunt, Adrienne R; Isherwood, Judith; Jacob, Leni; Johnson, David; Jones, Sally; de Jong, Pieter J; Joseph, Shirin S; Keenan, Stephen; Kelly, Susan; Kershaw, Joanne K; Khan, Ziad; Kioschis, Petra; Klages, Sven; Knights, Andrew J; Kosiura, Anna; Kovar-Smith, Christie; Laird, Gavin K; Langford, Cordelia; Lawlor, Stephanie; Leversha, Margaret; Lewis, Lora; Liu, Wen; Lloyd, Christine; Lloyd, David M; Loulseged, Hermela; Loveland, Jane E; Lovell, Jamieson D; Lozado, Ryan; Lu, Jing; Lyne, Rachael; Ma, Jie; Maheshwari, Manjula; Matthews, Lucy H; McDowall, Jennifer; McLaren, Stuart; McMurray, Amanda; Meidl, Patrick; Meitinger, Thomas; Milne, Sarah; Miner, George; Mistry, Shailesh L; Morgan, Margaret; Morris, Sidney; Müller, Ines; Mullikin, James C; Nguyen, Ngoc; Nordsiek, Gabriele; Nyakatura, Gerald; O'Dell, Christopher N; Okwuonu, Geoffery; Palmer, Sophie; Pandian, Richard; Parker, David; Parrish, Julia; Pasternak, Shiran; Patel, Dina; Pearce, Alex V; Pearson, Danita M; Pelan, Sarah E; Perez, Lesette; Porter, Keith M; Ramsey, Yvonne; Reichwald, Kathrin; Rhodes, Susan; Ridler, Kerry A; Schlessinger, David; Schueler, Mary G; Sehra, Harminder K; Shaw-Smith, Charles; Shen, Hua; Sheridan, Elizabeth M; Shownkeen, Ratna; Skuce, Carl D; Smith, Michelle L; Sotheran, Elizabeth C; Steingruber, Helen E; Steward, Charles A; Storey, Roy; Swann, R Mark; Swarbreck, David; Tabor, Paul E; Taudien, Stefan; Taylor, Tineace; Teague, Brian; Thomas, Karen; Thorpe, Andrea; Timms, Kirsten; Tracey, Alan; Trevanion, Steve; Tromans, Anthony C; d'Urso, Michele; Verduzco, Daniel; Villasana, Donna; Waldron, Lenee; Wall, Melanie; Wang, Qiaoyan; Warren, James; Warry, Georgina L; Wei, Xuehong; West, Anthony; Whitehead, Siobhan L; Whiteley, Mathew N; Wilkinson, Jane E; Willey, David L; Williams, Gabrielle; Williams, Leanne; Williamson, Angela; Williamson, Helen; Wilming, Laurens; Woodmansey, Rebecca L; Wray, Paul W; Yen, Jennifer; Zhang, Jingkun; Zhou, Jianling; Zoghbi, Huda; Zorilla, Sara; Buck, David; Reinhardt, Richard; Poustka, Annemarie; Rosenthal, André; Lehrach, Hans; Meindl, Alfons; Minx, Patrick J; Hillier, Ladeana W; Willard, Huntington F; Wilson, Richard K; Waterston, Robert H; Rice, Catherine M; Vaudin, Mark; Coulson, Alan; Nelson, David L; Weinstock, George; Sulston, John E; Durbin, Richard; Hubbard, Tim; Gibbs, Richard A; Beck, Stephan; Rogers, Jane; Bentley, David R

    2005-03-17

    The human X chromosome has a unique biology that was shaped by its evolution as the sex chromosome shared by males and females. We have determined 99.3% of the euchromatic sequence of the X chromosome. Our analysis illustrates the autosomal origin of the mammalian sex chromosomes, the stepwise process that led to the progressive loss of recombination between X and Y, and the extent of subsequent degradation of the Y chromosome. LINE1 repeat elements cover one-third of the X chromosome, with a distribution that is consistent with their proposed role as way stations in the process of X-chromosome inactivation. We found 1,098 genes in the sequence, of which 99 encode proteins expressed in testis and in various tumour types. A disproportionately high number of mendelian diseases are documented for the X chromosome. Of this number, 168 have been explained by mutations in 113 X-linked genes, which in many cases were characterized with the aid of the DNA sequence.

  4. HMGA1a recognition candidate DNA sequences in humans.

    Directory of Open Access Journals (Sweden)

    Takayuki Manabe

    Full Text Available High mobility group protein A1a (HMGA1a acts as an architectural transcription factor and influences a diverse array of normal biological processes. It binds AT-rich sequences, and previous reports have demonstrated HMGA1a binding to the authentic promoters of various genes. However, the precise sequences that HMGA1a binds to remain to be clarified. Therefore, in this study, we searched for the sequences with the highest affinity for human HMGA1a using an existing SELEX method, and then compared the identified sequences with known human promoter sequences. Based on our results, we propose the sequences "-(G/A-G-(A/T-(A/T-A-T-T-T-" as HMGA1a-binding candidate sequences. Furthermore, these candidate sequences bound native human HMGA1a from SK-N-SH cells. When candidate sequences were analyzed by performing FASTAs against all known human promoter sequences, 500-900 sequences were hit by each one. Some of the extracted genes have already been proven or suggested as HMGA1a-binding promoters. The candidate sequences presented here represent important information for research into the various roles of HMGA1a, including cell differentiation, death, growth, proliferation, and the pathogenesis of cancer.

  5. The DNA sequence and biology of human chromosome 19.

    Science.gov (United States)

    Grimwood, Jane; Gordon, Laurie A; Olsen, Anne; Terry, Astrid; Schmutz, Jeremy; Lamerdin, Jane; Hellsten, Uffe; Goodstein, David; Couronne, Olivier; Tran-Gyamfi, Mary; Aerts, Andrea; Altherr, Michael; Ashworth, Linda; Bajorek, Eva; Black, Stacey; Branscomb, Elbert; Caenepeel, Sean; Carrano, Anthony; Caoile, Chenier; Chan, Yee Man; Christensen, Mari; Cleland, Catherine A; Copeland, Alex; Dalin, Eileen; Dehal, Paramvir; Denys, Mirian; Detter, John C; Escobar, Julio; Flowers, Dave; Fotopulos, Dea; Garcia, Carmen; Georgescu, Anca M; Glavina, Tijana; Gomez, Maria; Gonzales, Eidelyn; Groza, Matthew; Hammon, Nancy; Hawkins, Trevor; Haydu, Lauren; Ho, Isaac; Huang, Wayne; Israni, Sanjay; Jett, Jamie; Kadner, Kristen; Kimball, Heather; Kobayashi, Arthur; Larionov, Vladimer; Leem, Sun-Hee; Lopez, Frederick; Lou, Yunian; Lowry, Steve; Malfatti, Stephanie; Martinez, Diego; McCready, Paula; Medina, Catherine; Morgan, Jenna; Nelson, Kathryn; Nolan, Matt; Ovcharenko, Ivan; Pitluck, Sam; Pollard, Martin; Popkie, Anthony P; Predki, Paul; Quan, Glenda; Ramirez, Lucia; Rash, Sam; Retterer, James; Rodriguez, Alex; Rogers, Stephanine; Salamov, Asaf; Salazar, Angelica; She, Xinwei; Smith, Doug; Slezak, Tom; Solovyev, Victor; Thayer, Nina; Tice, Hope; Tsai, Ming; Ustaszewska, Anna; Vo, Nu; Wagner, Mark; Wheeler, Jeremy; Wu, Kevin; Xie, Gary; Yang, Joan; Dubchak, Inna; Furey, Terrence S; DeJong, Pieter; Dickson, Mark; Gordon, David; Eichler, Evan E; Pennacchio, Len A; Richardson, Paul; Stubbs, Lisa; Rokhsar, Daniel S; Myers, Richard M; Rubin, Edward M; Lucas, Susan M

    2004-04-01

    Chromosome 19 has the highest gene density of all human chromosomes, more than double the genome-wide average. The large clustered gene families, corresponding high G + C content, CpG islands and density of repetitive DNA indicate a chromosome rich in biological and evolutionary significance. Here we describe 55.8 million base pairs of highly accurate finished sequence representing 99.9% of the euchromatin portion of the chromosome. Manual curation of gene loci reveals 1,461 protein-coding genes and 321 pseudogenes. Among these are genes directly implicated in mendelian disorders, including familial hypercholesterolaemia and insulin-resistant diabetes. Nearly one-quarter of these genes belong to tandemly arranged families, encompassing more than 25% of the chromosome. Comparative analyses show a fascinating picture of conservation and divergence, revealing large blocks of gene orthology with rodents, scattered regions with more recent gene family expansions and deletions, and segments of coding and non-coding conservation with the distant fish species Takifugu.

  6. The DNA sequence and biology of human chromosome 19

    Energy Technology Data Exchange (ETDEWEB)

    Grimwood, J; Gordon, L A; Olsen, A; Terry, A; Schmutz, J; Lamerdin, J; Hellsten, U; Goodstein, D; Couronne, O; Tran-Gyamfi, M

    2004-04-06

    Chromosome 19 has the highest gene density of all human chromosomes, more than double the genome-wide average. The large clustered gene families, corresponding high GC content, CpG islands and density of repetitive DNA indicate a chromosome rich in biological and evolutionary significance. Here we describe 55.8 million base pairs of highly accurate finished sequence representing 99.9% of the euchromatin portion of the chromosome. Manual curation of gene loci reveals 1,461 protein-coding genes and 321 pseudogenes. Among these are genes directly implicated in Mendelian disorders, including familial hypercholesterolemia and insulin-resistant diabetes. Nearly one quarter of these genes belong to tandemly arranged families, encompassing more than 25% of the chromosome. Comparative analyses show a fascinating picture of conservation and divergence, revealing large blocks of gene orthology with rodents, scattered regions with more recent gene family expansions and deletions, and segments of coding and non-coding conservation with the distant fish species Takifugu.

  7. The DNA sequence and biology of human chromosome 19

    Energy Technology Data Exchange (ETDEWEB)

    Grimwood, Jane; Gordon, Laurie A.; Olsen, Anne; Terry, Astrid; Schmutz, Jeremy; Lamerdin, Jane; Hellsten, Uffe; Goodstein, David; Couronne, Olivier; Tran-Gyamfi, Mary; Aerts, Andrea; Altherr, Michael; Ashworth, Linda; Bajorek, Eva; Black, Stacey; Branscomb, Elbert; Caenepeel, Sean; Carrano, Anthony; Caoile, Chenier; Chan, Yee Man; Christensen, Mari; Cleland, Catherine A.; Copeland, Alex; Dalin, Eileen; Dehal, Paramvir; Denys, Mirian; Detter, John C.; Escobar, Julio; Flowers, Dave; Fotopulos, Dea; Garcia, Carmen; Georgescu, Anca M.; Glavina, Tijana; Gomez, Maria; Gonzales, Eldelyn; Groza, Matthew; Hammon, Nancy; Hawkins, Trevor; Haydu, Lauren; Ho, Issac; Huang, Wayne; Israni, Sanjay; Jett, Jamie; Kadner, Kristen; Kimball, Heather; Kobayashi, Arthur; Larionov, Vladimer; Leem, Sun-Hee; Lopez, Frederick; Lou, Yunian; Lowry, Steve; Malfatti, Stephanie; Martinez, Diego; McCready, Paula; Medina, Catherine; Morgan, Jenna; Nelson, Kathryn; Nolan, Matt; Ovcharenko, Ivan; Pitluck, Sam; Pollard, Martin; Popkie, Anthony P.; Predki, Paul; Quan, Glenda; Ramirez, Lucia; Rash, Sam; Retterer, James; Rodriguez, Alex; Rogers, Stephanine; Salamov, Asaf; Salazar, Angelica; She, Xinwei; Smith, Doug; Slezak, Tom; Solovyev, Victor; Thayer, Nina; Tice, Hope; Tsai, Ming; Ustaszewska, Anna; Vo, Nu; Wagner, Mark; Wheeler, Jeremy; Wu, Kevin; Xie, Gary; Yang, Joan; Dubchak, Inna; Furey, Terrence S.; DeJong, Pieter; Dickson, Mark; Gordon, David; Eichler, Evan E.; Pennacchio, Len A.; Richardson, Paul; Stubbs, Lisa; Rokhsar, Daniel S.; Myers, Richard M.; Rubin, Edward M.; Lucas, Susan M.

    2003-09-15

    Chromosome 19 has the highest gene density of all human chromosomes, more than double the genome-wide average. The large clustered gene families, corresponding high G1C content, CpG islands and density of repetitive DNA indicate a chromosome rich in biological and evolutionary significance. Here we describe 55.8 million base pairs of highly accurate finished sequence representing 99.9 percent of the euchromatin portion of the chromosome. Manual curation of gene loci reveals 1,461 protein-coding genes and 321 pseudogenes. Among these are genes directly implicated in mendelian disorders, including familial hypercholesterolaemia and insulin-resistant diabetes. Nearly one-quarter of these genes belong to tandemly arranged families, encompassing more than 25 percent of the chromosome. Comparative analyses show a fascinating picture of conservation and divergence, revealing large blocks of gene orthology with rodents, scattered regions with more recent gene family expansions and deletions, a nd segments of coding and non-coding conservation with the distant fish species Takifugu.

  8. Genome-Wide Prediction of DNA Methylation Using DNA Composition and Sequence Complexity in Human

    Science.gov (United States)

    Wu, Chengchao; Yao, Shixin; Li, Xinghao; Chen, Chujia; Hu, Xuehai

    2017-01-01

    DNA methylation plays a significant role in transcriptional regulation by repressing activity. Change of the DNA methylation level is an important factor affecting the expression of target genes and downstream phenotypes. Because current experimental technologies can only assay a small proportion of CpG sites in the human genome, it is urgent to develop reliable computational models for predicting genome-wide DNA methylation. Here, we proposed a novel algorithm that accurately extracted sequence complexity features (seven features) and developed a support-vector-machine-based prediction model with integration of the reported DNA composition features (trinucleotide frequency and GC content, 65 features) by utilizing the methylation profiles of embryonic stem cells in human. The prediction results from 22 human chromosomes with size-varied windows showed that the 600-bp window achieved the best average accuracy of 94.7%. Moreover, comparisons with two existing methods further showed the superiority of our model, and cross-species predictions on mouse data also demonstrated that our model has certain generalization ability. Finally, a statistical test of the experimental data and the predicted data on functional regions annotated by ChromHMM found that six out of 10 regions were consistent, which implies reliable prediction of unassayed CpG sites. Accordingly, we believe that our novel model will be useful and reliable in predicting DNA methylation. PMID:28212312

  9. The finished DNA sequence of human chromosome 12.

    Science.gov (United States)

    Scherer, Steven E; Muzny, Donna M; Buhay, Christian J; Chen, Rui; Cree, Andrew; Ding, Yan; Dugan-Rocha, Shannon; Gill, Rachel; Gunaratne, Preethi; Harris, R Alan; Hawes, Alicia C; Hernandez, Judith; Hodgson, Anne V; Hume, Jennifer; Jackson, Andrew; Khan, Ziad Mohid; Kovar-Smith, Christie; Lewis, Lora R; Lozado, Ryan J; Metzker, Michael L; Milosavljevic, Aleksandar; Miner, George R; Montgomery, Kate T; Morgan, Margaret B; Nazareth, Lynne V; Scott, Graham; Sodergren, Erica; Song, Xing-Zhi; Steffen, David; Lovering, Ruth C; Wheeler, David A; Worley, Kim C; Yuan, Yi; Zhang, Zhengdong; Adams, Charles Q; Ansari-Lari, M Ali; Ayele, Mulu; Brown, Mary J; Chen, Guan; Chen, Zhijian; Clerc-Blankenburg, Kerstin P; Davis, Clay; Delgado, Oliver; Dinh, Huyen H; Draper, Heather; Gonzalez-Garay, Manuel L; Havlak, Paul; Jackson, Laronda R; Jacob, Leni S; Kelly, Susan H; Li, Li; Li, Zhangwan; Liu, Jing; Liu, Wen; Lu, Jing; Maheshwari, Manjula; Nguyen, Bao-Viet; Okwuonu, Geoffrey O; Pasternak, Shiran; Perez, Lesette M; Plopper, Farah J H; Santibanez, Jireh; Shen, Hua; Tabor, Paul E; Verduzco, Daniel; Waldron, Lenee; Wang, Qiaoyan; Williams, Gabrielle A; Zhang, Jingkun; Zhou, Jianling; Allen, Carlana C; Amin, Anita G; Anyalebechi, Vivian; Bailey, Michael; Barbaria, Joseph A; Bimage, Kesha E; Bryant, Nathaniel P; Burch, Paula E; Burkett, Carrie E; Burrell, Kevin L; Calderon, Eliana; Cardenas, Veronica; Carter, Kelvin; Casias, Kristal; Cavazos, Iracema; Cavazos, Sandra R; Ceasar, Heather; Chacko, Joseph; Chan, Sheryl N; Chavez, Dean; Christopoulos, Constantine; Chu, Joseph; Cockrell, Raynard; Cox, Caroline D; Dang, Michelle; Dathorne, Stephanie R; David, Robert; Davis, Candi Mon'Et; Davy-Carroll, Latarsha; Deshazo, Denise R; Donlin, Jeremy E; D'Souza, Lisa; Eaves, Kristy A; Egan, Amy; Emery-Cohen, Alexandra J; Escotto, Michael; Flagg, Nicole; Forbes, Lisa D; Gabisi, Abdul M; Garza, Melissa; Hamilton, Cerissa; Henderson, Nicholas; Hernandez, Omar; Hines, Sandra; Hogues, Marilyn E; Huang, Mei; Idlebird, DeVincent G; Johnson, Rudy; Jolivet, Angela; Jones, Sally; Kagan, Ryan; King, Laquisha M; Leal, Belita; Lebow, Heather; Lee, Sandra; LeVan, Jaclyn M; Lewis, Lakeshia C; London, Pamela; Lorensuhewa, Lorna M; Loulseged, Hermela; Lovett, Demetria A; Lucier, Alice; Lucier, Raymond L; Ma, Jie; Madu, Renita C; Mapua, Patricia; Martindale, Ashley D; Martinez, Evangelina; Massey, Elizabeth; Mawhiney, Samantha; Meador, Michael G; Mendez, Sylvia; Mercado, Christian; Mercado, Iracema C; Merritt, Christina E; Miner, Zachary L; Minja, Emmanuel; Mitchell, Teresa; Mohabbat, Farida; Mohabbat, Khatera; Montgomery, Baize; Moore, Niki; Morris, Sidney; Munidasa, Mala; Ngo, Robin N; Nguyen, Ngoc B; Nickerson, Elizabeth; Nwaokelemeh, Ogechi O; Nwokenkwo, Stanley; Obregon, Melissa; Oguh, Maryann; Oragunye, Njideka; Oviedo, Rodolfo J; Parish, Bridgette J; Parker, David N; Parrish, Julia; Parks, Kenya L; Paul, Heidie A; Payton, Brett A; Perez, Agapito; Perrin, William; Pickens, Adam; Primus, Eltrick L; Pu, Ling-Ling; Puazo, Maria; Quiles, Miyo M; Quiroz, Juana B; Rabata, Dina; Reeves, Kacy; Ruiz, San Juana; Shao, Hongmei; Sisson, Ida; Sonaike, Titilola; Sorelle, Richard P; Sutton, Angelica E; Svatek, Amanda F; Svetz, Leah Anne; Tamerisa, Kavitha S; Taylor, Tineace R; Teague, Brian; Thomas, Nicole; Thorn, Rachel D; Trejos, Zulma Y; Trevino, Brenda K; Ukegbu, Ogechi N; Urban, Jeremy B; Vasquez, Lydia I; Vera, Virginia A; Villasana, Donna M; Wang, Ling; Ward-Moore, Stephanie; Warren, James T; Wei, Xuehong; White, Flower; Williamson, Angela L; Wleczyk, Regina; Wooden, Hailey S; Wooden, Steven H; Yen, Jennifer; Yoon, Lillienne; Yoon, Vivienne; Zorrilla, Sara E; Nelson, David; Kucherlapati, Raju; Weinstock, George; Gibbs, Richard A

    2006-03-16

    Human chromosome 12 contains more than 1,400 coding genes and 487 loci that have been directly implicated in human disease. The q arm of chromosome 12 contains one of the largest blocks of linkage disequilibrium found in the human genome. Here we present the finished sequence of human chromosome 12, which has been finished to high quality and spans approximately 132 megabases, representing approximately 4.5% of the human genome. Alignment of the human chromosome 12 sequence across vertebrates reveals the origin of individual segments in chicken, and a unique history of rearrangement through rodent and primate lineages. The rate of base substitutions in recent evolutionary history shows an overall slowing in hominids compared with primates and rodents.

  10. The DNA Sequence And Comparative Analysis Of Human Chromosome5

    Energy Technology Data Exchange (ETDEWEB)

    Schmutz, Jeremy; Martin, Joel; Terry, Astrid; Couronne, Olivier; Grimwood, Jane; Lowry, Steve; Gordon, Laurie A.; Scott, Duncan; Xie,Gary; Huang, Wayne; Hellsten, Uffe; Tran-Gyamfi, Mary; She, Xinwei; Prabhakar, Shyam; Aerts, Andrea; Altherr, Michael; Bajorek, Eva; Black,Stacey; Branscomb, Elbert; Caoile, Chenier; Challacombe, Jean F.; Chan,Yee Man; Denys, Mirian; Detter, John C.; Escobar, Julio; Flowers, Dave; Fotopulos, Dea; Glavina, Tijana; Gomez, Maria; Gonzales, Eidelyn; Goodstein, David; Grigoriev, Igor; Groza, Matthew; Hammon, Nancy; Hawkins, Trevor; Haydu, Lauren; Israni, Sanjay; Jett, Jamie; Kadner,Kristen; Kimball, Heather; Kobayashi, Arthur; Lopez, Frederick; Lou,Yunian; Martinez, Diego; Medina, Catherine; Morgan, Jenna; Nandkeshwar,Richard; Noonan, James P.; Pitluck, Sam; Pollard, Martin; Predki, Paul; Priest, James; Ramirez, Lucia; Retterer, James; Rodriguez, Alex; Rogers,Stephanie; Salamov, Asaf; Salazar, Angelica; Thayer, Nina; Tice, Hope; Tsai, Ming; Ustaszewska, Anna; Vo, Nu; Wheeler, Jeremy; Wu, Kevin; Yang,Joan; Dickson, Mark; Cheng, Jan-Fang; Eichler, Evan E.; Olsen, Anne; Pennacchio, Len A.; Rokhsar, Daniel S.; Richardson, Paul; Lucas, SusanM.; Myers, Richard M.; Rubin, Edward M.

    2004-08-01

    Chromosome 5 is one of the largest human chromosomes and contains numerous intrachromosomal duplications, yet it has one of the lowest gene densities. This is partially explained by numerous gene-poor regions that display a remarkable degree of noncoding conservation with non-mammalian vertebrates, suggesting that they are functionally constrained. In total, we compiled 177.7 million base pairs of highly accurate finished sequence containing 923 manually curated protein-coding genes including the protocadherin and interleukin gene families. We also completely sequenced versions of the large chromosome-5-specific internal duplications. These duplications are very recent evolutionary events and probably have a mechanistic role in human physiological variation, as deletions in these regions are the cause of debilitating disorders including spinal muscular atrophy.

  11. cDNA cloning, sequence analysis, and chromosomal localization of the gene for human carnitine palmitoyltransferase

    Energy Technology Data Exchange (ETDEWEB)

    Finocchiaro, G.; Taroni, F.; Martin, A.L.; Colombo, I.; Tarelli, G.T.; DiDonato, S. (Istituto Nazionale Neurologico C. Besta, Milan (Italy)); Rocchi, M. (Istituto G. Gaslini, Genoa (Italy))

    1991-01-15

    The authors have cloned and sequenced a cDNA encoding human liver carnitine palmitoyltransferase an inner mitochondrial membrane enzyme that plays a major role in the fatty acid oxidation pathway. Mixed oligonucleotide primers whose sequences were deduced from one tryptic peptide obtained from purified CPTase were used in a polymerase chain reaction, allowing the amplification of a 0.12-kilobase fragment of human genomic DNA encoding such a peptide. A 60-base-pair (bp) oligonucleotide synthesized on the basis of the sequence from this fragment was used for the screening of a cDNA library from human liver and hybridized to a cDNA insert of 2255 bp. This cDNA contains an open reading frame of 1974 bp that encodes a protein of 658 amino acid residues including 25 residues of an NH{sub 2}-terminal leader peptide. The assignment of this open reading frame to human liver CPTase is confirmed by matches to seven different amino acid sequences of tryptic peptides derived from pure human CPTase and by the 82.2% homology with the amino acid sequence of rat CPTase. The NH{sub 2}-terminal region of CPTase contains a leucine-proline motif that is shared by carnitine acetyl- and octanoyltransferases and by choline acetyltransferase. The gene encoding CPTase was assigned to human chromosome 1, region 1q12-1pter, by hybridization of CPTase cDNA with a DNA panel of 19 human-hanster somatic cell hybrids.

  12. Human mitochondrial DNA complete amplification and sequencing: a new validated primer set that prevents nuclear DNA sequences of mitochondrial origin co-amplification.

    Science.gov (United States)

    Ramos, Amanda; Santos, Cristina; Alvarez, Luis; Nogués, Ramon; Aluja, Maria Pilar

    2009-05-01

    To date, there are no published primers to amplify the entire mitochondrial DNA (mtDNA) that completely prevent the amplification of nuclear DNA (nDNA) sequences of mitochondrial origin. The main goal of this work was to design, validate and describe a set of primers, to specifically amplify and sequence the complete human mtDNA, allowing the correct interpretation of mtDNA heteroplasmy in healthy and pathological samples. Validation was performed using two different approaches: (i) Basic Local Alignment Search Tool and (ii) amplification using isolated nDNA obtained from sperm cells by differential lyses. During the validation process, two mtDNA regions, with high similarity with nDNA, represent the major problematic areas for primer design. One of these could represent a non-published nuclear DNA sequence of mitochondrial origin. For two of the initially designed fragments, the amplification results reveal PCR artifacts that can be attributed to the poor quality of the DNA. After the validation, nine overlapping primer pairs to perform mtDNA amplification and 22 additional internal primers for mtDNA sequencing were obtained. These primers could be a useful tool in future projects that deal with mtDNA complete sequencing and heteroplasmy detection, since they represent a set of primers that have been tested for the non-amplification of nDNA.

  13. The DNA sequence of the human X chromosome

    OpenAIRE

    Ross, Mark T.; Grafham, Darren V.; Coffey, Alison J; Scherer, Steven; McLay, Kirsten; Muzny, Donna; Platzer, Matthias; Howell, Gareth R.; Burrows, Christine; Bird, Christine P.; Frankish, Adam; Lovell, Frances L.; Howe, Kevin L; Jennifer L Ashurst; Fulton, Robert S.

    2005-01-01

    The human X chromosome has a unique biology that was shaped by its evolution as the sex chromosome shared by males and females. We have determined 99.3% of the euchromatic sequence of the X chromosome. Our analysis illustrates the autosomal origin of the mammalian sex chromosomes, the stepwise process that led to the progressive loss of recombination between X and Y, and the extent of subsequent degradation of the Y chromosome. LINE1 repeat elements cover one-third of the X chromosome, with a...

  14. A human cellular sequence implicated in trk oncogene activation is DNA damage inducible

    Energy Technology Data Exchange (ETDEWEB)

    Ben-Ishai, R.; Scharf, R.; Sharon, R.; Kapten, I. (Technion-Israel Institute of Technology, Haifa (Israel))

    1990-08-01

    Xeroderma pigmentosum cells, which are deficient in the repair of UV light-induced DNA damage, have been used to clone DNA-damage-inducible transcripts in human cells. The cDNA clone designated pC-5 hybridizes on RNA gel blots to a 1-kilobase transcript, which is moderately abundant in nontreated cells and whose synthesis is enhanced in human cells following UV irradiation or treatment with several other DNA-damaging agents. UV-enhanced transcription of C-5 RNA is transient and occurs at lower fluences and to a greater extent in DNA-repair-deficient than in DNA-repair-proficient cells. Southern blot analysis indicates that the C-5 gene belongs to a multigene family. A cDNA clone containing the complete coding sequence of C-5 was isolated. Sequence analysis revealed that it is homologous to a human cellular sequence encoding the amino-terminal activating sequence of the trk-2h chimeric oncogene. The presence of DNA-damage-responsive sequences at the 5' end of a chimeric oncogene could result in enhanced expression of the oncogene in response to carcinogens.

  15. Detection of head-to-tail DNA sequences of human bocavirus in clinical samples.

    Directory of Open Access Journals (Sweden)

    Jessica Lüsebrink

    Full Text Available Parvoviruses are single stranded DNA viruses that replicate in a so called "rolling-hairpin" mechanism, a variant of the rolling circle replication known for bacteriophages like φX174. The replication intermediates of parvoviruses thus are concatemers of head-to-head or tail-to-tail structure. Surprisingly, in case of the novel human bocavirus, neither head-to-head nor tail-to-tail DNA sequences were detected in clinical isolates; in contrast head-to-tail DNA sequences were identified by PCR and sequencing. Thereby, the head-to-tail sequences were linked by a novel sequence of 54 bp of which 20 bp also occur as conserved structures of the palindromic ends of parvovirus MVC which in turn is a close relative to human bocavirus.

  16. Recombinant human MDM2 oncoprotein shows sequence composition selectivity for binding to both RNA and DNA.

    Science.gov (United States)

    Challen, Christine; Anderson, John J; Chrzanowska-Lightowlers, Zofia M A; Lightowlers, Robert N; Lunec, John

    2012-03-01

    MDM2 is a 90 kDa nucleo-phosphoprotein that binds p53 and other proteins contributing to its oncogenic properties. Its structure includes an amino proximal p53 binding site, a central acidic domain and a carboxy region which incorporates Zinc and Ring Finger domains suggestive of nucleic acid binding or transcription factor function. It has previously been reported that a bacculovirus expressed MDM2 protein binds RNA in a sequence-specific manner through the Ring Finger domain, however, its ability to bind DNA has yet to be examined. We report here that a bacterially expressed human MDM2 protein binds both DNA as well as the previously defined RNA consensus sequence. DNA binding appears selective and involves the carboxy-terminal domain of the molecule. RNA binding is inhibited by an MDM2 specific antibody, which recognises an epitope within the carboxy region of the protein. Selection cloning and sequence analysis of MDM2 DNA binding sequences, unlike RNA binding sequences, revealed no obvious DNA binding consensus sequence, but preferential binding to oligopurine:pyrimidine-rich stretches. Our results suggest that the observed preferential DNA binding may occur through the Zinc Finger or in a charge-charge interaction through the Ring Finger, thereby implying potentially different mechanisms for DNA and RNA MDM2 binding.

  17. Characterization of human chromosomal DNA sequences which replicate autonomously in Saccharomyces cerevisiae.

    Science.gov (United States)

    Montiel, J F; Norbury, C J; Tuite, M F; Dobson, M J; Mills, J S; Kingsman, A J; Kingsman, S M

    1984-01-01

    We have characterised two restriction fragments, isolated from a "shotgun" collection of human DNA, which function as autonomously replicating sequences (ARSs) in Saccharomyces cerevisiae. Functional domains of these fragments have been defined by subcloning and exonuclease (BAL 31) deletion analysis. Both fragments contain two spatially distinct domains. One is essential for high frequency transformation and is termed the Replication Sequence (RS) domain, the other, termed the Replication Enhancer (RE) domain, has no inherent replication competence but is essential for ensuring maximum function of the RS domain. The nucleotide sequence of these domains reveals several conserved sequences one of which is strikingly similar to the yeast ARS consensus sequence. PMID:6320114

  18. Molecular cloning and nucleotide sequence of cDNA for human liver arginase

    Energy Technology Data Exchange (ETDEWEB)

    Haraguchi, Y.; Takiguchi, M.; Amaya, Y.; Kawamoto, S.; Matsuda, I.; Mori, M.

    1987-01-01

    Arginase (EC3.5.3.1) catalyzes the last step of the urea cycle in the liver of ureotelic animals. Inherited deficiency of the enzyme results in argininemia, an autosomal recessive disorder characterized by hyperammonemia. To facilitate investigation of the enzyme and gene structures and to elucidate the nature of the mutation in argininemia, the authors isolated cDNA clones for human liver arginase. Oligo(dT)-primed and random primer human liver cDNA libraries in lambda gt11 were screened using isolated rat arginase cDNA as a probe. Two of the positive clones, designated lambda hARG6 and lambda hARG109, contained an overlapping cDNA sequence with an open reading frame encoding a polypeptide of 322 amino acid residues (predicted M/sub r/, 34,732), a 5'-untranslated sequence of 56 base pairs, a 3'-untranslated sequence of 423 base pairs, and a poly(A) segment. Arginase activity was detected in Escherichia coli cells transformed with the plasmid carrying lambda hARG6 cDNA insert. RNA gel blot analysis of human liver RNA showed a single mRNA of 1.6 kilobases. The predicted amino acid sequence of human liver arginase is 87% and 41% identical with those of the rat liver and yeast enzymes, respectively. There are several highly conserved segments among the human, rat, and yeast enzymes.

  19. Discovery of human inversion polymorphisms by comparative analysis of human and chimpanzee DNA sequence assemblies.

    Directory of Open Access Journals (Sweden)

    2005-10-01

    Full Text Available With a draft genome-sequence assembly for the chimpanzee available, it is now possible to perform genome-wide analyses to identify, at a submicroscopic level, structural rearrangements that have occurred between chimpanzees and humans. The goal of this study was to investigate chromosomal regions that are inverted between the chimpanzee and human genomes. Using the net alignments for the builds of the human and chimpanzee genome assemblies, we identified a total of 1,576 putative regions of inverted orientation, covering more than 154 mega-bases of DNA. The DNA segments are distributed throughout the genome and range from 23 base pairs to 62 mega-bases in length. For the 66 inversions more than 25 kilobases (kb in length, 75% were flanked on one or both sides by (often unrelated segmental duplications. Using PCR and fluorescence in situ hybridization we experimentally validated 23 of 27 (85% semi-randomly chosen regions; the largest novel inversion confirmed was 4.3 mega-bases at human Chromosome 7p14. Gorilla was used as an out-group to assign ancestral status to the variants. All experimentally validated inversion regions were then assayed against a panel of human samples and three of the 23 (13% regions were found to be polymorphic in the human genome. These polymorphic inversions include 730 kb (at 7p22, 13 kb (at 7q11, and 1 kb (at 16q24 fragments with a 5%, 30%, and 48% minor allele frequency, respectively. Our results suggest that inversions are an important source of variation in primate genome evolution. The finding of at least three novel inversion polymorphisms in humans indicates this type of structural variation may be a more common feature of our genome than previously realized.

  20. Identification and complete sequencing of novel human transcripts through the use of mouse orthologs and testis cDNA sequences

    DEFF Research Database (Denmark)

    Ferreira, Elisa N; Pires, Lilian C; Parmigiani, Raphael B;

    2004-01-01

    The correct identification of all human genes, and their derived transcripts, has not yet been achieved, and it remains one of the major aims of the worldwide genomics community. Computational programs suggest the existence of 30,000 to 40,000 human genes. However, definitive gene identification...... can only be achieved by experimental approaches. We used two distinct methodologies, one based on the alignment of mouse orthologous sequences to the human genome, and another based on the construction of a high-quality human testis cDNA library, in an attempt to identify new human transcripts within...

  1. A comprehensive assay for targeted multiplex amplification of human DNA sequences.

    Science.gov (United States)

    Krishnakumar, Sujatha; Zheng, Jianbiao; Wilhelmy, Julie; Faham, Malek; Mindrinos, Michael; Davis, Ronald

    2008-07-01

    We developed a robust and reproducible methodology to amplify human sequences in parallel for use in downstream multiplexed sequence analyses. We call the methodology SMART (Spacer Multiplex Amplification Reaction), and it is based, in part, on padlock probe technology. As a proof of principle, we used SMART technology to simultaneously amplify 485 human exons ranging from 100 to 500 bp from human genomic DNA. In multiple repetitions, >90% of the targets were successfully amplified with a high degree of uniformity, with 70% of targets falling within a 10-fold range and all products falling within a 100-fold range of each other in abundance. We used long padlock probes (LPPs) >300 bases in length for the assay, and the increased length of these probes allowed for the capture of human sequences up to 500 bp in length, which is optimal for capturing most human exons. To engineer the LPPs, we developed a method that generates ssDNA molecules with precise ends, using an appropriately designed dsDNA template. The template has appropriate restriction sites engineered into it that can be digested to generate nucleotide overhangs that are suitable for lambda exonuclease digestion, producing a single-stranded probe from dsDNA. The SMART technology is flexible and can be easily adapted to multiplex tens of thousands of target sequences in a single reaction.

  2. Complete amino acid sequence of human intestinal aminopeptidase N as deduced from cloned cDNA

    DEFF Research Database (Denmark)

    Cowell, G M; Kønigshøfer, E; Danielsen, E M

    1988-01-01

    The complete primary structure (967 amino acids) of an intestinal human aminopeptidase N (EC 3.4.11.2) was deduced from the sequence of a cDNA clone. Aminopeptidase N is anchored to the microvillar membrane via an uncleaved signal for membrane insertion. A domain constituting amino acid 250...

  3. DNA sequencing by CE.

    Science.gov (United States)

    Karger, Barry L; Guttman, András

    2009-06-01

    Sequencing of human and other genomes has been at the center of interest in the biomedical field over the past several decades and is now leading toward an era of personalized medicine. During this time, DNA-sequencing methods have evolved from the labor-intensive slab gel electrophoresis, through automated multiCE systems using fluorophore labeling with multispectral imaging, to the "next-generation" technologies of cyclic-array, hybridization based, nanopore and single molecule sequencing. Deciphering the genetic blueprint and follow-up confirmatory sequencing of Homo sapiens and other genomes were only possible with the advent of modern sequencing technologies that were a result of step-by-step advances with a contribution of academics, medical personnel and instrument companies. While next-generation sequencing is moving ahead at breakneck speed, the multicapillary electrophoretic systems played an essential role in the sequencing of the Human Genome, the foundation of the field of genomics. In this prospective, we wish to overview the role of CE in DNA sequencing based in part of several of our articles in this journal.

  4. Nucleotide sequence of cloned cDNA for human pancreatic kallikrein.

    Science.gov (United States)

    Fukushima, D; Kitamura, N; Nakanishi, S

    1985-12-31

    Cloned cDNA sequences for human pancreatic kallikrein have been isolated and determined by molecular cloning and sequence analysis. The identity between human pancreatic and urinary kallikreins is indicated by the complete coincidence between the amino acid sequence deduced from the cloned cDNA sequence and that reported partially for urinary kallikrein. The active enzyme form of the human pancreatic kallikrein consists of 238 amino acids and is preceded by a signal peptide and a profragment of 24 amino acids. A sequence comparison of this with other mammalian kallikreins indicates that key amino acid residues required for both serine protease activity and kallikrein-like cleavage specificity are retained in the human sequence, and residues corresponding to some external loops of the kallikrein diverge from other kallikreins. Analyses by RNA blot hybridization, primer extension, and S1 nuclease mapping indicate that the pancreatic kallikrein mRNA is also expressed in the kidney and sublingual gland, suggesting the active synthesis of urinary kallikrein in these tissues. Furthermore, the tissue-specific regulation of the expression of the members of the human kallikrein gene family has been discussed.

  5. Timing of human protein evolution as revealed by massively parallel capture of Neandertal nuclear DNA sequences

    Science.gov (United States)

    Burbano, Hernán A.; Hodges, Emily; Green, Richard E.; Briggs, Adrian W.; Krause, Johannes; Meyer, Matthias; Good, Jeffrey M.; Maricic, Tomislav; Johnson, Philipp L.F.; Xuan, Zhenyu; Rooks, Michelle; Bhattacharjee, Arindam; Brizuela, Leonardo; Albert, Frank W.; de la Rasilla, Marco; Fortea, Javier; Rosas, Antonio; Lachmann, Michael; Hannon, Gregory J.; Pääbo, Svante

    2010-01-01

    Whole genome shotgun sequencing is now possible for extinct organisms, as well as the targeted capture of specific regions. However, targeted resequencing of megabase sized parts of nuclear genomes has yet to be demonstrated for ancient DNA. Here we show that hybridization capture on microarrays can be used to generate large scale targeted data from Neandertal DNA even in the presence of ~99.8% microbial DNA. It is thus now possible to generate high quality data from large regions of the nuclear genome from Neandertals and other extinct organisms. Using this approach we have sequenced ~14,000 protein coding positions that have been inferred to have changed on the human lineage since the last common ancestor shared with chimpanzees. We identify 88 amino acid substitutions that have become fixed in all humans since the divergence from the Neandertals. PMID:20448179

  6. Sequence and transcription analysis of the human cytomegalovirus DNA polymerase gene

    Energy Technology Data Exchange (ETDEWEB)

    Kouzarides, T.; Bankier, A.T.; Satchwell, S.C.; Weston, K.; Tomlinson, P.; Barrell, B.G.

    1987-01-01

    DNA sequence analysis has revealed that the gene coding for the human cytomegalovirus (HCMV) DNA polymerase is present within the long unique region of the virus genome. Identification is based on extensive amino acid homology between the predicted HCMV open reading frame HFLF2 and the DNA polymerase of herpes simplex virus type 1. The authors present here a 5280 base-pair DNA sequence containing the HCMV pol gene, along with the analysis of transcripts encoded within this region. Since HCMV pol also shows homology to the predicted Epstein-Barr virus pol, they were able to analyze the extent of homology between the DNA polymerases of three distantly related herpes viruses, HCMV, Epstein-Barr virus, and herpes simplex virus. The comparison shows that these DNA polymerases exhibit considerable amino acid homology and highlights a number of highly conserved regions; two such regions show homology to sequences within the adenovirus type 2 DNA polymerase. The HCMV pol gene is flanked by open reading frames with homology to those of other herpes viruses; upstream, there is a reading frame homologous to the glycoprotein B gene of herpes simplex virus type I and Epstein-Barr virus, and downstream there is a reading frame homologous to BFLF2 of Epstein-Barr virus.

  7. Sequence characterization of a human embryonic craniofacial cDNA library

    Energy Technology Data Exchange (ETDEWEB)

    Padanilam, B.J.; Barsel, S.; Solursh, M. [and others

    1994-09-01

    Broad-based sequencing approaches for the characterization of human cDNA libraries have proven successful in identifying large numbers of novel genes of specific tissue or developmental stages. To pursue our interests in human craniofacial development, stages. To pursue our interests in human craniofacial development, we have made use of both subtracted and unsubtracted cDNA libraries constructed from embryonic craniofacial tissue obtained from pooled samples at 42-54 days gestation. Single-pass sequencing was carried out using an ABI automated sequencer and T3 or T7 primers. Sequences were characterized using BLAST and GRAIL, and the identified homologous sequences grouped according to gene class and family. Four genes have been mapped using repeat sequence elements identified in the clones. Using primers developed from sequence data, other genes are being mapped using a panel of somatic cell hybrids. To date, a total of 786 sequences have been returned with 35% identifying no homologies, and 35% with strong homologies to previously identified genes. A number of genes previously identified to play a role in human embryonic development have been returned from the sequence comparisons providing evidence that the library is representative of this tissue and stage of development. Previous characterization of the library has also identified a number of novel embryonically expressed human homeobox genes. Genes felt to be of special relevance based on their homology to characterized genes known to play a role in development or that are members of novel classes but with high scores on GRAIL searches are being characterized using whole mount in situ hybridization with mouse embryos. Characterization of the library with respect to chromosomal mapping, gene types and make-up, and embryonic expression patterns will be presented.

  8. The genome-wide DNA sequence specificity of the anti-tumour drug bleomycin in human cells.

    Science.gov (United States)

    Murray, Vincent; Chen, Jon K; Tanaka, Mark M

    2016-07-01

    The cancer chemotherapeutic agent, bleomycin, cleaves DNA at specific sites. For the first time, the genome-wide DNA sequence specificity of bleomycin breakage was determined in human cells. Utilising Illumina next-generation DNA sequencing techniques, over 200 million bleomycin cleavage sites were examined to elucidate the bleomycin genome-wide DNA selectivity. The genome-wide bleomycin cleavage data were analysed by four different methods to determine the cellular DNA sequence specificity of bleomycin strand breakage. For the most highly cleaved DNA sequences, the preferred site of bleomycin breakage was at 5'-GT* dinucleotide sequences (where the asterisk indicates the bleomycin cleavage site), with lesser cleavage at 5'-GC* dinucleotides. This investigation also determined longer bleomycin cleavage sequences, with preferred cleavage at 5'-GT*A and 5'- TGT* trinucleotide sequences, and 5'-TGT*A tetranucleotides. For cellular DNA, the hexanucleotide DNA sequence 5'-RTGT*AY (where R is a purine and Y is a pyrimidine) was the most highly cleaved DNA sequence. It was striking that alternating purine-pyrimidine sequences were highly cleaved by bleomycin. The highest intensity cleavage sites in cellular and purified DNA were very similar although there were some minor differences. Statistical nucleotide frequency analysis indicated a G nucleotide was present at the -3 position (relative to the cleavage site) in cellular DNA but was absent in purified DNA.

  9. Combined sequencing of mRNA and DNA from human embryonic stem cells.

    Science.gov (United States)

    Mertes, Florian; Kuhl, Heiner; Wruck, Wasco; Lehrach, Hans; Adjaye, James

    2016-06-01

    Combined transcriptome and whole genome sequencing of the same ultra-low input sample down to single cells is a rapidly evolving approach for the analysis of rare cells. Besides stem cells, rare cells originating from tissues like tumor or biopsies, circulating tumor cells and cells from early embryonic development are under investigation. Herein we describe a universal method applicable for the analysis of minute amounts of sample material (150 to 200 cells) derived from sub-colony structures from human embryonic stem cells. The protocol comprises the combined isolation and separate amplification of poly(A) mRNA and whole genome DNA followed by next generation sequencing. Here we present a detailed description of the method developed and an overview of the results obtained for RNA and whole genome sequencing of human embryonic stem cells, sequencing data is available in the Gene Expression Omnibus (GEO) database under accession number GSE69471.

  10. Locus Reference Genomic sequences: An improved basis for describing human DNA variants

    KAUST Repository

    Dalgleish, Raymond

    2010-04-15

    As our knowledge of the complexity of gene architecture grows, and we increase our understanding of the subtleties of gene expression, the process of accurately describing disease-causing gene variants has become increasingly problematic. In part, this is due to current reference DNA sequence formats that do not fully meet present needs. Here we present the Locus Reference Genomic (LRG) sequence format, which has been designed for the specifi c purpose of gene variant reporting. The format builds on the successful National Center for Biotechnology Information (NCBI) RefSeqGene project and provides a single-fi le record containing a uniquely stable reference DNA sequence along with all relevant transcript and protein sequences essential to the description of gene variants. In principle, LRGs can be created for any organism, not just human. In addition, we recognize the need to respect legacy numbering systems for exons and amino acids and the LRG format takes account of these. We hope that widespread adoption of LRGs - which will be created and maintained by the NCBI and the European Bioinformatics Institute (EBI) - along with consistent use of the Human Genome Variation Society (HGVS)- approved variant nomenclature will reduce errors in the reporting of variants in the literature and improve communication about variants aff ecting human health. Further information can be found on the LRG web site (http://www.lrg-sequence.org). 2010 Dalgleish et al.; licensee BioMed Central Ltd.

  11. Sequence polymorphism of human mitochondrial DNA control region in Chinese Dongxiang unrelated individuals

    Institute of Scientific and Technical Information of China (English)

    LIU Xin-she; CHEN Teng; LI Sheng-bin

    2004-01-01

    Objective: To investigate the mitochondrial DNA sequence polymorphism in Chinese Dongxiang ethnic group and to provide basic data used in ethnic origin investigation and forensic purpose. Methods: Genomic DNA was extracted from the whole blood of 100 unrelated individuals of Chinese Dongxiang ethnic group by standard Chelex-100 method.The sequence polymorphism was determined by PCR amplification and direct sequencing. Results: Eighty-two polymorphic sites were identified in mtDNA D-loop region 16 091 - 16 418 np, and 88 haplotypes were found. The genetic diversity was calculated to be 0.996 9, and the genetic identity was 0.013 2. Conclusion: There are some particular polymorphic sites in Chinese Dongxiang ethnic group, and these sites provide an important basis to investigate the origin of Dongxiang and the relationship between Dongxiang and other ethnic groups. The result also suggested that sequence polymorphism from 16 091 -16 418 np in human mitochondrial DNA control region can be an useful tool for forensic identity.

  12. Frequency of Epstein-Barr virus DNA sequences in human gliomas

    Directory of Open Access Journals (Sweden)

    Renata Fragelli Fonseca

    Full Text Available CONTEXT AND OBJECTIVE: The Epstein-Barr virus (EBV is the most common cause of infectious mononucleosis and is also associated with several human tumors, including Burkitt's lymphoma, Hodgkin's lymphoma, some cases of gastric carcinoma and nasopharyngeal carcinoma, among other neoplasms. The aim of this study was to screen 75 primary gliomas for the presence of specific EBV DNA sequences by means of the polymerase chain reaction (PCR, with confirmation by direct sequencing. DESIGN AND SETTING: Prevalence study on EBV molecular genetics at a molecular pathology laboratory in a university hospital and at an applied genetics laboratory in a national institution. METHODS: A total of 75 primary glioma biopsies and 6 others from other tumors from the central nervous system were obtained. The tissues were immediately frozen for subsequent DNA extraction by means of traditional methods using proteinase K digestion and extraction with a phenol-chloroform-isoamyl alcohol mixture. DNA was precipitated with ethanol, resuspended in buffer and stored. The PCRs were carried out using primers for amplification of the EBV BamM region. Positive and negative controls were added to each reaction. The PCR products were used for direct sequencing for confirmation. RESULTS: The viral sequences were positive in 11/75 (14.7% of our samples. CONCLUSION: The prevalence of EBV DNA was 11/75 (14.7% in our glioma collection. Further molecular and epidemiological studies are needed to establish the possible role played by EBV in the tumorigenesis of gliomas.

  13. Human secreted carbonic anhydrase: cDNA cloning, nucleotide sequence, and hybridization histochemistry

    Energy Technology Data Exchange (ETDEWEB)

    Aldred, P.; Fu, Ping; Barrett, G.; Penschow, J.D.; Wright, R.D.; Coghlan, J.P.; Fernley, R.T. (The Howard Florey Institute of Experimental Physiology and Medicine, Parkville, Victoria (Australia))

    1991-01-01

    Complementary DNA clones coding for the human secreted carbonic anhydrase isozyme (CAVI) have been isolated and their nucleotide sequences determined. These clones identify a 1.45-kb mRNA that is present in high levels in parotid submandibular salivary glands but absent in other tissues such as the sublingual gland, kidney, liver, and prostate gland. Hybridization histochemistry of human salivary glands shows mRNA for CA VI located in the acinar cells of these glands. The cDNA clones encode a protein of 308 amino acids that includes a 17 amino acid leader sequence typical of secreted proteins. The mature protein has 291 amino acids compared to 259 or 260 for the cytoplasmic isozymes, with most of the extra amino acids present as a carboxyl terminal extension. In comparison, sheep CA VI has a 45 amino acid extension. Overall the human CA VI protein has a sequence identity of 35 {percent} with human CA II, while residues involved in the active site of the enzymes have been conserved. The human and sheep secreted carbonic anhydrases have a sequence identity of 72 {percent}. This includes the two cysteine residues that are known to be involved in an intramolecular disulfide bond in the sheep CA VI. The enzyme is known to be glycosylated and three potential N-glycosylation sites (Asn-X-Thr/Ser) have been identified. Two of these are known to be glycosylated in sheep CA VI. Southern analysis of human DNA indicates that there is only one gene coding for CA VI.

  14. Non-B DNA-forming sequences and WRN deficiency independently increase the frequency of base substitution in human cells

    DEFF Research Database (Denmark)

    Bacolla, Albino; Wang, Guliang; Jain, Aklank

    2011-01-01

    determined non-B DNA-induced mutation frequencies and spectra in human U2OS osteosarcoma cells and assessed the role of WRN in isogenic knockdown (WRN-KD) cells using a supF gene mutation reporter system flanked by triplex- or Z-DNA-forming sequences. Although both non-B DNA and WRN-KD served to increase...

  15. DNA sequences encoding erythropoietin

    Energy Technology Data Exchange (ETDEWEB)

    Lin, F.K.

    1987-10-27

    A purified and isolated DNA sequence is described consisting essentially of a DNA sequence encoding a polypeptide having an amino acid sequence sufficiently duplicative of that of erythropoietin to allow possession of the biological property of causing bone marrow cells to increase production of reticulocytes and red blood cells, and to increase hemoglobin synthesis or iron uptake.

  16. p21WAF1/CIP1 gene DNA sequencing and its expression in human osteosarcoma

    Institute of Scientific and Technical Information of China (English)

    廖威明; 张春林; 李佛保; 曾炳芳; 曾益新

    2004-01-01

    Background Mutation and expression change of p21WAF1/CIP1 may play a role in the growth of osteosarcoma. This study was to investigate the expression of the p21WAF1/CIP1 gene in human osteosarcoma, p21WAF1/CIP1 gene DNA sequence change and their relationships with the phenotype and clinical prognosis.Methods p21WAF1/CIP1 gene in 10 normal people and the tumours of 45 osteosarcoma patients were examined using polymerase chain reaction-single strand conformation polymorphism (PCR-SSCP) with silver staining. The PCR product with an abnormal strand was sequenced directly. The p21WAF1/CIP1 gene mRNA and P21 protein of 45 cases of osteosarcoma were investigated by using in situ hybridization and immunohistochemistry, respectively. Results The occurrence of P21 protein in osteosarcoma was 17.78% (8/45), and p21WAF1/CIP1 mRNA expression in osteosarcoma was 42.22% (19/45). The p21WAF1/CIP1 gene DNA sequencing of amplified production showed that in p21WAF1/CIP1 gene exon 3 of 36 cases of human osteosarcoma, there were 17 cases (47.22%) with C→T at position 609; 10 normal blood samples' DNA sequence analysis yielded 8 cases (80.00%) with C→T at the same position. Conclusions Along with the increase of malignancy, the expression of p21WAF1/CIP1mRNA and P21 protein in osteosarcoma tends to decrease. It is uncommon for the p21WAF1/CIP1 gene mutation to occur in human osteosarcoma. As a result, the possible existence of tumour subtypes of p21WAF1/CIP1 gene mutation should be investigated. Our research leads to the location of p21WAF1/CIP1 gene polymorphism of Chinese osteosarcoma patients, which can provide a basis for further research.

  17. HLA DNA sequence variation among human populations: molecular signatures of demographic and selective events.

    Directory of Open Access Journals (Sweden)

    Stéphane Buhler

    Full Text Available Molecular differences between HLA alleles vary up to 57 nucleotides within the peptide binding coding region of human Major Histocompatibility Complex (MHC genes, but it is still unclear whether this variation results from a stochastic process or from selective constraints related to functional differences among HLA molecules. Although HLA alleles are generally treated as equidistant molecular units in population genetic studies, DNA sequence diversity among populations is also crucial to interpret the observed HLA polymorphism. In this study, we used a large dataset of 2,062 DNA sequences defined for the different HLA alleles to analyze nucleotide diversity of seven HLA genes in 23,500 individuals of about 200 populations spread worldwide. We first analyzed the HLA molecular structure and diversity of these populations in relation to geographic variation and we further investigated possible departures from selective neutrality through Tajima's tests and mismatch distributions. All results were compared to those obtained by classical approaches applied to HLA allele frequencies.Our study shows that the global patterns of HLA nucleotide diversity among populations are significantly correlated to geography, although in some specific cases the molecular information reveals unexpected genetic relationships. At all loci except HLA-DPB1, populations have accumulated a high proportion of very divergent alleles, suggesting an advantage of heterozygotes expressing molecularly distant HLA molecules (asymmetric overdominant selection model. However, both different intensities of selection and unequal levels of gene conversion may explain the heterogeneous mismatch distributions observed among the loci. Also, distinctive patterns of sequence divergence observed at the HLA-DPB1 locus suggest current neutrality but old selective pressures on this gene. We conclude that HLA DNA sequences advantageously complement HLA allele frequencies as a source of data used

  18. Isolation, Mapping, DNA Sequence and RFLPs Studies of Random Single-Copy DNA Segments on Human X Chromosome

    Institute of Scientific and Technical Information of China (English)

    谭骏; 邱信芳; 薛京伦; 朱锡华; 纪贤文; 张冬梅; 秦世真

    1994-01-01

    Using the total human/mouse DNA as the probe, screening has been carried out three times with in situ plaque hybridization to obtain the single-copy DNA sequence from the human X chromosome genomic library. The effective rate of screening is 1. 45%. DNAs from clones containing single-copy inserts have been analyzed by a panel of hybrid cells with or without human X chromosome. Three segments, designated by DXFD52,73,75, are mapped to the X chromosome. DXFD52 has been precisely localized on Xq12-q13 with in situ chromosomal hybridization. DXFD52 has been partially sequenced. The results indicate that DXFD52 is a new isolated single-copy segment on the X chromosome. Great progress in the RFLPs study with DXFD52 has been achieved in the population of Chongqing, Sichuan Province. The results show that the DXFD52 can be used to detect the RFLP with Hind Ⅲ, Bgl Ⅱ, and Hinf Ⅰ. DXFD52 will be a potential "landmark" for the construction of the complete linkage map of human genome and the analysis of genomic s

  19. DNA sequencing conference, 2

    Energy Technology Data Exchange (ETDEWEB)

    Cook-Deegan, R.M. [Georgetown Univ., Kennedy Inst. of Ethics, Washington, DC (United States); Venter, J.C. [National Inst. of Neurological Disorders and Strokes, Bethesda, MD (United States); Gilbert, W. [Harvard Univ., Cambridge, MA (United States); Mulligan, J. [Stanford Univ., CA (United States); Mansfield, B.K. [Oak Ridge National Lab., TN (United States)

    1991-06-19

    This conference focused on DNA sequencing, genetic linkage mapping, physical mapping, informatics and bioethics. Several were used to study this sequencing and mapping. This article also discusses computer hardware and software aiding in the mapping of genes.

  20. Dragon polya spotter: Predictor of poly(A) motifs within human genomic DNA sequences

    KAUST Repository

    Kalkatawi, Manal Matoq Saeed

    2011-11-15

    Motivation: Recognition of poly(A) signals in mRNA is relatively straightforward due to the presence of easily recognizable polyadenylic acid tail. However, the task of identifying poly(A) motifs in the primary genomic DNA sequence that correspond to poly(A) signals in mRNA is a far more challenging problem. Recognition of poly(A) signals is important for better gene annotation and understanding of the gene regulation mechanisms. In this work, we present one such poly(A) motif prediction method based on properties of human genomic DNA sequence surrounding a poly(A) motif. These properties include thermodynamic, physico-chemical and statistical characteristics. For predictions, we developed Artificial Neural Network and Random Forest models. These models are trained to recognize 12 most common poly(A) motifs in human DNA. Our predictors are available as a free web-based tool accessible at http://cbrc.kaust.edu.sa/dps. Compared with other reported predictors, our models achieve higher sensitivity and specificity and furthermore provide a consistent level of accuracy for 12 poly(A) motif variants. The Author(s) 2011. Published by Oxford University Press. All rights reserved.

  1. Stem-loop structures of the repetitive DNA sequences located at human centromeres

    Energy Technology Data Exchange (ETDEWEB)

    Gupta, G.; Garcia, A.E.; Ratliff, R.; Moyzis, R.K. [Los Alamos National Lab., NM (United States); Catasti, P.; Hong, Lin; Yau, P. [California Univ., Davis, CA (United States). Dept. of Biological Chemistry; Bradbury, E.M. [Los Alamos National Lab., NM (United States)]|[California Univ., Davis, CA (United States). Dept. of Biological Chemistry

    1993-09-01

    The presence of the highly conserved repetitive DNA sequences in the human centromeres argues for a special role of these sequences in their biological functions - most likely achieved by the formation of unusual structures. This prompted us to carry out quantitative one- and two-dimensional nuclear magnetic resonance (lD/2D NMR) spectroscopy to determine the structural properties of the human centromeric repeats, d(AATGG){sub n.d}(CCATT){sub n}. The studies on centromeric DNAs reveal that the complementary sequence, d(AATGG){sub n.d}(CCATT){sub n}, adopts the usual Watson-Crick B-DNA duplex and the pyrimidine-rich d(CCATT){sub n} strand is essentially a random coil. However, the purine-rich d(AATGG){sub n} strand is shown to adopt unusual stem-loop structures for repeat lengths, n=2,3,4, and 6. In addition to normal Watson-Crick A{center_dot}T pairs, the stem-loop structures are stabilized by mismatch A{center_dot}G and G{center_dot}G pairs in the stem and G-G-A stacking in the loop. Stem-loop structures of d(AATGG)n are independently verified by gel electrophoresis and nuclease digestion studies. Thermal melting studies show that the DNA repeats, d(AATGG){sub n}, are as stable as the corresponding Watson-Crick duplex d(AATGG){sub n.d}(CCATT){sub n}. Therefore, the sequence d(AATGG){sub n} can, indeed, nucleate a stem-loop structure at little free-energy cost and if, during mitosis, they are located on the chromosome surface they can provide specific recognition sites for kinetochore function.

  2. Rarity of DNA sequence alterations in the promoter region of the human androgen receptor gene

    Directory of Open Access Journals (Sweden)

    D.F. Cabral

    2004-12-01

    Full Text Available The human androgen receptor (AR gene promoter lies in a GC-rich region containing two principal sites of transcription initiation and a putative Sp1 protein-binding site, without typical "TATA" and "CAAT" boxes. It has been suggested that mutations within the 5'untranslated region (5'UTR may contribute to the development of prostate cancer by changing the rates of gene transcription and/or translation. In order to investigate this question, the aim of the present study was to search for the presence of mutations or polymorphisms at the AR-5'UTR in 92 prostate cancer patients, where histological diagnosis of adenocarcinoma was established in specimens obtained from transurethral resection or after prostatectomy. The AR-5'UTR was amplified by PCR from genomic DNA samples of the patients and of 100 healthy male blood donors, included as controls. Conformation-sensitive gel electrophoresis was used for DNA sequence alteration screening. Only one band shift was detected in one individual from the blood donor group. Sequencing revealed a new single nucleotide deletion (T in the most conserved portion of the promoter region at position +36 downstream from the transcription initiation site I. Although the effect of this specific mutation remains unknown, its rarity reveals the high degree of sequence conservation of the human androgen promoter region. Moreover, the absence of detectable variation within the critical 5'UTR in prostate cancer patients indicates a low probability of its involvement in prostate cancer etiology.

  3. Vulvar carcinomas: search for sequences homologous to human papillomavirus and herpes simplex virus DNA.

    Science.gov (United States)

    Pilotti, S; Rotola, A; D'Amato, L; Di Luca, D; Shah, K V; Cassai, E; Rilke, F

    1990-07-01

    Ten cases of intraepithelial carcinoma, five with Bowenoid features and five with early invasion, and ten cases of invasive vulvar carcinoma were examined by in situ hybridization and Southern blot analysis using DNA probes for human papillomavirus (HPV) types 6, 11, 16, 18 and 31. HPV DNA was detected in 90% of the intraepithelial cases and in 10% of the invasive cases. All positive cases showed the presence of DNA of HPV type 16. The cases with intraepithelial lesions revealed a strong correlation between the presence of HPV type 16 DNA, cigarette smoking habit, other potential cofactors such as herpes simplex (HSV) DNA sequences and the use of contraceptive drugs, and clinicopathologic features of Bowen's type in situ squamous cell carcinoma. Similar associations were not observed among the cases with invasive disease. While HPV-16 is associated with differentiated Bowenoid type vulvar intraepithelial neoplasia, which appears to be the most common form of early carcinoma of the vulva, the same association was not seen with respect to advanced vulvar invasive squamous cell carcinoma.

  4. Large-scale oscillation of structure-related DNA sequence features in human chromosome 21

    Science.gov (United States)

    Li, Wentian; Miramontes, Pedro

    2006-08-01

    Human chromosome 21 is the only chromosome in the human genome that exhibits oscillation of the (G+C) content of a cycle length of hundreds kilobases (kb) ( 500kb near the right telomere). We aim at establishing the existence of a similar periodicity in structure-related sequence features in order to relate this (G+C)% oscillation to other biological phenomena. The following quantities are shown to oscillate with the same 500kb periodicity in human chromosome 21: binding energy calculated by two sets of dinucleotide-based thermodynamic parameters, AA/TT and AAA/TTT bi- and tri-nucleotide density, 5'-TA-3' dinucleotide density, and signal for 10- or 11-base periodicity of AA/TT or AAA/TTT. These intrinsic quantities are related to structural features of the double helix of DNA molecules, such as base-pair binding, untwisting or unwinding, stiffness, and a putative tendency for nucleosome formation.

  5. Automated DNA Sequencing System

    Energy Technology Data Exchange (ETDEWEB)

    Armstrong, G.A.; Ekkebus, C.P.; Hauser, L.J.; Kress, R.L.; Mural, R.J.

    1999-04-25

    Oak Ridge National Laboratory (ORNL) is developing a core DNA sequencing facility to support biological research endeavors at ORNL and to conduct basic sequencing automation research. This facility is novel because its development is based on existing standard biology laboratory equipment; thus, the development process is of interest to the many small laboratories trying to use automation to control costs and increase throughput. Before automation, biology Laboratory personnel purified DNA, completed cycle sequencing, and prepared 96-well sample plates with commercially available hardware designed specifically for each step in the process. Following purification and thermal cycling, an automated sequencing machine was used for the sequencing. A technician handled all movement of the 96-well sample plates between machines. To automate the process, ORNL is adding a CRS Robotics A- 465 arm, ABI 377 sequencing machine, automated centrifuge, automated refrigerator, and possibly an automated SpeedVac. The entire system will be integrated with one central controller that will direct each machine and the robot. The goal of this system is to completely automate the sequencing procedure from bacterial cell samples through ready-to-be-sequenced DNA and ultimately to completed sequence. The system will be flexible and will accommodate different chemistries than existing automated sequencing lines. The system will be expanded in the future to include colony picking and/or actual sequencing. This discrete event, DNA sequencing system will demonstrate that smaller sequencing labs can achieve cost-effective the laboratory grow.

  6. Sequence to Medical Phenotypes: A Framework for Interpretation of Human Whole Genome DNA Sequence Data.

    Directory of Open Access Journals (Sweden)

    Frederick E Dewey

    2015-10-01

    Full Text Available High throughput sequencing has facilitated a precipitous drop in the cost of genomic sequencing, prompting predictions of a revolution in medicine via genetic personalization of diagnostic and therapeutic strategies. There are significant barriers to realizing this goal that are related to the difficult task of interpreting personal genetic variation. A comprehensive, widely accessible application for interpretation of whole genome sequence data is needed. Here, we present a series of methods for identification of genetic variants and genotypes with clinical associations, phasing genetic data and using Mendelian inheritance for quality control, and providing predictive genetic information about risk for rare disease phenotypes and response to pharmacological therapy in single individuals and father-mother-child trios. We demonstrate application of these methods for disease and drug response prognostication in whole genome sequence data from twelve unrelated adults, and for disease gene discovery in one father-mother-child trio with apparently simplex congenital ventricular arrhythmia. In doing so we identify clinically actionable inherited disease risk and drug response genotypes in pre-symptomatic individuals. We also nominate a new candidate gene in congenital arrhythmia, ATP2B4, and provide experimental evidence of a regulatory role for variants discovered using this framework.

  7. Evolution of DNA sequencing

    National Research Council Canada - National Science Library

    Tipu, Hamid Nawaz; Shabbir, Ambreen

    2015-01-01

    Sanger and coworkers introduced DNA sequencing in 1970s for the first time. It principally relied on termination of growing nucleotide chain when a dideoxythymidine triphosphate (ddTTP) was inserted...

  8. Gomphid DNA sequence data

    Data.gov (United States)

    U.S. Environmental Protection Agency — DNA sequence data for several genetic loci. This dataset is not publicly accessible because: It's already publicly available on GenBank. It can be accessed through...

  9. Mitochondrial DNA variant discovery and evaluation in human Cardiomyopathies through next-generation sequencing.

    Directory of Open Access Journals (Sweden)

    Michael V Zaragoza

    Full Text Available Mutations in mitochondrial DNA (mtDNA may cause maternally-inherited cardiomyopathy and heart failure. In homoplasmy all mtDNA copies contain the mutation. In heteroplasmy there is a mixture of normal and mutant copies of mtDNA. The clinical phenotype of an affected individual depends on the type of genetic defect and the ratios of mutant and normal mtDNA in affected tissues. We aimed at determining the sensitivity of next-generation sequencing compared to Sanger sequencing for mutation detection in patients with mitochondrial cardiomyopathy. We studied 18 patients with mitochondrial cardiomyopathy and two with suspected mitochondrial disease. We "shotgun" sequenced PCR-amplified mtDNA and multiplexed using a single run on Roche's 454 Genome Sequencer. By mapping to the reference sequence, we obtained 1,300x average coverage per case and identified high-confidence variants. By comparing these to >400 mtDNA substitution variants detected by Sanger, we found 98% concordance in variant detection. Simulation studies showed that >95% of the homoplasmic variants were detected at a minimum sequence coverage of 20x while heteroplasmic variants required >200x coverage. Several Sanger "misses" were detected by 454 sequencing. These included the novel heteroplasmic 7501T>C in tRNA serine 1 in a patient with sudden cardiac death. These results support a potential role of next-generation sequencing in the discovery of novel mtDNA variants with heteroplasmy below the level reliably detected with Sanger sequencing. We hope that this will assist in the identification of mtDNA mutations and key genetic determinants for cardiomyopathy and mitochondrial disease.

  10. DNA sequence-dependent fluorescence of doxorubicin for turn-on detection of biothiols in human serum.

    Science.gov (United States)

    Chen, Xing; Jiang, Guimei; Wang, Zhili; Hong, Shanni; Zhang, Yuanyuan; Guo, Yahui; Cheng, Hui; Wang, Jine; Pei, Renjun

    2016-01-01

    Doxorubicin (Dox) is a DNA-targeting anthracycline antibiotic active against a wide spectrum of cancers. The interaction between Dox and double-stranded DNA (dsDNA) was used to load Dox using DNA duplexes as carriers. More importantly, the interesting DNA sequence-dependent fluorescence response of Dox could be exploited in the design of efficient Dox release systems and efficient fluorescence sensors. In this work, we demonstrated that separate introduction of G and C bases into T-rich single-stranded DNA (ssDNA) sequences afforded the best discrimination of Dox binding between dsDNA and ssDNA. For the first time, we successfully utilized this interesting DNA sequence-dependent fluorescence response of Dox as a signal transduction mechanism for the sensitive detection of biothiols in human serum. Cysteine, homocysteine, and glutathione were detected at as low as 26 nM, 37 nM, and 29 nM, respectively. The biosensors exhibited not only good selectivity, stability, and sensitivity in aqueous solutions but also a sensitive response in human serum, demonstrating their potential for diagnosis.

  11. Sequence space coverage, entropy of genomes and the potential to detect non-human DNA in human samples

    Directory of Open Access Journals (Sweden)

    Maley Carlo C

    2008-10-01

    Full Text Available Abstract Background Genomes store information for building and maintaining organisms. Complete sequencing of many genomes provides the opportunity to study and compare global information properties of those genomes. Results We have analyzed aspects of the information content of Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans, Arabidopsis thaliana, Saccharomyces cerevisiae, and Escherichia coli (K-12 genomes. Virtually all possible (> 98% 12 bp oligomers appear in vertebrate genomes while 98% to D. melanogaster (12–17 bp, C. elegans (11–17 bp, A. thaliana (11–17 bp, S. cerevisiae (10–16 bp and E. coli (9–15 bp. Frequencies of unique oligomers in the genomes follow similar patterns. We identified a set of 2.6 M 15-mers that are more than 1 nucleotide different from all 15-mers in the human genome and so could be used as probes to detect microbes in human samples. In a human sample, these probes would detect 100% of the 433 currently fully sequenced prokaryotes and 75% of the 3065 fully sequenced viruses. The human genome is significantly more compact in sequence space than a random genome. We identified the most frequent 5- to 20-mers in the human genome, which may prove useful as PCR primers. We also identified a bacterium, Anaeromyxobacter dehalogenans, which has an exceptionally low diversity of oligomers given the size of its genome and its GC content. The entropy of coding regions in the human genome is significantly higher than non-coding regions and chromosomes. However chromosomes 1, 2, 9, 12 and 14 have a relatively high proportion of coding DNA without high entropy, and chromosome 20 is the opposite with a low frequency of coding regions but relatively high entropy. Conclusion Measures of the frequency of oligomers are useful for designing PCR assays and for identifying chromosomes and organisms with hidden structure that had not been previously recognized. This information may be used to detect

  12. Oral focal epithelial hyperplasia: report of 3 cases with human papillomavirus DNA sequencing analysis.

    Science.gov (United States)

    Gültekin, S E; Tokman Yildirim, Benay; Sarisoy, S

    2011-01-01

    Focal epithelial hyperplasia (FEH), or Heck's disease, is a benign proliferative viral infection of the oral mucosa that is related to Human Papil-lomavirus (HPV), mainly subtypes 13 and 32. Although this condition is known to exist in numerous populations and ethnic groups, the reported cases among Caucasians are relatively rare. It presents as asymptomatic papules or nodules on the oral mucosa, gingiva, tongue, and lips. Histopathologically, it is characterized by parakeratosis, epithelial hyperplasia, focal acanthosis, fusion, and horizontal outgrowth of epithelial ridges and the cells named mitozoids. The purpose of this case report was to present 3 cases of focal epithelial hyperplasia in a pediatric age group. Histopathological and clinical features of cases are discussed and DNA sequencing analysis is reported in which HPV 13, HPV 32, and HPV 11 genomes are detected.

  13. More comprehensive forensic genetic marker analyses for accurate human remains identification using massively parallel DNA sequencing.

    Science.gov (United States)

    Ambers, Angie D; Churchill, Jennifer D; King, Jonathan L; Stoljarova, Monika; Gill-King, Harrell; Assidi, Mourad; Abu-Elmagd, Muhammad; Buhmeida, Abdelbaset; Al-Qahtani, Mohammed; Budowle, Bruce

    2016-10-17

    Although the primary objective of forensic DNA analyses of unidentified human remains is positive identification, cases involving historical or archaeological skeletal remains often lack reference samples for comparison. Massively parallel sequencing (MPS) offers an opportunity to provide biometric data in such cases, and these cases provide valuable data on the feasibility of applying MPS for characterization of modern forensic casework samples. In this study, MPS was used to characterize 140-year-old human skeletal remains discovered at a historical site in Deadwood, South Dakota, United States. The remains were in an unmarked grave and there were no records or other metadata available regarding the identity of the individual. Due to the high throughput of MPS, a variety of biometric markers could be typed using a single sample. Using MPS and suitable forensic genetic markers, more relevant information could be obtained from a limited quantity and quality sample. Results were obtained for 25/26 Y-STRs, 34/34 Y SNPs, 166/166 ancestry-informative SNPs, 24/24 phenotype-informative SNPs, 102/102 human identity SNPs, 27/29 autosomal STRs (plus amelogenin), and 4/8 X-STRs (as well as ten regions of mtDNA). The Y-chromosome (Y-STR, Y-SNP) and mtDNA profiles of the unidentified skeletal remains are consistent with the R1b and H1 haplogroups, respectively. Both of these haplogroups are the most common haplogroups in Western Europe. Ancestry-informative SNP analysis also supported European ancestry. The genetic results are consistent with anthropological findings that the remains belong to a male of European ancestry (Caucasian). Phenotype-informative SNP data provided strong support that the individual had light red hair and brown eyes. This study is among the first to genetically characterize historical human remains with forensic genetic marker kits specifically designed for MPS. The outcome demonstrates that substantially more genetic information can be obtained from

  14. Poly(A) motif prediction using spectral latent features from human DNA sequences

    KAUST Repository

    Xie, Bo

    2013-06-21

    Motivation: Polyadenylation is the addition of a poly(A) tail to an RNA molecule. Identifying DNA sequence motifs that signal the addition of poly(A) tails is essential to improved genome annotation and better understanding of the regulatory mechanisms and stability of mRNA.Existing poly(A) motif predictors demonstrate that information extracted from the surrounding nucleotide sequences of candidate poly(A) motifs can differentiate true motifs from the false ones to a great extent. A variety of sophisticated features has been explored, including sequential, structural, statistical, thermodynamic and evolutionary properties. However, most of these methods involve extensive manual feature engineering, which can be time-consuming and can require in-depth domain knowledge.Results: We propose a novel machine-learning method for poly(A) motif prediction by marrying generative learning (hidden Markov models) and discriminative learning (support vector machines). Generative learning provides a rich palette on which the uncertainty and diversity of sequence information can be handled, while discriminative learning allows the performance of the classification task to be directly optimized. Here, we used hidden Markov models for fitting the DNA sequence dynamics, and developed an efficient spectral algorithm for extracting latent variable information from these models. These spectral latent features were then fed into support vector machines to fine-tune the classification performance.We evaluated our proposed method on a comprehensive human poly(A) dataset that consists of 14 740 samples from 12 of the most abundant variants of human poly(A) motifs. Compared with one of the previous state-of-the-art methods in the literature (the random forest model with expert-crafted features), our method reduces the average error rate, false-negative rate and false-positive rate by 26, 15 and 35%, respectively. Meanwhile, our method makes ?30% fewer error predictions relative to the other

  15. Intergenic DNA sequences from the human X chromosome reveal high rates of global gene flow

    Directory of Open Access Journals (Sweden)

    Wall Jeffrey D

    2008-11-01

    Full Text Available Abstract Background Despite intensive efforts devoted to collecting human polymorphism data, little is known about the role of gene flow in the ancestry of human populations. This is partly because most analyses have applied one of two simple models of population structure, the island model or the splitting model, which make unrealistic biological assumptions. Results Here, we analyze 98-kb of DNA sequence from 20 independently evolving intergenic regions on the X chromosome in a sample of 90 humans from six globally diverse populations. We employ an isolation-with-migration (IM model, which assumes that populations split and subsequently exchange migrants, to independently estimate effective population sizes and migration rates. While the maximum effective size of modern humans is estimated at ~10,000, individual populations vary substantially in size, with African populations tending to be larger (2,300–9,000 than non-African populations (300–3,300. We estimate mean rates of bidirectional gene flow at 4.8 × 10-4/generation. Bidirectional migration rates are ~5-fold higher among non-African populations (1.5 × 10-3 than among African populations (2.7 × 10-4. Interestingly, because effective sizes and migration rates are inversely related in African and non-African populations, population migration rates are similar within Africa and Eurasia (e.g., global mean Nm = 2.4. Conclusion We conclude that gene flow has played an important role in structuring global human populations and that migration rates should be incorporated as critical parameters in models of human demography.

  16. Open chromatin encoded in DNA sequence is the signature of 'master' replication origins in human cells.

    Science.gov (United States)

    Audit, Benjamin; Zaghloul, Lamia; Vaillant, Cédric; Chevereau, Guillaume; d'Aubenton-Carafa, Yves; Thermes, Claude; Arneodo, Alain

    2009-10-01

    For years, progress in elucidating the mechanisms underlying replication initiation and its coupling to transcriptional activities and to local chromatin structure has been hampered by the small number (approximately 30) of well-established origins in the human genome and more generally in mammalian genomes. Recent in silico studies of compositional strand asymmetries revealed a high level of organization of human genes around 1000 putative replication origins. Here, by comparing with recently experimentally identified replication origins, we provide further support that these putative origins are active in vivo. We show that regions approximately 300-kb wide surrounding most of these putative replication origins that replicate early in the S phase are hypersensitive to DNase I cleavage, hypomethylated and present a significant enrichment in genomic energy barriers that impair nucleosome formation (nucleosome-free regions). This suggests that these putative replication origins are specified by an open chromatin structure favored by the DNA sequence. We discuss how this distinctive attribute makes these origins, further qualified as 'master' replication origins, priviledged loci for future research to decipher the human spatio-temporal replication program. Finally, we argue that these 'master' origins are likely to play a key role in genome dynamics during evolution and in pathological situations.

  17. Human liver phosphatase 2A: cDNA and amino acid sequence of two catalytic subunit isotypes

    Energy Technology Data Exchange (ETDEWEB)

    Arino, J.; Woon, Chee Wai; Brautigan, D.L.; Miller, T.B. Jr.; Johnson, G.L. (Univ. of Massachusetts Medical School, Worcester (USA))

    1988-06-01

    Two cDNA clones were isolated from a human liver library that encode two phosphatase 2A catalytic subunits. The two cDNAs differed in eight amino acids (97% identity) with three nonconservative substitutions. All of the amino acid substitutions were clustered in the amino-terminal domain of the protein. Amino acid sequence of one human liver clone (HL-14) was identical to the rabbit skeletal muscle phosphatase 2A cDNA (with 97% nucleotide identity). The second human liver clone (HL-1) is encoded by a separate gene, and RNA gel blot analysis indicates that both mRNAs are expressed similarly in several human clonal cell lines. Sequence comparison with phosphatase 1 and 2A indicates highly divergent amino acid sequences at the amino and carboxyl termini of the proteins and identifies six highly conserved regions between the two proteins that are predicted to be important for phosphatase enzymatic activity.

  18. mtDNA-Server: next-generation sequencing data analysis of human mitochondrial DNA in the cloud

    Science.gov (United States)

    Weissensteiner, Hansi; Forer, Lukas; Fuchsberger, Christian; Schöpf, Bernd; Kloss-Brandstätter, Anita; Specht, Günther; Kronenberg, Florian; Schönherr, Sebastian

    2016-01-01

    Next generation sequencing (NGS) allows investigating mitochondrial DNA (mtDNA) characteristics such as heteroplasmy (i.e. intra-individual sequence variation) to a higher level of detail. While several pipelines for analyzing heteroplasmies exist, issues in usability, accuracy of results and interpreting final data limit their usage. Here we present mtDNA-Server, a scalable web server for the analysis of mtDNA studies of any size with a special focus on usability as well as reliable identification and quantification of heteroplasmic variants. The mtDNA-Server workflow includes parallel read alignment, heteroplasmy detection, artefact or contamination identification, variant annotation as well as several quality control metrics, often neglected in current mtDNA NGS studies. All computational steps are parallelized with Hadoop MapReduce and executed graphically with Cloudgene. We validated the underlying heteroplasmy and contamination detection model by generating four artificial sample mix-ups on two different NGS devices. Our evaluation data shows that mtDNA-Server detects heteroplasmies and artificial recombinations down to the 1% level with perfect specificity and outperforms existing approaches regarding sensitivity. mtDNA-Server is currently able to analyze the 1000G Phase 3 data (n = 2,504) in less than 5 h and is freely accessible at https://mtdna-server.uibk.ac.at. PMID:27084948

  19. mtDNA-Server: next-generation sequencing data analysis of human mitochondrial DNA in the cloud.

    Science.gov (United States)

    Weissensteiner, Hansi; Forer, Lukas; Fuchsberger, Christian; Schöpf, Bernd; Kloss-Brandstätter, Anita; Specht, Günther; Kronenberg, Florian; Schönherr, Sebastian

    2016-07-08

    Next generation sequencing (NGS) allows investigating mitochondrial DNA (mtDNA) characteristics such as heteroplasmy (i.e. intra-individual sequence variation) to a higher level of detail. While several pipelines for analyzing heteroplasmies exist, issues in usability, accuracy of results and interpreting final data limit their usage. Here we present mtDNA-Server, a scalable web server for the analysis of mtDNA studies of any size with a special focus on usability as well as reliable identification and quantification of heteroplasmic variants. The mtDNA-Server workflow includes parallel read alignment, heteroplasmy detection, artefact or contamination identification, variant annotation as well as several quality control metrics, often neglected in current mtDNA NGS studies. All computational steps are parallelized with Hadoop MapReduce and executed graphically with Cloudgene. We validated the underlying heteroplasmy and contamination detection model by generating four artificial sample mix-ups on two different NGS devices. Our evaluation data shows that mtDNA-Server detects heteroplasmies and artificial recombinations down to the 1% level with perfect specificity and outperforms existing approaches regarding sensitivity. mtDNA-Server is currently able to analyze the 1000G Phase 3 data (n = 2,504) in less than 5 h and is freely accessible at https://mtdna-server.uibk.ac.at.

  20. Analysis of human mitochondrial DNA sequences from fecally polluted environmental waters as a tool to study population diversity

    Directory of Open Access Journals (Sweden)

    Vikram Kapoor

    2017-05-01

    Full Text Available Mitochondrial signature sequences have frequently been used to study human population diversity around the world. Traditionally, this requires obtaining samples directly from individuals which is cumbersome, time consuming and limited to the number of individuals that participated in these types of surveys. Here, we used environmental DNA extracts to determine the presence and sequence variability of human mitochondrial sequences as a means to study the diversity of populations inhabiting in areas nearby a tropical watershed impacted with human fecal pollution. We used high-throughput sequencing (Illumina and barcoding to obtain thousands of sequences from the mitochondrial hypervariable region 2 (HVR2 and determined the different haplotypes present in 10 different water samples. Sequence analyses indicated a total of 19 distinct variants with frequency greater than 5%. The HVR2 sequences were associated with haplogroups of West Eurasian (57.6%, Sub-Saharan African (23.9%, and American Indian (11% ancestry. This was in relative accordance with population census data from the watershed sites. The results from this study demonstrates the potential value of mitochondrial sequence data retrieved from fecally impacted environmental waters to study the population diversity of local municipalities. This environmental DNA approach may also have other public health implications such as tracking background levels of human mitochondrial genes associated with diseases. It may be possible to expand this approach to other animal species inhabiting or using natural water systems.

  1. PCR-Free Enrichment of Mitochondrial DNA from Human Blood and Cell Lines for High Quality Next-Generation DNA Sequencing.

    Directory of Open Access Journals (Sweden)

    Meetha P Gould

    Full Text Available Recent advances in sequencing technology allow for accurate detection of mitochondrial sequence variants, even those in low abundance at heteroplasmic sites. Considerable sequencing cost savings can be achieved by enriching samples for mitochondrial (relative to nuclear DNA. Reduction in nuclear DNA (nDNA content can also help to avoid false positive variants resulting from nuclear mitochondrial sequences (numts. We isolate intact mitochondrial organelles from both human cell lines and blood components using two separate methods: a magnetic bead binding protocol and differential centrifugation. DNA is extracted and further enriched for mitochondrial DNA (mtDNA by an enzyme digest. Only 1 ng of the purified DNA is necessary for library preparation and next generation sequence (NGS analysis. Enrichment methods are assessed and compared using mtDNA (versus nDNA content as a metric, measured by using real-time quantitative PCR and NGS read analysis. Among the various strategies examined, the optimal is differential centrifugation isolation followed by exonuclease digest. This strategy yields >35% mtDNA reads in blood and cell lines, which corresponds to hundreds-fold enrichment over baseline. The strategy also avoids false variant calls that, as we show, can be induced by the long-range PCR approaches that are the current standard in enrichment procedures. This optimization procedure allows mtDNA enrichment for efficient and accurate massively parallel sequencing, enabling NGS from samples with small amounts of starting material. This will decrease costs by increasing the number of samples that may be multiplexed, ultimately facilitating efforts to better understand mitochondria-related diseases.

  2. Extensive sequence-influenced DNA methylation polymorphism in the human genome

    OpenAIRE

    Hellman Asaf; Chess Andrew

    2010-01-01

    Abstract Background Epigenetic polymorphisms are a potential source of human diversity, but their frequency and relationship to genetic polymorphisms are unclear. DNA methylation, an epigenetic mark that is a covalent modification of the DNA itself, plays an important role in the regulation of gene expression. Most studies of DNA methylation in mammalian cells have focused on CpG methylation present in CpG islands (areas of concentrated CpGs often found near promoters), but there are also int...

  3. Proceedings of the relevance of mass spectrometry to DNA sequence determination: Research needs for the Human Genome Program

    Energy Technology Data Exchange (ETDEWEB)

    Edmonds, C.G.; Smith, R.D. (Pacific Northwest Lab., Richland, WA (USA)); Smith, L.M. (Wisconsin Univ., Madison, WI (USA))

    1990-11-01

    A workshop was sponsored for the US Department of Energy (DOE), Office of Health and Environmental Research by Pacific Northwest Laboratory, April 4--5, 1990, in Seattle, Washington, to examine the potential role of mass spectrometry in the joint DOE/National Institutes of Health (NIH) Human Genome Program. The workshop was occasioned by recent developments in mass spectrometry that are providing new levels for selectivity, sensitivity, and, in particular, new methods of ionization appropriate for large biopolymers such as DNA. During discussions, three general mass spectrometric approaches to the determination of DNA sequence were considered: (1) the mass spectrometric detection of isotopic labels from DNA sequencing mixtures separated using gel electrophoresis, (2) the direct mass spectrometric analysis from direct ionization of unfractionated sequencing mixtures where the measured mass of the constituents functions to identify and order the base sequence (replacing separation by gel electrophoresis), and (3) an approach in which a single highly charged molecular ion of a large DNA segment produced is rapidly sequenced in an ion cyclotron resonance ion trap. The consensus of the workshop was that, on the basis of the new developments, mass spectrometry has the potential to provide the substantial increases in sequencing speed required for the Human Genome Program. 66 refs., 3 tabs.

  4. Proceedings of the relevance of mass spectrometry to DNA sequence determination: Research needs for the Human Genome Program

    Energy Technology Data Exchange (ETDEWEB)

    Edmonds, C.G.; Smith, R.D. (Pacific Northwest Lab., Richland, WA (USA)); Smith, L.M. (Wisconsin Univ., Madison, WI (USA))

    1990-11-01

    A workshop was sponsored for the US Department of Energy (DOE), Office of Health and Environmental Research by Pacific Northwest Laboratory, April 4--5, 1990, in Seattle, Washington, to examine the potential role of mass spectrometry in the joint DOE/National Institutes of Health (NIH) Human Genome Program. The workshop was occasioned by recent developments in mass spectrometry that are providing new levels for selectivity, sensitivity, and, in particular, new methods of ionization appropriate for large biopolymers such as DNA. During discussions, three general mass spectrometric approaches to the determination of DNA sequence were considered: (1) the mass spectrometric detection of isotopic labels from DNA sequencing mixtures separated using gel electrophoresis, (2) the direct mass spectrometric analysis from direct ionization of unfractionated sequencing mixtures where the measured mass of the constituents functions to identify and order the base sequence (replacing separation by gel electrophoresis), and (3) an approach in which a single highly charged molecular ion of a large DNA segment produced is rapidly sequenced in an ion cyclotron resonance ion trap. The consensus of the workshop was that, on the basis of the new developments, mass spectrometry has the potential to provide the substantial increases in sequencing speed required for the Human Genome Program. 66 refs., 3 tabs.

  5. A Mini-Library of Sequenced Human DNA Fragments: Linking Bench Experiments with Informatics

    Science.gov (United States)

    Dalgleish, Raymond; Shanks, Morag E.; Monger, Karen; Butler, Nicola J.

    2012-01-01

    We describe the development of a mini-library of human DNA fragments for use in an enquiry-based learning (EBL) undergraduate practical incorporating "wet-lab" and bioinformatics tasks. In spite of the widespread emergence of the polymerase chain reaction (PCR), the cloning and analysis of DNA fragments in "Escherichia coli"…

  6. A Mini-Library of Sequenced Human DNA Fragments: Linking Bench Experiments with Informatics

    Science.gov (United States)

    Dalgleish, Raymond; Shanks, Morag E.; Monger, Karen; Butler, Nicola J.

    2012-01-01

    We describe the development of a mini-library of human DNA fragments for use in an enquiry-based learning (EBL) undergraduate practical incorporating "wet-lab" and bioinformatics tasks. In spite of the widespread emergence of the polymerase chain reaction (PCR), the cloning and analysis of DNA fragments in "Escherichia coli"…

  7. Fragmentation of contaminant and endogenous DNA in ancient samples determined by shotgun sequencing; prospects for human palaeogenomics.

    Directory of Open Access Journals (Sweden)

    Marc García-Garcerà

    Full Text Available BACKGROUND: Despite the successful retrieval of genomes from past remains, the prospects for human palaeogenomics remain unclear because of the difficulty of distinguishing contaminant from endogenous DNA sequences. Previous sequence data generated on high-throughput sequencing platforms indicate that fragmentation of ancient DNA sequences is a characteristic trait primarily arising due to depurination processes that create abasic sites leading to DNA breaks. METHODOLOGY/PRINCIPALS FINDINGS: To investigate whether this pattern is present in ancient remains from a temperate environment, we have 454-FLX pyrosequenced different samples dated between 5,500 and 49,000 years ago: a bone from an extinct goat (Myotragus balearicus that was treated with a depurinating agent (bleach, an Iberian lynx bone not subjected to any treatment, a human Neolithic sample from Barcelona (Spain, and a Neandertal sample from the El Sidrón site (Asturias, Spain. The efficiency of retrieval of endogenous sequences is below 1% in all cases. We have used the non-human samples to identify human sequences (0.35 and 1.4%, respectively, that we positively know are contaminants. CONCLUSIONS: We observed that bleach treatment appears to create a depurination-associated fragmentation pattern in resulting contaminant sequences that is indistinguishable from previously described endogenous sequences. Furthermore, the nucleotide composition pattern observed in 5' and 3' ends of contaminant sequences is much more complex than the flat pattern previously described in some Neandertal contaminants. Although much research on samples with known contaminant histories is needed, our results suggest that endogenous and contaminant sequences cannot be distinguished by the fragmentation pattern alone.

  8. Sequencing the hypervariable regions of human mitochondrial DNA using massively parallel sequencing: Enhanced data acquisition for DNA samples encountered in forensic testing.

    Science.gov (United States)

    Davis, Carey; Peters, Dixie; Warshauer, David; King, Jonathan; Budowle, Bruce

    2015-03-01

    Mitochondrial DNA testing is a useful tool in the analysis of forensic biological evidence. In cases where nuclear DNA is damaged or limited in quantity, the higher copy number of mitochondrial genomes available in a sample can provide information about the source of a sample. Currently, Sanger-type sequencing (STS) is the primary method to develop mitochondrial DNA profiles. This method is laborious and time consuming. Massively parallel sequencing (MPS) can increase the amount of information obtained from mitochondrial DNA samples while improving turnaround time by decreasing the numbers of manipulations and more so by exploiting high throughput analyses to obtain interpretable results. In this study 18 buccal swabs, three different tissue samples from five individuals, and four bones samples from casework were sequenced at hypervariable regions I and II using STS and MPS. Sample enrichment for STS and MPS was PCR-based. Library preparation for MPS was performed using Nextera® XT DNA Sample Preparation Kit and sequencing was performed on the MiSeq™ (Illumina, Inc.). MPS yielded full concordance of base calls with STS results, and the newer methodology was able to resolve length heteroplasmy in homopolymeric regions. This study demonstrates short amplicon MPS of mitochondrial DNA is feasible, can provide information not possible with STS, and lays the groundwork for development of a whole genome sequencing strategy for degraded samples.

  9. Frequent mutations in EGFR, KRAS and TP53 genes in human lung cancer tumors detected by ion torrent DNA sequencing.

    Directory of Open Access Journals (Sweden)

    Xin Cai

    Full Text Available Lung cancer is the most common malignancy and the leading cause of cancer deaths worldwide. While smoking is by far the leading cause of lung cancer, other environmental and genetic factors influence the development and progression of the cancer. Since unique mutations patterns have been observed in individual cancer samples, identification and characterization of the distinctive lung cancer molecular profile is essential for developing more effective, tailored therapies. Until recently, personalized DNA sequencing to identify genetic mutations in cancer was impractical and expensive. The recent technological advancements in next-generation DNA sequencing, such as the semiconductor-based Ion Torrent sequencing platform, has made DNA sequencing cost and time effective with more reliable results. Using the Ion Torrent Ampliseq Cancer Panel, we sequenced 737 loci from 45 cancer-related genes to identify genetic mutations in 76 human lung cancer samples. The sequencing analysis revealed missense mutations in KRAS, EGFR, and TP53 genes in the breast cancer samples of various histologic types. Thus, this study demonstrates the necessity of sequencing individual human cancers in order to develop personalized drugs or combination therapies to effectively target individual, breast cancer-specific mutations.

  10. PIK3CA and TP53 gene mutations in human breast cancer tumors frequently detected by ion torrent DNA sequencing.

    Directory of Open Access Journals (Sweden)

    Xusheng Bai

    Full Text Available Breast cancer is the most common malignancy and the leading cause of cancer deaths in women worldwide. While specific genetic mutations have been linked to 5-10% of breast cancer cases, other environmental and epigenetic factors influence the development and progression of the cancer. Since unique mutations patterns have been observed in individual cancer samples, identification and characterization of the distinctive breast cancer molecular profile is needed to develop more effective target therapies. Until recently, identifying genetic cancer mutations via personalized DNA sequencing was impractical and expensive. The recent technological advancements in next-generation DNA sequencing, such as the semiconductor-based Ion Torrent sequencing platform, has made DNA sequencing cost and time effective with more reliable results. Using the Ion Torrent Ampliseq Cancer Panel, we sequenced 737 loci from 45 cancer-related genes to identify genetic mutations in 105 human breast cancer samples. The sequencing analysis revealed missense mutations in PIK3CA, and TP53 genes in the breast cancer samples of various histologic types. Thus, this study demonstrates the necessity of sequencing individual human cancers in order to develop personalized drugs or combination therapies to effectively target individual, breast cancer-specific mutations.

  11. Interspecies hybridization on DNA resequencing microarrays: efficiency of sequence recovery and accuracy of SNP detection in human, ape, and codfish mitochondrial DNA genomes sequenced on a human-specific MitoChip

    Directory of Open Access Journals (Sweden)

    Carr Steven M

    2007-09-01

    Full Text Available Abstract Background Iterative DNA "resequencing" on oligonucleotide microarrays offers a high-throughput method to measure intraspecific biodiversity, one that is especially suited to SNP-dense gene regions such as vertebrate mitochondrial (mtDNA genomes. However, costs of single-species design and microarray fabrication are prohibitive. A cost-effective, multi-species strategy is to hybridize experimental DNAs from diverse species to a common microarray that is tiled with oligonucleotide sets from multiple, homologous reference genomes. Such a strategy requires that cross-hybridization between the experimental DNAs and reference oligos from the different species not interfere with the accurate recovery of species-specific data. To determine the pattern and limits of such interspecific hybridization, we compared the efficiency of sequence recovery and accuracy of SNP identification by a 15,452-base human-specific microarray challenged with human, chimpanzee, gorilla, and codfish mtDNA genomes. Results In the human genome, 99.67% of the sequence was recovered with 100.0% accuracy. Accuracy of SNP identification declines log-linearly with sequence divergence from the reference, from 0.067 to 0.247 errors per SNP in the chimpanzee and gorilla genomes, respectively. Efficiency of sequence recovery declines with the increase of the number of interspecific SNPs in the 25b interval tiled by the reference oligonucleotides. In the gorilla genome, which differs from the human reference by 10%, and in which 46% of these 25b regions contain 3 or more SNP differences from the reference, only 88% of the sequence is recoverable. In the codfish genome, which differs from the reference by > 30%, less than 4% of the sequence is recoverable, in short islands ≥ 12b that are conserved between primates and fish. Conclusion Experimental DNAs bind inefficiently to homologous reference oligonucleotide sets on a re-sequencing microarray when their sequences differ by

  12. An Internet-Accessible DNA Sequence Database for Identifying Fusaria from Human and Animal Infections

    Science.gov (United States)

    Because less than one-third of clinically relevant fusaria can be accurately identified to species level using phenotypic data (i.e., morphological species recognition), we constructed a three-locus DNA sequence database to facilitate molecular identification of the 69 Fusarium species associated wi...

  13. An internet-accessible DNA sequence database for identifying fusaria from human and animal infections

    NARCIS (Netherlands)

    O'Donnell, K.; Sutton, D.A.; Rinaldi, M.G.; Sarver, B.A.J.; Balajee, S.A.; Schroers, H.J.; Summerbell, R.C.; Robert, V.A.R.G.; Crous, P.W.; Zhang, N.; Aoki, T.; Jung, K.; Park, J.; Lee, Y.H.; Kang, S.; Park, B.; Geiser, D.M.

    2010-01-01

    Because less than one-third of clinically relevant fusaria can be accurately identified to species level using phenotypic data (i.e., morphological species recognition), we constructed a three-locus DNA sequence database to facilitate molecular identification of the 69 Fusarium species associated wi

  14. Extensive sequence-influenced DNA methylation polymorphism in the human genome

    Directory of Open Access Journals (Sweden)

    Hellman Asaf

    2010-05-01

    Full Text Available Abstract Background Epigenetic polymorphisms are a potential source of human diversity, but their frequency and relationship to genetic polymorphisms are unclear. DNA methylation, an epigenetic mark that is a covalent modification of the DNA itself, plays an important role in the regulation of gene expression. Most studies of DNA methylation in mammalian cells have focused on CpG methylation present in CpG islands (areas of concentrated CpGs often found near promoters, but there are also interesting patterns of CpG methylation found outside of CpG islands. Results We compared DNA methylation patterns on both alleles between many pairs (and larger groups of related and unrelated individuals. Direct observation and simulation experiments revealed that around 10% of common single nucleotide polymorphisms (SNPs reside in regions with differences in the propensity for local DNA methylation between the two alleles. We further showed that for the most common form of SNP, a polymorphism at a CpG dinucleotide, the presence of the CpG at the SNP positively affected local DNA methylation in cis. Conclusions Taken together with the known effect of DNA methylation on mutation rate, our results suggest an interesting interdependence between genetics and epigenetics underlying diversity in the human genome.

  15. DNA Sequencing Using capillary Electrophoresis

    Energy Technology Data Exchange (ETDEWEB)

    Dr. Barry Karger

    2011-05-09

    The overall goal of this program was to develop capillary electrophoresis as the tool to be used to sequence for the first time the Human Genome. Our program was part of the Human Genome Project. In this work, we were highly successful and the replaceable polymer we developed, linear polyacrylamide, was used by the DOE sequencing lab in California to sequence a significant portion of the human genome using the MegaBase multiple capillary array electrophoresis instrument. In this final report, we summarize our efforts and success. We began our work by separating by capillary electrophoresis double strand oligonucleotides using cross-linked polyacrylamide gels in fused silica capillaries. This work showed the potential of the methodology. However, preparation of such cross-linked gel capillaries was difficult with poor reproducibility, and even more important, the columns were not very stable. We improved stability by using non-cross linked linear polyacrylamide. Here, the entangled linear chains could move when osmotic pressure (e.g. sample injection) was imposed on the polymer matrix. This relaxation of the polymer dissipated the stress in the column. Our next advance was to use significantly lower concentrations of the linear polyacrylamide that the polymer could be automatically blown out after each run and replaced with fresh linear polymer solution. In this way, a new column was available for each analytical run. Finally, while testing many linear polymers, we selected linear polyacrylamide as the best matrix as it was the most hydrophilic polymer available. Under our DOE program, we demonstrated initially the success of the linear polyacrylamide to separate double strand DNA. We note that the method is used even today to assay purity of double stranded DNA fragments. Our focus, of course, was on the separation of single stranded DNA for sequencing purposes. In one paper, we demonstrated the success of our approach in sequencing up to 500 bases. Other

  16. Validated primer set that prevents nuclear DNA sequences of mitochondrial origin co-amplification: a revision based on the New Human Genome Reference Sequence (GRCh37).

    Science.gov (United States)

    Ramos, Amanda; Santos, Cristina; Barbena, Elena; Mateiu, Ligia; Alvarez, Luis; Nogués, Ramon; Aluja, Maria Pilar

    2011-03-01

    A new human genome reference sequence--GRCh37--was recently generated and made available by the Genome Reference Consortium. Since the prior disposable human reference sequence--hg18--was previously used for the mitochondrial DNA primer BLAST validation, a revision of those previously published primer pairs is required. Thus, the aim of this Short Communication is to perform an in silico BLAST test of the published disposable nine primer pairs using the new human reference sequence and to report the pertinent modifications. The new analysis showed that one of the tested primer pairs requires a revision. Therefore, a new validated primer pair, which specifically amplifies the mitochondrial region located between positions 6520 and 9184, is presented.

  17. Simultaneous detection of human mitochondrial DNA and nuclear-inserted mitochondrial-origin sequences (NumtS) using forensic mtDNA amplification strategies and pyrosequencing technology.

    Science.gov (United States)

    Bintz, Brittania J; Dixon, Groves B; Wilson, Mark R

    2014-07-01

    Next-generation sequencing technologies enable the identification of minor mitochondrial DNA variants with higher sensitivity than Sanger methods, allowing for enhanced identification of minor variants. In this study, mixtures of human mtDNA control region amplicons were subjected to pyrosequencing to determine the detection threshold of the Roche GS Junior(®) instrument (Roche Applied Science, Indianapolis, IN). In addition to expected variants, a set of reproducible variants was consistently found in reads from one particular amplicon. A BLASTn search of the variant sequence revealed identity to a segment of a 611-bp nuclear insertion of the mitochondrial control region (NumtS) spanning the primer-binding sites of this amplicon (Nature 1995;378:489). Primers (Hum Genet 2012;131:757; Hum Biol 1996;68:847) flanking the insertion were used to confirm the presence or absence of the NumtS in buccal DNA extracts from twenty donors. These results further our understanding of human mtDNA variation and are expected to have a positive impact on the interpretation of mtDNA profiles using deep-sequencing methods in casework.

  18. Information Theory of DNA Sequencing

    CERN Document Server

    Motahari, Abolfazl; Tse, David

    2012-01-01

    DNA sequencing is the basic workhorse of modern day biology and medicine. Shotgun sequencing is the dominant technique used: many randomly located short fragments called reads are extracted from the DNA sequence, and these reads are assembled to reconstruct the original sequence. By drawing an analogy between the DNA sequencing problem and the classic communication problem, we define an information theoretic notion of sequencing capacity. This is the maximum number of DNA base pairs that can be resolved reliably per read, and provides a fundamental limit to the performance that can be achieved by any assembly algorithm. We compute the sequencing capacity explicitly for a simple statistical model of the DNA sequence and the read process. Using this framework, we also study the impact of noise in the read process on the sequencing capacity.

  19. Clinical identification of bacteria in human chronic wound infections: culturing vs. 16S ribosomal DNA sequencing

    Directory of Open Access Journals (Sweden)

    Rhoads Daniel D

    2012-11-01

    Full Text Available Abstract Background Chronic wounds affect millions of people and cost billions of dollars in the United States each year. These wounds harbor polymicrobial biofilm communities, which can be difficult to elucidate using culturing methods. Clinical molecular microbiological methods are increasingly being employed to investigate the microbiota of chronic infections, including wounds, as part of standard patient care. However, molecular testing is more sensitive than culturing, which results in markedly different results being reported to clinicians. This study compares the results of aerobic culturing and molecular testing (culture-free 16S ribosomal DNA sequencing, and it examines the relative abundance score that is generated by the molecular test and the usefulness of the relative abundance score in predicting the likelihood that the same organism would be detected by culture. Methods Parallel samples from 51 chronic wounds were studied using aerobic culturing and 16S DNA sequencing for the identification of bacteria. Results One hundred forty-five (145 unique genera were identified using molecular methods, and 68 of these genera were aerotolerant. Fourteen (14 unique genera were identified using aerobic culture methods. One-third (31/92 of the cultures were determined to be Staphylococcus aureus, Pseudomonas aeruginosa, and Enterococcus faecalis with higher relative abundance scores were more likely to be detected by culture as demonstrated with regression modeling. Conclusion Discordance between molecular and culture testing is often observed. However, culture-free 16S ribosomal DNA sequencing and its relative abundance score can provide clinicians with insight into which bacteria are most abundant in a sample and which are most likely to be detected by culture.

  20. Cloning and comparative mapping of a human chromosome 4-specific alpha satellite DNA sequence.

    Science.gov (United States)

    D'Aiuto, L; Antonacci, R; Marzella, R; Archidiacono, N; Rocchi, M

    1993-11-01

    We have isolated and characterized two human alphoid DNA clones: p4n1/4 and pZ4.1. Clone p4n1/4 identifies specifically the centromeric region of chromosome 4; pZ4.1 recognizes a subset of alphoid DNA shared by chromosomes 4 and 9. The specificity was determined using fluorescence in situ hybridization experiments on metaphase spreads and Southern blotting analysis of human-hamster somatic cell hybrids. The genomic organization of both subsets was also investigated. Comparative mapping on chimpanzee and gorilla chromosomes was performed. p4n1/4 hybridizes to chimpanzee chromosomes 11 and 13, homologs of human chromosomes 9 and 2q, respectively. On gorilla metaphase spreads, p4n1/4 hybridizes exclusively to the centromeric region of chromosome 19, partially homologous to human chromosome 17. No hybridization signal was detected on chromosome 3 of both chimpanzee and gorilla, in both species homolog of human chromosome 4. Identical comparative mapping results were obtained using pZ4.1 probe, although the latter recognizes an alphoid subset distinct from the one recognized by p4n1/4. The implications of these results in the evolution of centromeric regions of primate chromosomes are discussed.

  1. Cloning and comparative mapping of a human chromosome 4-specific alpha satellite DNA sequence

    Energy Technology Data Exchange (ETDEWEB)

    D' Aiuto, L.; Marzella, R.; Archidiacono, N.; Rocchi, M. (Universita di Bari (Italy)); Antonacci, R. (Instituto Anatomia Umana Normale, Modena (Italy))

    1993-11-01

    The authors have isolated and characterized two human alphoid DNA clones: p4n1/4 and pZ4.1. Clone p4n1/4 identifies specifically the centromeric region of chromosome 4; pZ4.1 recognizes a subset of alphoid DNA shared by chromosomes 4 and 9. The specificity was determined using fluorescence in situ hybridization experiments on metaphase spreads and Southern blotting analysis of human-hamster somatic cell hybrids. The genomic organization of both subsets was also investigated. Comparative mapping on chimpanzee and gorilla chromosomes was performed. p4n1/4 hybridizes to chimpanzee chromosomes 11 and 13, homologs of human chromosomes 9 and 2q, respectively. On gorilla metaphase spreads, p4n1/4 hybridizes exclusively to the centromeric region of chromosome 19, partially homologous to human chromosome 17. No hybridization signal was detected on chromosome 3 of both chimpanzee and gorilla, in both species homolog of human chromosome 4. Identical comparative mapping results were obtained using pZ4.1 probe, although the latter recognizes an alphoid subset distinct from the one recognized by p4n1/4. The implications of these results in the evolution of centromeric regions of primate chromosomes are discussed. 33 refs., 4 figs.

  2. Comparative analysis of human mitochondrial DNA from World War I bone samples by DNA sequencing and ESI-TOF mass spectrometry.

    Science.gov (United States)

    Howard, Rebecca; Encheva, Vesela; Thomson, Jim; Bache, Katherine; Chan, Yuen-Ting; Cowen, Simon; Debenham, Paul; Dixon, Alan; Krause, Jens-Uwe; Krishan, Elaina; Moore, Daniel; Moore, Victoria; Ojo, Michael; Rodrigues, Sid; Stokes, Peter; Walker, James; Zimmermann, Wolfgang; Barallon, Rita

    2013-01-01

    Mitochondrial DNA is commonly used in identity testing for the analysis of old or degraded samples or to give evidence of familial links. The Abbott T5000 mass spectrometry platform provides an alternative to the more commonly used Sanger sequencing for the analysis of human mitochondrial DNA. The robustness of the T5000 system has previously been demonstrated using DNA extracted from volunteer buccal swabs but the system has not been tested using more challenging sample types. For mass spectrometry to be considered as a valid alternative to Sanger sequencing it must also be demonstrated to be suitable for use with more limiting sample types such as old teeth, bone fragments, and hair shafts. In 2009 the Commonwealth War Graves Commission launched a project to identify the remains of 250 World War I soldiers discovered in a mass grave in Fromelles, France. This study characterises the performance of both Sanger sequencing and the T5000 platform for the analysis of the mitochondrial DNA extracted from 225 of these remains, both in terms of the ability to amplify and characterise DNA regions of interest and the relative information content and ease-of-use associated with each method.

  3. Comparison of Boiling and Robotics Automation Method in DNA Extraction for Metagenomic Sequencing of Human Oral Microbes.

    Science.gov (United States)

    Yamagishi, Junya; Sato, Yukuto; Shinozaki, Natsuko; Ye, Bin; Tsuboi, Akito; Nagasaki, Masao; Yamashita, Riu

    2016-01-01

    The rapid improvement of next-generation sequencing performance now enables us to analyze huge sample sets with more than ten thousand specimens. However, DNA extraction can still be a limiting step in such metagenomic approaches. In this study, we analyzed human oral microbes to compare the performance of three DNA extraction methods: PowerSoil (a method widely used in this field), QIAsymphony (a robotics method), and a simple boiling method. Dental plaque was initially collected from three volunteers in the pilot study and then expanded to 12 volunteers in the follow-up study. Bacterial flora was estimated by sequencing the V4 region of 16S rRNA following species-level profiling. Our results indicate that the efficiency of PowerSoil and QIAsymphony was comparable to the boiling method. Therefore, the boiling method may be a promising alternative because of its simplicity, cost effectiveness, and short handling time. Moreover, this method was reliable for estimating bacterial species and could be used in the future to examine the correlation between oral flora and health status. Despite this, differences in the efficiency of DNA extraction for various bacterial species were observed among the three methods. Based on these findings, there is no "gold standard" for DNA extraction. In future, we suggest that the DNA extraction method should be selected on a case-by-case basis considering the aims and specimens of the study.

  4. Interference of Co-amplified nuclear mitochondrial DNA sequences on the determination of human mtDNA heteroplasmy by Using the SURVEYOR nuclease and the WAVE HS system.

    Science.gov (United States)

    Yen, Hsiu-Chuan; Li, Shiue-Li; Hsu, Wei-Chien; Tang, Petrus

    2014-01-01

    High-sensitivity and high-throughput mutation detection techniques are useful for screening the homoplasmy or heteroplasmy status of mitochondrial DNA (mtDNA), but might be susceptible to interference from nuclear mitochondrial DNA sequences (NUMTs) co-amplified during polymerase chain reaction (PCR). In this study, we first evaluated the platform of SURVEYOR Nuclease digestion of heteroduplexed DNA followed by the detection of cleaved DNA by using the WAVE HS System (SN/WAVE-HS) for detecting human mtDNA variants and found that its performance was slightly better than that of denaturing high-performance liquid chromatography (DHPLC). The potential interference from co-amplified NUMTs on screening mtDNA heteroplasmy when using these 2 highly sensitive techniques was further examined by using 2 published primer sets containing a total of 65 primer pairs, which were originally designed to be used with one of the 2 techniques. We confirmed that 24 primer pairs could amplify NUMTs by conducting bioinformatic analysis and PCR with the DNA from 143B-ρ0 cells. Using mtDNA extracted from the mitochondria of human 143B cells and a cybrid line with the nuclear background of 143B-ρ0 cells, we demonstrated that NUMTs could affect the patterns of chromatograms for cell DNA during SN-WAVE/HS analysis of mtDNA, leading to incorrect judgment of mtDNA homoplasmy or heteroplasmy status. However, we observed such interference only in 2 of 24 primer pairs selected, and did not observe such effects during DHPLC analysis. These results indicate that NUMTs can affect the screening of low-level mtDNA variants, but it might not be predicted by bioinformatic analysis or the amplification of DNA from 143B-ρ0 cells. Therefore, using purified mtDNA from cultured cells with proven purity to evaluate the effects of NUMTs from a primer pair on mtDNA detection by using PCR-based high-sensitivity methods prior to the use of a primer pair in real studies would be a more practical strategy.

  5. Interference of Co-amplified nuclear mitochondrial DNA sequences on the determination of human mtDNA heteroplasmy by Using the SURVEYOR nuclease and the WAVE HS system.

    Directory of Open Access Journals (Sweden)

    Hsiu-Chuan Yen

    Full Text Available High-sensitivity and high-throughput mutation detection techniques are useful for screening the homoplasmy or heteroplasmy status of mitochondrial DNA (mtDNA, but might be susceptible to interference from nuclear mitochondrial DNA sequences (NUMTs co-amplified during polymerase chain reaction (PCR. In this study, we first evaluated the platform of SURVEYOR Nuclease digestion of heteroduplexed DNA followed by the detection of cleaved DNA by using the WAVE HS System (SN/WAVE-HS for detecting human mtDNA variants and found that its performance was slightly better than that of denaturing high-performance liquid chromatography (DHPLC. The potential interference from co-amplified NUMTs on screening mtDNA heteroplasmy when using these 2 highly sensitive techniques was further examined by using 2 published primer sets containing a total of 65 primer pairs, which were originally designed to be used with one of the 2 techniques. We confirmed that 24 primer pairs could amplify NUMTs by conducting bioinformatic analysis and PCR with the DNA from 143B-ρ0 cells. Using mtDNA extracted from the mitochondria of human 143B cells and a cybrid line with the nuclear background of 143B-ρ0 cells, we demonstrated that NUMTs could affect the patterns of chromatograms for cell DNA during SN-WAVE/HS analysis of mtDNA, leading to incorrect judgment of mtDNA homoplasmy or heteroplasmy status. However, we observed such interference only in 2 of 24 primer pairs selected, and did not observe such effects during DHPLC analysis. These results indicate that NUMTs can affect the screening of low-level mtDNA variants, but it might not be predicted by bioinformatic analysis or the amplification of DNA from 143B-ρ0 cells. Therefore, using purified mtDNA from cultured cells with proven purity to evaluate the effects of NUMTs from a primer pair on mtDNA detection by using PCR-based high-sensitivity methods prior to the use of a primer pair in real studies would be a more practical

  6. Multiplex genotype determination at a DNA sequence polymorphism cluster in the human immunoglobulin heavy-chain region

    Energy Technology Data Exchange (ETDEWEB)

    Li, H.; Hood, L. [California Institute of Technology, Pasadena, CA (United States)

    1995-03-20

    We have developed a method for multilocus genotype determination. The method involves using restriction fragment length polymorphisms (RFLPs) for allele discrimination. If a polymorphism is not an RFLP, it is converted into an RFLP during the polymerase chain reaction (PCR). After amplification and restriction enzyme digestion, samples are analyzed by sequential gel loading during electrophoresis. The efficiency of this method was demonstrated by determining the genotypes of 108 semen samples at seven DNA sequence polymorphic sites identified in the human immunoglobulin heavy-chain variable region. It was shown that more than 1000 PCR products could be easily analyzed per day per investigator. To show the reliability of this method, some of the typing results were confirmed by DNA sequence analysis. By computer simulation, most (98%) polymorphisms were shown to be natural or convertible (by changing 1 bp close to or next to each polymorphic site) RFLPs for the commercially available 4-base cutters. 47 refs., 4 figs., 3 tabs.

  7. DETECTION OF HUMAN PAPILLOMAVIRUS DNA SEQUENCES IN ORAL LESIONS USING POLYMERASE CHAIN REACTION

    Directory of Open Access Journals (Sweden)

    M. R. Zarei

    2007-07-01

    Full Text Available "nThe purpose of the present study was to estimate the frequency of HPV DNA in four groups of oral lesions, including oral squamous cell carcinoma. Sixty paraffin-embedded oral tissue samples were examined for the presence of HPV DNAs using the PCR technique. These specimens were obtained from patients with oral squamous cell carcinoma (OSCC, leukoplakia, oral lichen planus (OLP, and pyogenic granuloma (PG. Consensus primers for L1 region (MY09 and MY11 and specific primers were used for detection of HPV DNA sequences in this study. we detected HPV DNA in 60% (9 out of 15 of OSCCs, 26.7% (4 out of 15 of leukoplakia, 13.3% (2 out of 15 of OLPs, and 6.7% (1 out of 15 of PGs. Statistical analysis showed that the prevalence of HPV in OSCC was significantly higher than other groups (P < 0.05. The frequency of HPV-16 and 18 detection in OSCC samples were 40% and 20%, respectively. The prevalence of these high risk HPVs was significantly higher in OSCC group (P < 0.05. The results of the present study show a successive increase of detection rate of HPV-16 and 18 DNAs from low level in samples of pyogenic granuloma and non-premalignant or questionably premalignant lesions of OLP to premalignant leukoplakia and to OSCC."n "n "n "n "n 

  8. Extracting biological knowledge from DNA sequences

    Energy Technology Data Exchange (ETDEWEB)

    De La Vega, F.M. [CINVESTAV-IPN (Mexico); Thieffry, D. [Universite Libre de Bruxelles, Rhode-Saint-Genese (Belgium)]|[Universidad Nacional Autonoma de Mexico, Morelos (Mexico); Collado-Vides, J. [Universidad Nacional Autonoma de Mexico, Morelos (Mexico)

    1996-12-31

    This session describes the elucidation of information from dna sequences and what challenges computational biologists face in their task of summarizing and deciphering the human genome. Techniques discussed include methods from statistics, information theory, artificial intelligence and linguistics. 1 ref.

  9. DNA Sequencing Using capillary Electrophoresis

    Energy Technology Data Exchange (ETDEWEB)

    Dr. Barry Karger

    2011-05-09

    The overall goal of this program was to develop capillary electrophoresis as the tool to be used to sequence for the first time the Human Genome. Our program was part of the Human Genome Project. In this work, we were highly successful and the replaceable polymer we developed, linear polyacrylamide, was used by the DOE sequencing lab in California to sequence a significant portion of the human genome using the MegaBase multiple capillary array electrophoresis instrument. In this final report, we summarize our efforts and success. We began our work by separating by capillary electrophoresis double strand oligonucleotides using cross-linked polyacrylamide gels in fused silica capillaries. This work showed the potential of the methodology. However, preparation of such cross-linked gel capillaries was difficult with poor reproducibility, and even more important, the columns were not very stable. We improved stability by using non-cross linked linear polyacrylamide. Here, the entangled linear chains could move when osmotic pressure (e.g. sample injection) was imposed on the polymer matrix. This relaxation of the polymer dissipated the stress in the column. Our next advance was to use significantly lower concentrations of the linear polyacrylamide that the polymer could be automatically blown out after each run and replaced with fresh linear polymer solution. In this way, a new column was available for each analytical run. Finally, while testing many linear polymers, we selected linear polyacrylamide as the best matrix as it was the most hydrophilic polymer available. Under our DOE program, we demonstrated initially the success of the linear polyacrylamide to separate double strand DNA. We note that the method is used even today to assay purity of double stranded DNA fragments. Our focus, of course, was on the separation of single stranded DNA for sequencing purposes. In one paper, we demonstrated the success of our approach in sequencing up to 500 bases. Other

  10. Chromosomal assignment of human DNA fingerprint sequences by simultaneous hybridization to arbitrarily primed PCR products from human/rodent monochromosome cell hybrids

    Energy Technology Data Exchange (ETDEWEB)

    Yasuda, Jun; Sekiya, Takao [National Cancer Center Research Institute, Chuo-ku, Tokyo (Japan); Navarro, J.M. [Burnham Institute, La Jolla, CA (United States)] [and others

    1996-05-15

    We have developed a technique for the simultaneous chromosomal assignment of multiple human DNA sequences from DNA fingerprints obtained by the arbitrarily primed polymerase chain reaction (AP-PCR). Radioactively labeled human AP-PCR products are hybridized to DNA fingerprints generated with the same arbitrary primer from human/rodent monochromosome cell hybrids after electroblotting to a nylong membrane. Human-specific hybridization bands in the human/rodent fingerprints unambiguously determine their chromosome of origin. We named this method simultaneous hybridization of arbitrarily primed PCR DNA fingerprinting products (SHARP). Using this approach, we determined the chromosomal origins of most major bands of human AP-PCR fingerprints obtained with two arbitrary primers. Altogether, the chromosomal localization of near 50 DNA fragments, comprehensive of all human chromosomes except chromosomes 21 and Y, was achieved in this simple manner. Chromosome assignment of fingerprint bands is essential for molecular karyotyping of cancer by AP-PCR DNA fingerprinting. The SHARP method provides a convenient and powerful tool for this purpose. 23 refs., 3 figs., 2 tabs.

  11. DNA sequence comparative analysis of the 3pter-p26 region of human genome

    Institute of Scientific and Technical Information of China (English)

    LUO; Chunqing; LI; Yan; ZHANG; Xiaowei; ZHANG; Yilin; ZHAN

    2005-01-01

    Most proterminal regions of human chromosomes are GC-rich and gene-rich. Chromosome 3p is an exception. Its proterminal region is GC-poor, and likely to lose heterozygosity, thus causing a number of fatal diseases. Except one gap left in the telomeric position, the proterminal region of human chromosome 3p has been completely sequenced. The detailed sequence analysis showed: (i) the GC content of this region was 38.5%, being the lowest among all the human proterminal regions; (ii) this region contained 20 known genes and 22 predicted genes, with an average gene size of 97.5 kb. The previously mapped gene Cntn3 was not found in this region, but instead located in the 74 Mb position of human chromosome 3p; (iii) the interspersed repeats of this region were more active than the average level of the whole human genome, especially (TA)n, the content of which was twice the genome average; (iv) this region had a conserved synteny extending from 104.1 Mb to 112.4 Mb on the mouse chromosome 6, which was 8% larger in size, not in accordance with the whole genome comparison, probably because the 3pter-p26 region was more likely to lose neocleitides and its mouse synteny had more active interspersed repeats.

  12. Biosensors for DNA sequence detection

    Science.gov (United States)

    Vercoutere, Wenonah; Akeson, Mark

    2002-01-01

    DNA biosensors are being developed as alternatives to conventional DNA microarrays. These devices couple signal transduction directly to sequence recognition. Some of the most sensitive and functional technologies use fibre optics or electrochemical sensors in combination with DNA hybridization. In a shift from sequence recognition by hybridization, two emerging single-molecule techniques read sequence composition using zero-mode waveguides or electrical impedance in nanoscale pores.

  13. Integrated Model of DNA Sequence Numerical Representation and Artificial Neural Network for Human Donor and Acceptor Sites Prediction

    Directory of Open Access Journals (Sweden)

    Mohammed Abo-Zahhad

    2014-07-01

    Full Text Available Human Genome Project has led to a huge inflow of genomic data. After the completion of human genome sequencing, more and more effort is being put into identification of splicing sites of exons and introns (donor and acceptor sites. These invite bioinformatics to analysis the genome sequences and identify the location of exon and intron boundaries or in other words prediction of splicing sites. Prediction of splice sites in genic regions of DNA sequence is one of the most challenging aspects of gene structure recognition. Over the last two decades, artificial neural networks gradually became one of the essential tools in bioinformatics. In this paper artificial neural networks with different numerical mapping techniques have been employed for building integrated model for splice site prediction in genes. An artificial neural network is trained and then used to find splice sites in human genes. A comparison between different mapping methods using trained neural network in terms of their precision in prediction of donor and acceptor sites will be presented in this paper. Training and measuring performance of neural network are carried out using sequences of the human genome (GRch37/hg19- chr21. Simulation results indicate that using Electron-Ion Interaction Potential numerical mapping method with neural network yields to the best performance in prediction.

  14. Effective inhibition of human cytomegalovirus gene expression by DNA-based external guide sequences

    Institute of Scientific and Technical Information of China (English)

    Zhifeng Zeng; Hongjian Li; Yueqing Li; Yanwei Cui; Qi Zhou; Yi Zou; Guang Yang; Tianhong Zhou

    2009-01-01

    To investigate whether a 12 nucleotide DNA-based miniEGSs can silence the expression of human cytomegalovirus(HCMV)UL49 gene efficiently,A HeLa cell line stably expressing UL49 gene was constructed and the putative miniEGSs(UL49-miniEGSs)were assayed in the stable cell line.Quantitative RT-PCR and western blot resuits showed a reduction of 67%in UL49expression level in HeLa cells that were transfected with UL49-miniEGSs.It was significantly different from that of mock and control miniEGSs(TK-miniEGSs)which were 1 and 7%,respectively.To further confirm the gene silence directed by UL49-miniEGSs with human RNase P,a mutant of UL49-miniEGSs was constructedand a modified 5'RACE was carried out.Data showed that the inhibition of UL49 gene expression directed by UL49-miniEGSs was RNase P-dependent and the clea vage of UL49 mRNA by RNase P was site specific.As a result,the length of DNA-based miniEGSs that could silence gene expression efficiently was only 12 nt.That is significantly less than any other Oligonucleotide-based method of gene inactivation known SO far.MiniEGSs may represent novel gene-targeting agents for the inhibition of viral genes and other human disease reiated gene expression.

  15. Intrachromosomal recombination between highly diverged DNA sequences is enabled in human cells deficient in Bloom helicase.

    Science.gov (United States)

    Wang, Yibin; Li, Shen; Smith, Krissy; Waldman, Barbara Criscuolo; Waldman, Alan S

    2016-05-01

    Mutation of Bloom helicase (BLM) causes Bloom syndrome (BS), a rare human genetic disorder associated with genome instability, elevation of sister chromatid exchanges, and predisposition to cancer. Deficiency in BLM homologs in Drosophila and yeast brings about significantly increased rates of recombination between imperfectly matched sequences ("homeologous recombination," or HeR). To assess whether BLM deficiency provokes an increase in HeR in human cells, we transfected an HeR substrate into a BLM-null cell line derived from a BS patient. The substrate contained a thymidine kinase (tk)-neo fusion gene disrupted by the recognition site for endonuclease I-SceI, as well as a functional tk gene to serve as a potential recombination partner for the tk-neo gene. The two tk sequences on the substrate displayed 19% divergence. A double-strand break was introduced by expression of I-SceI and repair events were recovered by selection for G418-resistant clones. Among 181 events recovered, 30 were accomplished via HeR with the balance accomplished by nonhomologous end-joining. The frequency of HeR events in the BS cells was elevated significantly compared to that seen in normal human fibroblasts or in BS cells complemented for BLM expression. We conclude that BLM deficiency enables HeR in human cells.

  16. On the sequence-directed nature of human gene mutation: the role of genomic architecture and the local DNA sequence environment in mediating gene mutations underlying human inherited disease.

    Science.gov (United States)

    Cooper, David N; Bacolla, Albino; Férec, Claude; Vasquez, Karen M; Kehrer-Sawatzki, Hildegard; Chen, Jian-Min

    2011-10-01

    Different types of human gene mutation may vary in size, from structural variants (SVs) to single base-pair substitutions, but what they all have in common is that their nature, size and location are often determined either by specific characteristics of the local DNA sequence environment or by higher order features of the genomic architecture. The human genome is now recognized to contain "pervasive architectural flaws" in that certain DNA sequences are inherently mutation prone by virtue of their base composition, sequence repetitivity and/or epigenetic modification. Here, we explore how the nature, location and frequency of different types of mutation causing inherited disease are shaped in large part, and often in remarkably predictable ways, by the local DNA sequence environment. The mutability of a given gene or genomic region may also be influenced indirectly by a variety of noncanonical (non-B) secondary structures whose formation is facilitated by the underlying DNA sequence. Since these non-B DNA structures can interfere with subsequent DNA replication and repair and may serve to increase mutation frequencies in generalized fashion (i.e., both in the context of subtle mutations and SVs), they have the potential to serve as a unifying concept in studies of mutational mechanisms underlying human inherited disease. © 2011 Wiley-Liss, Inc.

  17. Sequence information encoded in DNA that may influence long-range chromatin structure correlates with human chromosome functions.

    Directory of Open Access Journals (Sweden)

    Taichi E Takasuka

    Full Text Available Little is known about the possible function of the bulk of the human genome. We have recently shown that long-range regular oscillation in the motif non-T, A/T, G (VWG existing at ten-nucleotide multiples influences large-scale nucleosome array formation. In this work, we have determined the locations of all 100 kb regions that are predicted to form distinctive chromatin structures throughout each human chromosome (except Y. Using these data, we found that a significantly greater fraction of 300 kb sequences lacked annotated transcripts in genomic DNA regions > or = 300 kb that contained nearly continuous chromatin organizing signals than in control regions. We also found a relationship between the meiotic recombination frequency and the presence of strong VWG chromatin organizing signals. Large (> or = 300 kb genomic DNA regions having low average recombination frequency are enriched in chromatin organizing signals. As additional controls, we show using chromosome 1 that the VWG motif signals are not enriched in randomly selected DNA regions having the mean size of the recombination coldspots, and that non-VWG motif sets do not generate signals that are enriched in recombination coldspots. We also show that tandemly repeated alpha satellite DNA contains strong VWG signals for the formation of distinctive nucleosome arrays, consistent with the low recombination activity of centromeres. Our correlations cannot be explained simply by variations in the GC content. Our findings suggest that a specific set of periodic DNA motifs encoded in genomic DNA, which provide signals for chromatin organization, influence human chromosome function.

  18. Graphene nanodevices for DNA sequencing

    Science.gov (United States)

    Heerema, Stephanie J.; Dekker, Cees

    2016-02-01

    Fast, cheap, and reliable DNA sequencing could be one of the most disruptive innovations of this decade, as it will pave the way for personalized medicine. In pursuit of such technology, a variety of nanotechnology-based approaches have been explored and established, including sequencing with nanopores. Owing to its unique structure and properties, graphene provides interesting opportunities for the development of a new sequencing technology. In recent years, a wide range of creative ideas for graphene sequencers have been theoretically proposed and the first experimental demonstrations have begun to appear. Here, we review the different approaches to using graphene nanodevices for DNA sequencing, which involve DNA passing through graphene nanopores, nanogaps, and nanoribbons, and the physisorption of DNA on graphene nanostructures. We discuss the advantages and problems of each of these key techniques, and provide a perspective on the use of graphene in future DNA sequencing technology.

  19. Intermittency as a universal characteristic of the complete chromosome DNA sequences of eukaryotes: From protozoa to human genomes

    Science.gov (United States)

    Rybalko, S.; Larionov, S.; Poptsova, M.; Loskutov, A.

    2011-10-01

    Large-scale dynamical properties of complete chromosome DNA sequences of eukaryotes are considered. Using the proposed deterministic models with intermittency and symbolic dynamics we describe a wide spectrum of large-scale patterns inherent in these sequences, such as segmental duplications, tandem repeats, and other complex sequence structures. It is shown that the recently discovered gene number balance on the strands is not of a random nature, and certain subsystems of a complete chromosome DNA sequence exhibit the properties of deterministic chaos.

  20. Probing the functional impact of sequence variation on p53-DNA interactions using a novel microsphere assay for protein-DNA binding with human cell extracts.

    Directory of Open Access Journals (Sweden)

    Maher A Noureddine

    2009-05-01

    Full Text Available The p53 tumor suppressor regulates its target genes through sequence-specific binding to DNA response elements (REs. Although numerous p53 REs are established, the thousands more identified by bioinformatics are not easily subjected to comparative functional evaluation. To examine the relationship between RE sequence variation -- including polymorphisms -- and p53 binding, we have developed a multiplex format microsphere assay of protein-DNA binding (MAPD for p53 in nuclear extracts. Using MAPD we measured sequence-specific p53 binding of doxorubicin-activated or transiently expressed p53 to REs from established p53 target genes and p53 consensus REs. To assess the sensitivity and scalability of the assay, we tested 16 variants of the p21 target sequence and a 62-multiplex set of single nucleotide (nt variants of the p53 consensus sequence and found many changes in p53 binding that are not captured by current computational binding models. A group of eight single nucleotide polymorphisms (SNPs was examined and binding profiles closely matched transactivation capability tested in luciferase constructs. The in vitro binding characteristics of p53 in nuclear extracts recapitulated the cellular in vivo transactivation capabilities for eight well-established human REs measured by luciferase assay. Using a set of 26 bona fide REs, we observed distinct binding patterns characteristic of transiently expressed wild type and mutant p53s. This microsphere assay system utilizes biologically meaningful cell extracts in a multiplexed, quantitative, in vitro format that provides a powerful experimental tool for elucidating the functional impact of sequence polymorphism and protein variation on protein/DNA binding in transcriptional networks.

  1. Inconsistencies in Neanderthal genomic DNA sequences.

    Directory of Open Access Journals (Sweden)

    Jeffrey D Wall

    2007-10-01

    Full Text Available Two recently published papers describe nuclear DNA sequences that were obtained from the same Neanderthal fossil. Our reanalyses of the data from these studies show that they are not consistent with each other and point to serious problems with the data quality in one of the studies, possibly due to modern human DNA contaminants and/or a high rate of sequencing errors.

  2. The case for the continuing use of the revised Cambridge Reference Sequence (rCRS) and the standardization of notation in human mitochondrial DNA studies.

    Science.gov (United States)

    Bandelt, Hans-Jürgen; Kloss-Brandstätter, Anita; Richards, Martin B; Yao, Yong-Gang; Logan, Ian

    2014-02-01

    Since the determination in 1981 of the sequence of the human mitochondrial DNA (mtDNA) genome, the Cambridge Reference Sequence (CRS), has been used as the reference sequence to annotate mtDNA in molecular anthropology, forensic science and medical genetics. The CRS was eventually upgraded to the revised version (rCRS) in 1999. This reference sequence is a convenient device for recording mtDNA variation, although it has often been misunderstood as a wild-type (WT) or consensus sequence by medical geneticists. Recently, there has been a proposal to replace the rCRS with the so-called Reconstructed Sapiens Reference Sequence (RSRS). Even if it had been estimated accurately, the RSRS would be a cumbersome substitute for the rCRS, as the new proposal fuses--and thus confuses--the two distinct concepts of ancestral lineage and reference point for human mtDNA. Instead, we prefer to maintain the rCRS and to report mtDNA profiles by employing the hitherto predominant circumfix style. Tree diagrams could display mutations by using either the profile notation (in conventional short forms where appropriate) or in a root-upwards way with two suffixes indicating ancestral and derived nucleotides. This would guard against misunderstandings about reporting mtDNA variation. It is therefore neither necessary nor sensible to change the present reference sequence, the rCRS, in any way. The proposed switch to RSRS would inevitably lead to notational chaos, mistakes and misinterpretations.

  3. Human DNA contains sequences homologous to the 5'-non-coding region of hepatitis C virus: characterization with restriction endonucleases reveals individual varieties

    Institute of Scientific and Technical Information of China (English)

    Reinhard H Dennin; Jianer Wo

    2003-01-01

    Objective To investigate a 272 base pair section of the 5'-non-coding region of genomic DNA from the peripheral blood monounuclear cells of healthy hepatitis virus C (HCV)-negative human subjects (not patients). Results The suspected HCV-specific sequence was found in the DNA of each subject tested. The pre-PCR digestion assay reveals individual differences in their pattern of methylation, which may be due to possible epigenetic phenomena.Conclusions The results provide formal proof that these HCV-specific sequences are contained in the genomic or extra chromosomal target DNA, and probably belong to a new class of endogenous sequences.

  4. Studies on the integration of hepatitis B virusDNA sequence in human sperm chromosomes

    Institute of Scientific and Technical Information of China (English)

    Jian-MinHUANG; Tian-HuaHUANG

    2002-01-01

    Aim:To study the integration of hepatitis Bvirus(HBV)DNAinto sperm chromosomes in hepatitsBpatients and the features of its integration.Methods:Sperm chromosomes of 14subjects(5healthy controls and9HBpatients,including1acute hepatitis B,2chronic active hepatitisB,4chronic persistent hepatitsB,2HBsAg chronic carriers with no clinical symptoms)were prepared using imterspecific in vitro fertilization between zona-free hamster oocytes and human spermatozoa.Fluosescence in situ hybridization(FISH)to sperm chromosome spreads was carried out with biotin-labeled full length HBVDNAprobe to detect the specificHBVDNA sequences in the sperm chromosomes.Results:Specific fluorescent signal spots for HBVDNAwere seen iv sperm chromosomes of one patient with chronic persistent hepatitisB.In9(9/42)sperm chromosome complements containing fluorescent signal spots,one presented5obvious FISHspots and the others2to4signals.The fluorescence intensity showed significant difference among the signal spots.The distribution of signal sites among chromosomes seems to be random.Con clusion:HBV could integrate into human sperm chromosomes.Results suggest that the possibility of vertical transmission of HBVvia the germ line tothe next generation is present.

  5. Interference of Co-Amplified Nuclear Mitochondrial DNA Sequences on the Determination of Human mtDNA Heteroplasmy by Using the SURVEYOR Nuclease and the WAVE HS System

    OpenAIRE

    Hsiu-Chuan Yen; Shiue-Li Li; Wei-Chien Hsu; Petrus Tang

    2014-01-01

    High-sensitivity and high-throughput mutation detection techniques are useful for screening the homoplasmy or heteroplasmy status of mitochondrial DNA (mtDNA), but might be susceptible to interference from nuclear mitochondrial DNA sequences (NUMTs) co-amplified during polymerase chain reaction (PCR). In this study, we first evaluated the platform of SURVEYOR Nuclease digestion of heteroduplexed DNA followed by the detection of cleaved DNA by using the WAVE HS System (SN/WAVE-HS) for detectin...

  6. Genetic heterogeneity and phylogeny of Trichuris spp. from captive non-human primates based on ribosomal DNA sequence data.

    Science.gov (United States)

    Cavallero, Serena; De Liberato, Claudio; Friedrich, Klaus G; Di Cave, David; Masella, Valentina; D'Amelio, Stefano; Berrilli, Federica

    2015-08-01

    Nematodes of the genus Trichuris, known as whipworms, are recognized to infect numerous mammalian species including humans and non-human primates. Several Trichuris spp. have been described and species designation/identification is traditionally based on host-affiliation, although cross-infection and hybridization events may complicate species boundaries. The main aims of the present study were to genetically characterize adult Trichuris specimens from captive Japanese macaques (Macaca fuscata) and grivets (Chlorocebus aethiops), using the ribosomal DNA (ITS) as molecular marker and to investigate the phylogeny and the extent of genetic variation also by comparison with data on isolates from other humans, non-human primates and other hosts. The phylogenetic analysis of Trichuris sequences from M. fuscata and C. aethiops provided evidences of distinct clades and subclades thus advocating the existence of additional separated taxa. Neighbor Joining and Bayesian trees suggest that specimens from M. fuscata may be distinct from, but related to Trichuris trichiura, while a close relationship is suggested between the subclade formed by the specimens from C. aethiops and the subclade formed by T. suis. The tendency to associate Trichuris sp. to host species can lead to misleading taxonomic interpretations (i.e. whipworms found in primates are identified as T. trichiura). The results here obtained confirm previous evidences suggesting the existence of Trichuris spp. other than T. trichiura infecting non-human living primates.

  7. Duplication in DNA Sequences

    Science.gov (United States)

    Ito, Masami; Kari, Lila; Kincaid, Zachary; Seki, Shinnosuke

    The duplication and repeat-deletion operations are the basis of a formal language theoretic model of errors that can occur during DNA replication. During DNA replication, subsequences of a strand of DNA may be copied several times (resulting in duplications) or skipped (resulting in repeat-deletions). As formal language operations, iterated duplication and repeat-deletion of words and languages have been well studied in the literature. However, little is known about single-step duplications and repeat-deletions. In this paper, we investigate several properties of these operations, including closure properties of language families in the Chomsky hierarchy and equations involving these operations. We also make progress toward a characterization of regular languages that are generated by duplicating a regular language.

  8. Construction and packaging of pseudotype retrovirus containing human N—ras cDNA antisense sequence and its biological effects on human hepatoma cells

    Institute of Scientific and Technical Information of China (English)

    JIALIBIN; WANGXIANG; 等

    1990-01-01

    N-ras is one of the transforming genes in human hepatic cancer cells.It has been found that N-ras was overexpressed at the mRNA and protein level in hepatoma cells.In order to explore the biological roles of N-ras in human hepatic carcinogenesis and the potential application in control of cancer cell growth,a preudotype retrovirus containing antisense sequence of human N-ras was constructed and packaged.A recombinant retrovirus vector containing antisense or sense sequences of N-ras cDNA was constructed by pZIP-NeoSV(X)1.The pseudotype virus was packaged ang rescued by transfection and infection in PA317 and ψ 2 helper cells.It has been demonstrated that the pseudotype retrovirus containing antisense N-ras sequence did inhibit the growth of human PLC/PRF/5 hepatoma cells accompanied with inhibition of p21 expression,while the retrovirus containing sense sequence had none.The pseudotype virus had no effect on human diploid fibroblasts.

  9. UMD‐Predictor: A High‐Throughput Sequencing Compliant System for Pathogenicity Prediction of any Human cDNA Substitution

    Science.gov (United States)

    Salgado, David; Desvignes, Jean‐Pierre; Rai, Ghadi; Blanchard, Arnaud; Miltgen, Morgane; Pinard, Amélie; Lévy, Nicolas; Collod‐Béroud, Gwenaëlle

    2016-01-01

    ABSTRACT Whole‐exome sequencing (WES) is increasingly applied to research and clinical diagnosis of human diseases. It typically results in large amounts of genetic variations. Depending on the mode of inheritance, only one or two correspond to pathogenic mutations responsible for the disease and present in affected individuals. Therefore, it is crucial to filter out nonpathogenic variants and limit downstream analysis to a handful of candidate mutations. We have developed a new computational combinatorial system UMD‐Predictor (http://umd‐predictor.eu) to efficiently annotate cDNA substitutions of all human transcripts for their potential pathogenicity. It combines biochemical properties, impact on splicing signals, localization in protein domains, variation frequency in the global population, and conservation through the BLOSUM62 global substitution matrix and a protein‐specific conservation among 100 species. We compared its accuracy with the seven most used and reliable prediction tools, using the largest reference variation datasets including more than 140,000 annotated variations. This system consistently demonstrated a better accuracy, specificity, Matthews correlation coefficient, diagnostic odds ratio, speed, and provided the shortest list of candidate mutations for WES. Webservices allow its implementation in any bioinformatics pipeline for next‐generation sequencing analysis. It could benefit to a wide range of users and applications varying from gene discovery to clinical diagnosis. PMID:26842889

  10. DNA Sequences Proximal to Human Mitochondrial DNA Deletion Breakpoints Prevalent in Human Disease Form G-quadruplexes, a Class of DNA Structures Inefficiently Unwound by the Mitochondrial Replicative Twinkle Helicase

    NARCIS (Netherlands)

    Bharti, S.K.; Sommers, J.A.; Zhou, J.; Kaplan, D.L.; Spelbrink, J.N.; Mergny, J.L.; Brosh, R.M., Jr.

    2014-01-01

    Mitochondrial DNA deletions are prominent in human genetic disorders, cancer, and aging. It is thought that stalling of the mitochondrial replication machinery during DNA synthesis is a prominent source of mitochondrial genome instability; however, the precise molecular determinants of defective

  11. DNA Sequencing Sensors: An Overview

    Directory of Open Access Journals (Sweden)

    Jose Antonio Garrido-Cardenas

    2017-03-01

    Full Text Available The first sequencing of a complete genome was published forty years ago by the double Nobel Prize in Chemistry winner Frederick Sanger. That corresponded to the small sized genome of a bacteriophage, but since then there have been many complex organisms whose DNA have been sequenced. This was possible thanks to continuous advances in the fields of biochemistry and molecular genetics, but also in other areas such as nanotechnology and computing. Nowadays, sequencing sensors based on genetic material have little to do with those used by Sanger. The emergence of mass sequencing sensors, or new generation sequencing (NGS meant a quantitative leap both in the volume of genetic material that was able to be sequenced in each trial, as well as in the time per run and its cost. One can envisage that incoming technologies, already known as fourth generation sequencing, will continue to cheapen the trials by increasing DNA reading lengths in each run. All of this would be impossible without sensors and detection systems becoming smaller and more precise. This article provides a comprehensive overview on sensors for DNA sequencing developed within the last 40 years.

  12. New sequence-based data on the relative DNA contents of chromosomes in the normal male and female human diploid genomes for radiation molecular cytogenetics

    Directory of Open Access Journals (Sweden)

    Repin Mikhail V

    2009-06-01

    Full Text Available Abstract Background The objective of this work is to obtain the correct relative DNA contents of chromosomes in the normal male and female human diploid genomes for the use at FISH analysis of radiation-induced chromosome aberrations. Results The relative DNA contents of chromosomes in the male and female human diploid genomes have been calculated from the publicly available international Human Genome Project data. New sequence-based data on the relative DNA contents of human chromosomes were compared with the data recommended by the International Atomic Energy Agency in 2001. The differences in the values of the relative DNA contents of chromosomes obtained by using different approaches for 15 human chromosomes, mainly for large chromosomes, were below 2%. For the chromosomes 13, 17, 20 and 22 the differences were above 5%. Conclusion New sequence-based data on the relative DNA contents of chromosomes in the normal male and female human diploid genomes were obtained. This approach, based on the genome sequence, can be recommended for the use in radiation molecular cytogenetics.

  13. Evolutionarily conserved sequences on human chromosome 21

    Energy Technology Data Exchange (ETDEWEB)

    Frazer, Kelly A.; Sheehan, John B.; Stokowski, Renee P.; Chen, Xiyin; Hosseini, Roya; Cheng, Jan-Fang; Fodor, Stephen P.A.; Cox, David R.; Patil, Nila

    2001-09-01

    Comparison of human sequences with the DNA of other mammals is an excellent means of identifying functional elements in the human genome. Here we describe the utility of high-density oligonucleotide arrays as a rapid approach for comparing human sequences with the DNA of multiple species whose sequences are not presently available. High-density arrays representing approximately 22.5 Mb of nonrepetitive human chromosome 21 sequence were synthesized and then hybridized with mouse and dog DNA to identify sequences conserved between humans and mice (human-mouse elements) and between humans and dogs (human-dog elements). Our data show that sequence comparison of multiple species provides a powerful empiric method for identifying actively conserved elements in the human genome. A large fraction of these evolutionarily conserved elements are present in regions on chromosome 21 that do not encode known genes.

  14. Structural Complexity of DNA Sequence

    Directory of Open Access Journals (Sweden)

    Cheng-Yuan Liou

    2013-01-01

    Full Text Available In modern bioinformatics, finding an efficient way to allocate sequence fragments with biological functions is an important issue. This paper presents a structural approach based on context-free grammars extracted from original DNA or protein sequences. This approach is radically different from all those statistical methods. Furthermore, this approach is compared with a topological entropy-based method for consistency and difference of the complexity results.

  15. Fractals in DNA sequence analysis

    Institute of Scientific and Technical Information of China (English)

    Yu Zu-Guo(喻祖国); Vo Anh; Gong Zhi-Min(龚志民); Long Shun-Chao(龙顺潮)

    2002-01-01

    Fractal methods have been successfully used to study many problems in physics, mathematics, engineering, finance,and even in biology. There has been an increasing interest in unravelling the mysteries of DNA; for example, how can we distinguish coding and noncoding sequences, and the problems of classification and evolution relationship of organisms are key problems in bioinformatics. Although much research has been carried out by taking into consideration the long-range correlations in DNA sequences, and the global fractal dimension has been used in these works by other people, the models and methods are somewhat rough and the results are not satisfactory. In recent years, our group has introduced a time series model (statistical point of view) and a visual representation (geometrical point of view)to DNA sequence analysis. We have also used fractal dimension, correlation dimension, the Hurst exponent and the dimension spectrum (multifractal analysis) to discuss problems in this field. In this paper, we introduce these fractal models and methods and the results of DNA sequence analysis.

  16. The DNA sequence specificity of bleomycin cleavage in a systematically altered DNA sequence.

    Science.gov (United States)

    Gautam, Shweta D; Chen, Jon K; Murray, Vincent

    2017-08-01

    Bleomycin is an anti-tumour agent that is clinically used to treat several types of cancers. Bleomycin cleaves DNA at specific DNA sequences and recent genome-wide DNA sequencing specificity data indicated that the sequence 5'-RTGT*AY (where T* is the site of bleomycin cleavage, R is G/A and Y is T/C) is preferentially cleaved by bleomycin in human cells. Based on this DNA sequence, we constructed a plasmid clone to explore this bleomycin cleavage preference. By systematic variation of single nucleotides in the 5'-RTGT*AY sequence, we were able to investigate the effect of nucleotide changes on bleomycin cleavage efficiency. We observed that the preferred consensus DNA sequence for bleomycin cleavage in the plasmid clone was 5'-YYGT*AW (where W is A/T). The most highly cleaved sequence was 5'-TCGT*AT and, in fact, the seven most highly cleaved sequences conformed to the consensus sequence 5'-YYGT*AW. A comparison with genome-wide results was also performed and while the core sequence was similar in both environments, the surrounding nucleotides were different.

  17. Lack of evidence of human herpesvirus 8 DNA sequences in HIV-negative patients with various lymphoproliferative disorders of the skin.

    Science.gov (United States)

    Dupin, N; Franck, N; Calvez, V; Gorin, I; Grandadam, M; Huraux, J M; Leibowitch, M; Agut, H; Escande, J P

    1997-06-01

    Human herpesvirus 8 (HHV-8) is a new virus which has been reported in Kaposi's sarcoma and some lymphoproliferative disorders such as Castleman's disease and body-cavity-based lymphoma. Because HHV-8 shares homology with Epstein-Barr virus (EBV), we searched for the presence of HHV-8 DNA sequences in various cutaneous T- and B-cell lymphoma by the polymerase chain reaction (PCR). Forty-seven HIV-negative patients with cutaneous lymphoma or large plaque parapsoriasis were enrolled in the study. For the detection of HHV-8 DNA sequences we used PCR followed by a hybridization with a digoxigenin-labelled probe and nested-PCR. HHV-8 DNA sequences could only be detected in a patient with large plaque parapsoriasis. Our study does not suggest any direct implication of HHV-8 in the pathogenesis of most cutaneous lymphoma. Serological studies will be helpful to appreciate if there is an epidemiological link between HHV-8 and cutaneous lymphomas.

  18. DNA Sequences Proximal to Human Mitochondrial DNA Deletion Breakpoints Prevalent in Human Disease Form G-quadruplexes, a Class of DNA Structures Inefficiently Unwound by the Mitochondrial Replicative Twinkle Helicase

    NARCIS (Netherlands)

    Bharti, S.K.; Sommers, J.A.; Zhou, J.; Kaplan, D.L.; Spelbrink, J.N.; Mergny, J.L.; Brosh, R.M., Jr.

    2014-01-01

    Mitochondrial DNA deletions are prominent in human genetic disorders, cancer, and aging. It is thought that stalling of the mitochondrial replication machinery during DNA synthesis is a prominent source of mitochondrial genome instability; however, the precise molecular determinants of defective mit

  19. Triplet repeat sequences in human DNA can be detected by hybridization to a synthetic (5'-CGG-3')17 oligodeoxyribonucleotide

    DEFF Research Database (Denmark)

    Behn-Krappa, A; Mollenhauer, J; Doerfler, W

    1993-01-01

    The seemingly autonomous amplification of naturally occurring triplet repeat sequences in the human genome has been implicated in the causation of human genetic disease, such as the fragile X (Martin-Bell) syndrome, myotonic dystrophy (Curshmann-Steinert), spinal and bulbar muscular atrophy...

  20. Information Analysis of DNA Sequences

    CERN Document Server

    Mohammed, Riyazuddin

    2010-01-01

    The problem of differentiating the informational content of coding (exons) and non-coding (introns) regions of a DNA sequence is one of the central problems of genomics. The introns are estimated to be nearly 95% of the DNA and since they do not seem to participate in the process of transcription of amino-acids, they have been termed "junk DNA." Although it is believed that the non-coding regions in genomes have no role in cell growth and evolution, demonstration that these regions carry useful information would tend to falsify this belief. In this paper, we consider entropy as a measure of information by modifying the entropy expression to take into account the varying length of these sequences. Exons are usually much shorter in length than introns; therefore the comparison of the entropy values needs to be normalized. A length correction strategy was employed using randomly generated nucleonic base strings built out of the alphabet of the same size as the exons under question. Our analysis shows that intron...

  1. Human cellular protein patterns and their link to genome DNA sequence data: usefulness of two-dimensional gel electrophoresis and microsequencing

    DEFF Research Database (Denmark)

    Celis, J E; Rasmussen, H H; Leffers, H

    1991-01-01

    Analysis of cellular protein patterns by computer-aided 2-dimensional gel electrophoresis together with recent advances in protein sequence analysis have made possible the establishment of comprehensive 2-dimensional gel protein databases that may link protein and DNA information and that offer...... a global approach to the study of the cell. Using the integrated approach offered by 2-dimensional gel protein databases it is now possible to reveal phenotype specific protein (or proteins), to microsequence them, to search for homology with previously identified proteins, to clone the cDNAs, to assign...... partial protein sequence to genes for which the full DNA sequence and the chromosome location is known, and to study the regulatory properties and function of groups of proteins that are coordinately expressed in a given biological process. Human 2-dimensional gel protein databases are becoming...

  2. Statistical assignment of DNA sequences using Bayesian phylogenetics

    DEFF Research Database (Denmark)

    Terkelsen, Kasper Munch; Boomsma, Wouter Krogh; Huelsenbeck, John P.;

    2008-01-01

    -analysis of previously published ancient DNA data and show that, with high statistical confidence, most of the published sequences are in fact of Neanderthal origin. However, there are several cases of chimeric sequences that are comprised of a combination of both Neanderthal and modern human DNA....

  3. DNA Sequence Determination by Hybridization: A Strategy for Efficient Large-Scale Sequencing

    Science.gov (United States)

    Drmanac, R.; Drmanac, S.; Strezoska, Z.; Paunesku, T.; Labat, I.; Zeremski, M.; Snoddy, J.; Funkhouser, W. K.; Koop, B.; Hood, L.; Crkvenjakov, R.

    1993-06-01

    The concept of sequencing by hybridization (SBH) makes use of an array of all possible n-nucleotide oligomers (n-mers) to identify n-mers present in an unknown DNA sequence. Computational approaches can then be used to assemble the complete sequence. As a validation of this concept, the sequences of three DNA fragments, 343 base pairs in length, were determined with octamer oligonucleotides. Possible applications of SBH include physical mapping (ordering) of overlapping DNA clones, sequence checking, DNA fingerprinting comparisons of normal and disease-causing genes, and the identification of DNA fragments with particular sequence motifs in complementary DNA and genomic libraries. The SBH techniques may accelerate the mapping and sequencing phases of the human genome project.

  4. MitoBamAnnotator: A web-based tool for detecting and annotating heteroplasmy in human mitochondrial DNA sequences.

    Science.gov (United States)

    Zhidkov, Ilia; Nagar, Tal; Mishmar, Dan; Rubin, Eitan

    2011-11-01

    The use of Next-Generation Sequencing of mitochondrial DNA is becoming widespread in biological and clinical research. This, in turn, creates a need for a convenient tool that detects and analyzes heteroplasmy. Here we present MitoBamAnnotator, a user friendly web-based tool that allows maximum flexibility and control in heteroplasmy research. MitoBamAnnotator provides the user with a comprehensively annotated overview of mitochondrial genetic variation, allowing for an in-depth analysis with no prior knowledge in programming.

  5. Molecular cloning and nucleotide sequence of a full-length cDNA for human alpha enolase.

    Science.gov (United States)

    Giallongo, A; Feo, S; Moore, R; Croce, C M; Showe, L C

    1986-01-01

    We previously purified a 48-kDa protein (p48) that specifically reacts with an antiserum directed against the 12 carboxyl-terminal amino acids of the c-myc gene product. Using an antiserum directed against the purified p48, we have cloned a cDNA from a human expression library. This cDNA hybrid-selects an mRNA that translates to a 48-kDa protein that specifically reacts with anti-p48 serum. We have isolated a full-length cDNA that encodes p48 and spans 1755 bases. The coding region is 1299 bases long; 94 bases are 5' noncoding and 359 bases are 3' noncoding. The cDNA encodes a 433 amino acid protein that is 67% homologous to yeast enolase and 94% homologous to the rat non-neuronal enolase. The purified protein has been shown to have enolase activity and has been identified to be of the alpha type by isoenzyme analysis. The transcriptional regulation of enolase expression in response to mitogenic stimulation of peripheral blood lymphocytes and in response to heat shock is also discussed. Images PMID:3529090

  6. Nucleosome DNA sequence structure of isochores

    Directory of Open Access Journals (Sweden)

    Trifonov Edward N

    2011-04-01

    Full Text Available Abstract Background Significant differences in G+C content between different isochore types suggest that the nucleosome positioning patterns in DNA of the isochores should be different as well. Results Extraction of the patterns from the isochore DNA sequences by Shannon N-gram extension reveals that while the general motif YRRRRRYYYYYR is characteristic for all isochore types, the dominant positioning patterns of the isochores vary between TAAAAATTTTTA and CGGGGGCCCCCG due to the large differences in G+C composition. This is observed in human, mouse and chicken isochores, demonstrating that the variations of the positioning patterns are largely G+C dependent rather than species-specific. The species-specificity of nucleosome positioning patterns is revealed by dinucleotide periodicity analyses in isochore sequences. While human sequences are showing CG periodicity, chicken isochores display AG (CT periodicity. Mouse isochores show very weak CG periodicity only. Conclusions Nucleosome positioning pattern as revealed by Shannon N-gram extension is strongly dependent on G+C content and different in different isochores. Species-specificity of the pattern is subtle. It is reflected in the choice of preferentially periodical dinucleotides.

  7. PCR-SSCP-DNA sequencing method in detecting PTEN gene mutation and its significance in human gastric cancer

    Institute of Scientific and Technical Information of China (English)

    Chuan-Yong Guo; Xuan-Fu Xu; Jian-Ye Wu; Shu-Fang Liu

    2008-01-01

    AIM: To discuss the possible effect of PTEN gene mutations on occurrence and development of gastric cancer.METHODS: Fifty-three gastric cancer specimens were selected to probe PTEN gene mutations in genome of gastric cancer and paracancerous tissues using PCR-SSCP-DNA sequencing method based on microdissection and to observe the protein expression by immunohistochemistry technique.RESULTS: PCR-SSCP-DNA sequencing indicated that 4 kinds of mutation sites were found in 5 of 53 gastric cancer specimens.One kind of mutation was found in exons.AA-TCC mutation was located at 40bp upstream of 3' lateral exert 7 (115946 AA-TCC).Such mutations led to terminator formation in the 297th codon of the PTEN gene.The other 3 kinds of mutation were found in introns,including a G-C point mutation at 91 bp upstream of 5' lateral exon 5(90896 G-C),a T-G point mutation at 24 bp upstream of 5' lateral exon 5 (90963 T-G),and a single base A mutation at 7 bp upstream of 5' lateral exon 5 (90980 A del).The PTEN protein expression in gastric cancer and paracancerous tissues detected using immunohistochemistry technique indicated that the total positive rate of PTEN protein expression was 66% in gastric cancer tissue,which was significantly lower than that (100%) in paracancerous tissues (P<0.005).CONCLUSION: PTEN gene mutation and expression may play an important role in the occurrence and development of gastric cancer.(C)2008 The WJG Press.All rights reserved.

  8. [DNA sequencing technology and automatization of it].

    Science.gov (United States)

    Kraev, A S

    1991-01-01

    Precise manipulations with genetic material, typical for modern experiments in molecular biology and in new biotechnology, require a capability to determine DNA base sequence. This capability enables today to exploit specific genetic knowledge for the dissection of complex cell processes and for modulation of cell metabolism in transgenic organisms. The review focuses on such DNA sequencing technologies that are widespread in general laboratory practice. They can safely be called, with the availability of commercial reagents, industrial techniques. Modern DNA sequencing requires recurrent breakdown of large genomic DNA into smaller pieces, that are then amplified, sequenced and the initial long stretch reconstructed via overlap of small pieces. The DNA sequencing process has several steps: a DNA fragment is obtained in sufficient quantity and purity, it is converted to a form suitable for a particular sequencing method, a sequencing reaction is performed and its products fractionated; and finally the resultant data are interpreted (i.e. an autoradiograph is read into a computer memory) and a long sequence in reconstructed via overlap of short stretches. These steps are considered in separate parts; an accent is made on sequencing strategies with respect to their biological task. In the last part, possibilities for automation of sequencing experiment are considered, followed by a discussion of domestic problems in DNA sequencing.

  9. Fibonacci Sequence and Supramolecular Structure of DNA.

    Science.gov (United States)

    Shabalkin, I P; Grigor'eva, E Yu; Gudkova, M V; Shabalkin, P I

    2016-05-01

    We proposed a new model of supramolecular DNA structure. Similar to the previously developed by us model of primary DNA structure [11-15], 3D structure of DNA molecule is assembled in accordance to a mathematic rule known as Fibonacci sequence. Unlike primary DNA structure, supramolecular 3D structure is assembled from complex moieties including a regular tetrahedron and a regular octahedron consisting of monomers, elements of the primary DNA structure. The moieties of the supramolecular DNA structure forming fragments of regular spatial lattice are bound via linker (joint) sequences of the DNA chain. The lattice perceives and transmits information signals over a considerable distance without acoustic aberrations. Linker sequences expand conformational space between lattice segments allowing their sliding relative to each other under the action of external forces. In this case, sliding is provided by stretching of the stacked linker sequences.

  10. Expression of a chimeric human/salmon calcitonin gene integrated into the Saccharomyces cerevisiae genome using rDNA sequences as recombination sites.

    Science.gov (United States)

    Sun, Hengyi; Zang, Xiaonan; Liu, Yuantao; Cao, Xiaofei; Wu, Fei; Huang, Xiaoyun; Jiang, Minjie; Zhang, Xuecheng

    2015-12-01

    Calcitonin participates in controlling homeostasis of calcium and phosphorus and plays an important role in bone metabolism. The aim of this study was to endow an industrial strain of Saccharomyces cerevisiae with the ability to express chimeric human/salmon calcitonin (hsCT) without the use of antibiotics. To do so, a homologous recombination plasmid pUC18-rDNA2-ura3-P pgk -5hsCT-rDNA1 was constructed, which contains two segments of ribosomal DNA of 1.1 kb (rDNA1) and 1.4 kb (rDNA2), to integrate the heterologous gene into host rDNA. A DNA fragment containing five copies of a chimeric human/salmon calcitonin gene (5hsCT) under the control of the promoter for phosphoglycerate kinase (P pgk ) was constructed to express 5hsCT in S. cerevisiae using ura3 as a selectable auxotrophic marker gene. After digestion by restriction endonuclease HpaI, a linear fragment, rDNA2-ura3-P pgk -5hsCT-rDNA1, was obtained and transformed into the △ura3 mutant of S. cerevisiae by the lithium acetate method. The ura3-P pgk -5hsCT sequence was introduced into the genome at rDNA sites by homologous recombination, and the recombinant strain YS-5hsCT was obtained. Southern blot analysis revealed that the 5hsCT had been integrated successfully into the genome of S. cerevisiae. The results of Western blot and ELISA confirmed that the 5hsCT protein had been expressed in the recombinant strain YS-5hsCT. The expression level reached 2.04 % of total proteins. S. cerevisiae YS-5hsCT decreased serum calcium in mice by oral administration and even 0.01 g lyophilized S. cerevisiae YS-5hsCT/kg decreased serum calcium by 0.498 mM. This work has produced a commercial yeast strain potentially useful for the treatment of osteoporosis.

  11. Mitochondrial DNA sequence evolution in shorebird populations.

    NARCIS (Netherlands)

    Wenink, P.W.

    1994-01-01

    This thesis describes the global molecular population structure of two shorebird species, in particular of the dunlin, Calidris alpina, by means of comparative sequence analysis of the most variable part of the mitochondrial DNA (mtDNA) genome. There are several reasons why mtDNA is the molecule of

  12. Ubiquitous human 'master' origins of replication are encoded in the DNA sequence via a local enrichment in nucleosome excluding energy barriers.

    Science.gov (United States)

    Drillon, Guénola; Audit, Benjamin; Argoul, Françoise; Arneodo, Alain

    2015-02-18

    As the elementary building block of eukaryotic chromatin, the nucleosome is at the heart of the compromise between the necessity of compacting DNA in the cell nucleus and the required accessibility to regulatory proteins. The recent availability of genome-wide experimental maps of nucleosome positions for many different organisms and cell types has provided an unprecedented opportunity to elucidate to what extent the DNA sequence conditions the primary structure of chromatin and in turn participates in the chromatin-mediated regulation of nuclear functions, such as gene expression and DNA replication. In this study, we use in vivo and in vitro genome-wide nucleosome occupancy data together with the set of nucleosome-free regions (NFRs) predicted by a physical model of nucleosome formation based on sequence-dependent bending properties of the DNA double-helix, to investigate the role of intrinsic nucleosome occupancy in the regulation of the replication spatio-temporal programme in human. We focus our analysis on the so-called replication U/N-domains that were shown to cover about half of the human genome in the germline (skew-N domains) as well as in embryonic stem cells, somatic and HeLa cells (mean replication timing U-domains). The 'master' origins of replication (MaOris) that border these megabase-sized U/N-domains were found to be specified by a few hundred kb wide regions that are hyper-sensitive to DNase I cleavage, hypomethylated, and enriched in epigenetic marks involved in transcription regulation, the hallmarks of localized open chromatin structures. Here we show that replication U/N-domain borders that are conserved in all considered cell lines have an environment highly enriched in nucleosome-excluding-energy barriers, suggesting that these ubiquitous MaOris have been selected during evolution. In contrast, MaOris that are cell-type-specific are mainly regulated epigenetically and are no longer favoured by a local abundance of intrinsic NFRs encoded in

  13. Ubiquitous human ‘master’ origins of replication are encoded in the DNA sequence via a local enrichment in nucleosome excluding energy barriers

    Science.gov (United States)

    Drillon, Guénola; Audit, Benjamin; Argoul, Françoise; Arneodo, Alain

    2015-02-01

    As the elementary building block of eukaryotic chromatin, the nucleosome is at the heart of the compromise between the necessity of compacting DNA in the cell nucleus and the required accessibility to regulatory proteins. The recent availability of genome-wide experimental maps of nucleosome positions for many different organisms and cell types has provided an unprecedented opportunity to elucidate to what extent the DNA sequence conditions the primary structure of chromatin and in turn participates in the chromatin-mediated regulation of nuclear functions, such as gene expression and DNA replication. In this study, we use in vivo and in vitro genome-wide nucleosome occupancy data together with the set of nucleosome-free regions (NFRs) predicted by a physical model of nucleosome formation based on sequence-dependent bending properties of the DNA double-helix, to investigate the role of intrinsic nucleosome occupancy in the regulation of the replication spatio-temporal programme in human. We focus our analysis on the so-called replication U/N-domains that were shown to cover about half of the human genome in the germline (skew-N domains) as well as in embryonic stem cells, somatic and HeLa cells (mean replication timing U-domains). The ‘master’ origins of replication (MaOris) that border these megabase-sized U/N-domains were found to be specified by a few hundred kb wide regions that are hyper-sensitive to DNase I cleavage, hypomethylated, and enriched in epigenetic marks involved in transcription regulation, the hallmarks of localized open chromatin structures. Here we show that replication U/N-domain borders that are conserved in all considered cell lines have an environment highly enriched in nucleosome-excluding-energy barriers, suggesting that these ubiquitous MaOris have been selected during evolution. In contrast, MaOris that are cell-type-specific are mainly regulated epigenetically and are no longer favoured by a local abundance of intrinsic NFRs

  14. Biased distribution of DNA uptake sequences towards genome maintenance genes

    DEFF Research Database (Denmark)

    Davidsen, T.; Rodland, E.A.; Lagesen, K.

    2004-01-01

    coding regions are the DNA uptake sequences (DUS) required for natural genetic transformation. More importantly, we found a significantly higher density of DUS within genes involved in DNA repair, recombination, restriction-modification and replication than in any other annotated gene group......Repeated sequence signatures are characteristic features of all genomic DNA. We have made a rigorous search for repeat genomic sequences in the human pathogens Neisseria meningitidis, Neisseria gonorrhoeae and Haemophilus influenzae and found that by far the most frequent 9-10mers residing within...

  15. Comparison of repair of DNA double-strand breaks in identical sequences in primary human fibroblast and immortal hamster-human hybrid cells harboring a single copy of human chromosome 11

    Science.gov (United States)

    Fouladi, B.; Waldren, C. A.; Rydberg, B.; Cooper, P. K.; Chatterjee, A. (Principal Investigator)

    2000-01-01

    We have optimized a pulsed-field gel electrophoresis assay that measures induction and repair of double-strand breaks (DSBs) in specific regions of the genome (Lobrich et al., Proc. Natl. Acad. Sci. USA 92, 12050-12054, 1995). The increased sensitivity resulting from these improvements makes it possible to analyze the size distribution of broken DNA molecules immediately after the introduction of DSBs and after repair incubation. This analysis shows that the distribution of broken DNA pieces after exposure to sparsely ionizing radiation is consistent with the distribution expected from randomly induced DSBs. It is apparent from the distribution of rejoined DNA pieces after repair incubation that DNA ends continue to rejoin between 3 and 24 h postirradiation and that some of these rejoining events are in fact misrejoining events, since novel restriction fragments both larger and smaller than the original fragment are generated after repair. This improved assay was also used to study the kinetics of DSB rejoining and the extent of misrejoining in identical DNA sequences in human GM38 cells and human-hamster hybrid A(L) cells containing a single human chromosome 11. Despite the numerous differences between these cells, which include species and tissue of origin, levels of TP53, expression of telomerase, and the presence or absence of a homologous chromosome for the restriction fragments examined, the kinetics of rejoining of radiation-induced DSBs and the extent of misrejoining were similar in the two cell lines when studied in the G(1) phase of the cell cycle. Furthermore, DSBs were removed from the single-copy human chromosome in the hamster A(L) cells with similar kinetics and misrejoining frequency as at a locus on this hybrid's CHO chromosomes.

  16. cDNA sequence and gene locus of the human retinal phosphoinositide-specific phospholipase-C{beta}4 (PLCB4)

    Energy Technology Data Exchange (ETDEWEB)

    Alvarez, R.A.; Ghalayini, A.J.; Anderson, R.E. [Baylor College of Medicine, Houston, TX (United States)] [and others

    1995-09-01

    Defects in the Drosophila norpA (no receptor potential A) gene encoding a phosphoinositide-specific phospholipase C (PLC) block invertebrate phototransduction and lead to retinal degeneration. The mammalian homolog, PLCB4, is expressed in rat brain, bovine cerebellum, and the bovine retina in several splice variants. To determine a possible role of PLCB4 gene defects in human disease, we isolated several overlapping cDNA clones from a human retina library. The composite cDNA sequence predicts a human PLC{beta}4 polypeptide of 1022 amino acid residues (MW 117,000). This PLC{beta}4 variant lacks a 165-amino-acid N-terminal domain characteristic for the rat brain isoforms, but has a distinct putative exon 1 unique for human and bovine retina isoforms. A PLC{beta}4 monospecific antibody detected a major (130 kDa) and a minor (160 kDa) isoform in retina homogenates. Somatic cell hybrids and deletion panels were used to localize the PCLB4 gene to the short arm of chromosome 20. The gene was further sublocalized to 20p12 by florescence in situ hybridization. 4 refs., 5 figs.

  17. Long range correlations in DNA sequences

    CERN Document Server

    Mohanty, A K

    2002-01-01

    The so called long range correlation properties of DNA sequences are studied using the variance analyses of the density distribution of a single or a group of nucleotides in a model independent way. This new method which was suggested earlier has been applied to extract slope parameters that characterize the correlation properties for several intron containing and intron less DNA sequences. An important aspect of all the DNA sequences is the properties of complimentarity by virtue of which any two complimentary distributions (like GA is complimentary to TC or G is complimentary to ATC) have identical fluctuations at all scales although their distribution functions need not be identical. Due to this complimentarity, the famous DNA walk representation whose statistical interpretation is still unresolved is shown to be a special case of the present formalism with a density distribution corresponding to a purine or a pyrimidine group. Another interesting aspect of most of the DNA sequences is that the factorial m...

  18. Dynamics and Control of DNA Sequence Amplification

    CERN Document Server

    Marimuthu, Karthikeyan

    2014-01-01

    DNA amplification is the process of replication of a specified DNA sequence \\emph{in vitro} through time-dependent manipulation of its external environment. A theoretical framework for determination of the optimal dynamic operating conditions of DNA amplification reactions, for any specified amplification objective, is presented based on first-principles biophysical modeling and control theory. Amplification of DNA is formulated as a problem in control theory with optimal solutions that can differ considerably from strategies typically used in practice. Using the Polymerase Chain Reaction (PCR) as an example, sequence-dependent biophysical models for DNA amplification are cast as control systems, wherein the dynamics of the reaction are controlled by a manipulated input variable. Using these control systems, we demonstrate that there exists an optimal temperature cycling strategy for geometric amplification of any DNA sequence and formulate optimal control problems that can be used to derive the optimal tempe...

  19. DNA display I. Sequence-encoded routing of DNA populations.

    Directory of Open Access Journals (Sweden)

    David R Halpin

    2004-07-01

    Full Text Available Recently reported technologies for DNA-directed organic synthesis and for DNA computing rely on routing DNA populations through complex networks. The reduction of these ideas to practice has been limited by a lack of practical experimental tools. Here we describe a modular design for DNA routing genes, and routing machinery made from oligonucleotides and commercially available chromatography resins. The routing machinery partitions nanomole quantities of DNA into physically distinct subpools based on sequence. Partitioning steps can be iterated indefinitely, with worst-case yields of 85% per step. These techniques facilitate DNA-programmed chemical synthesis, and thus enable a materials biology that could revolutionize drug discovery.

  20. Current-voltage characteristics of double-strand DNA sequences

    Science.gov (United States)

    Bezerril, L. M.; Moreira, D. A.; Albuquerque, E. L.; Fulco, U. L.; de Oliveira, E. L.; de Sousa, J. S.

    2009-09-01

    We use a tight-binding formulation to investigate the transmissivity and the current-voltage (I-V) characteristics of sequences of double-strand DNA molecules. In order to reveal the relevance of the underlying correlations in the nucleotides distribution, we compare the results for the genomic DNA sequence with those of artificial sequences (the long-range correlated Fibonacci and Rudin-Shapiro one) and a random sequence, which is a kind of prototype of a short-range correlated system. The random sequence is presented here with the same first neighbors pair correlations of the human DNA sequence. We found that the long-range character of the correlations is important to the transmissivity spectra, although the I-V curves seem to be mostly influenced by the short-range correlations.

  1. Current-voltage characteristics of double-strand DNA sequences

    Energy Technology Data Exchange (ETDEWEB)

    Bezerril, L.M.; Moreira, D.A. [Departamento de Fisica, Universidade Federal do Rio Grande do Norte, 59072-970, Natal-RN (Brazil); Albuquerque, E.L., E-mail: eudenilson@dfte.ufrn.b [Departamento de Fisica, Universidade Federal do Rio Grande do Norte, 59072-970, Natal-RN (Brazil); Fulco, U.L. [Departamento de Biofisica e Farmacologia, Universidade Federal do Rio Grande do Norte, 59072-970, Natal-RN (Brazil); Oliveira, E.L. de; Sousa, J.S. de [Departamento de Fisica, Universidade Federal do Ceara, 60455-760, Fortaleza-CE (Brazil)

    2009-09-07

    We use a tight-binding formulation to investigate the transmissivity and the current-voltage (I-V) characteristics of sequences of double-strand DNA molecules. In order to reveal the relevance of the underlying correlations in the nucleotides distribution, we compare the results for the genomic DNA sequence with those of artificial sequences (the long-range correlated Fibonacci and Rudin-Shapiro one) and a random sequence, which is a kind of prototype of a short-range correlated system. The random sequence is presented here with the same first neighbors pair correlations of the human DNA sequence. We found that the long-range character of the correlations is important to the transmissivity spectra, although the I-V curves seem to be mostly influenced by the short-range correlations.

  2. Isolation of a human anti-haemophilic factor IX cDNA clone using a unique 52-base synthetic oligonucleotide probe deduced from the amino acid sequence of bovine factor IX.

    Science.gov (United States)

    Jaye, M; de la Salle, H; Schamber, F; Balland, A; Kohli, V; Findeli, A; Tolstoshev, P; Lecocq, J P

    1983-04-25

    A unique 52mer oligonucleotide deduced from the amino acid sequence of bovine Factor IX was synthesized and used as a probe to screen a human liver cDNA bank. The Factor IX clone isolated shows 5 differences in nucleotide and deduced amino acid sequence as compared to a previously isolated clone. In addition, precisely one codon has been deleted.Images

  3. Visible periodicity of strong nucleosome DNA sequences.

    Science.gov (United States)

    Salih, Bilal; Tripathi, Vijay; Trifonov, Edward N

    2015-01-01

    Fifteen years ago, Lowary and Widom assembled nucleosomes on synthetic random sequence DNA molecules, selected the strongest nucleosomes and discovered that the TA dinucleotides in these strong nucleosome sequences often appear at 10-11 bases from one another or at distances which are multiples of this period. We repeated this experiment computationally, on large ensembles of natural genomic sequences, by selecting the strongest nucleosomes--i.e. those with such distances between like-named dinucleotides, multiples of 10.4 bases, the structural and sequence period of nucleosome DNA. The analysis confirmed the periodicity of TA dinucleotides in the strong nucleosomes, and revealed as well other periodic sequence elements, notably classical AA and TT dinucleotides. The matrices of DNA bendability and their simple linear forms--nucleosome positioning motifs--are calculated from the strong nucleosome DNA sequences. The motifs are in full accord with nucleosome positioning sequences derived earlier, thus confirming that the new technique, indeed, detects strong nucleosomes. Species- and isochore-specific variations of the matrices and of the positioning motifs are demonstrated. The strong nucleosome DNA sequences manifest the highest hitherto nucleosome positioning sequence signals, showing the dinucleotide periodicities in directly observable rather than in hidden form.

  4. Localization of the human fibromodulin gene (FMOD) to chromosome 1q32 and completion of the cDNA sequence

    Energy Technology Data Exchange (ETDEWEB)

    Sztrolovics, R.; Grover, J.; Roughley, P.J. [McGill Univ., Montreal (Canada)] [and others

    1994-10-01

    This report describes the cloning of the 3{prime}-untranslated region of the human fibromodulin cDNA and its use to map the gene. For somatic cell hybrids, the generation of the PCR product was concordant with the presence of chromosome 1 and discordant with the presence of all other chromosomes, confirming that the fibromodulin gene is located within region q32 of chromosome 1. The physical mapping of genes is a critical step in the process of identifying which genes may be responsible for various inherited disorders. Specifically, the mapping of the fibromodulin gene now provides the information necessary to evaluate its potential role in genetic disorders of connective tissues. The analysis of previously reported diseases mapped to chromosome 1 reveals two genes located in the proximity of the fibromodulin locus. These are Usher syndrome type II, a recessive disorder characterized by hearing loss and retinitis pigmentosa, and Van der Woude syndrome, a dominant condition associated with abnormalities such as cleft lip and palate and hyperdontia. The genes for both of these disorders have been projected to be localized to 1q32 of a physical map that integrates available genetic linkage and physical data. However, it seems improbable that either of these disorders, exhibiting restricted tissue involvement, could be linked to the fibromodulin gene, given the wide tissue distribution of the encoded proteoglycan, although it remains possible that the relative importance of the quantity and function of the proteoglycan may avry between tissues. 11 refs., 1 fig.

  5. Applications of mass spectrometry to DNA fingerprinting and DNA sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Jacobson, K.B.; Buchanan, M.V.; Chen, C.H.; Doktycz, M.J.; McLuckey, S.A. (Oak Ridge National Lab., TN (United States)); Arlinghaus, H.F. (Atom Sciences, Inc., Oak Ridge, TN (United States))

    1993-01-01

    DNA fingerprinting and sequencing rely on polyacrylamide gel electrophoresis to determine the sizes of the DNA fragments. Innovative altematives to polyacrylamide gel electrophoresis are under investigation for characterization of such fingerprinting and sequencing. One method uses stable isotopes of tin and other elements to label the DNAwhereas other procedures do not require labels. The detectors in each case are mass spectrometers that detect either the stable isotopes or the DNA fragments themselves. If successful, these methods will speed up the rate of DNA analysis by one or two orders of magnitude.

  6. Applications of mass spectrometry to DNA fingerprinting and DNA sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Jacobson, K.B.; Buchanan, M.V.; Chen, C.H.; Doktycz, M.J.; McLuckey, S.A. [Oak Ridge National Lab., TN (United States); Arlinghaus, H.F. [Atom Sciences, Inc., Oak Ridge, TN (United States)

    1993-06-01

    DNA fingerprinting and sequencing rely on polyacrylamide gel electrophoresis to determine the sizes of the DNA fragments. Innovative altematives to polyacrylamide gel electrophoresis are under investigation for characterization of such fingerprinting and sequencing. One method uses stable isotopes of tin and other elements to label the DNAwhereas other procedures do not require labels. The detectors in each case are mass spectrometers that detect either the stable isotopes or the DNA fragments themselves. If successful, these methods will speed up the rate of DNA analysis by one or two orders of magnitude.

  7. EGNAS: an exhaustive DNA sequence design algorithm

    Directory of Open Access Journals (Sweden)

    Kick Alfred

    2012-06-01

    Full Text Available Abstract Background The molecular recognition based on the complementary base pairing of deoxyribonucleic acid (DNA is the fundamental principle in the fields of genetics, DNA nanotechnology and DNA computing. We present an exhaustive DNA sequence design algorithm that allows to generate sets containing a maximum number of sequences with defined properties. EGNAS (Exhaustive Generation of Nucleic Acid Sequences offers the possibility of controlling both interstrand and intrastrand properties. The guanine-cytosine content can be adjusted. Sequences can be forced to start and end with guanine or cytosine. This option reduces the risk of “fraying” of DNA strands. It is possible to limit cross hybridizations of a defined length, and to adjust the uniqueness of sequences. Self-complementarity and hairpin structures of certain length can be avoided. Sequences and subsequences can optionally be forbidden. Furthermore, sequences can be designed to have minimum interactions with predefined strands and neighboring sequences. Results The algorithm is realized in a C++ program. TAG sequences can be generated and combined with primers for single-base extension reactions, which were described for multiplexed genotyping of single nucleotide polymorphisms. Thereby, possible foldback through intrastrand interaction of TAG-primer pairs can be limited. The design of sequences for specific attachment of molecular constructs to DNA origami is presented. Conclusions We developed a new software tool called EGNAS for the design of unique nucleic acid sequences. The presented exhaustive algorithm allows to generate greater sets of sequences than with previous software and equal constraints. EGNAS is freely available for noncommercial use at http://www.chm.tu-dresden.de/pc6/EGNAS.

  8. Persistence of DNA sequences of BK virus and JC virus in normal human tissues and in diseased tissues.

    Science.gov (United States)

    Chesters, P M; Heritage, J; McCance, D J

    1983-04-01

    Available evidence suggests that BK virus (BKV) and JC virus (JCV) persist in the kidneys of healthy individuals after primary infection and may reactivate when the host's immune response is impaired. Data supporting this hypothesis are presented. A previous study had shown BKV to be present in the kidneys of eight (57%) of 14 subjects. In the present study, which extended the investigation to a total of 30 subjects, BKV DNA was found in the renal tissues of 10 (33%) subjects, and JCV DNA was found in the renal tissues of three (10%) subjects. The viral DNA detected appeared not to be integrated with host DNA and to be isolated in foci. Investigation of normal and diseased brain tissue, including tissue from six subjects with multiple sclerosis, failed to reveal the presence of either JCV DNA or BKV DNA.

  9. Spectroscopic investigation on the telomeric DNA base sequence repeat

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    Telomeres are protein-DNA complexes at the terminals of linear chromosomes, which protect chromosomal integrity and maintain cellular replicative capacity.From single-cell organisms to advanced animals and plants,structures and functions of telomeres are both very conservative. In cells of human and vertebral animals, telomeric DNA base sequences all are (TTAGGG)n. In the present work, we have obtained absorption and fluorescence spectra measured from seven synthesized oligonucleotides to simulate the telomeric DNA system and calculated their relative fluorescence quantum yields on which not only telomeric DNA characteristics are predicted but also possibly the shortened telomeric sequences during cell division are imrelative fluorescence quantum yield and remarkable excitation energy innerconversion, which tallies with the telomeric sequence of (TTAGGG)n. This result shows that telomeric DNA has a strong non-radiative or innerconvertible capability.``

  10. Nanopore DNA sequencing using kinetic proofreading

    Science.gov (United States)

    Ling, Xinsheng

    We propose a method of DNA sequencing by combining the physical method of nanopore electrical measurements and Southern's sequencing-by-hybridization. The new key ingredient, essential to both lowering the costs and increasing the precision, is an asymmetric nanopore sandwich device capable of measuring the DNA hybridization probe twice separated by a designed waiting time. Those incorrect probes appearing only once in nanopore ionic current traces are discriminated from the correct ones that appear twice. This method of discrimination is similar to the principle of kinetic proofreading proposed by Hopfield and Ninio in gene transcription and translation processes. An error analysis is of this nanopore kinetic proofreading (nKP) technique for DNA sequencing is carried out in comparison with the most precise 3' dideoxy termination method developed by Sanger. Nanopore DNA sequencing using kinetic proofreading.

  11. An approach to sequence DNA without tagging

    Science.gov (United States)

    Niu, Sanjun; Saraf, Ravi F.

    2002-10-01

    Microarray technology is playing an increasingly important role in biology and medicine and its application to genomics for gene expression analysis has already reached the market with a variety of commercially available instruments. In these combinatorial analysis methods, known probe single-strand DNA (ssDNA) 'primers' are attached in clusters of typically 100 µm × 100 µm pixels. Each pixel of the array has a slightly different sequence. On exposure to 'unknown' target ssDNA, the pixels with the right complementary probe ssDNA sequence convert to double-stranded DNA (dsDNA) by a hybridization reaction. To transduct the conversion of the pixel to dsDNA, the target ssDNA is labelled with a photoluminescent tag during the polymerase chain reaction (PCR) amplification process. Due to the statistical distribution of the tags in the target ssDNA, it becomes significantly difficult to implement these methods as a diagnostic tool in a pathology laboratory. A method to sequence DNA without tagging the molecule is developed. The fabrication process is compatible with current microelectronics and (emerging) soft-material fabrication technologies, allowing the method to be integrable with micro-electromechanical systems (MEMS) and lab-on-a-chip devices. An estimated sensitivity of 10-12 g on a 1 cm2 device area is obtained.

  12. gargammel: a sequence simulator for ancient DNA.

    Science.gov (United States)

    Renaud, Gabriel; Hanghøj, Kristian; Willerslev, Eske; Orlando, Ludovic

    2016-10-29

    Ancient DNA has emerged as a remarkable tool to infer the history of extinct species and past populations. However, many of its characteristics, such as extensive fragmentation, damage and contamination, can influence downstream analyses. To help investigators measure how these could impact their analyses in silico, we have developed gargammel, a package that simulates ancient DNA fragments given a set of known reference genomes. Our package simulates the entire molecular process from post-mortem DNA fragmentation and DNA damage to experimental sequencing errors, and reproduces most common bias observed in ancient DNA datasets.

  13. Next-generation sequencing of RNA and DNA isolated from paired fresh-frozen and formalin-fixed paraffin-embedded samples of human cancer and normal tissue.

    Directory of Open Access Journals (Sweden)

    Jakob Hedegaard

    Full Text Available Formalin-fixed, paraffin-embedded (FFPE tissues are an invaluable resource for clinical research. However, nucleic acids extracted from FFPE tissues are fragmented and chemically modified making them challenging to use in molecular studies. We analysed 23 fresh-frozen (FF, 35 FFPE and 38 paired FF/FFPE specimens, representing six different human tissue types (bladder, prostate and colon carcinoma; liver and colon normal tissue; reactive tonsil in order to examine the potential use of FFPE samples in next-generation sequencing (NGS based retrospective and prospective clinical studies. Two methods for DNA and three methods for RNA extraction from FFPE tissues were compared and were found to affect nucleic acid quantity and quality. DNA and RNA from selected FFPE and paired FF/FFPE specimens were used for exome and transcriptome analysis. Preparations of DNA Exome-Seq libraries was more challenging (29.5% success than that of RNA-Seq libraries, presumably because of modifications to FFPE tissue-derived DNA. Libraries could still be prepared from RNA isolated from two-decade old FFPE tissues. Data were analysed using the CLC Bio Genomics Workbench and revealed systematic differences between FF and FFPE tissue-derived nucleic acid libraries. In spite of this, pairwise analysis of DNA Exome-Seq data showed concordance for 70-80% of variants in FF and FFPE samples stored for fewer than three years. RNA-Seq data showed high correlation of expression profiles in FF/FFPE pairs (Pearson Correlations of 0.90 +/- 0.05, irrespective of storage time (up to 244 months and tissue type. A common set of 1,494 genes was identified with expression profiles that were significantly different between paired FF and FFPE samples irrespective of tissue type. Our results are promising and suggest that NGS can be used to study FFPE specimens in both prospective and retrospective archive-based studies in which FF specimens are not available.

  14. Mitochondrial DNA sequence evolution in shorebird populations

    NARCIS (Netherlands)

    Wenink, P.W.

    1994-01-01

    This thesis describes the global molecular population structure of two shorebird species, in particular of the dunlin, Calidris alpina, by means of comparative sequence analysis of the most variable part of the mitochondrial DNA (mtDNA) genome. There are several reasons

  15. Ribosomal DNA copy number loss and sequence variation in cancer.

    Science.gov (United States)

    Xu, Baoshan; Li, Hua; Perry, John M; Singh, Vijay Pratap; Unruh, Jay; Yu, Zulin; Zakari, Musinu; McDowell, William; Li, Linheng; Gerton, Jennifer L

    2017-06-01

    Ribosomal DNA is one of the most variable regions in the human genome with respect to copy number. Despite the importance of rDNA for cellular function, we know virtually nothing about what governs its copy number, stability, and sequence in the mammalian genome due to challenges associated with mapping and analysis. We applied computational and droplet digital PCR approaches to measure rDNA copy number in normal and cancer states in human and mouse genomes. We find that copy number and sequence can change in cancer genomes. Counterintuitively, human cancer genomes show a loss of copies, accompanied by global copy number co-variation. The sequence can also be more variable in the cancer genome. Cancer genomes with lower copies have mutational evidence of mTOR hyperactivity. The PTEN phosphatase is a tumor suppressor that is critical for genome stability and a negative regulator of the mTOR kinase pathway. Surprisingly, but consistent with the human cancer genomes, hematopoietic cancer stem cells from a Pten-/- mouse model for leukemia have lower rDNA copy number than normal tissue, despite increased proliferation, rRNA production, and protein synthesis. Loss of copies occurs early and is associated with hypersensitivity to DNA damage. Therefore, copy loss is a recurrent feature in cancers associated with mTOR activation. Ribosomal DNA copy number may be a simple and useful indicator of whether a cancer will be sensitive to DNA damaging treatments.

  16. Nanogrid rolling circle DNA sequencing

    Energy Technology Data Exchange (ETDEWEB)

    Church, George M.; Porreca, Gregory J.; Shendure, Jay; Rosenbaum, Abraham Meir

    2017-04-18

    The present invention relates to methods for sequencing a polynucleotide immobilized on an array having a plurality of specific regions each having a defined diameter size, including synthesizing a concatemer of a polynucleotide by rolling circle amplification, wherein the concatemer has a cross-sectional diameter greater than the diameter of a specific region, immobilizing the concatemer to the specific region to make an immobilized concatemer, and sequencing the immobilized concatemer.

  17. Nanopore-based Fourth-generation DNA Sequencing Technology

    Institute of Scientific and Technical Information of China (English)

    Yanxiao Feng; Yuechuan Zhang; Cuifeng Ying; Deqiang Wang; Chunlei Du

    2015-01-01

    Nanopore-based sequencers, as the fourth-generation DNA sequencing technology, have the potential to quickly and reliably sequence the entire human genome for less than $1000, and possibly for even less than$100. The single-molecule techniques used by this technology allow us to further study the interaction between DNA and protein, as well as between protein and protein. Nanopore analysis opens a new door to molecular biology investigation at the single-molecule scale. In this article, we have reviewed academic achievements in nanopore technology from the past as well as the latest advances, including both biological and solid-state nanopores, and discussed their recent and potential applications.

  18. Sequencing intractable DNA to close microbial genomes.

    Science.gov (United States)

    Hurt, Richard A; Brown, Steven D; Podar, Mircea; Palumbo, Anthony V; Elias, Dwayne A

    2012-01-01

    Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps) and the Desulfovibrio africanus genome (1 intractable gap). The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  19. Sequencing intractable DNA to close microbial genomes.

    Directory of Open Access Journals (Sweden)

    Richard A Hurt

    Full Text Available Advancement in high throughput DNA sequencing technologies has supported a rapid proliferation of microbial genome sequencing projects, providing the genetic blueprint for in-depth studies. Oftentimes, difficult to sequence regions in microbial genomes are ruled "intractable" resulting in a growing number of genomes with sequence gaps deposited in databases. A procedure was developed to sequence such problematic regions in the "non-contiguous finished" Desulfovibrio desulfuricans ND132 genome (6 intractable gaps and the Desulfovibrio africanus genome (1 intractable gap. The polynucleotides surrounding each gap formed GC rich secondary structures making the regions refractory to amplification and sequencing. Strand-displacing DNA polymerases used in concert with a novel ramped PCR extension cycle supported amplification and closure of all gap regions in both genomes. The developed procedures support accurate gene annotation, and provide a step-wise method that reduces the effort required for genome finishing.

  20. Nanopore-CMOS Interfaces for DNA Sequencing.

    Science.gov (United States)

    Magierowski, Sebastian; Huang, Yiyun; Wang, Chengjie; Ghafar-Zadeh, Ebrahim

    2016-08-06

    DNA sequencers based on nanopore sensors present an opportunity for a significant break from the template-based incumbents of the last forty years. Key advantages ushered by nanopore technology include a simplified chemistry and the ability to interface to CMOS technology. The latter opportunity offers substantial promise for improvement in sequencing speed, size and cost. This paper reviews existing and emerging means of interfacing nanopores to CMOS technology with an emphasis on massively-arrayed structures. It presents this in the context of incumbent DNA sequencing techniques, reviews and quantifies nanopore characteristics and models and presents CMOS circuit methods for the amplification of low-current nanopore signals in such interfaces.

  1. Osmylated DNA, a novel concept for sequencing DNA using nanopores

    Science.gov (United States)

    Kanavarioti, Anastassia

    2015-03-01

    Saenger sequencing has led the advances in molecular biology, while faster and cheaper next generation technologies are urgently needed. A newer approach exploits nanopores, natural or solid-state, set in an electrical field, and obtains base sequence information from current variations due to the passage of a ssDNA molecule through the pore. A hurdle in this approach is the fact that the four bases are chemically comparable to each other which leads to small differences in current obstruction. ‘Base calling’ becomes even more challenging because most nanopores sense a short sequence and not individual bases. Perhaps sequencing DNA via nanopores would be more manageable, if only the bases were two, and chemically very different from each other; a sequence of 1s and 0s comes to mind. Osmylated DNA comes close to such a sequence of 1s and 0s. Osmylation is the addition of osmium tetroxide bipyridine across the C5-C6 double bond of the pyrimidines. Osmylation adds almost 400% mass to the reactive base, creates a sterically and electronically notably different molecule, labeled 1, compared to the unreactive purines, labeled 0. If osmylated DNA were successfully sequenced, the result would be a sequence of osmylated pyrimidines (1), and purines (0), and not of the actual nucleobases. To solve this problem we studied the osmylation reaction with short oligos and with M13mp18, a long ssDNA, developed a UV-vis assay to measure extent of osmylation, and designed two protocols. Protocol A uses mild conditions and yields osmylated thymidines (1), while leaving the other three bases (0) practically intact. Protocol B uses harsher conditions and effectively osmylates both pyrimidines, but not the purines. Applying these two protocols also to the complementary of the target polynucleotide yields a total of four osmylated strands that collectively could define the actual base sequence of the target DNA.

  2. A new baseline for fascioliasis in Venezuela: lymnaeid vectors ascertained by DNA sequencing and analysis of their relationships with human and animal infection

    Science.gov (United States)

    2011-01-01

    Background Human and animal fascioliasis poses serious public health problems in South America. In Venezuela, livestock infection represents an important veterinary problem whereas there appear to be few human cases reported, most of which are passively detected in health centres. However, results of recent surveys suggest that the situation may be underestimated in particular areas. To obtain a baseline for future fascioliasis assessment, studies were undertaken by means of rDNA ITS-2 and ITS-1 and mtDNA cox1 sequencing to clarify the specific status of Venezuelan lymnaeids, their geographical distribution and fascioliasis transmission capacity, by comparison with other American countries and other continents. Results Results obtained completely change the lymnaeid scenario known so far. The relatively rich lymnaeid fauna of Venezuela has been proven to include (i) Lymnaea meridensis and L. neotropica as the only native members, (ii) L. cubensis and Pseudosuccinea columella introduced from the Caribbean area, and (iii) Galba truncatula and L. schirazensis introduced from the Old World. The absence of representatives of the stagnicoline and Radix groups is remarkable. Four species are fascioliasis vectors: G. truncatula, L. cubensis and L. neotropica, which have the capacity to give rise to human endemic areas, and P. columella, which is a source of animal infection and is responsible for the spread of disease. Vector capacity in the apparently highland endemic L. meridensis is to be confimed, although may be expected given its phylogenetic relationships. Similarly as elsewhere, the non-transmitting L. schirazensis has been confused with L. cubensis, also with G. truncatula and possibly with L. neotropica. Conclusions The new scenario leads to the re-opening of many disease aspects. In Venezuela, altitude appears to be the main factor influencing fascioliasis distribution. Human infection shows an altitude pattern similar to other Andean countries, although a

  3. Electrochemical measurement for analysis of DNA sequence

    Energy Technology Data Exchange (ETDEWEB)

    Cho, S.B.; Hong, J.S.; Pak, J.H. [Korea University, Seoul (Korea); Kim, Y.M. [National Institute of Health, Seoul (Korea)

    2002-02-01

    One of the important roles of a DNA chip is the capability of detecting genetic diseases and mutations by analyzing DNA sequence. For a successful electrochemical genotyping, several aspects should be considered including the chemical treatment of electrode surface, DNA immobilization on electrode, hybridization, choice of an intercalator to be selectively bound to double standed DNA, and an equipment for detecting and analyzing the output singal. Au was used as the electrode material, 2-mercaptoethanol was used for linking DNA to Au electrode, and methylene blue was used as an indicator that can be bound to a double stranded DNA selectively. From the analysis of reductive current of this indicator that was bound to a double stranded DNA on an electrode, a normal double stranded DNA was able to be distinguished from a single stranded DNA in just a few seconds. Also, it was found that the peak reduction current of indicator is proportional to the concentration of target DNA to be hybridized with probe DNA. Therefore, it is possible to realize a simple and cheap DNA sensor using the electrochemical measurement for genotyping. (author). 20 refs., 8 figs., 1 tab.

  4. Dynamics and control of DNA sequence amplification

    Energy Technology Data Exchange (ETDEWEB)

    Marimuthu, Karthikeyan [Department of Chemical Engineering and Center for Advanced Process Decision-Making, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213 (United States); Chakrabarti, Raj, E-mail: raj@pmc-group.com, E-mail: rajc@andrew.cmu.edu [Department of Chemical Engineering and Center for Advanced Process Decision-Making, Carnegie Mellon University, Pittsburgh, Pennsylvania 15213 (United States); Division of Fundamental Research, PMC Advanced Technology, Mount Laurel, New Jersey 08054 (United States)

    2014-10-28

    DNA amplification is the process of replication of a specified DNA sequence in vitro through time-dependent manipulation of its external environment. A theoretical framework for determination of the optimal dynamic operating conditions of DNA amplification reactions, for any specified amplification objective, is presented based on first-principles biophysical modeling and control theory. Amplification of DNA is formulated as a problem in control theory with optimal solutions that can differ considerably from strategies typically used in practice. Using the Polymerase Chain Reaction as an example, sequence-dependent biophysical models for DNA amplification are cast as control systems, wherein the dynamics of the reaction are controlled by a manipulated input variable. Using these control systems, we demonstrate that there exists an optimal temperature cycling strategy for geometric amplification of any DNA sequence and formulate optimal control problems that can be used to derive the optimal temperature profile. Strategies for the optimal synthesis of the DNA amplification control trajectory are proposed. Analogous methods can be used to formulate control problems for more advanced amplification objectives corresponding to the design of new types of DNA amplification reactions.

  5. Female-specific DNA sequences in geese.

    Science.gov (United States)

    Huang, M C; Lin, W C; Horng, Y M; Rouvier, R; Huang, C W

    2003-07-01

    1. The OPAE random primers (Operon Technologies, Inc., CA) were used for random amplified polymorphic DNA (RAPD) fingerprinting in Chinese, White Roman and Landaise geese. One of these primers, OPAE-06, produced a 938-bp sex-specific fragment in all females and in no males of Chinese geese only. 2. A novel female-specific DNA sequence in Chinese goose was cloned and sequenced. Two primers, CGSex-F and CGSex-R, were designed in order to amplify a 912-bp sex-specific polymerase chain reaction (PCR) fragment on genomic DNA from female geese. 3. It was shown that a simple and effective PCR-based sexing technique could be used in the three goose breeds studied. 4. Nucleotide sequencing of the sex-specific fragments in White Roman and Landaise geese was performed and sequence differences were observed among these three breeds.

  6. PREDICTION OF CHROMATIN STATES USING DNA SEQUENCE PROPERTIES

    KAUST Repository

    Bahabri, Rihab R.

    2013-06-01

    Activities of DNA are to a great extent controlled epigenetically through the internal struc- ture of chromatin. This structure is dynamic and is influenced by different modifications of histone proteins. Various combinations of epigenetic modification of histones pinpoint to different functional regions of the DNA determining the so-called chromatin states. How- ever, the characterization of chromatin states by the DNA sequence properties remains largely unknown. In this study we aim to explore whether DNA sequence patterns in the human genome can characterize different chromatin states. Using DNA sequence motifs we built binary classifiers for each chromatic state to eval- uate whether a given genomic sequence is a good candidate for belonging to a particular chromatin state. Of four classification algorithms (C4.5, Naive Bayes, Random Forest, and SVM) used for this purpose, the decision tree based classifiers (C4.5 and Random Forest) yielded best results among those we evaluated. Our results suggest that in general these models lack sufficient predictive power, although for four chromatin states (insulators, het- erochromatin, and two types of copy number variation) we found that presence of certain motifs in DNA sequences does imply an increased probability that such a sequence is one of these chromatin states.

  7. Compressing DNA sequence databases with coil

    Directory of Open Access Journals (Sweden)

    Hendy Michael D

    2008-05-01

    Full Text Available Abstract Background Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work.

  8. A two-locus DNA sequence database for typing plant and human pathogens within the Fusarium oxysporum species complex

    DEFF Research Database (Denmark)

    O'Donnell, Kerry; Gueidan, C; Sink, S

    2009-01-01

    mycoses were genetically diverse, including several which appear to be nosocomial in origin. A congruence analysis, comparing partial EF-1alpha and IGS rDNA bootstrap consensus, identified a significant number of conflicting relationships dispersed throughout the bipartitions, suggesting that some...

  9. DNA Sequencing in Cultural Heritage.

    Science.gov (United States)

    Vai, Stefania; Lari, Martina; Caramelli, David

    2016-02-01

    During the last three decades, DNA analysis on degraded samples revealed itself as an important research tool in anthropology, archaeozoology, molecular evolution, and population genetics. Application on topics such as determination of species origin of prehistoric and historic objects, individual identification of famous personalities, characterization of particular samples important for historical, archeological, or evolutionary reconstructions, confers to the paleogenetics an important role also for the enhancement of cultural heritage. A really fast improvement in methodologies in recent years led to a revolution that permitted recovering even complete genomes from highly degraded samples with the possibility to go back in time 400,000 years for samples from temperate regions and 700,000 years for permafrozen remains and to analyze even more recent material that has been subjected to hard biochemical treatments. Here we propose a review on the different methodological approaches used so far for the molecular analysis of degraded samples and their application on some case studies.

  10. cDNA sequence quality data - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available Budding yeast cDNA sequencing project cDNA sequence quality data Data detail Data name cDNA sequence quality... data Description of data contents Phred's quality score. PHD format, one file to a single cDNA data, and co...ription Download License Update History of This Database Site Policy | Contact Us cDNA sequence quality data - Budding yeast cDNA sequencing project | LSDB Archive ...

  11. Analysis of p53 gene mutations in human gliomas by polymerase chain reaction-based single-strand conformation polymorphism and DNA sequencing.

    Science.gov (United States)

    Sarkar, F H; Kupsky, W J; Li, Y W; Sreepathi, P

    1994-03-01

    Mutations in the p53 gene have been recognized in brain tumors, and clonal expansion of p53 mutant cells has been shown to be associated with glioma progression. However, studies on the p53 gene have been limited by the need for frozen tissues. We have developed a method utilizing polymerase chain reaction (PCR) for the direct analysis of p53 mutation by single-strand conformation polymorphism (SSCP) and by direct DNA sequencing of the p53 gene using a single 10-microns paraffin-embedded tissue section. We applied this method to screen for p53 gene mutations in exons 5-8 in human gliomas utilizing paraffin-embedded tissues. Twenty paraffin blocks containing tumor were selected from surgical specimens from 17 different adult patients. Tumors included six anaplastic astrocytomas (AAs), nine glioblastomas (GBs), and two mixed malignant gliomas (MMGs). The tissue section on the stained glass slide was used to guide microdissection of an unstained adjacent tissue section to ensure > 90% of the tumor cell population for p53 mutational analysis. Simultaneously, microdissection of the tissue was also carried out to obtain normal tissue from adjacent areas as a control. Mutations in the p53 gene were identified in 3 of 17 (18%) patients by PCR-SSCP analysis and subsequently confirmed by PCR-based DNA sequencing. Mutations in exon 5 resulting in amino acid substitution were found in one thalamic AA (codon 158, CGC > CTT: Arg > Leu) and one cerebral hemispheric GB (codon 151, CCG > CTG: Pro > Leu).(ABSTRACT TRUNCATED AT 250 WORDS)

  12. Cluster analysis of human and animal pathogenic Microsporum species and their teleomorphic states, Arthroderma species, based on the DNA sequences of nuclear ribosomal internal transcribed spacer 1.

    Science.gov (United States)

    Makimura, K; Tamura, Y; Murakami, A; Kano, R; Nakamura, Y; Hasegawa, A; Uchida, K; Yamaguchi, H

    2001-01-01

    We performed a cluster analysis of human and animal pathogenic Microsporum species and their teleomorphic states, Arthroderma species, including A. otae-related species (M. canis, M. audouinii, M. distortum, M. equinum, M. langeronii, and M. ferrugineum) and M. gypseum complex (A. fulvum, A. gypseum, and A. incurvatum) using DNA sequences of nuclear ribosomal internal transcribed spacer 1 (ITS1). The dendrogram showed the members of A. otae-related species to be monophyletic and to construct an extremely closely related cluster with a long horizontal branch. This ITS1-homologous group of A. otae was organized in 6 unique genotypes, while sequences of the members of the ITS1-homologous group of M. gypseum complex are more diverse. This ITS1-based database of Microsporum species and their teleomorphic states will provide a useful and reliable species identification system: it is time-saving (takes two to three days), accurate and applicable even to strains with atypical morphological features or in a non-culturable state.

  13. DNA Amplification and Nucleotide Sequence Determination of a Region of Mitochondrial DNA in the Sea Snake, Laticauda Semifasciata

    OpenAIRE

    Eguchi, Tomoko; Eguchi, Yukinori; Oshiro, Minoru; Asato, Tsuyoshi; Takei, Hiroshi; Nakashima, Yasutsugu

    1993-01-01

    We determined the nucleotide sequence of a region of the 12S ribosomal RNA (rRNA) gene in the mitochondrial DNA (mtDNA) of the sea snake, Laticauda semifasciata, using the polymerase chain reaction (PCR). We synthesized oligonucleotide primers according to the nucleotide sequence of human mt DNA 12S rRNA gene and found that the target sequence (386bp) of the sea snake mtDNA could be amplified with these primers. The nucleotide sequence of the amplified region of the sea snake mt DNA was deter...

  14. DNA sequencing by synthesis with degenerate primers

    Institute of Scientific and Technical Information of China (English)

    2008-01-01

    The degenerate primer-based sequencing Was developed by a synthesis method(DP-SBS)for high-throughput DNA sequencing,in which a set of degenerate primers are hybridized on the arrayed DNA templates and extended by DNA polymerase on microarrays.In this method,adifferent set of degenerate primers containing a give nnumber(n)of degenerate nucleotides at the 3'-ends were annealed to the sequenced templates that were immobilized on the solid surface.The nucleotides(n+1)on the template sequences were determined by detecting the incorporation of fluorescent labeled nucleotides.The fluorescent labeled nucleotide was incorporated into the primer in a base-specific manner after the enzymatic primer extension reactions and nine-base length were read out accurately.The main advanmge of the DP-SBS is that the method only uses very conventional biochemical reagents and avoids the complicated special chemical reagents for removing the labeled nucleotides and reactivating the primer for further extension.From the present study,it is found that the DP-SBS method is reliable,simple,and cost-effective for laboratory-sequencing a large amount of short DNA fragments.

  15. Glycome mapping on DNA sequencing equipment.

    Science.gov (United States)

    Laroy, Wouter; Contreras, Roland; Callewaert, Nico

    2006-01-01

    Here we provide a detailed protocol for the analysis of protein-linked glycans on DNA sequencing equipment. This protocol satisfies the glyco-analytical needs of many projects and can form the basis of 'glycomics' studies, in which robustness, high throughput, high sensitivity and reliable quantification are of paramount importance. The protocol routinely resolves isobaric glycan stereoisomers, which is much more difficult by mass spectrometry (MS). Earlier methods made use of polyacrylamide gel-based sequencers, but we have now adapted the technique to multicapillary DNA sequencers, which represent the state of the art today. In addition, we have integrated an option for HPLC-based fractionation of highly anionic 8-amino-1,3,6-pyrenetrisulfonic acid (APTS)-labeled glycans before rapid capillary electrophoretic profiling. This option facilitates either two-dimensional profiling of complex glycan mixtures and exoglycosidase sequencing, or MS analysis of particular compounds of interest rather than of the total pool of glycans in a sample.

  16. The complete DNA sequence of vaccinia virus.

    Science.gov (United States)

    Goebel, S J; Johnson, G P; Perkus, M E; Davis, S W; Winslow, J P; Paoletti, E

    1990-11-01

    The complete DNA sequence of the genome of vaccinia virus has been determined. The genome consisted of 191,636 bp with a base composition of 66.6% A + T. We have identified 198 "major" protein-coding regions and 65 overlapping "minor" regions, for a total of 263 potential genes. Genes encoded by the virus were located by examination of DNA sequence characteristics and compared with existing vaccinia virus mapping analyses, sequence data, and transcription data. These genes were found to be compactly organized along the genome with relatively few regions of noncoding sequences. Whereas several similarities to proteins of known function were discerned, the function of the majority of proteins encoded by these open reading frames is as yet undetermined.

  17. RNA-DNA sequence differences spell genetic code ambiguities

    DEFF Research Database (Denmark)

    Bentin, Thomas; Nielsen, Michael L

    2013-01-01

    A recent paper in Science by Li et al. 2011(1) reports widespread sequence differences in the human transcriptome between RNAs and their encoding genes termed RNA-DNA differences (RDDs). The findings could add a new layer of complexity to gene expression but the study has been criticized. ...

  18. Directed PCR-free engineering of highly repetitive DNA sequences

    Directory of Open Access Journals (Sweden)

    Preissler Steffen

    2011-09-01

    Full Text Available Abstract Background Highly repetitive nucleotide sequences are commonly found in nature e.g. in telomeres, microsatellite DNA, polyadenine (poly(A tails of eukaryotic messenger RNA as well as in several inherited human disorders linked to trinucleotide repeat expansions in the genome. Therefore, studying repetitive sequences is of biological, biotechnological and medical relevance. However, cloning of such repetitive DNA sequences is challenging because specific PCR-based amplification is hampered by the lack of unique primer binding sites resulting in unspecific products. Results For the PCR-free generation of repetitive DNA sequences we used antiparallel oligonucleotides flanked by restriction sites of Type IIS endonucleases. The arrangement of recognition sites allowed for stepwise and seamless elongation of repetitive sequences. This facilitated the assembly of repetitive DNA segments and open reading frames encoding polypeptides with periodic amino acid sequences of any desired length. By this strategy we cloned a series of polyglutamine encoding sequences as well as highly repetitive polyadenine tracts. Such repetitive sequences can be used for diverse biotechnological applications. As an example, the polyglutamine sequences were expressed as His6-SUMO fusion proteins in Escherichia coli cells to study their aggregation behavior in vitro. The His6-SUMO moiety enabled affinity purification of the polyglutamine proteins, increased their solubility, and allowed controlled induction of the aggregation process. We successfully purified the fusions proteins and provide an example for their applicability in filter retardation assays. Conclusion Our seamless cloning strategy is PCR-free and allows the directed and efficient generation of highly repetitive DNA sequences of defined lengths by simple standard cloning procedures.

  19. Z-DNA-forming sequences generate large-scale deletions in mammalian cells

    OpenAIRE

    Wang, Guliang; Christensen, Laura A.; Vasquez, Karen M.

    2006-01-01

    Spontaneous chromosomal breakages frequently occur at genomic hot spots in the absence of DNA damage and can result in translocation-related human disease. Chromosomal breakpoints are often mapped near purine–pyrimidine Z-DNA-forming sequences in human tumors. However, it is not known whether Z-DNA plays a role in the generation of these chromosomal breakages. Here, we show that Z-DNA-forming sequences induce high levels of genetic instability in both bacterial and mammalian cells. In mammali...

  20. DNA Sequence Alignment during Homologous Recombination.

    Science.gov (United States)

    Greene, Eric C

    2016-05-27

    Homologous recombination allows for the regulated exchange of genetic information between two different DNA molecules of identical or nearly identical sequence composition, and is a major pathway for the repair of double-stranded DNA breaks. A key facet of homologous recombination is the ability of recombination proteins to perfectly align the damaged DNA with homologous sequence located elsewhere in the genome. This reaction is referred to as the homology search and is akin to the target searches conducted by many different DNA-binding proteins. Here I briefly highlight early investigations into the homology search mechanism, and then describe more recent research. Based on these studies, I summarize a model that includes a combination of intersegmental transfer, short-distance one-dimensional sliding, and length-specific microhomology recognition to efficiently align DNA sequences during the homology search. I also suggest some future directions to help further our understanding of the homology search. Where appropriate, I direct the reader to other recent reviews describing various issues related to homologous recombination.

  1. Automated Template Quantification for DNA Sequencing Facilities

    Science.gov (United States)

    Ivanetich, Kathryn M.; Yan, Wilson; Wunderlich, Kathleen M.; Weston, Jennifer; Walkup, Ward G.; Simeon, Christian

    2005-01-01

    The quantification of plasmid DNA by the PicoGreen dye binding assay has been automated, and the effect of quantification of user-submitted templates on DNA sequence quality in a core laboratory has been assessed. The protocol pipets, mixes and reads standards, blanks and up to 88 unknowns, generates a standard curve, and calculates template concentrations. For pUC19 replicates at five concentrations, coefficients of variance were 0.1, and percent errors were from 1% to 7% (n = 198). Standard curves with pUC19 DNA were nonlinear over the 1 to 1733 ng/μL concentration range required to assay the majority (98.7%) of user-submitted templates. Over 35,000 templates have been quantified using the protocol. For 1350 user-submitted plasmids, 87% deviated by ≥ 20% from the requested concentration (500 ng/μL). Based on data from 418 sequencing reactions, quantification of user-submitted templates was shown to significantly improve DNA sequence quality. The protocol is applicable to all types of double-stranded DNA, is unaffected by primer (1 pmol/μL), and is user modifiable. The protocol takes 30 min, saves 1 h of technical time, and costs approximately $0.20 per unknown. PMID:16461949

  2. DNA Sequence Alignment during Homologous Recombination*

    Science.gov (United States)

    Greene, Eric C.

    2016-01-01

    Homologous recombination allows for the regulated exchange of genetic information between two different DNA molecules of identical or nearly identical sequence composition, and is a major pathway for the repair of double-stranded DNA breaks. A key facet of homologous recombination is the ability of recombination proteins to perfectly align the damaged DNA with homologous sequence located elsewhere in the genome. This reaction is referred to as the homology search and is akin to the target searches conducted by many different DNA-binding proteins. Here I briefly highlight early investigations into the homology search mechanism, and then describe more recent research. Based on these studies, I summarize a model that includes a combination of intersegmental transfer, short-distance one-dimensional sliding, and length-specific microhomology recognition to efficiently align DNA sequences during the homology search. I also suggest some future directions to help further our understanding of the homology search. Where appropriate, I direct the reader to other recent reviews describing various issues related to homologous recombination. PMID:27129270

  3. Replacement of Homologous Mouse DNA Sequence With Pathogenic 6-Base Human CREB1 Promoter Sequence Creates Murine Model of Major Depressive Disorder

    OpenAIRE

    Zubenko, George S.; Hughes, Hugh B.

    2011-01-01

    Major Depressive Disorder (MDD) is a leading cause of disability worldwide. Families with Recurrent, Early-Onset MDD (RE-MDD), a severe, familial form of MDD, have provided an important resource for identifying and characterizing genetic variants that confer susceptibility to MDD and related disorders. Previous studies identified a rare, highly penetrant A(-115)G transition within the human CREB1 promoter that reduced promoter activity in vitro and was associated with depressive disorders in ...

  4. Detection and quantitation of single nucleotide polymorphisms, DNA sequence variations, DNA mutations, DNA damage and DNA mismatches

    Science.gov (United States)

    McCutchen-Maloney, Sandra L.

    2002-01-01

    DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.

  5. Human identification from forensic materials by amplification of a human-specific sequence in the myoglobin gene.

    OpenAIRE

    Ono T; Miyaishi S; Yamamoto Y; Yoshitome K; Ishikawa T.; Ishizu H

    2001-01-01

    We developed a method for human identification of forensic biological materials by PCR-based detection of a human-specific sequence in exon 3 of the myoglobin gene. This human-specific DNA sequence was deduced from differences in the amino acid sequences of myoglobins between humans and other animal species. The new method enabled amplification of the target DNA fragment from 30 samples of human DNA, and the amplified sequences were identical with that already reported. Using this method, we ...

  6. The first determination of DNA sequence of a specific gene.

    Science.gov (United States)

    Inouye, Masayori

    2016-05-10

    How and when the first DNA sequence of a gene was determined? In 1977, F. Sanger came up with an innovative technology to sequence DNA by using chain terminators, and determined the entire DNA sequence of the 5375-base genome of bacteriophage φX 174 (Sanger et al., 1977). While this Sanger's achievement has been recognized as the first DNA sequencing of genes, we had determined DNA sequence of a gene, albeit a partial sequence, 11 years before the Sanger's DNA sequence (Okada et al., 1966).

  7. Nonlinear Aspects of Coding and Noncoding DNA Sequences

    Science.gov (United States)

    Stanley, H. Eugene

    2001-03-01

    One of the most remarkable features of human DNA is that 97 percent is not coding for proteins. Studying this noncoding DNA is important both for practical reasons (to distinguish it from the coding DNA as the human genome is sequenced), and for scientific reasons (why is the noncoding DNA present at all, if it appears to have little if any purpose?). In this talk we discuss new methods of analyzing coding and noncoding DNA in parallel, with a view to uncovering different statistical properties of the two kinds of DNA. We also speculate on possible roles of noncoding DNA. The work reported here was carried out primarily by P. Bernaola-Galvan, S. V. Buldyrev, P. Carpena, N. Dokholyan, A. L. Goldberger, I. Grosse, S. Havlin, H. Herzel, J. L. Oliver, C.-K. Peng, M. Simons, H. E. Stanley, R. H. R. Stanley, and G. M. Viswanathan. [1] For a brief overview in language that physicists can understand, see H. E. Stanley, S. V. Buldyrev, A. L. Goldberger, S. Havlin, C.-K. Peng, and M. Simons, "Scaling Features of Noncoding DNA" [Proc. XII Max Born Symposium, Wroclaw], Physica A 273, 1-18 (1999). [2] I. Grosse, H. Herzel, S. V. Buldyrev, and H. E. Stanley, "Species Independence of Mutual Information in Coding and Noncoding DNA," Phys. Rev. E 61, 5624-5629 (2000). [3] P. Bernaola-Galvan, I. Grosse, P. Carpena, J. L. Oliver, and H. E. Stanley, "Identification of DNA Coding Regions Using an Entropic Segmentation Method," Phys. Rev. Lett. 84, 1342-1345 (2000). [4] N. Dokholyan, S. V. Buldyrev, S. Havlin, and H. E. Stanley, "Distributions of Dimeric Tandem Repeats in Non-coding and Coding DNA Sequences," J. Theor. Biol. 202, 273-282 (2000). [5] R. H. R. Stanley, N. V. Dokholyan, S. V. Buldyrev, S. Havlin, and H. E. Stanley, "Clumping of Identical Oligonucleotides in Coding and Noncoding DNA Sequences," J. Biomol. Structure and Design 17, 79-87 (1999). [6] N. Dokholyan, S. V. Buldyrev, S. Havlin, and H. E. Stanley, "Distribution of Base Pair Repeats in Coding and Noncoding DNA

  8. DNA sequencing by nanopores: advances and challenges

    Science.gov (United States)

    Agah, Shaghayegh; Zheng, Ming; Pasquali, Matteo; Kolomeisky, Anatoly B.

    2016-10-01

    Developing inexpensive and simple DNA sequencing methods capable of detecting entire genomes in short periods of time could revolutionize the world of medicine and technology. It will also lead to major advances in our understanding of fundamental biological processes. It has been shown that nanopores have the ability of single-molecule sensing of various biological molecules rapidly and at a low cost. This has stimulated significant experimental efforts in developing DNA sequencing techniques by utilizing biological and artificial nanopores. In this review, we discuss recent progress in the nanopore sequencing field with a focus on the nature of nanopores and on sensing mechanisms during the translocation. Current challenges and alternative methods are also discussed.

  9. Structure, organization, and sequence of alpha satellite DNA from human chromosome 17: evidence for evolution by unequal crossing-over and an ancestral pentamer repeat shared with the human X chromosome.

    Science.gov (United States)

    Waye, J S; Willard, H F

    1986-09-01

    The centromeric regions of all human chromosomes are characterized by distinct subsets of a diverse tandemly repeated DNA family, alpha satellite. On human chromosome 17, the predominant form of alpha satellite is a 2.7-kilobase-pair higher-order repeat unit consisting of 16 alphoid monomers. We present the complete nucleotide sequence of the 16-monomer repeat, which is present in 500 to 1,000 copies per chromosome 17, as well as that of a less abundant 15-monomer repeat, also from chromosome 17. These repeat units were approximately 98% identical in sequence, differing by the exclusion of precisely 1 monomer from the 15-monomer repeat. Homologous unequal crossing-over is suggested as a probable mechanism by which the different repeat lengths on chromosome 17 were generated, and the putative site of such a recombination event is identified. The monomer organization of the chromosome 17 higher-order repeat unit is based, in part, on tandemly repeated pentamers. A similar pentameric suborganization has been previously demonstrated for alpha satellite of the human X chromosome. Despite the organizational similarities, substantial sequence divergence distinguishes these subsets. Hybridization experiments indicate that the chromosome 17 and X subsets are more similar to each other than to the subsets found on several other human chromosomes. We suggest that the chromosome 17 and X alpha satellite subsets may be related components of a larger alphoid subfamily which have evolved from a common ancestral repeat into the contemporary chromosome-specific subsets.

  10. Mitochondrial DNA sequence variation in Greeks.

    Science.gov (United States)

    Kouvatsi, A; Karaiskou, N; Apostolidis, A; Kirmizidis, G

    2001-12-01

    Mitochondrial DNA (mtDNA) control region sequences were determined in 54 unrelated Greeks, coming from different regions in Greece, for both segments HVR-I and HVR-II. Fifty-two different mtDNA haplotypes were revealed, one of which was shared by three individuals. A very low heterogeneity was found among Greek regions. No one cluster of lineages was specific to individuals coming from a certain region. The average pairwise difference distribution showed a value of 7.599. The data were compared with that for other European or neighbor populations (British, French, Germans, Tuscans, Bulgarians, and Turks). The genetic trees that were constructed revealed homogeneity between Europeans. Median networks revealed that most of the Greek mtDNA haplotypes are clustered to the five known haplogroups and that a number of haplotypes are shared among Greeks and other European and Near Eastern populations.

  11. Markov chain for estimating human mitochondrial DNA mutation pattern

    Science.gov (United States)

    Vantika, Sandy; Pasaribu, Udjianna S.

    2015-12-01

    The Markov chain was proposed to estimate the human mitochondrial DNA mutation pattern. One DNA sequence was taken randomly from 100 sequences in Genbank. The nucleotide transition matrix and mutation transition matrix were estimated from this sequence. We determined whether the states (mutation/normal) are recurrent or transient. The results showed that both of them are recurrent.

  12. Delineating relative homogeneous G+C domains in DNA sequences.

    Science.gov (United States)

    Li, W

    2001-10-03

    The concept of homogeneity of G+C content is always relative and subjective. This point is emphasized and quantified in this paper using a simple example of one sequence segmented into two subsequences. Whether the sequence is homogeneous or not can be answered by whether the two-subsequence model describes the DNA sequence better than the one-sequence model. There are at least three equivalent ways of looking at the 1-to-2 segmentation: Jensen-Shannon divergence measure, log likelihood ratio test, and model selection using Bayesian information criterion. Once a criterion is chosen, a DNA sequence can be recursively segmented into multiple domains. We use one subjective criterion called segmentation strength based on the Bayesian information criterion. Whether or not a sequence is homogeneous and how many domains it has depend on this criterion. We compare six different genome sequences (yeast S. cerevisiae chromosome III and IV, bacterium M. pneumoniae, human major histocompatibility complex sequence, longest contigs in human chromosome 21 and 22) by recursive segmentations at different strength criteria. Results by recursive segmentation confirm that yeast chromosome IV is more homogeneous than yeast chromosome III, human chromosome 21 is more homogeneous than human chromosome 22, and bacterial genomes may not be homogeneous due to short segments with distinct base compositions. The recursive segmentation also provides a quantitative criterion for identifying isochores in human sequences. Some features of our recursive segmentation, such as the possibility of delineating domain borders accurately, are superior to those of the moving-window approach commonly used in such analyses.

  13. Human cellular protein patterns and their link to genome DNA sequence data: usefulness of two-dimensional gel electrophoresis and microsequencing

    DEFF Research Database (Denmark)

    Celis, J E; Rasmussen, H H; Leffers, H;

    1991-01-01

    Analysis of cellular protein patterns by computer-aided 2-dimensional gel electrophoresis together with recent advances in protein sequence analysis have made possible the establishment of comprehensive 2-dimensional gel protein databases that may link protein and DNA information and that offer a...

  14. Local Renyi entropic profiles of DNA sequences

    Directory of Open Access Journals (Sweden)

    Vinga Susana

    2007-10-01

    Full Text Available Abstract Background In a recent report the authors presented a new measure of continuous entropy for DNA sequences, which allows the estimation of their randomness level. The definition therein explored was based on the Rényi entropy of probability density estimation (pdf using the Parzen's window method and applied to Chaos Game Representation/Universal Sequence Maps (CGR/USM. Subsequent work proposed a fractal pdf kernel as a more exact solution for the iterated map representation. This report extends the concepts of continuous entropy by defining DNA sequence entropic profiles using the new pdf estimations to refine the density estimation of motifs. Results The new methodology enables two results. On the one hand it shows that the entropic profiles are directly related with the statistical significance of motifs, allowing the study of under and over-representation of segments. On the other hand, by spanning the parameters of the kernel function it is possible to extract important information about the scale of each conserved DNA region. The computational applications, developed in Matlab m-code, the corresponding binary executables and additional material and examples are made publicly available at http://kdbio.inesc-id.pt/~svinga/ep/. Conclusion The ability to detect local conservation from a scale-independent representation of symbolic sequences is particularly relevant for biological applications where conserved motifs occur in multiple, overlapping scales, with significant future applications in the recognition of foreign genomic material and inference of motif structures.

  15. Sequence-specific recognition of DNA nanostructures.

    Science.gov (United States)

    Rusling, David A; Fox, Keith R

    2014-05-15

    DNA is the most exploited biopolymer for the programmed self-assembly of objects and devices that exhibit nanoscale-sized features. One of the most useful properties of DNA nanostructures is their ability to be functionalized with additional non-nucleic acid components. The introduction of such a component is often achieved by attaching it to an oligonucleotide that is part of the nanostructure, or hybridizing it to single-stranded overhangs that extend beyond or above the nanostructure surface. However, restrictions in nanostructure design and/or the self-assembly process can limit the suitability of these procedures. An alternative strategy is to couple the component to a DNA recognition agent that is capable of binding to duplex sequences within the nanostructure. This offers the advantage that it requires little, if any, alteration to the nanostructure and can be achieved after structure assembly. In addition, since the molecular recognition of DNA can be controlled by varying pH and ionic conditions, such systems offer tunable properties that are distinct from simple Watson-Crick hybridization. Here, we describe methodology that has been used to exploit and characterize the sequence-specific recognition of DNA nanostructures, with the aim of generating functional assemblies for bionanotechnology and synthetic biology applications.

  16. Syntenic homology of human unique DNA sequences within chromossome regions 5q31, 10q22, 13q32-33 and 19q13.1 in the great apes

    Directory of Open Access Journals (Sweden)

    Vallente-Samonte Rhea U.

    2000-01-01

    Full Text Available Homologies between chromosome banding patterns and DNA sequences in the great apes and humans suggest an apparent common origin for these two lineages. The availability of DNA probes for specific regions of human chromosomes (5q31, 10q22, 13q32-33 and 19q13.1 led us to cross-hybridize these to chimpanzee (Pan troglodytes, PTR, gorilla (Gorilla gorilla, GGO and orangutan (Pongo pygmaeus, PPY chromosomes in a search for equivalent regions in the great apes. Positive hybridization signals to the chromosome 5q31-specific DNA probe were observed at HSA 5q31, PTR 4q31, GGO 4q31 and PPY 4q31, while fluorescent signals using the chromosome 10q22-specific DNA probe were noted at HSA 10q22, PTR 8q22, GGO 8q22 and PPY 7q22. The chromosome arms showing hybridization signals to the Quint-EssentialTM 13-specific DNA probe were identified as HSA 13q32-33, PTR 14q32-33, GGO 14q32-33 and PPY 14q32-33, while those presenting hybridization signals to the chromosome 19q13.1-specific DNA probe were identified as HSA 19q13.1, PTR 20q13, GGO 20q13 and PPY 20q13. All four probes presumably hybridized to homologous chromosomal locations in the apes, which suggests a homology of certain unique DNA sequences among hominoid species.

  17. Dialects of the DNA uptake sequence in Neisseriaceae.

    Directory of Open Access Journals (Sweden)

    Stephan A Frye

    2013-04-01

    Full Text Available In all sexual organisms, adaptations exist that secure the safe reassortment of homologous alleles and prevent the intrusion of potentially hazardous alien DNA. Some bacteria engage in a simple form of sex known as transformation. In the human pathogen Neisseria meningitidis and in related bacterial species, transformation by exogenous DNA is regulated by the presence of a specific DNA Uptake Sequence (DUS, which is present in thousands of copies in the respective genomes. DUS affects transformation by limiting DNA uptake and recombination in favour of homologous DNA. The specific mechanisms of DUS-dependent genetic transformation have remained elusive. Bioinformatic analyses of family Neisseriaceae genomes reveal eight distinct variants of DUS. These variants are here termed DUS dialects, and their effect on interspecies commutation is demonstrated. Each of the DUS dialects is remarkably conserved within each species and is distributed consistent with a robust Neisseriaceae phylogeny based on core genome sequences. The impact of individual single nucleotide transversions in DUS on meningococcal transformation and on DNA binding and uptake is analysed. The results show that a DUS core 5'-CTG-3' is required for transformation and that transversions in this core reduce DNA uptake more than two orders of magnitude although the level of DNA binding remains less affected. Distinct DUS dialects are efficient barriers to interspecies recombination in N. meningitidis, N. elongata, Kingella denitrificans, and Eikenella corrodens, despite the presence of the core sequence. The degree of similarity between the DUS dialect of the recipient species and the donor DNA directly correlates with the level of transformation and DNA binding and uptake. Finally, DUS-dependent transformation is documented in the genera Eikenella and Kingella for the first time. The results presented here advance our understanding of the function and evolution of DUS and genetic

  18. New stopping criteria for segmenting DNA sequences

    CERN Document Server

    Li, W

    2001-01-01

    We propose a solution on the stopping criterion in segmenting inhomogeneous DNA sequences with complex statistical patterns. This new stopping criterion is based on Bayesian Information Criterion (BIC) in the model selection framework. When this stopping criterion is applied to a left telomere sequence of yeast Saccharomyces cerevisiae and the complete genome sequence of bacterium Escherichia coli, borders of biologically meaningful units were identified (e.g. subtelomeric units, replication origin, and replication terminus), and a more reasonable number of domains was obtained. We also introduce a measure called segmentation strength which can be used to control the delineation of large domains. The relationship between the average domain size and the threshold of segmentation strength is determined for several genome sequences.

  19. Sequence determinants of human microsatellite variability

    Directory of Open Access Journals (Sweden)

    Jakobsson Mattias

    2009-12-01

    Full Text Available Abstract Background Microsatellite loci are frequently used in genomic studies of DNA sequence repeats and in population studies of genetic variability. To investigate the effect of sequence properties of microsatellites on their level of variability we have analyzed genotypes at 627 microsatellite loci in 1,048 worldwide individuals from the HGDP-CEPH cell line panel together with the DNA sequences of these microsatellites in the human RefSeq database. Results Calibrating PCR fragment lengths in individual genotypes by using the RefSeq sequence enabled us to infer repeat number in the HGDP-CEPH dataset and to calculate the mean number of repeats (as opposed to the mean PCR fragment length, under the assumption that differences in PCR fragment length reflect differences in the numbers of repeats in the embedded repeat sequences. We find the mean and maximum numbers of repeats across individuals to be positively correlated with heterozygosity. The size and composition of the repeat unit of a microsatellite are also important factors in predicting heterozygosity, with tetra-nucleotide repeat units high in G/C content leading to higher heterozygosity. Finally, we find that microsatellites containing more separate sets of repeated motifs generally have higher heterozygosity. Conclusions These results suggest that sequence properties of microsatellites have a significant impact in determining the features of human microsatellite variability.

  20. Construction of a Sequencing Library from Circulating Cell-Free DNA.

    Science.gov (United States)

    Fang, Nan; Löffert, Dirk; Akinci-Tolun, Rumeysa; Heitz, Katja; Wolf, Alexander

    2016-04-01

    Circulating DNA is cell-free DNA (cfDNA) in serum or plasma that can be used for non-invasive prenatal testing, as well as cancer diagnosis, prognosis, and stratification. High-throughput sequence analysis of the cfDNA with next-generation sequencing technologies has proven to be a highly sensitive and specific method in detecting and characterizing mutations in cancer and other diseases, as well as aneuploidy during pregnancy. This unit describes detailed procedures to extract circulating cfDNA from human serum and plasma and generate sequencing libraries from a wide concentration range of circulating DNA.

  1. Electronic density of states in sequence dependent DNA molecules

    Science.gov (United States)

    de Oliveira, B. P. W.; Albuquerque, E. L.; Vasconcelos, M. S.

    2006-09-01

    We report in this work a numerical study of the electronic density of states (DOS) in π-stacked arrays of DNA single-strand segments made up from the nucleotides guanine G, adenine A, cytosine C and thymine T, forming a Rudin-Shapiro (RS) as well as a Fibonacci (FB) polyGC quasiperiodic sequences. Both structures are constructed starting from a G nucleotide as seed and following their respective inflation rules. Our theoretical method uses Dyson's equation together with a transfer-matrix treatment, within an electronic tight-binding Hamiltonian model, suitable to describe the DNA segments modelled by the quasiperiodic chains. We compared the DOS spectra found for the quasiperiodic structure to those using a sequence of natural DNA, as part of the human chromosome Ch22, with a remarkable concordance, as far as the RS structure is concerned. The electronic spectrum shows several peaks, corresponding to localized states, as well as a striking self-similar aspect.

  2. Next-generation sequencing offers new insights into DNA degradation

    DEFF Research Database (Denmark)

    Overballe-Petersen, Søren; Orlando, Ludovic Antoine Alexandre; Willerslev, Eske

    2012-01-01

    The processes underlying DNA degradation are central to various disciplines, including cancer research, forensics and archaeology. The sequencing of ancient DNA molecules on next-generation sequencing platforms provides direct measurements of cytosine deamination, depurination and fragmentation r...

  3. Random Coding Bounds for DNA Codes Based on Fibonacci Ensembles of DNA Sequences

    Science.gov (United States)

    2008-07-01

    COVERED (From - To) 6 Jul 08 – 11 Jul 08 4. TITLE AND SUBTITLE RANDOM CODING BOUNDS FOR DNA CODES BASED ON FIBONACCI ENSEMBLES OF DNA SEQUENCES ... sequences which are generalizations of the Fibonacci sequences . 15. SUBJECT TERMS DNA Codes, Fibonacci Ensembles, DNA Computing, Code Optimization 16...coding bound on the rate of DNA codes is proved. To obtain the bound, we use some ensembles of DNA sequences which are generalizations of the Fibonacci

  4. An oligonucleotide hybridization approach to DNA sequencing.

    Science.gov (United States)

    Khrapko, K R; Lysov YuP; Khorlyn, A A; Shick, V V; Florentiev, V L; Mirzabekov, A D

    1989-10-09

    We have proposed a DNA sequencing method based on hybridization of a DNA fragment to be sequenced with the complete set of fixed-length oligonucleotides (e.g., 4(8) = 65,536 possible 8-mers) immobilized individually as dots of a 2-D matrix [(1989) Dokl. Akad. Nauk SSSR 303, 1508-1511]. It was shown that the list of hybridizing octanucleotides is sufficient for the computer-assisted reconstruction of the structures for 80% of random-sequence fragments up to 200 bases long, based on the analysis of the octanucleotide overlapping. Here a refinement of the method and some experimental data are presented. We have performed hybridizations with oligonucleotides immobilized on a glass plate, and obtained their dissociation curves down to heptanucleotides. Other approaches, e.g., an additional hybridization of short oligonucleotides which continuously extend duplexes formed between the fragment and immobilized oligonucleotides, should considerably increase either the probability of unambiguous reconstruction, or the length of reconstructed sequences, or decrease the size of immobilized oligonucleotides.

  5. Sequencing and analysis of an Irish human genome.

    LENUS (Irish Health Repository)

    Tong, Pin

    2010-01-01

    Recent studies generating complete human sequences from Asian, African and European subgroups have revealed population-specific variation and disease susceptibility loci. Here, choosing a DNA sample from a population of interest due to its relative geographical isolation and genetic impact on further populations, we extend the above studies through the generation of 11-fold coverage of the first Irish human genome sequence.

  6. Modified Genetic Algorithm for DNA Sequence Assembly by Shotgun and Hybridization Sequencing Techniques

    Directory of Open Access Journals (Sweden)

    Prof.Narayan Kumar Sahu

    2012-09-01

    Full Text Available Since the advent of rapid DNA sequencing methods in 1976, scientists have had the problem of inferring DNA sequences from sequenced fragments. Shotgun sequencing is a well-established biological and computational method used in practice. Many conventional algorithms for shotgun sequencing are based on the notion of pair wise fragment overlap. While shotgun sequencing infers a DNA sequence given the sequences of overlapping fragments, a recent and complementary method, called sequencing by hybridization (SBH, infers a DNA sequence given the set of oligomers that represents all sub words of some fixed length, k. In this paper, we propose a new computer algorithm for DNA sequence assembly that combines in a novel way the techniques of both shotgun and SBH methods. Based on our preliminary investigations, the algorithm promises- to be very fast and practical for DNA sequence assembly [1].

  7. Laser mass spectrometry for DNA sequencing, disease diagnosis, and fingerprinting

    Energy Technology Data Exchange (ETDEWEB)

    Winston Chen, C.H.; Taranenko, N.I.; Zhu, Y.F.; Chung, C.N.; Allman, S.L.

    1997-03-01

    Since laser mass spectrometry has the potential for achieving very fast DNA analysis, the authors recently applied it to DNA sequencing, DNA typing for fingerprinting, and DNA screening for disease diagnosis. Two different approaches for sequencing DNA have been successfully demonstrated. One is to sequence DNA with DNA ladders produced from Snager`s enzymatic method. The other is to do direct sequencing without DNA ladders. The need for quick DNA typing for identification purposes is critical for forensic application. The preliminary results indicate laser mass spectrometry can possibly be used for rapid DNA fingerprinting applications at a much lower cost than gel electrophoresis. Population screening for certain genetic disease can be a very efficient step to reducing medical costs through prevention. Since laser mass spectrometry can provide very fast DNA analysis, the authors applied laser mass spectrometry to disease diagnosis. Clinical samples with both base deletion and point mutation have been tested with complete success.

  8. The influence of DNA sequence on epigenome-induced pathologies

    Directory of Open Access Journals (Sweden)

    Meagher Richard B

    2012-07-01

    Full Text Available Abstract Clear cause-and-effect relationships are commonly established between genotype and the inherited risk of acquiring human and plant diseases and aberrant phenotypes. By contrast, few such cause-and-effect relationships are established linking a chromatin structure (that is, the epitype with the transgenerational risk of acquiring a disease or abnormal phenotype. It is not entirely clear how epitypes are inherited from parent to offspring as populations evolve, even though epigenetics is proposed to be fundamental to evolution and the likelihood of acquiring many diseases. This article explores the hypothesis that, for transgenerationally inherited chromatin structures, “genotype predisposes epitype”, and that epitype functions as a modifier of gene expression within the classical central dogma of molecular biology. Evidence for the causal contribution of genotype to inherited epitypes and epigenetic risk comes primarily from two different kinds of studies discussed herein. The first and direct method of research proceeds by the examination of the transgenerational inheritance of epitype and the penetrance of phenotype among genetically related individuals. The second approach identifies epitypes that are duplicated (as DNA sequences are duplicated and evolutionarily conserved among repeated patterns in the DNA sequence. The body of this article summarizes particularly robust examples of these studies from humans, mice, Arabidopsis, and other organisms. The bulk of the data from both areas of research support the hypothesis that genotypes predispose the likelihood of displaying various epitypes, but for only a few classes of epitype. This analysis suggests that renewed efforts are needed in identifying polymorphic DNA sequences that determine variable nucleosome positioning and DNA methylation as the primary cause of inherited epigenome-induced pathologies. By contrast, there is very little evidence that DNA sequence directly

  9. Analysis of a cDNA clone expressing a human autoimmune antigen: full-length sequence of the U2 small nuclear RNA-associated B antigen

    Energy Technology Data Exchange (ETDEWEB)

    Habets, W.J.; Sillekens, P.T.G.; Hoet, M.H.; Schalken, J.A.; Roebroek, A.J.M.; Leunissen, J.A.M.; Van de Ven, W.J.M.; Van Venrooij, W.J.

    1987-04-01

    A U2 small nuclear RNA-associated protein, designated B'', was recently identified as the target antigen for autoimmune sera from certain patients with systemic lupus erythematosus and other rheumatic diseases. Such antibodies enabled them to isolate cDNA clone lambdaHB''-1 from a phage lambdagt11 expression library. This clone appeared to code for the B'' protein as established by in vitro translation of hybrid-selected mRNA. The identity of clone lambdaHB''-1 was further confirmed by partial peptide mapping and analysis of the reactivity of the recombinant antigen with monospecific and monoclonal antibodies. Analysis of the nucleotide sequence of the 1015-base-pair cDNA insert of clone lambdaHB''-1 revealed a large open reading frame of 800 nucleotides containing the coding sequence for a polypeptide of 25,457 daltons. In vitro transcription of the lambdaHB''-1 cDNA insert and subsequent translation resulted in a protein product with the molecular size of the B'' protein. These data demonstrate that clone lambdaHB''-1 contains the complete coding sequence of this antigen. The deduced polypeptide sequence contains three very hydrophilic regions that might constitute RNA binding sites and/or antigenic determinants. These findings might have implications both for the understanding of the pathogenesis of rheumatic diseases as well as for the elucidation of the biological function of autoimmune antigens.

  10. Sequence-level mechanisms of human epigenome evolution.

    Science.gov (United States)

    Prendergast, James G D; Chambers, Emily V; Semple, Colin A M

    2014-06-24

    DNA methylation and chromatin states play key roles in development and disease. However, the extent of recent evolutionary divergence in the human epigenome and the influential factors that have shaped it are poorly understood. To determine the links between genome sequence and human epigenome evolution, we examined the divergence of DNA methylation and chromatin states following segmental duplication events in the human lineage. Chromatin and DNA methylation states were found to have been generally well conserved following a duplication event, with the evolution of the epigenome largely uncoupled from the total number of genetic changes in the surrounding DNA sequence. However, the epigenome at tissue-specific, distal regulatory regions was observed to be unusually prone to diverge following duplication, with particular sequence differences, altering known sequence motifs, found to be associated with divergence in patterns of DNA methylation and chromatin. Alu elements were found to have played a particularly prominent role in shaping human epigenome evolution, and we show that human-specific AluY insertion events are strongly linked to the evolution of the DNA methylation landscape and gene expression levels, including at key neurological genes in the human brain. Studying paralogous regions within the same sample enables the study of the links between genome and epigenome evolution while controlling for biological and technical variation. We show DNA methylation and chromatin divergence between duplicated regions are linked to the divergence of particular genetic motifs, with Alu elements having played a disproportionate role in the evolution of the epigenome in the human lineage.

  11. Sequence dependent hole evolution in DNA.

    Science.gov (United States)

    Lakhno, V D

    2004-06-01

    The paper examines thedynamical behavior of a radical cation(G(+*)) generated in adouble stranded DNA for differentoligonucleotide sequences. The resonancehole tunneling through an oligonucleotidesequence is studied by the method ofnumerical integration of self-consistentquantum-mechanical equations. The holemotion is considered quantum mechanicallyand nucleotide base oscillations aretreated classically. The results obtaineddemonstrate a strong dependence of chargetransfer on the type of nucleotidesequence. The rates of the hole transferare calculated for different nucleotidesequences and compared with experimentaldata on the transfer from (G(+*))to a GGG unit.

  12. Transverse Electronic Signature of DNA for Electronic Sequencing

    Science.gov (United States)

    Xu, Mingsheng; Endres, Robert G.; Arakawa, Yasuhiko

    In recent years, the proliferation of large-scale DNA sequencing projects for applications in clinical medicine and health care has driven the search for new methods that could reduce the time and cost. The commonly used Sanger sequencing method relies on the chemistry to read the bases in DNA and is far too slow and expensive for reading personal genetic codes. There were earlier attempts to sequence DNA by directly visualizing the nucleotide composition of the DNA molecules by scanning tunneling microscopy (STM). However, sequencing DNA based on directly imaging DNA's atomic structure has not yet been successful. In Chap. 9, Xu, Endres, and Arakawa report a potential physical alternative by detecting unique transverse electronic signatures of DNA bases using ultrahigh vacuum STM. Supported by the principles, calculations and statistical analyses, these authors argue that it would be possible to directly sequence DNA by the STM-based technology without any modification of the DNA.

  13. A new DNA sequence assembly program.

    Science.gov (United States)

    Bonfield, J K; Smith, K f; Staden, R

    1995-01-01

    We describe the Genome Assembly Program (GAP), a new program for DNA sequence assembly. The program is suitable for large and small projects, a variety of strategies and can handle data from a range of sequencing instruments. It retains the useful components of our previous work, but includes many novel ideas and methods. Many of these methods have been made possible by the program's completely new, and highly interactive, graphical user interface. The program provides many visual clues to the current state of a sequencing project and allows users to interact in intuitive and graphical ways with their data. The program has tools to display and manipulate the various types of data that help to solve and check difficult assemblies, particularly those in repetitive genomes. We have introduced the following new displays: the Contig Selector, the Contig Comparator, the Template Display, the Restriction Enzyme Map and the Stop Codon Map. We have also made it possible to have any number of Contig Editors and Contig Joining Editors running simultaneously even on the same contig. The program also includes a new 'Directed Assembly' algorithm and routines for automatically detecting unfinished segments of sequence, to which it suggests experimental solutions. Images PMID:8559656

  14. Understanding Long-Range Correlations in DNA sequences

    CERN Document Server

    Li, W; Kaneko, K; Wentian Li; Thomas G Marr; Kunihiko Kaneko

    1994-01-01

    Abstract: In this paper, we review the literature on statistical long-range correlation in DNA sequences. We examine the current evidence for these correlations, and conclude that a mixture of many length scales (including some relatively long ones) in DNA sequences is responsible for the observed 1/f-like spectral component. We note the complexity of the correlation structure in DNA sequences. The observed complexity often makes it hard, or impossible, to decompose the sequence into a few statistically stationary regions. We suggest that, based on the complexity of DNA sequences, a fruitful approach to understand long-range correlation is to model duplication, and other rearrangement processes, in DNA sequences. One model, called ``expansion-modification system", contains only point duplication and point mutation. Though simplistic, this model is able to generate sequences with 1/f spectra. We emphasize the importance of DNA duplication in its contribution to the observed long-range correlation in DNA sequen...

  15. Improved taboo search algorithm for designing DNA sequences

    Institute of Scientific and Technical Information of China (English)

    Kai Zhang; Jin Xu; Xiutang Geng; Jianhua Xiao; Linqiang Pan

    2008-01-01

    The design of DNA sequences is one of the most practical and important research topics in DNA computing.We adopt taboo search algorithm and improve the method for the systematic design of equal-length DNA sequences,which can satisfy certain combinatorial and thermodynamic constraints.Using taboo search algorithm,our method can avoid trapping into local optimization and can find a set of good DNA sequences satisfying required constraints.

  16. Chimeric proteins for detection and quantitation of DNA mutations, DNA sequence variations, DNA damage and DNA mismatches

    Science.gov (United States)

    McCutchen-Maloney, Sandra L.

    2002-01-01

    Chimeric proteins having both DNA mutation binding activity and nuclease activity are synthesized by recombinant technology. The proteins are of the general formula A-L-B and B-L-A where A is a peptide having DNA mutation binding activity, L is a linker and B is a peptide having nuclease activity. The chimeric proteins are useful for detection and identification of DNA sequence variations including DNA mutations (including DNA damage and mismatches) by binding to the DNA mutation and cutting the DNA once the DNA mutation is detected.

  17. Use of subgenic 18S ribosomal DNA PCR and sequencing for genus and genotype identification of acanthamoebae from humans with keratitis and from sewage sludge.

    Science.gov (United States)

    Schroeder, J M; Booton, G C; Hay, J; Niszl, I A; Seal, D V; Markus, M B; Fuerst, P A; Byers, T J

    2001-05-01

    This study identified subgenic PCR amplimers from 18S rDNA that were (i) highly specific for the genus Acanthamoeba, (ii) obtainable from all known genotypes, and (iii) useful for identification of individual genotypes. A 423- to 551-bp Acanthamoeba-specific amplimer ASA.S1 obtained with primers JDP1 and JDP2 was the most reliable for purposes i and ii. A variable region within this amplimer also identified genotype clusters, but purpose iii was best achieved with sequencing of the genotype-specific amplimer GTSA.B1. Because this amplimer could be obtained from any eukaryote, axenic Acanthamoeba cultures were required for its study. GTSA.B1, produced with primers CRN5 and 1137, extended between reference bp 1 and 1475. Genotypic identification relied on three segments: bp 178 to 355, 705 to 926, and 1175 to 1379. ASA.S1 was obtained from single amoeba, from cultures of all known 18S rDNA genotypes, and from corneal scrapings of Scottish patients with suspected Acanthamoeba keratitis (AK). The AK PCR findings were consistent with culture results for 11 of 15 culture-positive specimens and detected Acanthamoeba in one of nine culture-negative specimens. ASA.S1 sequences were examined for 6 of the 11 culture-positive isolates and were most closely associated with genotypic cluster T3-T4-T11. A similar distance analysis using GTSA.B1 sequences identified nine South African AK-associated isolates as genotype T4 and three isolates from sewage sludge as genotype T5. Our results demonstrate the usefulness of 18S ribosomal DNA PCR amplimers ASA.S1 and GTSA.B1 for Acanthamoeba-specific detection and reliable genotyping, respectively, and provide further evidence that T4 is the predominant genotype in AK.

  18. Identification of human chromosome 22 transcribed sequences with ORF expressed sequence tags

    DEFF Research Database (Denmark)

    de Souza, S J; Camargo, A A; Briones, M R;

    2000-01-01

    by EST or full length cDNA sequences available in GenBank but not utilized in the initial annotation of the first human chromosome sequence. Thus despite representing less than 15% of all expressed human sequences in the public databases at the time of the present analysis, ORESTES sequences defined 48......Transcribed sequences in the human genome can be identified with confidence only by alignment with sequences derived from cDNAs synthesized from naturally occurring mRNAs. We constructed a set of 250,000 cDNAs that represent partial expressed gene sequences and that are biased toward the central...... coding regions of the resulting transcripts. They are termed ORF expressed sequence tags (ORESTES). The 250,000 ORESTES were assembled into 81,429 contigs. Of these, 1, 181 (1.45%) were found to match sequences in chromosome 22 with at least one ORESTES contig for 162 (65.6%) of the 247 known genes...

  19. ProteDNA: a sequence-based predictor of sequence-specific DNA-binding residues in transcription factors

    OpenAIRE

    2009-01-01

    This article presents the design of a sequence-based predictor named ProteDNA for identifying the sequence-specific binding residues in a transcription factor (TF). Concerning protein–DNA interactions, there are two types of binding mechanisms involved, namely sequence-specific binding and nonspecific binding. Sequence-specific bindings occur between protein sidechains and nucleotide bases and correspond to sequence-specific recognition of genes. Therefore, sequence-specific bindings are esse...

  20. DNA Sequence Optimization Based on Continuous Particle Swarm Optimization for Reliable DNA Computing and DNA Nanotechnology

    Directory of Open Access Journals (Sweden)

    N. K. Khalid

    2008-01-01

    Full Text Available Problem statement: In DNA based computation and DNA nanotechnology, the design of good DNA sequences has turned out to be an essential problem and one of the most practical and important research topics. Basically, the DNA sequence design problem is a multi-objective problem and it can be evaluated using four objective functions, namely, Hmeasure, similarity, continuity and hairpin. Approach: There are several ways to solve multi-objective problem, however, in order to evaluate the correctness of PSO algorithm in DNA sequence design, this problem is converted into single objective problem. Particle Swarm Optimization (PSO is proposed to minimize the objective in the problem, subjected to two constraints: melting temperature and GCcontent. A model is developed to present the DNA sequence design based on PSO computation. Results: Based on experiments and researches done, 20 particles are used in the implementation of the optimization process, where the average values and the standard deviation for 100 runs are shown along with comparison to other existing methods. Conclusion: The results achieve verified that PSO can suitably solves the DNA sequence design problem using the proposed method and model, comparatively better than other approaches.

  1. A DNA Structure-Based Bionic Wavelet Transform and Its Application to DNA Sequence Analysis

    Directory of Open Access Journals (Sweden)

    Fei Chen

    2003-01-01

    Full Text Available DNA sequence analysis is of great significance for increasing our understanding of genomic functions. An important task facing us is the exploration of hidden structural information stored in the DNA sequence. This paper introduces a DNA structure-based adaptive wavelet transform (WT – the bionic wavelet transform (BWT – for DNA sequence analysis. The symbolic DNA sequence can be separated into four channels of indicator sequences. An adaptive symbol-to-number mapping, determined from the structural feature of the DNA sequence, was introduced into WT. It can adjust the weight value of each channel to maximise the useful energy distribution of the whole BWT output. The performance of the proposed BWT was examined by analysing synthetic and real DNA sequences. Results show that BWT performs better than traditional WT in presenting greater energy distribution. This new BWT method should be useful for the detection of the latent structural features in future DNA sequence analysis.

  2. The DNA methylome of human peripheral blood mononuclear cells

    DEFF Research Database (Denmark)

    Li, Yingrui; Zhu, Jingde; Tian, Geng;

    2010-01-01

    DNA methylation plays an important role in biological processes in human health and disease. Recent technological advances allow unbiased whole-genome DNA methylation (methylome) analysis to be carried out on human cells. Using whole-genome bisulfite sequencing at 24.7-fold coverage (12.3-fold pe...

  3. Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling

    Science.gov (United States)

    Chang, Chun-Tien; Tsai, Chi-Neu; Tang, Chuan Yi; Chen, Chun-Houh; Lian, Jang-Hau; Hu, Chi-Yu; Tsai, Chia-Lung; Chao, Angel; Lai, Chyong-Huey; Wang, Tzu-Hao; Lee, Yun-Shien

    2012-01-01

    The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3. PMID:22778697

  4. Mixed Sequence Reader: A Program for Analyzing DNA Sequences with Heterozygous Base Calling

    Directory of Open Access Journals (Sweden)

    Chun-Tien Chang

    2012-01-01

    Full Text Available The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs, insertion-deletions (indels, short tandem repeats (STRs, and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR, which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS; (iii determine human papilloma virus (HPV genotypes by searching current viral databases in cases of double infections; (iv estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4 and its paralog HSPDP3.

  5. Mixed sequence reader: a program for analyzing DNA sequences with heterozygous base calling.

    Science.gov (United States)

    Chang, Chun-Tien; Tsai, Chi-Neu; Tang, Chuan Yi; Chen, Chun-Houh; Lian, Jang-Hau; Hu, Chi-Yu; Tsai, Chia-Lung; Chao, Angel; Lai, Chyong-Huey; Wang, Tzu-Hao; Lee, Yun-Shien

    2012-01-01

    The direct sequencing of PCR products generates heterozygous base-calling fluorescence chromatograms that are useful for identifying single-nucleotide polymorphisms (SNPs), insertion-deletions (indels), short tandem repeats (STRs), and paralogous genes. Indels and STRs can be easily detected using the currently available Indelligent or ShiftDetector programs, which do not search reference sequences. However, the detection of other genomic variants remains a challenge due to the lack of appropriate tools for heterozygous base-calling fluorescence chromatogram data analysis. In this study, we developed a free web-based program, Mixed Sequence Reader (MSR), which can directly analyze heterozygous base-calling fluorescence chromatogram data in .abi file format using comparisons with reference sequences. The heterozygous sequences are identified as two distinct sequences and aligned with reference sequences. Our results showed that MSR may be used to (i) physically locate indel and STR sequences and determine STR copy number by searching NCBI reference sequences; (ii) predict combinations of microsatellite patterns using the Federal Bureau of Investigation Combined DNA Index System (CODIS); (iii) determine human papilloma virus (HPV) genotypes by searching current viral databases in cases of double infections; (iv) estimate the copy number of paralogous genes, such as β-defensin 4 (DEFB4) and its paralog HSPDP3.

  6. On-Demand Indexing for Referential Compression of DNA Sequences.

    Directory of Open Access Journals (Sweden)

    Fernando Alves

    Full Text Available The decreasing costs of genome sequencing is creating a demand for scalable storage and processing tools and techniques to deal with the large amounts of generated data. Referential compression is one of these techniques, in which the similarity between the DNA of organisms of the same or an evolutionary close species is exploited to reduce the storage demands of genome sequences up to 700 times. The general idea is to store in the compressed file only the differences between the to-be-compressed and a well-known reference sequence. In this paper, we propose a method for improving the performance of referential compression by removing the most costly phase of the process, the complete reference indexing. Our approach, called On-Demand Indexing (ODI compresses human chromosomes five to ten times faster than other state-of-the-art tools (on average, while achieving similar compression ratios.

  7. Solid-Phase Purification of Synthetic DNA Sequences.

    Science.gov (United States)

    Grajkowski, Andrzej; Cieslak, Jacek; Beaucage, Serge L

    2016-08-05

    Although high-throughput methods for solid-phase synthesis of DNA sequences are currently available for synthetic biology applications and technologies for large-scale production of nucleic acid-based drugs have been exploited for various therapeutic indications, little has been done to develop high-throughput procedures for the purification of synthetic nucleic acid sequences. An efficient process for purification of phosphorothioate and native DNA sequences is described herein. This process consists of functionalizing commercial aminopropylated silica gel with aminooxyalkyl functions to enable capture of DNA sequences carrying a 5'-siloxyl ether linker with a "keto" function through an oximation reaction. Deoxyribonucleoside phosphoramidites functionalized with the 5'-siloxyl ether linker were prepared in yields of 75-83% and incorporated last into the solid-phase assembly of DNA sequences. Capture of nucleobase- and phosphate-deprotected DNA sequences released from the synthesis support is demonstrated to proceed near quantitatively. After shorter than full-length DNA sequences were washed from the capture support, the purified DNA sequences were released from this support upon treatment with tetra-n-butylammonium fluoride in dry DMSO. The purity of released DNA sequences exceeds 98%. The scalability and high-throughput features of the purification process are demonstrated without sacrificing purity of the DNA sequences.

  8. Rapid sequencing of DNA based on single-molecule detection

    Science.gov (United States)

    Soper, Steven A.; Davis, Lloyd M.; Fairfield, Frederick R.; Hammond, Mark L.; Harger, Carol A.; Jett, James H.; Keller, Richard A.; Marrone, Babetta L.; Martin, John C.; Nutter, Harvey L.; Shera, E. Brooks; Simpson, Daniel J.

    1991-07-01

    Sequencing the human genome is a major undertaking considering the large number of nucleotides present in the genome and the slow methods currently available to perform the task. The authors have recently reported on a scheme to sequence DNA rapidly using a non-gel based technique. The concept is based upon the incorporation of fluorescently labeled nucleotides into a strand of DNA, isolation and manipulation of a labeled DNA fragment and the detection of single nucleotides using ultra-sensitive laser-induced fluorescence detection following their cleavage from the fragment. Detection of individual fluorophores in the liquid phase was accomplished with time-gated detection following pulsed-laser excitation. The photon bursts from individual rhodamine 6G (R6G) molecules travelling through a laser beam have been observed, as have bursts from single fluorescently modified nucleotides. Using two different biotinylated nucleotides as a model system for fluorescently labeled nucleotides, the authors have observed synthesis of the complementary copy of M13 bacteriophage. Work with fluorescently labeled nucleotides is underway. Individual molecules of DNA attached to a microbead have been observed and manipulated with an epifluorescence microscope.

  9. Affinity purification of sequence-specific DNA binding proteins.

    OpenAIRE

    1986-01-01

    We describe a method for affinity purification of sequence-specific DNA binding proteins that is fast and effective. Complementary chemically synthesized oligodeoxynucleotides that contain a recognition site for a sequence-specific DNA binding protein are annealed and ligated to give oligomers. This DNA is then covalently coupled to Sepharose CL-2B with cyanogen bromide to yield the affinity resin. A partially purified protein fraction is combined with competitor DNA and subsequently passed t...

  10. Beyond DNA Sequencing in Space: Current and Future Omics Capabilities of the Biomolecule Sequencer Payload

    Science.gov (United States)

    Wallace, Sarah

    2017-01-01

    Why do we need a DNA sequencer to support the human exploration of space? (A) Operational environmental monitoring; (1) Identification of contaminating microbes, (2) Infectious disease diagnosis, (3) Reduce down mass (sample return for environmental monitoring, crew health, etc.). (B) Research; (1) Human, (2) Animal, (3) Microbes/Cell lines, (4) Plant. (C) Med Ops; (1) Response to countermeasures, (2) Radiation, (3) Real-time analysis can influence medical intervention. (C) Support astrobiology science investigations; (1) Technology superiorly suited to in situ nucleic acid-based life detection, (2) Functional testing for integration into robotics for extraplanetary exploration mission.

  11. DNA Sequencing via Quantum Mechanics and Machine Learning

    CERN Document Server

    Yuen, Henry; Zhang, Kevin J; Nomura, Ken-ichi; Kalia, Rajiv K; Nakano, Aiichiro; Vashishta, Priya

    2010-01-01

    Rapid sequencing of individual human genome is prerequisite to genomic medicine, where diseases will be prevented by preemptive cures. Quantum-mechanical tunneling through single-stranded DNA in a solid-state nanopore has been proposed for rapid DNA sequencing, but unfortunately the tunneling current alone cannot distinguish the four nucleotides due to large fluctuations in molecular conformation and solvent. Here, we propose a machine-learning approach applied to the tunneling current-voltage (I-V) characteristic for efficient discrimination between the four nucleotides. We first combine principal component analysis (PCA) and fuzzy c-means (FCM) clustering to learn the "fingerprints" of the electronic density-of-states (DOS) of the four nucleotides, which can be derived from the I-V data. We then apply the hidden Markov model and the Viterbi algorithm to sequence a time series of DOS data (i.e., to solve the sequencing problem). Numerical experiments show that the PCA-FCM approach can classify unlabeled DOS ...

  12. Artificial intelligence approach in analysis of DNA sequences.

    Science.gov (United States)

    Brézillon, P J; Zaraté, P; Saci, F

    1993-01-01

    We present an approach for designing a knowledge-based system, called Sequence Acquisition In Context (SAIC), that will be able to cooperate with a biologist in the analysis of DNA sequences. The main task of the system is the acquisition of the expert knowledge that the biologist uses for solving ambiguities from gel autoradiograms, with the aim of re-using it later for solving similar ambiguities. The various types of expert knowledge constitute what we call the contextual knowledge of the sequence analysis. Contextual knowledge deals with the unavoidable problems that are common in the study of the living material (eg noise on data, difficulties of observations). Indeed, the analysis of DNA sequences from autoradiograms belongs to an emerging and promising area of investigation, namely reasoning with images. The SAIC project is developed in a theoretical framework that is shared with other applications. Not all tasks have the same importance in each application. We use this observation for designing an intelligent assistant system with three applications. In the SAIC project, we focus on knowledge acquisition, human-computer interaction and explanation. The project will benefit research in the two other applications. We also discuss our SAIC project in the context of large international projects that aim to re-use and share knowledge in a repository.

  13. [Mapping and human genome sequence program].

    Science.gov (United States)

    Weissenbach, J

    1997-03-01

    Until recently, human genome programs focused primarily on establishing maps that would provide signposts to researchers seeking to identify genes responsible for inherited diseases, as well as a basis for genome sequencing studies. Preestablished gene mapping goals have been reached. The over 7,000 microsatellite markers identified to date provide a map of sufficient density to allow localization of the gene of a monogenic disease with a precision of 1 to 2 million base pairs. The physical map, based on systematically arranged overlapping sets of artificial yeast chromosomes (YACs), has also made considerable headway during the last few years. The most recently published map covers more than 90% of the genome. However, currently available physical maps cannot be used for sequencing studies because multiple rearrangements occur in YACs. The recently developed sets of radioinduced hybrids are extremely useful for incorporating genes into existing maps. A network of American and European laboratories has successfully used these radioinduced hybrids to map 15,000 gene tags from large-scale cDNA library sequencing programs. There are increasingly pressing reasons for initiating large scale human genome sequencing studies.

  14. SWORDS: A statistical tool for analysing large DNA sequences

    Indian Academy of Sciences (India)

    Probal Chaudhuri; Sandip Das

    2002-02-01

    In this article, we present some simple yet effective statistical techniques for analysing and comparing large DNA sequences. These techniques are based on frequency distributions of DNA words in a large sequence, and have been packaged into a software called SWORDS. Using sequences available in public domain databases housed in the Internet, we demonstrate how SWORDS can be conveniently used by molecular biologists and geneticists to unmask biologically important features hidden in large sequences and assess their statistical significance.

  15. DNA sequencing technology, walking with modular primers. Final report

    Energy Technology Data Exchange (ETDEWEB)

    Ulanovsky, L.

    1996-12-31

    The success of the Human Genome Project depends on the development of adequate technology for rapid and inexpensive DNA sequencing, which will also benefit biomedical research in general. The authors are working on DNA technologies that eliminate primer synthesis, the main bottleneck in sequencing by primer walking. They have developed modular primers that are assembled from three 5-mer, 6-mer or 7-mer modules selected from a presynthesized library of as few as 1,000 oligonucleotides ({double_bond}4, {double_bond}5, {double_bond}7). The three modules anneal contiguously at the selected template site and prime there uniquely, even though each is not unique for the most part when used alone. This technique is expected to speed up primer walking 30 to 50 fold, and reduce the sequencing cost by a factor of 5 to 15. Time and expensive will be saved on primer synthesis itself and even more so due to closed-loop automation of primer walking, made possible by the instant availability of primers. Apart from saving time and cost, closed-loop automation would also minimize the errors and complications associated with human intervention between the walks. The author has also developed two additional approaches to primer-library based sequencing. One involves a branched structure of modular primers which has a distinctly different mechanism of achieving priming specificity. The other introduces the concept of ``Differential Extension with Nucleotide Subsets`` as an approach increasing priming specificity, priming strength and allowing cycle sequencing. These approaches are expected to be more robust than the original version of the modular primer technique.

  16. Yeast DNA sequences initiating gene expression in Escherichia coli.

    Science.gov (United States)

    Lewin, Astrid; Tran, Thi Tuyen; Jacob, Daniela; Mayer, Martin; Freytag, Barbara; Appel, Bernd

    2004-01-01

    DNA transfer between pro- and eukaryotes occurs either during natural horizontal gene transfer or as a result of the employment of gene technology. We analysed the capacity of DNA sequences from a eukaryotic donor organism (Saccharomyces cerevisiae) to serve as promoter region in a prokaryotic recipient (Escherichia coli) by creating fusions between promoterless luxAB genes from Vibrio harveyi and random DNA sequences from S. cerevisiae and measuring the luminescence of transformed E. coli. Fifty-four out of 100 randomly analysed S. cerevisiae DNA sequences caused considerable gene expression in E. coli. Determination of transcription start sites within six selected yeast sequences in E. coli confirmed the existence of bacterial -10 and -35 consensus sequences at appropriate distances upstream from transcription initiation sites. Our results demonstrate that the probability of transcription of transferred eukaryotic DNA in bacteria is extremely high and does not require the insertion of the transferred DNA behind a promoter of the recipient genome.

  17. A novel constraint for thermodynamically designing DNA sequences.

    Directory of Open Access Journals (Sweden)

    Qiang Zhang

    Full Text Available Biotechnological and biomolecular advances have introduced novel uses for DNA such as DNA computing, storage, and encryption. For these applications, DNA sequence design requires maximal desired (and minimal undesired hybridizations, which are the product of a single new DNA strand from 2 single DNA strands. Here, we propose a novel constraint to design DNA sequences based on thermodynamic properties. Existing constraints for DNA design are based on the Hamming distance, a constraint that does not address the thermodynamic properties of the DNA sequence. Using a unique, improved genetic algorithm, we designed DNA sequence sets which satisfy different distance constraints and employ a free energy gap based on a minimum free energy (MFE to gauge DNA sequences based on set thermodynamic properties. When compared to the best constraints of the Hamming distance, our method yielded better thermodynamic qualities. We then used our improved genetic algorithm to obtain lower-bound DNA sequence sets. Here, we discuss the effects of novel constraint parameters on the free energy gap.

  18. A novel constraint for thermodynamically designing DNA sequences.

    Science.gov (United States)

    Zhang, Qiang; Wang, Bin; Wei, Xiaopeng; Zhou, Changjun

    2013-01-01

    Biotechnological and biomolecular advances have introduced novel uses for DNA such as DNA computing, storage, and encryption. For these applications, DNA sequence design requires maximal desired (and minimal undesired) hybridizations, which are the product of a single new DNA strand from 2 single DNA strands. Here, we propose a novel constraint to design DNA sequences based on thermodynamic properties. Existing constraints for DNA design are based on the Hamming distance, a constraint that does not address the thermodynamic properties of the DNA sequence. Using a unique, improved genetic algorithm, we designed DNA sequence sets which satisfy different distance constraints and employ a free energy gap based on a minimum free energy (MFE) to gauge DNA sequences based on set thermodynamic properties. When compared to the best constraints of the Hamming distance, our method yielded better thermodynamic qualities. We then used our improved genetic algorithm to obtain lower-bound DNA sequence sets. Here, we discuss the effects of novel constraint parameters on the free energy gap.

  19. Stability of capillary gels for automated sequencing of DNA.

    Science.gov (United States)

    Swerdlow, H; Dew-Jager, K E; Brady, K; Grey, R; Dovichi, N J; Gesteland, R

    1992-08-01

    Recent interest in capillary gel electrophoresis has been fueled by the Human Genome Project and other large-scale sequencing projects. Advances in gel polymerization techniques and detector design have enabled sequencing of DNA directly in capillaries. Efforts to exploit this technology have been hampered by problems with the reproducibility and stability of gels. Gel instability manifests itself during electrophoresis as a decrease in the current passing through the capillary under a constant voltage. Upon subsequent microscopic examination, bubbles are often visible at or near the injection (cathodic) end of the capillary gel. Gels have been prepared with the polyacrylamide matrix covalently attached to the silica walls of the capillary. These gels, although more stable, still suffer from problems with bubbles. The use of actual DNA sequencing samples also adversely affects gel stability. We examined the mechanisms underlying these disruptive processes by employing polyacrylamide gel-filled capillaries in which the gel was not attached to the capillary wall. Three sources of gel instability were identified. Bubbles occurring in the absence of sample introduction were attributed to electroosmotic force; replacing the denaturant urea with formamide was shown to reduce the frequency of these bubbles. The slow, steady decline in current through capillary sequencing gels interferes with the ability to detect other gel problems. This phenomenon was shown to be a result of ionic depletion at the gel-liquid interface. The decline was ameliorated by adding denaturant and acrylamide monomers to the buffer reservoirs. Sample-induced problems were shown to be due to the presence of template DNA; elimination of the template allowed sample loading to occur without complications.(ABSTRACT TRUNCATED AT 250 WORDS)

  20. Preparing DNA libraries for multiplexed paired-end deep sequencing for Illumina GA sequencers.

    Science.gov (United States)

    Son, Mike S; Taylor, Ronald K

    2011-02-01

    Whole-genome sequencing, also known as deep sequencing, is becoming a more affordable and efficient way to identify SNP mutations, deletions, and insertions in DNA sequences across several different strains. Two major obstacles preventing the widespread use of deep sequencers are the costs involved in services used to prepare DNA libraries for sequencing and the overall accuracy of the sequencing data. This unit describes the preparation of DNA libraries for multiplexed paired-end sequencing using the Illumina GA series sequencer. Self-preparation of DNA libraries can help reduce overall expenses, especially if optimization is required for the different samples, and use of the Illumina GA Sequencer can improve the quality of the data.

  1. New method to study DNA sequences: the languages of evolution.

    Science.gov (United States)

    Spinelli, Gino; Mayer-Foulkes, David

    2008-04-01

    Recently, several authors have reported statistical evidence for deterministic dynamics in the flux of genetic information, suggesting that evolution involves the emergence and maintenance of a fractal landscape in DNA chains. Here we examine the idea that motif repetition lies at the origin of these statistical properties of DNA. To analyse repetition patterns we apply a modification of the BDS statistic, devised to analyze complex economic dynamics and adapted here to DNA sequence analysis. This provides a new method to detect structured signals in genetic information. We compare naturally occurring DNA sequences along the evolutionary tree with randomly generated sequences and also with simulated sequences with repetition motifs. For easier understanding, we also define a new statistic for a DNA sequence that constitutes a specific fingerprint. The new methods are applied to exon and intron DNA sequences, finding specific statistical differences. Moreover, by analysing DNA sequences of different species from Bacteria to Man, we explore the evolution of these linguistic DNA features along the evolutionary tree. The results are consistent with the idea that all the flux of DNA information need not be random, but may be structured along the evolutionary tree. The implications for evolutionary theory are discussed.

  2. Genetic variability of Taenia saginata inferred from mitochondrial DNA sequences.

    Science.gov (United States)

    Rostami, Sima; Salavati, Reza; Beech, Robin N; Babaei, Zahra; Sharbatkhori, Mitra; Harandi, Majid Fasihi

    2015-04-01

    Taenia saginata is an important tapeworm, infecting humans in many parts of the world. The present study was undertaken to identify inter- and intraspecific variation of T. saginata isolated from cattle in different parts of Iran using two mitochondrial CO1 and 12S rRNA genes. Up to 105 bovine specimens of T. saginata were collected from 20 slaughterhouses in three provinces of Iran. DNA were extracted from the metacestode Cysticercus bovis. After PCR amplification, sequencing of CO1 and 12S rRNA genes were carried out and two phylogenetic analyses of the sequence data were generated by Bayesian inference on CO1 and 12S rRNA sequences. Sequence analyses of CO1 and 12S rRNA genes showed 11 and 29 representative profiles respectively. The level of pairwise nucleotide variation between individual haplotypes of CO1 gene was 0.3-2.4% while the overall nucleotide variation among all 11 haplotypes was 4.6%. For 12S rRNA sequence data, level of pairwise nucleotide variation was 0.2-2.5% and the overall nucleotide variation was determined as 5.8% among 29 haplotypes of 12S rRNA gene. Considerable genetic diversity was found in both mitochondrial genes particularly in 12S rRNA gene.

  3. Levenshtein error-correcting barcodes for multiplexed DNA sequencing

    NARCIS (Netherlands)

    Buschmann, Tilo; Bystrykh, Leonid V.

    2013-01-01

    Background: High-throughput sequencing technologies are improving in quality, capacity and costs, providing versatile applications in DNA and RNA research. For small genomes or fraction of larger genomes, DNA samples can be mixed and loaded together on the same sequencing track. This so-called multi

  4. Food Fish Identification from DNA Extraction through Sequence Analysis

    Science.gov (United States)

    Hallen-Adams, Heather E.

    2015-01-01

    This experiment exposed 3rd and 4th y undergraduates and graduate students taking a course in advanced food analysis to DNA extraction, polymerase chain reaction (PCR), and DNA sequence analysis. Students provided their own fish sample, purchased from local grocery stores, and the class as a whole extracted DNA, which was then subjected to PCR,…

  5. Food Fish Identification from DNA Extraction through Sequence Analysis

    Science.gov (United States)

    Hallen-Adams, Heather E.

    2015-01-01

    This experiment exposed 3rd and 4th y undergraduates and graduate students taking a course in advanced food analysis to DNA extraction, polymerase chain reaction (PCR), and DNA sequence analysis. Students provided their own fish sample, purchased from local grocery stores, and the class as a whole extracted DNA, which was then subjected to PCR,…

  6. Mesoscopic Model for Free Energy Landscape Analysis of DNA sequences

    CERN Document Server

    Tapia-Rojo, R; Mazo, J J; Falo, F; 10.1103/PhysRevE.86.021908

    2012-01-01

    A mesoscopic model which allows us to identify and quantify the strength of binding sites in DNA sequences is proposed. The model is based on the Peyrard-Bishop-Dauxois model for the DNA chain coupled to a Brownian particle which explores the sequence interacting more importantly with open base pairs of the DNA chain. We apply the model to promoter sequences of different organisms. The free energy landscape obtained for these promoters shows a complex structure that is strongly connected to their biological behavior. The analysis method used is able to quantify free energy differences of sites within genome sequences.

  7. Cloning and sequencing of mouse GABA transporter complementary DNA

    Institute of Scientific and Technical Information of China (English)

    TAMANTHONYC.W.; LIHEGUO; 等

    1994-01-01

    A cDNA encoding the mouse GABA transporter has been isolated and sequenced.The results show that the mouse GABA transporter cDNA differs from that of the rat by 60 base pairs at the open reading frame region but the deduced amino acid sequences of the two cDNAs are identical and both composed of 599 amino acids.However,the amino acid sequence is different from the sequence deduced from a recently published mouse GABA transporter cDNA.

  8. A 28,000 years old Cro-Magnon mtDNA sequence differs from all potentially contaminating modern sequences.

    Directory of Open Access Journals (Sweden)

    David Caramelli

    Full Text Available BACKGROUND: DNA sequences from ancient specimens may in fact result from undetected contamination of the ancient specimens by modern DNA, and the problem is particularly challenging in studies of human fossils. Doubts on the authenticity of the available sequences have so far hampered genetic comparisons between anatomically archaic (Neandertal and early modern (Cro-Magnoid Europeans. METHODOLOGY/PRINCIPAL FINDINGS: We typed the mitochondrial DNA (mtDNA hypervariable region I in a 28,000 years old Cro-Magnoid individual from the Paglicci cave, in Italy (Paglicci 23 and in all the people who had contact with the sample since its discovery in 2003. The Paglicci 23 sequence, determined through the analysis of 152 clones, is the Cambridge reference sequence, and cannot possibly reflect contamination because it differs from all potentially contaminating modern sequences. CONCLUSIONS/SIGNIFICANCE: The Paglicci 23 individual carried a mtDNA sequence that is still common in Europe, and which radically differs from those of the almost contemporary Neandertals, demonstrating a genealogical continuity across 28,000 years, from Cro-Magnoid to modern Europeans. Because all potential sources of modern DNA contamination are known, the Paglicci 23 sample will offer a unique opportunity to get insight for the first time into the nuclear genes of early modern Europeans.

  9. Sequence determinants in human polyadenylation site selection

    Directory of Open Access Journals (Sweden)

    Gautheret Daniel

    2003-02-01

    Full Text Available Abstract Background Differential polyadenylation is a widespread mechanism in higher eukaryotes producing mRNAs with different 3' ends in different contexts. This involves several alternative polyadenylation sites in the 3' UTR, each with its specific strength. Here, we analyze the vicinity of human polyadenylation signals in search of patterns that would help discriminate strong and weak polyadenylation sites, or true sites from randomly occurring signals. Results We used human genomic sequences to retrieve the region downstream of polyadenylation signals, usually absent from cDNA or mRNA databases. Analyzing 4956 EST-validated polyadenylation sites and their -300/+300 nt flanking regions, we clearly visualized the upstream (USE and downstream (DSE sequence elements, both characterized by U-rich (not GU-rich segments. The presence of a USE and a DSE is the main feature distinguishing true polyadenylation sites from randomly occurring A(A/UUAAA hexamers. While USEs are indifferently associated with strong and weak poly(A sites, DSEs are more conspicuous near strong poly(A sites. We then used the region encompassing the hexamer and DSE as a training set for poly(A site identification by the ERPIN program and achieved a prediction specificity of 69 to 85% for a sensitivity of 56%. Conclusion The availability of complete genomes and large EST sequence databases now permit large-scale observation of polyadenylation sites. Both U-rich sequences flanking both sides of poly(A signals contribute to the definition of "true" sites. However, the downstream U-rich sequences may also play an enhancing role. Based on this information, poly(A site prediction accuracy was moderately but consistently improved compared to the best previously available algorithm.

  10. Affordable Hands-On DNA Sequencing and Genotyping: An Exercise for Teaching DNA Analysis to Undergraduates

    Science.gov (United States)

    Shah, Kushani; Thomas, Shelby; Stein, Arnold

    2013-01-01

    In this report, we describe a 5-week laboratory exercise for undergraduate biology and biochemistry students in which students learn to sequence DNA and to genotype their DNA for selected single nucleotide polymorphisms (SNPs). Students use miniaturized DNA sequencing gels that require approximately 8 min to run. The students perform G, A, T, C…

  11. Effects of Sequence on Transmission Properties of DNA Molecules

    Institute of Scientific and Technical Information of China (English)

    DONG Rui-Xin; YAN Xun-Ling; YANG Bing

    2008-01-01

    A double helix model of charge transport in DNA molecule is given and the transmission spectra of four DNA sequences are obtained. The calculated results show that the transmission characteristics of DNA are not only related to the longitudinal transport but also to the transverse transport of molecule. The periodic sequence with the same composition has stronger conduction ability. With the increasing of bases composition, the conductive ability reduces, but the weight of θ direction rises in charge transfer.

  12. DNA Polymerases Drive DNA Sequencing-by-Synthesis Technologies: Both Past and Present

    Directory of Open Access Journals (Sweden)

    Cheng-Yao eChen

    2014-06-01

    Full Text Available Next-generation sequencing (NGS technologies have revolutionized modern biological and biomedical research. The engines responsible for this innovation are DNA polymerases; they catalyze the biochemical reaction for deriving template sequence information. In fact, DNA polymerase has been a cornerstone of DNA sequencing from the very beginning. E. coli DNA polymerase I proteolytic (Klenow fragment was originally utilized in Sanger's dideoxy chain terminating DNA sequencing chemistry. From these humble beginnings followed an explosion of organism-specific, genome sequence information accessible via public database. Family A/B DNA polymerases from mesophilic/thermophilic bacteria/archaea were modified and tested in today's standard capillary electrophoresis (CE and NGS sequencing platforms. These enzymes were selected for their efficient incorporation of bulky dye-terminator and reversible dye-terminator nucleotides respectively. Third generation, real-time single molecule sequencing platform requires slightly different enzyme properties. Enterobacterial phage ⱷ29 DNA polymerase copies long stretches of DNA and possesses a unique capability to efficiently incorporate terminal phosphate-labeled nucleoside polyphosphates. Furthermore, ⱷ29 enzyme has also been utilized in emerging DNA sequencing technologies including nanopore-, and protein-transistor-based sequencing. DNA polymerase is, and will continue to be, a crucial component of sequencing technologies.

  13. High-speed automated DNA sequencing utilizing from-the-side laser excitation

    Science.gov (United States)

    Westphall, Michael S.; Brumley, Robert L., Jr.; Buxton, Erin C.; Smith, Lloyd M.

    1995-04-01

    The Human Genome Initiative is an ambitious international effort to map and sequence the three billion bases of DNA encoded in the human genome. If successfully completed, the resultant sequence database will be a tool of unparalleled power for biomedical research. One of the major challenges of this project is in the area of DNA sequencing technology. At this time, virtually all DNA sequencing is based upon the separation of DNA fragments in high resolution polyacrylamide gels. This method, as generally practiced, is one to two orders of magnitude too slow and expensive for the successful completion of the Human Genome projection. One reasonable approach is improved sequencing of DNA fragments is to increase the performance of such gel-based sequencing methods. Decreased sequencing times may be obtained by increasing the magnitude of the electric field employed. This is not possible with conventional sequencing, due to the fact that the additional heat associated with the increased electric field cannot be adequately dissipated. Recent developments in the use of thin gels have addressed this problem. Performing electrophoresis in ultrathin (50 to 100 microns) gels greatly increases the heat transfer efficiency, thus allowing the benefits of larger electric fields to be obtained. An increase in separation speed of about an order of magnitude is readily achieved. Thin gels have successfully been used in capillary and slab formats. A detection system has been designed for use with a multiple fluorophore sequencing strategy in horizontal ultrathin slab gels. The system employs laser through-the-side excitation and a cooled CCD detector; this allows for the parallel detection of up to 24 sets of four fluorescently labeled DNA sequencing reactions during their electrophoretic separation in ultrathin (115 micrometers ) denaturing polyacrylamide gels. Four hundred bases of sequence information is obtained from 100 ng of M13 template DNA in an hour, corresponding to an

  14. PCR master mixes harbour murine DNA sequences. Caveat emptor!

    Directory of Open Access Journals (Sweden)

    Philip W Tuke

    Full Text Available BACKGROUND: XMRV is the most recently described retrovirus to be found in Man, firstly in patients with prostate cancer (PC and secondly in 67% of patients with chronic fatigue syndrome (CFS and 3.7% of controls. Both disease associations remain contentious. Indeed, a recent publication has concluded that "XMRV is unlikely to be a human pathogen". Subsequently related but different polytropic MLV (pMLV sequences were also reported from the blood of 86.5% of patients with CFS. and 6.8% of controls. Consequently we decided to investigate blood donors for evidence of XMRV/pMLV. METHODOLOGY/PRINCIPAL FINDINGS: Testing of cDNA prepared from the whole blood of 80 random blood donors, generated gag PCR signals from two samples (7C and 9C. These had previously tested negative for XMRV by two other PCR based techniques. To test whether the PCR mix was the source of these sequences 88 replicates of water were amplified using Invitrogen Platinum Taq (IPT and Applied Biosystems Taq Gold LD (ABTG. Four gag sequences (2D, 3F, 7H, 12C were generated with the IPT, a further sequence (12D by ABTG re-amplification of an IPT first round product. Sequence comparisons revealed remarkable similarities between these sequences, endogeous MLVs and the pMLV sequences reported in patients with CFS. CONCLUSIONS/SIGNIFICANCE: Methodologies for the detection of viruses highly homologous to endogenous murine viruses require special caution as the very reagents used in the detection process can be a source of contamination and at a level where it is not immediately apparent. It is suggested that such contamination is likely to explain the apparent presence of pMLV in CFS.

  15. Human cellular protein patterns and their link to genome DNA mapping and sequencing data: towards an integrated approach to the study of gene expression

    DEFF Research Database (Denmark)

    Celis, J E; Rasmussen, H H; Leffers, H

    1993-01-01

    two-dimensional gel protein databases will provide an integrated picture of the expression levels and properties of the thousands of protein components of organelles, pathways, and cytoskeletal systems, both under physiological and abnormal conditions, and are expected to lead to the identification...... mapping and sequence information and that offer an integrated approach to the study of gene expression. With the integrated approach offered by two-dimensional gel protein databases it is now possible to reveal phenotype-specific protein(s), to microsequence them, to search for homology with previous...... of new regulatory networks. So far, about 20% (600 out of 2,980) of the total number of proteins recorded in the human keratinocyte protein database have been identified and we are actively gathering qualitative and quantitative biological data on all resolved proteins. Given the current improvements...

  16. Z-DNA-forming sequences generate large-scale deletions in mammalian cells.

    Science.gov (United States)

    Wang, Guliang; Christensen, Laura A; Vasquez, Karen M

    2006-02-21

    Spontaneous chromosomal breakages frequently occur at genomic hot spots in the absence of DNA damage and can result in translocation-related human disease. Chromosomal breakpoints are often mapped near purine-pyrimidine Z-DNA-forming sequences in human tumors. However, it is not known whether Z-DNA plays a role in the generation of these chromosomal breakages. Here, we show that Z-DNA-forming sequences induce high levels of genetic instability in both bacterial and mammalian cells. In mammalian cells, the Z-DNA-forming sequences induce double-strand breaks nearby, resulting in large-scale deletions in 95% of the mutants. These Z-DNA-induced double-strand breaks in mammalian cells are not confined to a specific sequence but rather are dispersed over a 400-bp region, consistent with chromosomal breakpoints in human diseases. This observation is in contrast to the mutations generated in Escherichia coli that are predominantly small deletions within the repeats. We found that the frequency of small deletions is increased by replication in mammalian cell extracts. Surprisingly, the large-scale deletions generated in mammalian cells are, at least in part, replication-independent and are likely initiated by repair processing cleavages surrounding the Z-DNA-forming sequence. These results reveal that mammalian cells process Z-DNA-forming sequences in a strikingly different fashion from that used by bacteria. Our data suggest that Z-DNA-forming sequences may be causative factors for gene translocations found in leukemias and lymphomas and that certain cellular conditions such as active transcription may increase the risk of Z-DNA-related genetic instability.

  17. cDNA sequencing improves the detection of P53 missense mutations in colorectal cancer

    Directory of Open Access Journals (Sweden)

    Jesionek-Kupnicka Dorota

    2009-08-01

    Full Text Available Abstract Background Recently published data showed discrepancies beteween P53 cDNA and DNA sequencing in glioblastomas. We hypothesised that similar discrepancies may be observed in other human cancers. Methods To this end, we analyzed 23 colorectal cancers for P53 mutations and gene expression using both DNA and cDNA sequencing, real-time PCR and immunohistochemistry. Results We found P53 gene mutations in 16 cases (15 missense and 1 nonsense. Two of the 15 cases with missense mutations showed alterations based only on cDNA, and not DNA sequencing. Moreover, in 6 of the 15 cases with a cDNA mutation those mutations were difficult to detect in the DNA sequencing, so the results of DNA analysis alone could be misinterpreted if the cDNA sequencing results had not also been available. In all those 15 cases, we observed a higher ratio of the mutated to the wild type template by cDNA analysis, but not by the DNA analysis. Interestingly, a similar overexpression of P53 mRNA was present in samples with and without P53 mutations. Conclusion In terms of colorectal cancer, those discrepancies might be explained under three conditions: 1, overexpression of mutated P53 mRNA in cancer cells as compared with normal cells; 2, a higher content of cells without P53 mutation (normal cells and cells showing K-RAS and/or APC but not P53 mutation in samples presenting P53 mutation; 3, heterozygous or hemizygous mutations of P53 gene. Additionally, for heterozygous mutations unknown mechanism(s causing selective overproduction of mutated allele should also be considered. Our data offer new clues for studying discrepancy in P53 cDNA and DNA sequencing analysis.

  18. Discovering motifs in ranked lists of DNA sequences.

    Directory of Open Access Journals (Sweden)

    Eran Eden

    2007-03-01

    Full Text Available Computational methods for discovery of sequence elements that are enriched in a target set compared with a background set are fundamental in molecular biology research. One example is the discovery of transcription factor binding motifs that are inferred from ChIP-chip (chromatin immuno-precipitation on a microarray measurements. Several major challenges in sequence motif discovery still require consideration: (i the need for a principled approach to partitioning the data into target and background sets; (ii the lack of rigorous models and of an exact p-value for measuring motif enrichment; (iii the need for an appropriate framework for accounting for motif multiplicity; (iv the tendency, in many of the existing methods, to report presumably significant motifs even when applied to randomly generated data. In this paper we present a statistical framework for discovering enriched sequence elements in ranked lists that resolves these four issues. We demonstrate the implementation of this framework in a software application, termed DRIM (discovery of rank imbalanced motifs, which identifies sequence motifs in lists of ranked DNA sequences. We applied DRIM to ChIP-chip and CpG methylation data and obtained the following results. (i Identification of 50 novel putative transcription factor (TF binding sites in yeast ChIP-chip data. The biological function of some of them was further investigated to gain new insights on transcription regulation networks in yeast. For example, our discoveries enable the elucidation of the network of the TF ARO80. Another finding concerns a systematic TF binding enhancement to sequences containing CA repeats. (ii Discovery of novel motifs in human cancer CpG methylation data. Remarkably, most of these motifs are similar to DNA sequence elements bound by the Polycomb complex that promotes histone methylation. Our findings thus support a model in which histone methylation and CpG methylation are mechanistically linked

  19. A MapReduce Framework for DNA Sequencing Data Processing

    Directory of Open Access Journals (Sweden)

    Samy Ghoneimy

    2016-12-01

    Full Text Available Genomics and Next Generation Sequencers (NGS like Illumina Hiseq produce data in the order of ‎‎200 billion base pairs in a single one-week run for a 60x human genome coverage, which ‎requires modern high-throughput experimental technologies that can ‎only be tackled with high performance computing (HPC and specialized software algorithms called ‎‎“short read aligners”. This paper focuses on the implementation of the DNA sequencing as a set of MapReduce programs that will accept a DNA data set as a FASTQ file and finally generate a VCF (variant call format file, which has variants for a given DNA data set. In this paper MapReduce/Hadoop along with Burrows-Wheeler Aligner (BWA, Sequence Alignment/Map (SAM ‎tools, are fully utilized to provide various utilities for manipulating alignments, including sorting, merging, indexing, ‎and generating alignments. The Map-Sort-Reduce process is designed to be suited for a Hadoop framework in ‎which each cluster is a traditional N-node Hadoop cluster to utilize all of the Hadoop features like HDFS, program ‎management and fault tolerance. The Map step performs multiple instances of the short read alignment algorithm ‎‎(BoWTie that run in parallel in Hadoop. The ordered list of the sequence reads are used as input tuples and the ‎output tuples are the alignments of the short reads. In the Reduce step many parallel instances of the Short ‎Oligonucleotide Analysis Package for SNP (SOAPsnp algorithm run in the cluster. Input tuples are sorted ‎alignments for a partition and the output tuples are SNP calls. Results are stored via HDFS, and then archived in ‎SOAPsnp format. ‎ The proposed framework enables extremely fast discovering somatic mutations, inferring population genetical ‎parameters, and performing association tests directly based on sequencing data without explicit genotyping or ‎linkage-based imputation. It also demonstrate that this method achieves comparable

  20. Mitochondrial DNA sequence variation in the Anatolian Peninsula (Turkey)

    Indian Academy of Sciences (India)

    Hatice Mergen; Reyhan Öner; Cihan Öner

    2004-04-01

    Throughout human history, the region known today as the Anatolian peninsula (Turkey) has served as a junction connecting the Middle East, Europe and Central Asia, and, thus, has been subject to major population movements. The present study is undertaken to obtain information about the distribution of the existing mitochondrial D-loop sequence variations in the Turkish population of Anatolia. A few studies have previously reported mtDNA sequences in Turks. We attempted to extend these results by analysing a cohort that is not only larger, but also more representative of the Turkish population living in Anatolia. In order to obtain a descriptive picture for the phylogenetic distribution of the mitochondrial genome within Turkey, we analysed mitochondrial D-loop region sequence variations in 75 individuals from different parts of Anatolia by direct sequencing. Analysis of the two hypervariable segments within the noncoding region of the mitochondrial genome revealed the existence of 81 nucleotide mutations at 79 sites. The neighbour-joining tree of Kimura’s distance matrix has revealed the presence of six main clusters, of which H and U are the most common. The data obtained are also compared with several European and Turkic Central Asian populations.

  1. Quantification of human mitochondrial DNA using synthesized DNA standards.

    Science.gov (United States)

    Kavlick, Mark F; Lawrence, Helen S; Merritt, R Travis; Fisher, Constance; Isenberg, Alice; Robertson, James M; Budowle, Bruce

    2011-11-01

    Successful mitochondrial DNA (mtDNA) forensic analysis depends on sufficient quantity and quality of mtDNA. A real-time quantitative PCR assay was developed to assess such characteristics in a DNA sample, which utilizes a duplex, synthetic DNA to ensure optimal quality assurance and quality control. The assay's 105-base pair target sequence facilitates amplification of degraded DNA and is minimally homologous to nonhuman mtDNA. The primers and probe hybridize to a region that has relatively few sequence polymorphisms. The assay can also identify the presence of PCR inhibitors and thus indicate the need for sample repurification. The results show that the assay provides information down to 10 copies and provides a dynamic range spanning seven orders of magnitude. Additional experiments demonstrated that as few as 300 mtDNA copies resulted in successful hypervariable region amplification, information that permits sample conservation and optimized downstream PCR testing. The assay described is rapid, reliable, and robust.

  2. Code domains in tandem repetitive DNA sequence structures.

    Science.gov (United States)

    Vogt, P

    1992-10-01

    Traditionally, many people doing research in molecular biology attribute coding properties to a given DNA sequence if this sequence contains an open reading frame for translation into a sequence of amino acids. This protein coding capability of DNA was detected about 30 years ago. The underlying genetic code is highly conserved and present in every biological species studied so far. Today, it is obvious that DNA has a much larger coding potential for other important tasks. Apart from coding for specific RNA molecules such as rRNA, snRNA and tRNA molecules, specific structural and sequence patterns of the DNA chain itself express distinct codes for the regulation and expression of its genetic activity. A chromatin code has been defined for phasing of the histone-octamer protein complex in the nucleosome. A translation frame code has been shown to exist that determines correct triplet counting at the ribosome during protein synthesis. A loop code seems to organize the single stranded interaction of the nascent RNA chain with proteins during the splicing process, and a splicing code phases successive 5' and 3' splicing sites. Most of these DNA codes are not exclusively based on the primary DNA sequence itself, but also seem to include specific features of the corresponding higher order structures. Based on the view that these various DNA codes are genetically instructive for specific molecular interactions or processes, important in the nucleus during interphase and during cell division, the coding capability of tandem repetitive DNA sequences has recently been reconsidered.

  3. ProteDNA: a sequence-based predictor of sequence-specific DNA-binding residues in transcription factors.

    Science.gov (United States)

    Chu, Wen-Yi; Huang, Yu-Feng; Huang, Chun-Chin; Cheng, Yi-Sheng; Huang, Chien-Kang; Oyang, Yen-Jen

    2009-07-01

    This article presents the design of a sequence-based predictor named ProteDNA for identifying the sequence-specific binding residues in a transcription factor (TF). Concerning protein-DNA interactions, there are two types of binding mechanisms involved, namely sequence-specific binding and nonspecific binding. Sequence-specific bindings occur between protein sidechains and nucleotide bases and correspond to sequence-specific recognition of genes. Therefore, sequence-specific bindings are essential for correct gene regulation. In this respect, ProteDNA is distinctive since it has been designed to identify sequence-specific binding residues. In order to accommodate users with different application needs, ProteDNA has been designed to operate under two modes, namely, the high-precision mode and the balanced mode. According to the experiments reported in this article, under the high-precision mode, ProteDNA has been able to deliver precision of 82.3%, specificity of 99.3%, sensitivity of 49.8% and accuracy of 96.5%. Meanwhile, under the balanced mode, ProteDNA has been able to deliver precision of 60.8%, specificity of 97.6%, sensitivity of 60.7% and accuracy of 95.4%. ProteDNA is available at the following websites: http://protedna.csbb.ntu.edu.tw/, http://protedna.csie.ntu.edu.tw/, http://bio222.esoe.ntu.edu.tw/ProteDNA/.

  4. A DNA-binding protein factor in K562 nuclear extract interacts with positive control region (PCR) in the 5'-flanking sequence of human β-globin gene

    Institute of Scientific and Technical Information of China (English)

    HUYULONG; YADICHEN; TONGSUN; RUOLANQIAN

    1993-01-01

    It has been known that there are at least three regulatory regions (NCR1. NCR2 and PCR) in the 5'-flanking sequence (from -610 bp to +1 bp) of human β-glohin geneand that the function of PCR is unique to the human erythroleukemia (Ksfi2) ceils. Here we have detected a DNA-binding protein factor (termed NFEa) in K562 ceils. which can bind specifically to the PCR of human β-globin gene. The sequence of the binding site is 5'ACTGATG3' (between -222 bp and -216 bp). The NFEa is erythroidspecific and perhaps specific for K562 cells. It seemed that this factor differed from the erythroid-specific transcriptional factor (NFE-1) ,nsing competition assay. The presence of the NFEa further supported that the funciton of the cis-acting element PCR was specitic for K562 cells. and helps us to understand the mechauism of the regulation of the expression of lmman β-globin gene in the human K562 cells.

  5. DNA Shape Dominates Sequence Affinity in Nucleosome Formation

    Science.gov (United States)

    Freeman, Gordon S.; Lequieu, Joshua P.; Hinckley, Daniel M.; Whitmer, Jonathan K.; de Pablo, Juan J.

    2014-10-01

    Nucleosomes provide the basic unit of compaction in eukaryotic genomes, and the mechanisms that dictate their position at specific locations along a DNA sequence are of central importance to genetics. In this Letter, we employ molecular models of DNA and proteins to elucidate various aspects of nucleosome positioning. In particular, we show how DNA's histone affinity is encoded in its sequence-dependent shape, including subtle deviations from the ideal straight B-DNA form and local variations of minor groove width. By relying on high-precision simulations of the free energy of nucleosome complexes, we also demonstrate that, depending on DNA's intrinsic curvature, histone binding can be dominated by bending interactions or electrostatic interactions. More generally, the results presented here explain how sequence, manifested as the shape of the DNA molecule, dominates molecular recognition in the problem of nucleosome positioning.

  6. Next Generation Sequencing of Ancient DNA: Requirements, Strategies and Perspectives

    Directory of Open Access Journals (Sweden)

    Michael Knapp

    2010-07-01

    Full Text Available The invention of next-generation-sequencing has revolutionized almost all fields of genetics, but few have profited from it as much as the field of ancient DNA research. From its beginnings as an interesting but rather marginal discipline, ancient DNA research is now on its way into the centre of evolutionary biology. In less than a year from its invention next-generation-sequencing had increased the amount of DNA sequence data available from extinct organisms by several orders of magnitude. Ancient DNA  research is now not only adding a temporal aspect to evolutionary studies and allowing for the observation of evolution in real time, it also provides important data to help understand the origins of our own species. Here we review progress that has been made in next-generation-sequencing of ancient DNA over the past five years and evaluate sequencing strategies and future directions.

  7. DNA methylation profiling of human chromosomes 6, 20 and 22

    OpenAIRE

    Eckhardt, Florian; Lewin, Joern; Cortese, Rene; Rakyan, Vardhman K.; Attwood, John; Burger, Matthias; Burton, John; Cox, Tony V.; Davies, Rob; Down, Thomas A; Haefliger, Carolina; Horton, Roger; Howe, Kevin; Jackson, David K.; Kunde, Jan

    2006-01-01

    DNA methylation constitutes the most stable type of epigenetic modifications modulating the transcriptional plasticity of mammalian genomes. Using bisulfite DNA sequencing, we report high-resolution methylation reference profiles of human chromosomes 6, 20 and 22, providing a resource of about 1.9 million CpG methylation values derived from 12 different tissues. Analysis of 6 annotation categories, revealed evolutionary conserved regions to be the predominant sites for differential DNA methyl...

  8. An Optimal Seed Based Compression Algorithm for DNA Sequences

    Directory of Open Access Journals (Sweden)

    Pamela Vinitha Eric

    2016-01-01

    Full Text Available This paper proposes a seed based lossless compression algorithm to compress a DNA sequence which uses a substitution method that is similar to the LempelZiv compression scheme. The proposed method exploits the repetition structures that are inherent in DNA sequences by creating an offline dictionary which contains all such repeats along with the details of mismatches. By ensuring that only promising mismatches are allowed, the method achieves a compression ratio that is at par or better than the existing lossless DNA sequence compression algorithms.

  9. DNA splice site sequences clustering method for conservativeness analysis

    Institute of Scientific and Technical Information of China (English)

    Quanwei Zhang; Qinke Peng; Tao Xu

    2009-01-01

    DNA sequences that are near to splice sites have remarkable conservativeness,and many researchers have contributed to the prediction of splice site.In order to mine the underlying biological knowledge,we analyze the conservativeness of DNA splice site adjacent sequences by clustering.Firstly,we propose a kind of DNA splice site sequences clustering method which is based on DBSCAN,and use four kinds of dissimilarity calculating methods.Then,we analyze the conservative feature of the clustering results and the experimental data set.

  10. Thermodynamics of sequence-specific binding of PNA to DNA

    DEFF Research Database (Denmark)

    Ratilainen, T; Holmén, A; Tuite, E

    2000-01-01

    For further characterization of the hybridization properties of peptide nucleic acids (PNAs), the thermodynamics of hybridization of mixed sequence PNA-DNA duplexes have been studied. We have characterized the binding of PNA to DNA in terms of binding affinity (perfectly matched duplexes) and seq......For further characterization of the hybridization properties of peptide nucleic acids (PNAs), the thermodynamics of hybridization of mixed sequence PNA-DNA duplexes have been studied. We have characterized the binding of PNA to DNA in terms of binding affinity (perfectly matched duplexes...

  11. PNA Directed Sequence Addressed Self-Assembly of DNA Nanostructures

    Science.gov (United States)

    Nielsen, Peter E.

    2008-10-01

    Peptide nucleic acids (PNA) can be designed to target duplex DNA with very high sequence specificity and efficiency via various binding modes. We have designed three domain PNA clamps, that bind stably to predefined decameric homopurine targets in large dsDNA molecules and via a third PNA domain sequence specifically recognize another PNA oligomer. We describe how such three domain PNAs have utility for assembling dsDNA grid and clover leaf structures, and in combination with SNAP-tag technology of protein dsDNA structures.

  12. Characteristics of alternating current hopping conductivity in DNA sequences

    Institute of Scientific and Technical Information of China (English)

    Ma Song-Shan; Xu Hui; Wang Huan-You; Guo Rui

    2009-01-01

    This paper presents a model to describe alternating current (AC) conductivity of DNA sequences,in which DNA is considered as a one-dimensional (1D) disordered system,and electrons transport via hopping between localized states.It finds that AC conductivity in DNA sequences increases as the frequency of the external electric field rises,and it takes the form of σac(ω)~ω2 ln2(1/ω).Also AC conductivity of DNA sequences increases with the increase of temperature,this phenomenon presents characteristics of weak temperature-dependence.Meanwhile,the AC conductivity in an off diagonally correlated case is much larger than that in the uncorrelated case of the Anderson limit in low temperatures,which indicates that the off-diagonal correlations in DNA sequences have a great effect on the AC conductivity,while at high temperature the off-diagonal correlations no longer play a vital role in electric transport. In addition,the proportion of nucleotide pairs p also plays an important role in AC electron transport of DNA sequences.For p<0.5,the conductivity of DNA sequence decreases with the increase of p,while for p > 0.5,the conductivity increases with the increase of p.

  13. Structural biology of disease-associated repetitive DNA sequences and protein-DNA complexes involved in DNA damage and repair

    Energy Technology Data Exchange (ETDEWEB)

    Gupta, G.; Santhana Mariappan, S.V.; Chen, X.; Catasti, P.; Silks, L.A. III; Moyzis, R.K.; Bradbury, E.M.; Garcia, A.E.

    1997-07-01

    This project is aimed at formulating the sequence-structure-function correlations of various microsatellites in the human (and other eukaryotic) genomes. Here the authors have been able to develop and apply structure biology tools to understand the following: the molecular mechanism of length polymorphism microsatellites; the molecular mechanism by which the microsatellites in the noncoding regions alter the regulation of the associated gene; and finally, the molecular mechanism by which the expansion of these microsatellites impairs gene expression and causes the disease. Their multidisciplinary structural biology approach is quantitative and can be applied to all coding and noncoding DNA sequences associated with any gene. Both NIH and DOE are interested in developing quantitative tools for understanding the function of various human genes for prevention against diseases caused by genetic and environmental effects.

  14. Sequence-specific activation of the DNA sensor cGAS by Y-form DNA structures as found in primary HIV-1 cDNA.

    Science.gov (United States)

    Herzner, Anna-Maria; Hagmann, Cristina Amparo; Goldeck, Marion; Wolter, Steven; Kübler, Kirsten; Wittmann, Sabine; Gramberg, Thomas; Andreeva, Liudmila; Hopfner, Karl-Peter; Mertens, Christina; Zillinger, Thomas; Jin, Tengchuan; Xiao, Tsan Sam; Bartok, Eva; Coch, Christoph; Ackermann, Damian; Hornung, Veit; Ludwig, Janos; Barchet, Winfried; Hartmann, Gunther; Schlee, Martin

    2015-10-01

    Cytosolic DNA that emerges during infection with a retrovirus or DNA virus triggers antiviral type I interferon responses. So far, only double-stranded DNA (dsDNA) over 40 base pairs (bp) in length has been considered immunostimulatory. Here we found that unpaired DNA nucleotides flanking short base-paired DNA stretches, as in stem-loop structures of single-stranded DNA (ssDNA) derived from human immunodeficiency virus type 1 (HIV-1), activated the type I interferon-inducing DNA sensor cGAS in a sequence-dependent manner. DNA structures containing unpaired guanosines flanking short (12- to 20-bp) dsDNA (Y-form DNA) were highly stimulatory and specifically enhanced the enzymatic activity of cGAS. Furthermore, we found that primary HIV-1 reverse transcripts represented the predominant viral cytosolic DNA species during early infection of macrophages and that these ssDNAs were highly immunostimulatory. Collectively, our study identifies unpaired guanosines in Y-form DNA as a highly active, minimal cGAS recognition motif that enables detection of HIV-1 ssDNA.

  15. ATRF Houses the Latest DNA Sequencing Technologies | Poster

    Science.gov (United States)

    By Ashley DeVine, Staff Writer By the end of October, the Advanced Technology Research Facility (ATRF) will be one of the few facilities in the world to house all of the latest DNA sequencing technologies.

  16. Sequence and expression analysis of gaps in human chromosome 20

    DEFF Research Database (Denmark)

    Minocherhomji, Sheroy; Seemann, Stefan; Mang, Yuan;

    2012-01-01

    The finished human genome-assemblies comprise several hundred un-sequenced euchromatic gaps, which may be rich in long polypurine/polypyrimidine stretches. Human chromosome 20 (chr 20) currently has three unfinished gaps remaining on its q-arm. All three gaps are within gene-dense regions and....../or overlap disease-associated loci, including the DLGAP4 locus. In this study, we sequenced ~99% of all three unfinished gaps on human chr 20, determined their complete genomic sizes and assessed epigenetic profiles using a combination of Sanger sequencing, mate pair paired-end high-throughput sequencing...... and chromatin, methylation and expression analyses. We found histone 3 trimethylated at Lysine 27 to be distributed across all three gaps in immortalized B-lymphocytes. In one gap, five novel CpG islands were predominantly hypermethylated in genomic DNA from peripheral blood lymphocytes and human cerebellum...

  17. Neural network predicts sequence of TP53 gene based on DNA chip

    DEFF Research Database (Denmark)

    Spicker, J.S.; Wikman, F.; Lu, M.L.;

    2002-01-01

    We have trained an artificial neural network to predict the sequence of the human TP53 tumor suppressor gene based on a p53 GeneChip. The trained neural network uses as input the fluorescence intensities of DNA hybridized to oligonucleotides on the surface of the chip and makes between zero...... and four errors in the predicted 1300 bp sequence when tested on wild-type TP53 sequence....

  18. [cDNA cloning and sequence analysis of pluripotency genes in tree shrews (Tupaia belangeri)].

    Science.gov (United States)

    Wang, Cai-Yun; Ma, Yun-Han; He, Da-Jian; Yang, Shi-Hua

    2013-04-01

    In this paper, partial sequences of the tree shrew (Tupaia belangeri) Klf4, Sox2, and c-Myc genes were cloned and sequenced, which were 382, 612, and 485 bp in length and encoded 127, 204, and 161 amino acids, respectively. Whereas, their cDNA sequence identities with those of human were 89%, 98%, and 89%, respectively. Their phylogenetic tree results indicated different topologies and suggested individual evolutional pathways. These results can facilitate further functional studies.

  19. Rapid detection and purification of sequence specific DNA binding proteins using magnetic separation

    Directory of Open Access Journals (Sweden)

    TIJANA SAVIC

    2006-02-01

    Full Text Available In this paper, a method for the rapid identification and purification of sequence specific DNA binding proteins based on magnetic separation is presented. This method was applied to confirm the binding of the human recombinant USF1 protein to its putative binding site (E-box within the human SOX3 protomer. It has been shown that biotinylated DNA attached to streptavidin magnetic particles specifically binds the USF1 protein in the presence of competitor DNA. It has also been demonstrated that the protein could be successfully eluted from the beads, in high yield and with restored DNA binding activity. The advantage of these procedures is that they could be applied for the identification and purification of any high-affinity sequence-specific DNA binding protein with only minor modifications.

  20. Nonneutral mitochondrial DNA variation in humans and chimpanzees

    Energy Technology Data Exchange (ETDEWEB)

    Nachman, M.W.; Aquadro, C.F. [Cornell Univ., Ithaca, NY (United States); Brown, W.M. [Univ. of Michigan, Ann Arbor, MI (United States)] [and others

    1996-03-01

    We sequenced the NADH dehydrogenase subunit 3 (ND3) gene from a sample of 61 humans, five common chimpanzees, and one gorilla to test whether patterns of mitochondrial DNA (mtDNA) variation are consistent with a neutral model of molecular evolution. Within humans and within chimpanzees, the ratio of replacement to silent nucleotide substitutions was higher than observed in comparisons between species, contrary to neutral expectations. To test the generality of this result, we reanalyzed published human RFLP data from the entire mitochondrial genome. Gains of restriction sites relative to a known human mtDNA sequence were used to infer unambiguous nucleotide substitutions. We also compared the complete mtDNA sequences of three humans. Both the RFLP data and the sequence data reveal a higher ratio of replacement to silent nucleotide substitutions within humans than is seen between species. This pattern is observed at most or all human mitochondrial genes and is inconsistent with a strictly neutral model. These data suggest that many mitochondrial protein polymorphisms are slightly deleterious, consistent with studies of human mitochondrial diseases. 59 refs., 2 figs., 8 tabs.

  1. Repetitive DNA Sequences in Wheat and Its Relatives

    Institute of Scientific and Technical Information of China (English)

    ZHANG Xue-yong; LI Da-yong

    2001-01-01

    Repetitive DNA sequences form a large portion of eukaryote genomes. Using wheat ( Triticum )as a model, the classification, features and functions of repetitive DNA sequences in the Tritieeae grass tribe is reviewed as well as the role of these sequences in genome differentiation, control and regulation of homologous chromosome synapsis and pairing. Transposable elements, as an important portion of dispersed repetitives,may play an essential role in gene mutation of the host. Dynamic models for change of copy number and sequences of the repetitive family are also presented after the models of Charlesworth et al. Application of repetitive DNA sequences in the study of evolution, chromosome fingerprinting and marker assisted gene transfer and breeding are described by taking wheat as an example.

  2. Which Are More Random: Coding or Noncoding DNA Sequences?

    Institute of Scientific and Technical Information of China (English)

    WU Fang; ZHENG Wei-Mou

    2002-01-01

    Evidence seems to show that coding DNA is more random than noncoding DNA, but other conflictingevidence also exists. Based on the third-base degeneracy of codons, we regard the third position of codons as a 'noisy'position. By deleting one fixed position of non-overlapping triplets in a given sequence, three masked sequences may bededuced from the sequence. We have investigated the block-to-site mutual information functions of coding and noncodingsequences in yeast without and with the masking. Characteristics that distinguish coding from noncoding DNA havebeen found. It is observed that the strong correlations in the coding regions may be blocked by the third base of codons,and the proper masking can extract the correlations. Distribution of dimeric tandem repeats of unmasked sequences isalso compared with that of masked sequences.

  3. Effects of sequence on DNA wrapping around histones

    Science.gov (United States)

    Ortiz, Vanessa

    2011-03-01

    A central question in biophysics is whether the sequence of a DNA strand affects its mechanical properties. In epigenetics, these are thought to influence nucleosome positioning and gene expression. Theoretical and experimental attempts to answer this question have been hindered by an inability to directly resolve DNA structure and dynamics at the base-pair level. In our previous studies we used a detailed model of DNA to measure the effects of sequence on the stability of naked DNA under bending. Sequence was shown to influence DNA's ability to form kinks, which arise when certain motifs slide past others to form non-native contacts. Here, we have now included histone-DNA interactions to see if the results obtained for naked DNA are transferable to the problem of nucleosome positioning. Different DNA sequences interacting with the histone protein complex are studied, and their equilibrium and mechanical properties are compared among themselves and with the naked case. NLM training grant to the Computation and Informatics in Biology and Medicine Training Program (NLM T15LM007359).

  4. PNA Directed Sequence Addressed Self-Assembly of DNA Nanostructures

    DEFF Research Database (Denmark)

    Nielsen, Peter E.

    2008-01-01

    sequence specifically recognize another PNA oligomer. We describe how such three domain PNAs have utility for assembling dsDNA grid and clover leaf structures, and in combination with SNAP-tag technol. of protein dsDNA structures. (c) 2008 American Institute of Physics. [on SciFinder (R)] Udgivelsesdato...

  5. PNA Directed Sequence Addressed Self-Assembly of DNA Nanostructures

    DEFF Research Database (Denmark)

    Nielsen, Peter E.

    2008-01-01

    sequence specifically recognize another PNA oligomer. We describe how such three domain PNAs have utility for assembling dsDNA grid and clover leaf structures, and in combination with SNAP-tag technol. of protein dsDNA structures. (c) 2008 American Institute of Physics. [on SciFinder (R)] Udgivelsesdato...

  6. Sequence dependence of electron-induced DNA strand breakage revealed by DNA nanoarrays

    DEFF Research Database (Denmark)

    Keller, Adrian; Rackwitz, Jenny; Cauët, Emilie

    2014-01-01

    The electronic structure of DNA is determined by its nucleotide sequence, which is for instance exploited in molecular electronics. Here we demonstrate that also the DNA strand breakage induced by low-energy electrons (18 eV) depends on the nucleotide sequence. To determine the absolute cross...

  7. Biometric Authentication Using ElGamal Cryptosystem And DNA Sequence

    Directory of Open Access Journals (Sweden)

    V.SAMUEL SUSAN

    2010-06-01

    Full Text Available Biometrics are automated methods of identifying a person or verifying the identity of a person based on a Physiological or behavioral characteristic. Physiological haracteristics include hand or finger images, facial characteristics and iris recognition. Behavioral characteristics include dynamic signature verification, speaker verification and keystroke dynamics. DNA is unique feature among individuals. DNA provides high security level, long term stability, user acceptance and is intrusive. Combining ElGamal cryptosystem and DNA sequence, a novel biometric authentication scheme is proposed.

  8. Anti-DNA antibodies: Sequencing, cloning, and expression

    Energy Technology Data Exchange (ETDEWEB)

    Barry, M.M.

    1992-01-01

    To gain some insight into the mechanism of systemic lupus erythematosus, and the interactions involved in proteins binding to DNA four anti-DNA antibodies have been investigated. Two of the antibodies, Hed 10 and Jel 242, have previously been prepared from female NZB/NZW mice which develop an autoimmune disease resembling human SLE. The remaining two antibodies, Jel 72 and Jel 318, have previously been produced via immunization of C57BL/6 mice. The isotypes of the four antibodies investigated in this thesis were determined by an enzyme-linked-immunosorbent assay. All four antibodies contained [kappa] light chains and [gamma]2a heavy chains except Jel 318 which contains a [gamma]2b heavy chain. The complete variable regions of the heavy and light chains of these four antibodies were sequenced from their respective mRNAs. The gene segments and variable gene families expressed in each antibody were identified. Analysis of the genes used in the autoimmune anti-DNA antibodies and those produced by immunization indicated no obvious differences to account for their different origins. Examination of the amino acid residues present in the complementary-determining regions of these four antibodies indicates a preference for aromatic amino acids. Jel 72 and Jel 242 contain three arginine residues in the third complementary-determining region. A single-chain Fv and the variable region of the heavy chain of Hed 10 were expressed in Escherichia coli. Expression resulted in the production of a 26,000 M[sub r] protein and a 15,000 M[sub r] protein. An immunoblot indicated that the 26,000 M[sub r] protein was the Fv for Hed 10, while the 15,000 M[sub r] protein was shown to bind poly (dT). The contribution of the heavy chain to DNA binding was assessed.

  9. Protein sequence for clustering DNA based on Artificial Neural Networks

    Directory of Open Access Journals (Sweden)

    Gamal. F. Elhadi

    2012-01-01

    Full Text Available DNA is a nucleic acid that contains the genetic instructions used in the development and functioning of all known living organisms and some viruses. Clustering is a process that groups a set of objects into clusters so that the similarity among objects in the same cluster is high, while that among the objects in different clusters is low. In this paper, we proposed an approach for clustering DNA sequences using Self-Organizing Map (SOM algorithm and Protein Sequence. The main objective is to analyze biological data and to bunch DNA to many clusters more easily and efficiently. We use the proposed approach to analyze both large and small amount of input DNA sequences. The results show that the similarity of the sequences does not depend on the amount of input sequences. Our approach depends on evaluating the degree of the DNA sequences similarity using the hierarchal representation Dendrogram. Representing large amount of data using hierarchal tree gives the ability to compare large sequences efficiently

  10. Sequencing and Analysis of Neanderthal Genomic DNA

    OpenAIRE

    Noonan, James P.; Coop, Graham; Kudaravalli, Sridhar; Smith, Doug; Krause, Johannes; Alessi, Joe; Chen, Feng; Platt, Darren; Paabo, Svante; Pritchard, Jonathan K; Rubin, Edward M.

    2006-01-01

    Our knowledge of Neanderthals is based on a limited number of remains and artifacts from which we must make inferences about their biology, behavior, and relationship to ourselves. Here, we describe the characterization of these extinct hominids from a new perspective, based on the development of a Neanderthal metagenomic library and its high-throughput sequencing and analysis. Several lines of evidence indicate that the 65,250 base pairs of hominid sequence so far identified in the library a...

  11. DNA fingerprinting, DNA barcoding, and next generation sequencing technology in plants.

    Science.gov (United States)

    Sucher, Nikolaus J; Hennell, James R; Carles, Maria C

    2012-01-01

    DNA fingerprinting of plants has become an invaluable tool in forensic, scientific, and industrial laboratories all over the world. PCR has become part of virtually every variation of the plethora of approaches used for DNA fingerprinting today. DNA sequencing is increasingly used either in combination with or as a replacement for traditional DNA fingerprinting techniques. A prime example is the use of short, standardized regions of the genome as taxon barcodes for biological identification of plants. Rapid advances in "next generation sequencing" (NGS) technology are driving down the cost of sequencing and bringing large-scale sequencing projects into the reach of individual investigators. We present an overview of recent publications that demonstrate the use of "NGS" technology for DNA fingerprinting and DNA barcoding applications.

  12. Retrieval of human DNA from rodent-human genomic libraries by a recombination process.

    Science.gov (United States)

    Neve, R L; Bruns, G A; Dryja, T P; Kurnit, D M

    1983-09-01

    Human Alu repeat ("BLUR") sequences have been cloned into the mini-plasmid vector piVX. The resulting piBLUR clones have been used to rescue selectively, by recombination, bacteriophage carrying human DNA sequences from genomic libraries constructed using DNA from rodent-human somatic cell hybrids. piBLUR clones are able to retrieve human clones from such libraries because at least one Alu family repeat is present on most 15 to 20 kb fragments of human DNA and because of the relative species-specificity of the sequences comprising the Alu family. The rapid, selective plaque purification achieved results in the construction of a collection of recombinant phage carrying diverse human DNA inserts from a specific subset of the human karyotype. Subfragments of two recombinants rescued from a mouse-human somatic cell hybrid containing human chromosomes X, 10, 13, and 22 were mapped to human chromosomes X and 13, respectively, demonstrating the utility of this protocol for the isolation of human chromosome-specific DNA sequences from appropriate somatic cell hybrids.

  13. Sequence dependence of electron-induced DNA strand breakage revealed by DNA nanoarrays

    DEFF Research Database (Denmark)

    Keller, Adrian; Rackwitz, Jenny; Cauët, Emilie;

    2014-01-01

    sections for electron induced single strand breaks in specific 13 mer oligonucleotides we used atomic force microscopy analysis of DNA origami based DNA nanoarrays. We investigated the DNA sequences 5'-TT(XYX)3TT with X = A, G, C and Y = T, BrU 5-bromouracil and found absolute strand break cross sections...

  14. In vitro nucleosome positioning features of DNA repeats sequence associated with human genetic disease%与人遗传病相关的DNA重复序列的体外核小体定位特性

    Institute of Scientific and Technical Information of China (English)

    柴荣; 赵宏宇; 蔡禄

    2013-01-01

    Objective To investigate the nucleosome positioning of DNA repeats sequence ire vitro which can cause human genetic disease. Methods The recombinant plasmids containing (GAA)42, (ATTCT)43, (GCCT)18 and 601 sequence were cloned. The histone and plasmids were used to assemble chromatin structure ire vitro,and then analyzed by agarose gel electrophoresis after micrococcal nuclease digestion. Results The plasmid containing ATTCT repeats sequence was easier to form nucleosome than GAA containing repeats sequence ire vitro. Conclusions The recombinant plasmids' ability to form chromatin structure was changed because of the insert of the different repeats sequence fragment.%目的 研究与人遗传病相关的DNA重复序列的体外核小体定位.方法 构建含有(GAA)42、(ATTCT)43、(GCCT)18和601序列的重组质粒,体外利用盐透析将质粒与组蛋白八聚体组装形成染色质结构,微球菌核酸酶消化后,用琼脂糖凝胶电泳分析染色质的结构.结果 含有ATTCT重复序列的质粒较含GAA重复序列质粒在体外易于形成核小体.结论 在重组质粒中,由于引入的重复序列片段形成核小体能力的不同会影响其局部染色质结构.

  15. Algorithms for mapping high-throughput DNA sequences

    DEFF Research Database (Denmark)

    Frellsen, Jes; Menzel, Peter; Krogh, Anders

    2014-01-01

    Abstract High-throughput sequencing (HTS) technologies revolutionized the field of molecular biology by enabling large scale whole genome sequencing as well as a broad range of experiments for studying the cell's inner workings directly on DNA or RNA level. Given the dramatically increased rate...

  16. Ancient DNA sequence revealed by error-correcting codes.

    Science.gov (United States)

    Brandão, Marcelo M; Spoladore, Larissa; Faria, Luzinete C B; Rocha, Andréa S L; Silva-Filho, Marcio C; Palazzo, Reginaldo

    2015-07-10

    A previously described DNA sequence generator algorithm (DNA-SGA) using error-correcting codes has been employed as a computational tool to address the evolutionary pathway of the genetic code. The code-generated sequence alignment demonstrated that a residue mutation revealed by the code can be found in the same position in sequences of distantly related taxa. Furthermore, the code-generated sequences do not promote amino acid changes in the deviant genomes through codon reassignment. A Bayesian evolutionary analysis of both code-generated and homologous sequences of the Arabidopsis thaliana malate dehydrogenase gene indicates an approximately 1 MYA divergence time from the MDH code-generated sequence node to its paralogous sequences. The DNA-SGA helps to determine the plesiomorphic state of DNA sequences because a single nucleotide alteration often occurs in distantly related taxa and can be found in the alternative codon patterns of noncanonical genetic codes. As a consequence, the algorithm may reveal an earlier stage of the evolution of the standard code.

  17. Complete Genome Sequence of Human Respiratory Syncytial Virus from Lanzhou, China

    OpenAIRE

    Zhu, Chuanfeng; Fu, Shengfang; Zhou, Xv; Yu, Li

    2017-01-01

    ABSTRACT A complete genome of human respiratory syncytial virus was sequenced and analyzed. Phylogenetic analysis showed that the full-length human respiratory syncytial virus (HRSV) genome sequence belongs to gene type NA1. We sequenced the genome in order to create the full-length cDNA infectious clone and develop vaccines against HRSV.

  18. Sequencing of chloroplast genome using whole cellular DNA and Solexa sequencing technology

    Directory of Open Access Journals (Sweden)

    Jian eWu

    2012-11-01

    Full Text Available Sequencing of the chloroplast genome using traditional sequencing methods has been difficult because of its size (>120 kb and the complicated procedures required to prepare templates. To explore the feasibility of sequencing the chloroplast genome using DNA extracted from whole cells and Solexa sequencing technology, we sequenced whole cellular DNA isolated from leaves of three Brassica rapa accessions with one lane per accession. In total, 246 Mb, 362Mb, 361 Mb sequence data were generated for the three accessions Chiifu-401-42, Z16 and FT, respectively. Microreads were assembled by reference-guided assembly using the cpDNA sequences of B. rapa, Arabidopsis thaliana, and Nicotiana tabacum. We achieved coverage of more than 99.96% of the cp genome in the three tested accessions using the B. rapa sequence as the reference. When A. thaliana or N. tabacum sequences were used as references, 99.7–99.8% or 95.5–99.7% of the B. rapa chloroplast genome was covered, respectively. These results demonstrated that sequencing of whole cellular DNA isolated from young leaves using the Illumina Genome Analyzer is an efficient method for high-throughput sequencing of chloroplast genome.

  19. DNA methylation and healthy human aging.

    Science.gov (United States)

    Jones, Meaghan J; Goodman, Sarah J; Kobor, Michael S

    2015-12-01

    The process of aging results in a host of changes at the cellular and molecular levels, which include senescence, telomere shortening, and changes in gene expression. Epigenetic patterns also change over the lifespan, suggesting that epigenetic changes may constitute an important component of the aging process. The epigenetic mark that has been most highly studied is DNA methylation, the presence of methyl groups at CpG dinucleotides. These dinucleotides are often located near gene promoters and associate with gene expression levels. Early studies indicated that global levels of DNA methylation increase over the first few years of life and then decrease beginning in late adulthood. Recently, with the advent of microarray and next-generation sequencing technologies, increases in variability of DNA methylation with age have been observed, and a number of site-specific patterns have been identified. It has also been shown that certain CpG sites are highly associated with age, to the extent that prediction models using a small number of these sites can accurately predict the chronological age of the donor. Together, these observations point to the existence of two phenomena that both contribute to age-related DNA methylation changes: epigenetic drift and the epigenetic clock. In this review, we focus on healthy human aging throughout the lifetime and discuss the dynamics of DNA methylation as well as how interactions between the genome, environment, and the epigenome influence aging rates. We also discuss the impact of determining 'epigenetic age' for human health and outline some important caveats to existing and future studies.

  20. Zinc finger recombinases with adaptable DNA sequence specificity.

    Directory of Open Access Journals (Sweden)

    Chris Proudfoot

    Full Text Available Site-specific recombinases have become essential tools in genetics and molecular biology for the precise excision or integration of DNA sequences. However, their utility is currently limited to circumstances where the sites recognized by the recombinase enzyme have been introduced into the DNA being manipulated, or natural 'pseudosites' are already present. Many new applications would become feasible if recombinase activity could be targeted to chosen sequences in natural genomic DNA. Here we demonstrate efficient site-specific recombination at several sequences taken from a 1.9 kilobasepair locus of biotechnological interest (in the bovine β-casein gene, mediated by zinc finger recombinases (ZFRs, chimaeric enzymes with linked zinc finger (DNA recognition and recombinase (catalytic domains. In the "Z-sites" tested here, 22 bp casein gene sequences are flanked by 9 bp motifs recognized by zinc finger domains. Asymmetric Z-sites were recombined by the concomitant action of two ZFRs with different zinc finger DNA-binding specificities, and could be recombined with a heterologous site in the presence of a third recombinase. Our results show that engineered ZFRs may be designed to promote site-specific recombination at many natural DNA sequences.

  1. cDNA cloning and sequencing of ostrich Growth hormone

    Directory of Open Access Journals (Sweden)

    Doosti Abbas

    2012-01-01

    Full Text Available In recent years, industrial breeding of ostrich (Struthio camelus has been widely developed in Iran. Growth hormone (GH is a peptide hormone that stimulates growth and cell reproduction in different animals. The aim of this study was to clone and sequence the ostrich growth hormone gene in E. coli, done for the first time in Iran. The cDNA that encodes ostrich growth hormone was isolated from total mRNA of the pituitary gland and amplified by RT-PCR using GH specific PCR primers. Then GH cDNA was cloned by T/A cloning technique and the construct was transformed into E. coli. Finally, GH cDNA sequence was submitted to the GenBank (Accession number: JN559394. The results of present study showed that GH cDNA was successfully cloned in E. coli. Sequencing confirmed that GH cDNA was cloned and that the length of ostrich GH cDNA was 672 bp; BLAST search showed that the sequence of growth hormone cDNA of the ostrich from Iran has 100% homology with other records existing in GenBank.

  2. Polyamide platinum anticancer complexes designed to target specific DNA sequences.

    Science.gov (United States)

    Jaramillo, David; Wheate, Nial J; Ralph, Stephen F; Howard, Warren A; Tor, Yitzhak; Aldrich-Wright, Janice R

    2006-07-24

    Two new platinum complexes, trans-chlorodiammine[N-(2-aminoethyl)-4-[4-(N-methylimidazole-2-carboxamido)-N-methylpyrrole-2-carboxamido]-N-methylpyrrole-2-carboxamide]platinum(II) chloride (DJ1953-2) and trans-chlorodiammine[N-(6-aminohexyl)-4-[4-(N-methylimidazole-2-carboxamido)-N-methylpyrrole-2-carboxamido]-N-methylpyrrole-2-carboxamide]platinum(II) chloride (DJ1953-6) have been synthesized as proof-of-concept molecules in the design of agents that can specifically target genes in DNA. Coordinate covalent binding to DNA was demonstrated with electrospray ionization mass spectrometry. Using circular dichroism, these complexes were found to show greater DNA binding affinity to the target sequence: d(CATTGTCAGAC)(2), than toward either d(GTCTGTCAATG)(2,) which contains different flanking sequences, or d(CATTGAGAGAC)(2), which contains a double base pair mismatch sequence. DJ1953-2 unwinds the DNA helix by around 13 degrees , but neither metal complex significantly affects the DNA melting temperature. Unlike simple DNA minor groove binders, DJ1953-2 is able to inhibit, in vitro, RNA synthesis. The cytotoxicity of both metal complexes in the L1210 murine leukaemia cell line was also determined, with DJ1953-6 (34 microM) more active than DJ1953-2 (>50 microM). These results demonstrate the potential of polyamide platinum complexes and provide the structural basis for designer agents that are able to recognize biologically relevant sequences and prevent DNA transcription and replication.

  3. Selective binding of anti-DNA antibodies to native dsDNA fragments of differing sequence.

    Science.gov (United States)

    Uccellini, Melissa B; Busto, Patricia; Debatis, Michelle; Marshak-Rothstein, Ann; Viglianti, Gregory A

    2012-03-30

    Systemic autoimmune diseases are characterized by the development of autoantibodies directed against a limited subset of nuclear antigens, including DNA. DNA-specific B cells take up mammalian DNA through their B cell receptor, and this DNA is subsequently transported to an endosomal compartment where it can potentially engage TLR9. We have previously shown that ssDNA-specific B cells preferentially bind to particular DNA sequences, and antibody specificity for short synthetic oligodeoxynucleotides (ODNs). Since CpG-rich DNA, the ligand for TLR9 is found in low abundance in mammalian DNA, we sought to determine whether antibodies derived from DNA-reactive B cells showed binding preference for CpG-rich native dsDNA, and thereby select immunostimulatory DNA for delivery to TLR9. We examined a panel of anti-DNA antibodies for binding to CpG-rich and CpG-poor DNA fragments. We show that a number of anti-DNA antibodies do show preference for binding to certain native dsDNA fragments of differing sequence, but this does not correlate directly with the presence of CpG dinucleotides. An antibody with preference for binding to a fragment containing optimal CpG motifs was able to promote B cell proliferation to this fragment at 10-fold lower antibody concentrations than an antibody that did not selectively bind to this fragment, indicating that antibody binding preference can influence autoreactive B cell responses.

  4. Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system.

    Science.gov (United States)

    Schloss, Patrick D; Jenior, Matthew L; Koumpouras, Charles C; Westcott, Sarah L; Highlander, Sarah K

    2016-01-01

    Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina's MiSeq, have allowed researchers to obtain millions of high quality but short sequences. The result of the added sequencing depth has been significant improvements in experimental design. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT) DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3-V5, V1-V3, V1-V5, V1-V6, and V1-V9 variable regions from within the 16S rRNA gene using DNA from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1-V9 region from 0.69 to 0.027%. This error rate is comparable to what has been observed for the shorter reads generated by 454 and Illumina's MiSeq sequencing platforms. Although the per base sequencing cost is still significantly more than that of MiSeq, the prospect of supplementing reference databases with full-length sequences from organisms below the limit of detection from the Sanger approach is exciting.

  5. Sequencing 16S rRNA gene fragments using the PacBio SMRT DNA sequencing system

    Directory of Open Access Journals (Sweden)

    Patrick D. Schloss

    2016-03-01

    Full Text Available Over the past 10 years, microbial ecologists have largely abandoned sequencing 16S rRNA genes by the Sanger sequencing method and have instead adopted highly parallelized sequencing platforms. These new platforms, such as 454 and Illumina’s MiSeq, have allowed researchers to obtain millions of high quality but short sequences. The result of the added sequencing depth has been significant improvements in experimental design. The tradeoff has been the decline in the number of full-length reference sequences that are deposited into databases. To overcome this problem, we tested the ability of the PacBio Single Molecule, Real-Time (SMRT DNA sequencing platform to generate sequence reads from the 16S rRNA gene. We generated sequencing data from the V4, V3–V5, V1–V3, V1–V5, V1–V6, and V1–V9 variable regions from within the 16S rRNA gene using DNA from a synthetic mock community and natural samples collected from human feces, mouse feces, and soil. The mock community allowed us to assess the actual sequencing error rate and how that error rate changed when different curation methods were applied. We developed a simple method based on sequence characteristics and quality scores to reduce the observed error rate for the V1–V9 region from 0.69 to 0.027%. This error rate is comparable to what has been observed for the shorter reads generated by 454 and Illumina’s MiSeq sequencing platforms. Although the per base sequencing cost is still significantly more than that of MiSeq, the prospect of supplementing reference databases with full-length sequences from organisms below the limit of detection from the Sanger approach is exciting.

  6. Mitochondrial DNA sequence analysis of two mouse hepatocarcinoma cell lines

    Institute of Scientific and Technical Information of China (English)

    Ji-Gang Dai; Xia Lei; Jia-Xin Min; Guo-Qiang Zhang; Hong Wei

    2005-01-01

    AIM: To study genetic difference of mitochondrial DNA (mtDNA)between two hepatocarcinoma cell lines (Hca-F and Hca-P)with diverse metastatic characteristics and the relationship between mtDNA changes in cancer cells and their oncogenic phenotype.METHODS: Mitochondrial DNA D-loop, tRNAMet+Glu+Ile and ND3gene fragments from the hepatocarcinoma cell lines with 1100, 1126 and 534 bp in length respectively were analysed by PCR amplification and restriction fragment length polymorphism techniques. The D-loop 3' end sequence of the hepatocarcinoma cell lines was determined by sequencing.RESULTS: No amplification fragment length polymorphism and restriction fragment length polymorphism were observed in tRNAMet+Glu+Ile,ND3 and D-loop of mitochondrial DNA of the hepatocarcinoma cells. Sequence differences between Hca-F and Hca-P were found in mtDNA D-loop.CONCLUSION: Deletion mutations of mitochondrial DNA restriction fragment may not play a significant role in carcinogenesis. Genetic difference of mtDNA D-loop between Hca-F and Hca-P, which may reflect the environmental and genetic influences during tumor progression, could be linked to their tumorigenic phenotypes.

  7. [Patentability of DNA sequences: the debate remains open].

    Science.gov (United States)

    Martín Uranga, Amelia

    2013-01-01

    The patentability of human genes was from the beginning of the discussion concerning the Directive on the legal protection of biotechnological inventions, an issue that provoked debates among politicians, scientists, lawyers and civil society itself. Although Directive 98/44 tried to settle the matter by stating that to support the patentability of human genes, it should know what role they fulfill, which protein they encode, all of this as an essential requirement to test its industrial application. However, following the judgment of 13 June 2013 (Supreme Court of the United States of America in the case of Association for Molecular Pathology et al. versus Myriad Genetics Inc.) the debate on this issue has been reopened. There are several issues to be considered, taking into account that the patents on DNA & Gene Sequences have played an important incentive to increase the interest in biotechnology applied to human health. On the other hand, this is a paradigm shift in the R & D of biopharmaceutical companies, and it has moved from an in house research model to a model of open innovation, a model of collaboration between large corporations with biotech SMEs and public and private research centers. This model of innovation, impacts on the issue of the industrial property, and therefore it will be necessary to clearly define what each party brings to the relationship and how they are expected to share the results. But all of this, with the ultimate goal that the patients have access to treatments and medications most innovative, safe and effective.

  8. Properties of CENP-B and its target sequence in a satellite DNA

    Energy Technology Data Exchange (ETDEWEB)

    Masumoto, H.; Yoda, K.; Ikeno, M.; Kitagawa, K.; Muro, Y.; Okazaki, T. [Nagoya Univ. (Japan)

    1993-12-31

    The centromere plays an essential role in the proper segregation of eukaryotic chromosomes at mitosis and meiosis. The centromere is the multifunctional domain of chromosome responsible for sister chromatid association at the inner site and for microtubule attachment at the outer surface. It also acts as a mechanochemical motor for chromosome movement. These multiple centromere functions must, in some way, be directed by a cis-acting DNA sequence located in the centromere region. Indeed, specific centromere DNA sequences (CEN-DNA) were identified in two yeast species. In Saccharomyces cerevisiae, CEN-DNA consists of roughly 125 bp sequence composed of three conserved elements. In contrast, the centromere sequence of S. pombe is quite different from S. cerevisiae in length and sequence organization. The molecular bases for understanding the structure and function of the centromere/kinetochore domain have not been elucidated in higher eukaryotes. In mammalian cells, satellite DNA`s are localized in the centromeric heterochromatin or heterochromatic arm. In all human chromosomes, the alpha satellite or alphoid DNA family, a highly repetitive DNA composed of about 170 bp fundamental monomer repeating units, is found at the primary constriction. Its function, however, has not been established.

  9. Palindromic sequence artifacts generated during next generation sequencing library preparation from historic and ancient DNA.

    Directory of Open Access Journals (Sweden)

    Bastiaan Star

    Full Text Available Degradation-specific processes and variation in laboratory protocols can bias the DNA sequence composition from samples of ancient or historic origin. Here, we identify a novel artifact in sequences from historic samples of Atlantic cod (Gadus morhua, which forms interrupted palindromes consisting of reverse complementary sequence at the 5' and 3'-ends of sequencing reads. The palindromic sequences themselves have specific properties - the bases at the 5'-end align well to the reference genome, whereas extensive misalignments exists among the bases at the terminal 3'-end. The terminal 3' bases are artificial extensions likely caused by the occurrence of hairpin loops in single stranded DNA (ssDNA, which can be ligated and amplified in particular library creation protocols. We propose that such hairpin loops allow the inclusion of erroneous nucleotides, specifically at the 3'-end of DNA strands, with the 5'-end of the same strand providing the template. We also find these palindromes in previously published ancient DNA (aDNA datasets, albeit at varying and substantially lower frequencies. This artifact can negatively affect the yield of endogenous DNA in these types of samples and introduces sequence bias.

  10. PCR primers for metazoan mitochondrial 12S ribosomal DNA sequences.

    Directory of Open Access Journals (Sweden)

    Ryuji J Machida

    Full Text Available BACKGROUND: Assessment of the biodiversity of communities of small organisms is most readily done using PCR-based analysis of environmental samples consisting of mixtures of individuals. Known as metagenetics, this approach has transformed understanding of microbial communities and is beginning to be applied to metazoans as well. Unlike microbial studies, where analysis of the 16S ribosomal DNA sequence is standard, the best gene for metazoan metagenetics is less clear. In this study we designed a set of PCR primers for the mitochondrial 12S ribosomal DNA sequence based on 64 complete mitochondrial genomes and then tested their efficacy. METHODOLOGY/PRINCIPAL FINDINGS: A total of the 64 complete mitochondrial genome sequences representing all metazoan classes available in GenBank were downloaded using the NCBI Taxonomy Browser. Alignment of sequences was performed for the excised mitochondrial 12S ribosomal DNA sequences, and conserved regions were identified for all 64 mitochondrial genomes. These regions were used to design a primer pair that flanks a more variable region in the gene. Then all of the complete metazoan mitochondrial genomes available in NCBI's Organelle Genome Resources database were used to determine the percentage of taxa that would likely be amplified using these primers. Results suggest that these primers will amplify target sequences for many metazoans. CONCLUSIONS/SIGNIFICANCE: Newly designed 12S ribosomal DNA primers have considerable potential for metazoan metagenetic analysis because of their ability to amplify sequences from many metazoans.

  11. Recognizing a Single Base in an Individual DNA Strand: A Step Toward Nanopore DNA Sequencing**

    Science.gov (United States)

    Ashkenasy, N.; Sánchez-Quesada, J.; Ghadiri, M. R.; Bayley, H.

    2007-01-01

    Functional supramolecular chemistry at the single-molecule level. Single strands of DNA can be captured inside α-hemolysin transmembrane pore protein to form single-species α-HL·DNA pseudorotaxanes. This process can be used to identify a single adenine nucleotide at a specific location on a strand of DNA by the characteristic reductions in the α-HL ion conductance. This study suggests that α-HL-mediated single-molecule DNA sequencing might be fundamentally feasible. PMID:15666419

  12. Analysis of sequence variation in Gnathostoma spinigerum mitochondrial DNA by single-strand conformation polymorphism analysis and DNA sequence.

    Science.gov (United States)

    Ngarmamonpirat, Charinthon; Waikagul, Jitra; Petmitr, Songsak; Dekumyoy, Paron; Rojekittikhun, Wichit; Anantapruti, Malinee T

    2005-03-01

    Morphological variations were observed in the advance third stage larvae of Gnathostoma spinigerum collected from swamp eel (Fluta alba), the second intermediate host. Larvae with typical and three atypical types were chosen for partial cytochrome c oxidase subunit I (COI) gene sequence analysis. A 450 bp polymerase chain reaction product of the COI gene was amplified from mitochondrial DNA. The variations were analyzed by single-strand conformation polymorphism and DNA sequencing. The nucleotide variations of the COI gene in the four types of larvae indicated the presence of an intra-specific variation of mitochondrial DNA in the G. spinigerum population.

  13. A next generation semiconductor based sequencing approach for the identification of meat species in DNA mixtures.

    Science.gov (United States)

    Bertolini, Francesca; Ghionda, Marco Ciro; D'Alessandro, Enrico; Geraci, Claudia; Chiofalo, Vincenzo; Fontanesi, Luca

    2015-01-01

    The identification of the species of origin of meat and meat products is an important issue to prevent and detect frauds that might have economic, ethical and health implications. In this paper we evaluated the potential of the next generation semiconductor based sequencing technology (Ion Torrent Personal Genome Machine) for the identification of DNA from meat species (pig, horse, cattle, sheep, rabbit, chicken, turkey, pheasant, duck, goose and pigeon) as well as from human and rat in DNA mixtures through the sequencing of PCR products obtained from different couples of universal primers that amplify 12S and 16S rRNA mitochondrial DNA genes. Six libraries were produced including PCR products obtained separately from 13 species or from DNA mixtures containing DNA from all species or only avian or only mammalian species at equimolar concentration or at 1:10 or 1:50 ratios for pig and horse DNA. Sequencing obtained a total of 33,294,511 called nucleotides of which 29,109,688 with Q20 (87.43%) in a total of 215,944 reads. Different alignment algorithms were used to assign the species based on sequence data. Error rate calculated after confirmation of the obtained sequences by Sanger sequencing ranged from 0.0003 to 0.02 for the different species. Correlation about the number of reads per species between different libraries was high for mammalian species (0.97) and lower for avian species (0.70). PCR competition limited the efficiency of amplification and sequencing for avian species for some primer pairs. Detection of low level of pig and horse DNA was possible with reads obtained from different primer pairs. The sequencing of the products obtained from different universal PCR primers could be a useful strategy to overcome potential problems of amplification. Based on these results, the Ion Torrent technology can be applied for the identification of meat species in DNA mixtures.

  14. A next generation semiconductor based sequencing approach for the identification of meat species in DNA mixtures.

    Directory of Open Access Journals (Sweden)

    Francesca Bertolini

    Full Text Available The identification of the species of origin of meat and meat products is an important issue to prevent and detect frauds that might have economic, ethical and health implications. In this paper we evaluated the potential of the next generation semiconductor based sequencing technology (Ion Torrent Personal Genome Machine for the identification of DNA from meat species (pig, horse, cattle, sheep, rabbit, chicken, turkey, pheasant, duck, goose and pigeon as well as from human and rat in DNA mixtures through the sequencing of PCR products obtained from different couples of universal primers that amplify 12S and 16S rRNA mitochondrial DNA genes. Six libraries were produced including PCR products obtained separately from 13 species or from DNA mixtures containing DNA from all species or only avian or only mammalian species at equimolar concentration or at 1:10 or 1:50 ratios for pig and horse DNA. Sequencing obtained a total of 33,294,511 called nucleotides of which 29,109,688 with Q20 (87.43% in a total of 215,944 reads. Different alignment algorithms were used to assign the species based on sequence data. Error rate calculated after confirmation of the obtained sequences by Sanger sequencing ranged from 0.0003 to 0.02 for the different species. Correlation about the number of reads per species between different libraries was high for mammalian species (0.97 and lower for avian species (0.70. PCR competition limited the efficiency of amplification and sequencing for avian species for some primer pairs. Detection of low level of pig and horse DNA was possible with reads obtained from different primer pairs. The sequencing of the products obtained from different universal PCR primers could be a useful strategy to overcome potential problems of amplification. Based on these results, the Ion Torrent technology can be applied for the identification of meat species in DNA mixtures.

  15. Probabilistic models for semisupervised discriminative motif discovery in DNA sequences.

    Science.gov (United States)

    Kim, Jong Kyoung; Choi, Seungjin

    2011-01-01

    Methods for discriminative motif discovery in DNA sequences identify transcription factor binding sites (TFBSs), searching only for patterns that differentiate two sets (positive and negative sets) of sequences. On one hand, discriminative methods increase the sensitivity and specificity of motif discovery, compared to generative models. On the other hand, generative models can easily exploit unlabeled sequences to better detect functional motifs when labeled training samples are limited. In this paper, we develop a hybrid generative/discriminative model which enables us to make use of unlabeled sequences in the framework of discriminative motif discovery, leading to semisupervised discriminative motif discovery. Numerical experiments on yeast ChIP-chip data for discovering DNA motifs demonstrate that the best performance is obtained between the purely-generative and the purely-discriminative and the semisupervised learning improves the performance when labeled sequences are limited.

  16. Applications of recursive segmentation to the analysis of DNA sequences.

    Science.gov (United States)

    Li, Wentian; Bernaola-Galván, Pedro; Haghighi, Fatameh; Grosse, Ivo

    2002-07-01

    Recursive segmentation is a procedure that partitions a DNA sequence into domains with a homogeneous composition of the four nucleotides A, C, G and T. This procedure can also be applied to any sequence converted from a DNA sequence, such as to a binary strong(G + C)/weak(A + T) sequence, to a binary sequence indicating the presence or absence of the dinucleotide CpG, or to a sequence indicating both the base and the codon position information. We apply various conversion schemes in order to address the following five DNA sequence analysis problems: isochore mapping, CpG island detection, locating the origin and terminus of replication in bacterial genomes, finding complex repeats in telomere sequences, and delineating coding and noncoding regions. We find that the recursive segmentation procedure can successfully detect isochore borders, CpG islands, and the origin and terminus of replication, but it needs improvement for detecting complex repeats as well as borders between coding and noncoding regions.

  17. Chaos game representation (CGR)-walk model for DNA sequences

    Institute of Scientific and Technical Information of China (English)

    Gao Jie; Xu Zhen-Yuan

    2009-01-01

    Chaos game representation (CGR) is an iterative mapping technique that processes sequences of units, such as nucleotides in a DNA sequence or amino acids in a protein, in order to determine the coordinates of their positions in a continuous space. This distribution of positions has two features: one is unique, and the other is source sequence that can be recovered from the coordinates so that the distance between positions may serve as a measure of similarity between the corresponding sequences. A CGR-walk model is proposed based on CGR coordinates for the DNA sequences. The CGR coordinates are converted into a time series, and a long-memory ARFIMA (p, d, q) model, where ARFIMA stands for autoregressive fractionally integrated moving average, is introduced into the DNA sequence analysis. This model is applied to simulating real CGR-walk sequence data of ten genomic sequences. Remarkably long-range correlations are uncovered in the data, and the results from these models are reasonably fitted with those from the ARFIMA (p, d, q) model.

  18. Sequencing and Analysis of Neanderthal Genomic DNA

    Energy Technology Data Exchange (ETDEWEB)

    Noonan, James P.; Coop, Graham; Kudaravalli, Sridhar; Smith,Doug; Krause, Johannes; Alessi, Joe; Chen, Feng; Platt, Darren; Paabo,Svante; Pritchard, Jonathan K.; Rubin, Edward M.

    2006-06-13

    Recovery and analysis of multiple Neanderthal autosomalsequences using a metagenomic approach reveals that modern humans andNeanderthals split ~;400,000 years ago, without significant evidence ofsubsequent admixture.

  19. Improved Algorithm for Analysis of DNA Sequences Using Multiresolution Transformation

    Directory of Open Access Journals (Sweden)

    T. M. Inbamalar

    2015-01-01

    Full Text Available Bioinformatics and genomic signal processing use computational techniques to solve various biological problems. They aim to study the information allied with genetic materials such as the deoxyribonucleic acid (DNA, the ribonucleic acid (RNA, and the proteins. Fast and precise identification of the protein coding regions in DNA sequence is one of the most important tasks in analysis. Existing digital signal processing (DSP methods provide less accurate and computationally complex solution with greater background noise. Hence, improvements in accuracy, computational complexity, and reduction in background noise are essential in identification of the protein coding regions in the DNA sequences. In this paper, a new DSP based method is introduced to detect the protein coding regions in DNA sequences. Here, the DNA sequences are converted into numeric sequences using electron ion interaction potential (EIIP representation. Then discrete wavelet transformation is taken. Absolute value of the energy is found followed by proper threshold. The test is conducted using the data bases available in the National Centre for Biotechnology Information (NCBI site. The comparative analysis is done and it ensures the efficiency of the proposed system.

  20. Improved algorithm for analysis of DNA sequences using multiresolution transformation.

    Science.gov (United States)

    Inbamalar, T M; Sivakumar, R

    2015-01-01

    Bioinformatics and genomic signal processing use computational techniques to solve various biological problems. They aim to study the information allied with genetic materials such as the deoxyribonucleic acid (DNA), the ribonucleic acid (RNA), and the proteins. Fast and precise identification of the protein coding regions in DNA sequence is one of the most important tasks in analysis. Existing digital signal processing (DSP) methods provide less accurate and computationally complex solution with greater background noise. Hence, improvements in accuracy, computational complexity, and reduction in background noise are essential in identification of the protein coding regions in the DNA sequences. In this paper, a new DSP based method is introduced to detect the protein coding regions in DNA sequences. Here, the DNA sequences are converted into numeric sequences using electron ion interaction potential (EIIP) representation. Then discrete wavelet transformation is taken. Absolute value of the energy is found followed by proper threshold. The test is conducted using the data bases available in the National Centre for Biotechnology Information (NCBI) site. The comparative analysis is done and it ensures the efficiency of the proposed system.

  1. How effective is graphene nanopore geometry on DNA sequencing?

    CERN Document Server

    Satarifard, Vahid; Ejtehadi, Mohammad Reza

    2015-01-01

    In this paper we investigate the effects of graphene nanopore geometry on homopolymer ssDNA pulling process through nanopore using steered molecular dynamic (SMD) simulations. Different graphene nanopores are examined including axially symmetric and asymmetric monolayer graphene nanopores as well as five layer graphene polyhedral crystals (GPC). The pulling force profile, moving fashion of ssDNA, work done in irreversible DNA pulling and orientations of DNA bases near the nanopore are assessed. Simulation results demonstrate the strong effect of the pore shape as well as geometrical symmetry on free energy barrier, orientations and dynamic of DNA translocation through graphene nanopore. Our study proposes that the symmetric circular geometry of monolayer graphene nanopore with high pulling velocity can be used for DNA sequencing.

  2. Qualitatively predicting acetylation and methylation areas in DNA sequences.

    Science.gov (United States)

    Pham, Tho Hoan; Tran, Dang Hung; Ho, Tu Bao; Satou, Kenji; Valiente, Gabriel

    2005-01-01

    Eukaryotic genomes are packaged by the wrapping of DNA around histone octamers to form nucleosomes. Nucleosome occupancy, acetylation, and methylation, which have a major impact on all nuclear processes involving DNA, have been recently mapped across the yeast genome using chromatin immunoprecipitation and DNA microarrays. However, this experimental protocol is laborious and expensive. Moreover, experimental methods often produce noisy results. In this paper, we introduce a computational approach to the qualitative prediction of nucleosome occupancy, acetylation, and methylation areas in DNA sequences. Our method uses support vector machines to discriminate between DNA areas with high and low relative occupancy, acetylation, or methylation, and rank k-gram features based on their support for these DNA modifications. Experimental results on the yeast genome reveal genetic area preferences of nucleosome occupancy, acetylation, and methylation that are consistent with previous studies. Supplementary files are available from http://www.jaist.ac.jp/~tran/nucleosome/.

  3. PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities

    Science.gov (United States)

    2011-01-01

    Background Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS) that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. Results The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. Conclusions PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/. PMID:21385349

  4. PIMS sequencing extension: a laboratory information management system for DNA sequencing facilities

    Directory of Open Access Journals (Sweden)

    Baldwin Stephen A

    2011-03-01

    Full Text Available Abstract Background Facilities that provide a service for DNA sequencing typically support large numbers of users and experiment types. The cost of services is often reduced by the use of liquid handling robots but the efficiency of such facilities is hampered because the software for such robots does not usually integrate well with the systems that run the sequencing machines. Accordingly, there is a need for software systems capable of integrating different robotic systems and managing sample information for DNA sequencing services. In this paper, we describe an extension to the Protein Information Management System (PIMS that is designed for DNA sequencing facilities. The new version of PIMS has a user-friendly web interface and integrates all aspects of the sequencing process, including sample submission, handling and tracking, together with capture and management of the data. Results The PIMS sequencing extension has been in production since July 2009 at the University of Leeds DNA Sequencing Facility. It has completely replaced manual data handling and simplified the tasks of data management and user communication. Samples from 45 groups have been processed with an average throughput of 10000 samples per month. The current version of the PIMS sequencing extension works with Applied Biosystems 3130XL 96-well plate sequencer and MWG 4204 or Aviso Theonyx liquid handling robots, but is readily adaptable for use with other combinations of robots. Conclusions PIMS has been extended to provide a user-friendly and integrated data management solution for DNA sequencing facilities that is accessed through a normal web browser and allows simultaneous access by multiple users as well as facility managers. The system integrates sequencing and liquid handling robots, manages the data flow, and provides remote access to the sequencing results. The software is freely available, for academic users, from http://www.pims-lims.org/.

  5. VoSeq: a voucher and DNA sequence web application.

    Science.gov (United States)

    Peña, Carlos; Malm, Tobias

    2012-01-01

    There is an ever growing number of molecular phylogenetic studies published, due to, in part, the advent of new techniques that allow cheap and quick DNA sequencing. Hence, the demand for relational databases with which to manage and annotate the amassing DNA sequences, genes, voucher specimens and associated biological data is increasing. In addition, a user-friendly interface is necessary for easy integration and management of the data stored in the database back-end. Available databases allow management of a wide variety of biological data. However, most database systems are not specifically constructed with the aim of being an organizational tool for researchers working in phylogenetic inference. We here report a new software facilitating easy management of voucher and sequence data, consisting of a relational database as back-end for a graphic user interface accessed via a web browser. The application, VoSeq, includes tools for creating molecular datasets of DNA or amino acid sequences ready to be used in commonly used phylogenetic software such as RAxML, TNT, MrBayes and PAUP, as well as for creating tables ready for publishing. It also has inbuilt BLAST capabilities against all DNA sequences stored in VoSeq as well as sequences in NCBI GenBank. By using mash-ups and calls to web services, VoSeq allows easy integration with public services such as Yahoo! Maps, Flickr, Encyclopedia of Life (EOL) and GBIF (by generating data-dumps that can be processed with GBIF's Integrated Publishing Toolkit).

  6. Label-free DNA sequencing using Millikan detection.

    Science.gov (United States)

    Dettloff, Roger; Leiske, Danielle; Chow, Andrea; Farinas, Javier

    2015-10-15

    A label-free method for DNA sequencing based on the principle of the Millikan oil drop experiment was developed. This sequencing-by-synthesis approach sensed increases in bead charge as nucleotides were added by a polymerase to DNA templates attached to beads. The balance between an electrical force, which was dependent on the number of nucleotide charges on a bead, and opposing hydrodynamic drag and restoring tether forces resulted in a bead velocity that was a function of the number of nucleotides attached to the bead. The velocity of beads tethered via a polymer to a microfluidic channel and subjected to an oscillating electric field was measured using dark-field microscopy and used to determine how many nucleotides were incorporated during each sequencing-by-synthesis cycle. Increases in bead velocity of approximately 1% were reliably detected during DNA polymerization, allowing for sequencing of short DNA templates. The method could lead to a low-cost, high-throughput sequencing platform that could enable routine sequencing in medical applications.

  7. Hiding message into DNA sequence through DNA coding and chaotic maps.

    Science.gov (United States)

    Liu, Guoyan; Liu, Hongjun; Kadir, Abdurahman

    2014-09-01

    The paper proposes an improved reversible substitution method to hide data into deoxyribonucleic acid (DNA) sequence, and four measures have been taken to enhance the robustness and enlarge the hiding capacity, such as encode the secret message by DNA coding, encrypt it by pseudo-random sequence, generate the relative hiding locations by piecewise linear chaotic map, and embed the encoded and encrypted message into a randomly selected DNA sequence using the complementary rule. The key space and the hiding capacity are analyzed. Experimental results indicate that the proposed method has a better performance compared with the competing methods with respect to robustness and capacity.

  8. Identification of genes in anonymous DNA sequences. Annual performance report, February 1, 1991--January 31, 1992

    Energy Technology Data Exchange (ETDEWEB)

    Fields, C.A.

    1996-06-01

    The objective of this project is the development of practical software to automate the identification of genes in anonymous DNA sequences from the human, and other higher eukaryotic genomes. A software system for automated sequence analysis, gm (gene modeler) has been designed, implemented, tested, and distributed to several dozen laboratories worldwide. A significantly faster, more robust, and more flexible version of this software, gm 2.0 has now been completed, and is being tested by operational use to analyze human cosmid sequence data. A range of efforts to further understand the features of eukaryoyic gene sequences are also underway. This progress report also contains papers coming out of the project including the following: gm: a Tool for Exploratory Analysis of DNA Sequence Data; The Human THE-LTR(O) and MstII Interspersed Repeats are subfamilies of a single widely distruted highly variable repeat family; Information contents and dinucleotide compostions of plant intron sequences vary with evolutionary origin; Splicing signals in Drosophila: intron size, information content, and consensus sequences; Integration of automated sequence analysis into mapping and sequencing projects; Software for the C. elegans genome project.

  9. Mitochondrial DNA sequence of Onychostoma rara.

    Science.gov (United States)

    Zeng, Chun-Fang; Li, Xiao-Ling; Li, Chuan-Wu; Huang, Xiang-Rong; Wan, Yi-Wen

    2015-01-01

    The complete mitochondrial genome sequence of Onychostoma rara was determined to be 16,590 bp in length and contains 13 protein-coding genes (PCGs), 22 tRNA genes, large (rrnL) and small (rrnS) rRNA and the non-coding control region. Its total A + T content is 55.65%. We also analyzed the structure of control region, 6 CSBs (CSB-1, CSB-2, CSB-3, CSB-D, CSB-E and CSB-F) and 2 bp tandem repeat were detected.

  10. DNA sequence alignment by microhomology sampling during homologous recombination.

    Science.gov (United States)

    Qi, Zhi; Redding, Sy; Lee, Ja Yil; Gibb, Bryan; Kwon, YoungHo; Niu, Hengyao; Gaines, William A; Sung, Patrick; Greene, Eric C

    2015-02-26

    Homologous recombination (HR) mediates the exchange of genetic information between sister or homologous chromatids. During HR, members of the RecA/Rad51 family of recombinases must somehow search through vast quantities of DNA sequence to align and pair single-strand DNA (ssDNA) with a homologous double-strand DNA (dsDNA) template. Here, we use single-molecule imaging to visualize Rad51 as it aligns and pairs homologous DNA sequences in real time. We show that Rad51 uses a length-based recognition mechanism while interrogating dsDNA, enabling robust kinetic selection of 8-nucleotide (nt) tracts of microhomology, which kinetically confines the search to sites with a high probability of being a homologous target. Successful pairing with a ninth nucleotide coincides with an additional reduction in binding free energy, and subsequent strand exchange occurs in precise 3-nt steps, reflecting the base triplet organization of the presynaptic complex. These findings provide crucial new insights into the physical and evolutionary underpinnings of DNA recombination. Copyright © 2015 Elsevier Inc. All rights reserved.

  11. Rapid DNA sequencing by horizontal ultrathin gel electrophoresis.

    Science.gov (United States)

    Brumley, R L; Smith, L M

    1991-01-01

    A horizontal polyacrylamide gel electrophoresis apparatus has been developed that decreases the time required to separate the DNA fragments produced in enzymatic sequencing reactions. The configuration of this apparatus and the use of circulating coolant directly under the glass plates result in heat exchange that is approximately nine times more efficient than passive thermal transfer methods commonly used. Bubble-free gels as thin as 25 microns can be routinely cast on this device. The application to these ultrathin gels of electric fields up to 250 volts/cm permits the rapid separation of multiple DNA sequencing reactions in parallel. When used in conjunction with 32P-based autoradiography, the DNA bands appear substantially sharper than those obtained in conventional electrophoresis. This increased sharpness permits shorter autoradiographic exposure times and longer sequence reads. Images PMID:1870968

  12. Noninvasive prenatal paternity testing (NIPAT) through maternal plasma DNA sequencing

    DEFF Research Database (Denmark)

    Jiang, Haojun; Xie, Yifan; Li, Xuchao

    2016-01-01

    Short tandem repeats (STRs) and single nucleotide polymorphisms (SNPs) have been already used to perform noninvasive prenatal paternity testing from maternal plasma DNA. The frequently used technologies were PCR followed by capillary electrophoresis and SNP typing array, respectively. Here, we...... developed a noninvasive prenatal paternity testing (NIPAT) based on SNP typing with maternal plasma DNA sequencing. We evaluated the influence factors (minor allele frequency (MAF), the number of total SNP, fetal fraction and effective sequencing depth) and designed three different selective SNP panels...... paternity test using STR multiplex system. Our study here proved that the maternal plasma DNA sequencing-based technology is feasible and accurate in determining paternity, which may provide an alternative in forensic application in the future....

  13. Accelerating Computation of DNA Sequence Alignment in Distributed Environment

    Science.gov (United States)

    Guo, Tao; Li, Guiyang; Deaton, Russel

    Sequence similarity and alignment are most important operations in computational biology. However, analyzing large sets of DNA sequence seems to be impractical on a regular PC. Using multiple threads with JavaParty mechanism, this project has successfully implemented in extending the capabilities of regular Java to a distributed environment for simulation of DNA computation. With the aid of JavaParty and the design of multiple threads, the results of this study demonstrated that the modified regular Java program could perform parallel computing without using RMI or socket communication. In this paper, an efficient method for modeling and comparing DNA sequences with dynamic programming and JavaParty was firstly proposed. Additionally, results of this method in distributed environment have been discussed.

  14. Facilitated diffusion on mobile DNA: configurational traps and sequence heterogeneity

    CERN Document Server

    Brackley, C A; Marenduzzo, D; 10.1103/PhysRevLett.109.168103

    2012-01-01

    We present Brownian dynamics simulations of the facilitated diffusion of a protein, modelled as a sphere with a binding site on its surface, along DNA, modelled as a semi-flexible polymer. We consider both the effect of DNA organisation in 3D, and of sequence heterogeneity. We find that in a network of DNA loops, as are thought to be present in bacterial DNA, the search process is very sensitive to the spatial location of the target within such loops. Therefore, specific genes might be repressed or promoted by changing the local topology of the genome. On the other hand, sequence heterogeneity creates traps which normally slow down facilitated diffusion. When suitably positioned, though, these traps can, surprisingly, render the search process much more efficient.

  15. The future of human DNA vaccines.

    Science.gov (United States)

    Li, Lei; Saade, Fadi; Petrovsky, Nikolai

    2012-12-31

    DNA vaccines have evolved greatly over the last 20 years since their invention, but have yet to become a competitive alternative to conventional protein or carbohydrate based human vaccines. Whilst safety concerns were an initial barrier, the Achilles heel of DNA vaccines remains their poor immunogenicity when compared to protein vaccines. A wide variety of strategies have been developed to optimize DNA vaccine immunogenicity, including codon optimization, genetic adjuvants, electroporation and sophisticated prime-boost regimens, with each of these methods having its advantages and limitations. Whilst each of these methods has contributed to incremental improvements in DNA vaccine efficacy, more is still needed if human DNA vaccines are to succeed commercially. This review foresees a final breakthrough in human DNA vaccines will come from application of the latest cutting-edge technologies, including "epigenetics" and "omics" approaches, alongside traditional techniques to improve immunogenicity such as adjuvants and electroporation, thereby overcoming the current limitations of DNA vaccines in humans.

  16. Vector sequences - Budding yeast cDNA sequencing project | LSDB Archive [Life Science Database Archive metadata

    Lifescience Database Archive (English)

    Full Text Available [ Credits ] BLAST Search Image Search Home About Archive Update History Contact us ...od - Number of data entries 7 entries - Joomla SEF URLs by Artio About This Database Database Description Download License Update His...tory of This Database Site Policy | Contact Us Vector sequences - Budding yeast cDNA sequencing project | LSDB Archive ...

  17. Comment on "Linguistic features of noncoding DNA sequences"

    CERN Document Server

    Israeloff, N E; Chan, K; Israeloff, N E; Kagalenko, M; Chan, K

    1995-01-01

    In a recent Physical Review Letter, Mantegna et. al., report that certain statistical signatures of natural language can be found in non-coding DNA sequences. In this comment we show that random noise with power-law correlation similar to 1/f noise, exhibits the same "linguistic" signature as those found in non-coding DNA. We conclude that these signa- tures cannot distinguish languages from noise.

  18. Sequence dependence of transcription factor-mediated DNA looping.

    Science.gov (United States)

    Johnson, Stephanie; Lindén, Martin; Phillips, Rob

    2012-09-01

    DNA is subject to large deformations in a wide range of biological processes. Two key examples illustrate how such deformations influence the readout of the genetic information: the sequestering of eukaryotic genes by nucleosomes and DNA looping in transcriptional regulation in both prokaryotes and eukaryotes. These kinds of regulatory problems are now becoming amenable to systematic quantitative dissection with a powerful dialogue between theory and experiment. Here, we use a single-molecule experiment in conjunction with a statistical mechanical model to test quantitative predictions for the behavior of DNA looping at short length scales and to determine how DNA sequence affects looping at these lengths. We calculate and measure how such looping depends upon four key biological parameters: the strength of the transcription factor binding sites, the concentration of the transcription factor, and the length and sequence of the DNA loop. Our studies lead to the surprising insight that sequences that are thought to be especially favorable for nucleosome formation because of high flexibility lead to no systematically detectable effect of sequence on looping, and begin to provide a picture of the distinctions between the short length scale mechanics of nucleosome formation and looping.

  19. Label-Free DNA Sequencing Using Millikan Detection

    OpenAIRE

    Dettloff, Roger; Leiske, Danielle; Chow, Andrea; Farinas, Javier

    2015-01-01

    A label-free method for DNA sequencing based on the principle of the Millikan oil drop experiment was developed. This sequencing-by-synthesis approach sensed increases in bead charge as nucleotides were added by a polymerase to DNA templates attached to beads. The balance between an electrical force, which was dependent on the number of nucleotide charges on a bead, and opposing hydrodynamic drag and restoring tether forces resulted in a bead velocity that was a function of the number of nucl...

  20. Anaplasma phagocytophilum in Danish sheep: confirmation by DNA sequencing

    Directory of Open Access Journals (Sweden)

    Thamsborg Stig M

    2009-12-01

    Full Text Available Abstract Background The presence of Anaplasma phagocytophilum, an Ixodes ricinus transmitted bacterium, was investigated in two flocks of Danish grazing lambs. Direct PCR detection was performed on DNA extracted from blood and serum with subsequent confirmation by DNA sequencing. Methods 31 samples obtained from clinically normal lambs in 2000 from Fussingø, Jutland and 12 samples from ten lambs and two ewes from a clinical outbreak at Feddet, Zealand in 2006 were included in the study. Some of the animals from Feddet had shown clinical signs of polyarthritis and general unthriftiness prior to sampling. DNA extraction was optimized from blood and serum and detection achieved by a 16S rRNA targeted PCR with verification of the product by DNA sequencing. Results Five DNA extracts were found positive by PCR, including two samples from 2000 and three from 2006. For both series of samples the product was verified as A. phagocytophilum by DNA sequencing. Conclusions A. phagocytophilum was detected by molecular methods for the first time in Danish grazing lambs during the two seasons investigated (2000 and 2006.

  1. Sequence-selective DNA recognition with peptide-bisbenzamidine conjugates.

    Science.gov (United States)

    Sánchez, Mateo I; Vázquez, Olalla; Vázquez, M Eugenio; Mascareñas, José L

    2013-07-22

    Transcription factors (TFs) are specialized proteins that play a key role in the regulation of genetic expression. Their mechanism of action involves the interaction with specific DNA sequences, which usually takes place through specialized domains of the protein. However, achieving an efficient binding usually requires the presence of the full protein. This is the case for bZIP and zinc finger TF families, which cannot interact with their target sites when the DNA binding fragments are presented as isolated monomers. Herein it is demonstrated that the DNA binding of these monomeric peptides can be restored when conjugated to aza-bisbenzamidines, which are readily accessible molecules that interact with A/T-rich sites by insertion into their minor groove. Importantly, the fluorogenic properties of the aza-benzamidine unit provide details of the DNA interaction that are eluded in electrophoresis mobility shift assays (EMSA). The hybrids based on the GCN4 bZIP protein preferentially bind to composite sequences containing tandem bisbenzamidine-GCN4 binding sites (TCAT⋅AAATT). Fluorescence reverse titrations show an interesting multiphasic profile consistent with the formation of competitive nonspecific complexes at low DNA/peptide ratios. On the other hand, the conjugate with the DNA binding domain of the zinc finger protein GAGA binds with high affinity (KD≈12 nM) and specificity to a composite AATTT⋅GAGA sequence containing both the bisbenzamidine and the TF consensus binding sites.

  2. Preparation of next-generation sequencing libraries from damaged DNA.

    Science.gov (United States)

    Briggs, Adrian W; Heyn, Patricia

    2012-01-01

    Next-generation sequencing (NGS) has revolutionized ancient DNA research, especially when combined with high-throughput target enrichment methods. However, attaining high sequencing depth and accuracy from samples often remains problematic due to the damaged state of ancient DNA, in particular the extremely low copy number of ancient DNA and the abundance of uracil residues derived from cytosine deamination that lead to miscoding errors. It is therefore critical to use a highly efficient procedure for conversion of a raw DNA extract into an adaptor-ligated sequencing library, and equally important to reduce errors from uracil residues. We present a protocol for NGS library preparation that allows highly efficient conversion of DNA fragments into an adaptor-ligated form. The protocol incorporates an option to remove the vast majority of uracil miscoding lesions as part of the library preparation process. The procedure requires only two spin column purification steps and no gel purification or bead handling. Starting from an aliquot of DNA extract, a finished, highly amplified library can be generated in 5 h, or under 3 h if uracil removal is not required.

  3. Perspectives of DNA microarray and next-generation DNA sequencing technologies

    Institute of Scientific and Technical Information of China (English)

    TENG XiaoKun; XIAO HuaSheng

    2009-01-01

    DNA microarray and next-generation DNA sequencing technologies are important tools for high-throughput genome research, in revealing both the structural and functional characteristics of genomes. In the past decade the DNA microarray technologies have been widely applied in the studies of functional genomics, systems biology and pharmacogenomics. The next-generation DNA sequenc-ing method was first introduced by the 454 Company in 2003, immediately followed by the establish-ment of the Solexa and Solid techniques by other biotech companies. Though it has not been long since the first emergence of this technology, with the fast and impressive improvement, the application of this technology has extended to almost all fields of genomics research, as a rival challenging the existing DNA microarray technology. This paper briefly reviews the working principles of these two technologies as well as their application and perspectives in genome research.

  4. Real-time DNA sequencing from single polymerase molecules.

    Science.gov (United States)

    Eid, John; Fehr, Adrian; Gray, Jeremy; Luong, Khai; Lyle, John; Otto, Geoff; Peluso, Paul; Rank, David; Baybayan, Primo; Bettman, Brad; Bibillo, Arkadiusz; Bjornson, Keith; Chaudhuri, Bidhan; Christians, Frederick; Cicero, Ronald; Clark, Sonya; Dalal, Ravindra; Dewinter, Alex; Dixon, John; Foquet, Mathieu; Gaertner, Alfred; Hardenbol, Paul; Heiner, Cheryl; Hester, Kevin; Holden, David; Kearns, Gregory; Kong, Xiangxu; Kuse, Ronald; Lacroix, Yves; Lin, Steven; Lundquist, Paul; Ma, Congcong; Marks, Patrick; Maxham, Mark; Murphy, Devon; Park, Insil; Pham, Thang; Phillips, Michael; Roy, Joy; Sebra, Robert; Shen, Gene; Sorenson, Jon; Tomaney, Austin; Travers, Kevin; Trulson, Mark; Vieceli, John; Wegener, Jeffrey; Wu, Dawn; Yang, Alicia; Zaccarin, Denis; Zhao, Peter; Zhong, Frank; Korlach, Jonas; Turner, Stephen

    2009-01-02

    We present single-molecule, real-time sequencing data obtained from a DNA polymerase performing uninterrupted template-directed synthesis using four distinguishable fluorescently labeled deoxyribonucleoside triphosphates (dNTPs). We detected the temporal order of their enzymatic incorporation into a growing DNA strand with zero-mode waveguide nanostructure arrays, which provide optical observation volume confinement and enable parallel, simultaneous detection of thousands of single-molecule sequencing reactions. Conjugation of fluorophores to the terminal phosphate moiety of the dNTPs allows continuous observation of DNA synthesis over thousands of bases without steric hindrance. The data report directly on polymerase dynamics, revealing distinct polymerization states and pause sites corresponding to DNA secondary structure. Sequence data were aligned with the known reference sequence to assay biophysical parameters of polymerization for each template position. Consensus sequences were generated from the single-molecule reads at 15-fold coverage, showing a median accuracy of 99.3%, with no systematic error beyond fluorophore-dependent error rates.

  5. Identification of Bacterial Species in Kuwaiti Waters Through DNA Sequencing

    Science.gov (United States)

    Chen, K.

    2017-01-01

    With an objective of identifying the bacterial diversity associated with ecosystem of various Kuwaiti Seas, bacteria were cultured and isolated from 3 water samples. Due to the difficulties for cultured and isolated fecal coliforms on the selective agar plates, bacterial isolates from marine agar plates were selected for molecular identification. 16S rRNA genes were successfully amplified from the genome of the selected isolates using Universal Eubacterial 16S rRNA primers. The resulted amplification products were subjected to automated DNA sequencing. Partial 16S rDNA sequences obtained were compared directly with sequences in the NCBI database using BLAST as well as with the sequences available with Ribosomal Database Project (RDP).

  6. Reduced representation bisulphite sequencing of the ten bovine somatic tissues reveals DNA methylation patterns

    Science.gov (United States)

    As a major component epigenetics, DNA methylation has been proved that widely functions in individual development and various diseases. It has been well studied in model organisms and human but includes limited data for the economic animals. Using reduced representation bisulphite sequencing (RRBS),...

  7. DNA qualification workflow for next generation sequencing of histopathological samples.

    Directory of Open Access Journals (Sweden)

    Michele Simbolo

    Full Text Available Histopathological samples are a treasure-trove of DNA for clinical research. However, the quality of DNA can vary depending on the source or extraction method applied. Thus a standardized and cost-effective workflow for the qualification of DNA preparations is essential to guarantee interlaboratory reproducible results. The qualification process consists of the quantification of double strand DNA (dsDNA and the assessment of its suitability for downstream applications, such as high-throughput next-generation sequencing. We tested the two most frequently used instrumentations to define their role in this process: NanoDrop, based on UV spectroscopy, and Qubit 2.0, which uses fluorochromes specifically binding dsDNA. Quantitative PCR (qPCR was used as the reference technique as it simultaneously assesses DNA concentration and suitability for PCR amplification. We used 17 genomic DNAs from 6 fresh-frozen (FF tissues, 6 formalin-fixed paraffin-embedded (FFPE tissues, 3 cell lines, and 2 commercial preparations. Intra- and inter-operator variability was negligible, and intra-methodology variability was minimal, while consistent inter-methodology divergences were observed. In fact, NanoDrop measured DNA concentrations higher than Qubit and its consistency with dsDNA quantification by qPCR was limited to high molecular weight DNA from FF samples and cell lines, where total DNA and dsDNA quantity virtually coincide. In partially degraded DNA from FFPE samples, only Qubit proved highly reproducible and consistent with qPCR measurements. Multiplex PCR amplifying 191 regions of 46 cancer-related genes was designated the downstream application, using 40 ng dsDNA from FFPE samples calculated by Qubit. All but one sample produced amplicon libraries suitable for next-generation sequencing. NanoDrop UV-spectrum verified contamination of the unsuccessful sample. In conclusion, as qPCR has high costs and is labor intensive, an alternative effective standard

  8. DNA qualification workflow for next generation sequencing of histopathological samples.

    Science.gov (United States)

    Simbolo, Michele; Gottardi, Marisa; Corbo, Vincenzo; Fassan, Matteo; Mafficini, Andrea; Malpeli, Giorgio; Lawlor, Rita T; Scarpa, Aldo

    2013-01-01

    Histopathological samples are a treasure-trove of DNA for clinical research. However, the quality of DNA can vary depending on the source or extraction method applied. Thus a standardized and cost-effective workflow for the qualification of DNA preparations is essential to guarantee interlaboratory reproducible results. The qualification process consists of the quantification of double strand DNA (dsDNA) and the assessment of its suitability for downstream applications, such as high-throughput next-generation sequencing. We tested the two most frequently used instrumentations to define their role in this process: NanoDrop, based on UV spectroscopy, and Qubit 2.0, which uses fluorochromes specifically binding dsDNA. Quantitative PCR (qPCR) was used as the reference technique as it simultaneously assesses DNA concentration and suitability for PCR amplification. We used 17 genomic DNAs from 6 fresh-frozen (FF) tissues, 6 formalin-fixed paraffin-embedded (FFPE) tissues, 3 cell lines, and 2 commercial preparations. Intra- and inter-operator variability was negligible, and intra-methodology variability was minimal, while consistent inter-methodology divergences were observed. In fact, NanoDrop measured DNA concentrations higher than Qubit and its consistency with dsDNA quantification by qPCR was limited to high molecular weight DNA from FF samples and cell lines, where total DNA and dsDNA quantity virtually coincide. In partially degraded DNA from FFPE samples, only Qubit proved highly reproducible and consistent with qPCR measurements. Multiplex PCR amplifying 191 regions of 46 cancer-related genes was designated the downstream application, using 40 ng dsDNA from FFPE samples calculated by Qubit. All but one sample produced amplicon libraries suitable for next-generation sequencing. NanoDrop UV-spectrum verified contamination of the unsuccessful sample. In conclusion, as qPCR has high costs and is labor intensive, an alternative effective standard workflow for

  9. Extracellular DNA affects NO content in human endothelial cells.

    Science.gov (United States)

    Efremova, L V; Alekseeva, A Yu; Konkova, M S; Kostyuk, S V; Ershova, E S; Smirnova, T D; Konorova, I L; Veiko, N N

    2010-08-01

    Fragments of extracellular DNA are permanently released into the blood flow due to cell apoptosis and possible de novo DNA synthesis. To find out whether extracellular DNA can affect the synthesis of nitric oxide (NO), one of key vascular tone regulators, we studied in vitro effects of three artificial DNA probes with different sequences and 10 samples of extracellular DNA (obtained from healthy people and patients with hypertension and atherosclerosis) on NO synthesis in endothelial cell culture (HUVEC). For detection of NO in live cells and culture medium, we used a NO-specific agent CuFL penetrating into the cells and forming a fluorescent product FL-NO upon interaction with NO. Human genome DNA fragments affected the content of NO in endothelial cells; this effect depended on both the base sequence and concentration of DNA fragments. Addition of artificial DNA and extracellular DNA from healthy people into the cell culture in a low concentration (5 ng/ml) increased the detected NO concentration by 4-fold at most. Cytosine-guanine (CG)-rich fragment of the transcribed sequence of ribosomal repeat was the most powerful NO-inductor. The effect of DNA fragments on NO synthesis was comparable with that of low doses of oxidizing agents, H(2)O(2) and 17β-estradiol. Extracellular DNA samples obtained from patients with hypertension and atherosclerosis decreased NO content in cells and medium by 1.3-28 times compared to the control; the effect correlated with the content of CG-rich sequences.

  10. Correcting sequencing errors in DNA coding regions using a dynamic programming approach.

    Science.gov (United States)

    Xu, Y; Mural, R J; Uberbacher, E C

    1995-04-01

    This paper presents an algorithm for detecting and 'correcting' sequencing errors that occur in DNA coding regions. The types of sequencing errors addressed are insertions and deletions (indels) of DNA bases. The goal is to provide a capability which makes single-pass or low-redundancy sequence data more informative, reducing the need for high-redundancy sequencing for gene identification and characterization purposes. This would permit improved sequencing efficiency and reduce genome sequencing costs. The algorithm detects sequencing errors by discovering changes in the statistically preferred reading frame within a putative coding region and then inserts a number of 'neutral' bases at a perceived reading frame transition point to make the putative exon candidate frame consistent. We have implemented the algorithm as a front-end subsystem of the GRAIL DNA sequence analysis system to construct a version which is very error tolerant and also intend to use this as a testbed for further development of sequencing error-correction technology. Preliminary test results have shown the usefulness of this algorithm and also exhibited some of its weakness, providing possible directions for further improvement. On a test set consisting of 68 human DNA sequences with 1% randomly generated indels in coding regions, the algorithm detected and corrected 76% of the indels. The average distance between the position of an indel and the predicted one was 9.4 bases. With this subsystem in place, GRAIL correctly predicted 89% of the coding messages with 10% false message on the 'corrected' sequences, compared to 69% correctly predicted coding messages and 11% falsely predicted messages on the 'corrupted' sequences using standard GRAIL II method (version 1.2).(ABSTRACT TRUNCATED AT 250 WORDS)

  11. DNA methylation and transcription in HERV (K, W, E) and LINE sequences remain unchanged upon foreign DNA insertions.

    Science.gov (United States)

    Weber, Stefanie; Jung, Susan; Doerfler, Walter

    2016-02-01

    DNA methylation and transcriptional profiles were determined in the regulatory sequences of the human endogenous retroviral (HERV-K, -W, -E) and LINE-1.2 elements and were compared between non-transgenomic and plasmid-transgenomic cells. DNA methylation profiles in the HERV (K, W, E) and LINE sequences were determined by bisulfite genomic sequencing. The transcription of these genome segments was assessed by quantitative real-time PCR. In HERV-K, HERV-W and LINE-1.2 the levels of DNA methylation ranged between 75 and 98%, while in HERV-E they were around 60%. Nevertheless, the HERV and LINE-1.2 sequences were actively transcribed. No differences were found in comparisons of HERV and LINE-1.2 CpG methylation and transcription patterns between non-transgenomic and plasmid-transgenomic HCT116 cells. The insertion of a 5.6 kbp plasmid into the HCT116 genome had no effect on the HERV and LINE-1.2 methylation and transcription profiles, although other parts of the HCT116 genome had shown marked changes. These repetitive sequences are transcribed, probably because the large number of HERV and LINE-1.2 elements harbor copies with non- or hypo-methylated long terminal repeat sequences.

  12. Ancient DNA in human bone remains from Pompeii archaeological site.

    Science.gov (United States)

    Cipollaro, M; Di Bernardo, G; Galano, G; Galderisi, U; Guarino, F; Angelini, F; Cascino, A

    1998-06-29

    aDNA extraction and amplification procedures have been optimized for Pompeian human bone remains whose diagenesis has been determined by histological analysis. Single copy genes amplification (X and Y amelogenin loci and Y specific alphoid repeat sequences) have been performed and compared with anthropometric data on sexing.

  13. Quality standards for DNA sequence variation databases to improve clinical management under development in Australia

    Directory of Open Access Journals (Sweden)

    B. Bennetts

    2014-09-01

    Full Text Available Despite the routine nature of comparing sequence variations identified during clinical testing to database records, few databases meet quality requirements for clinical diagnostics. To address this issue, The Royal College of Pathologists of Australasia (RCPA in collaboration with the Human Genetics Society of Australasia (HGSA, and the Human Variome Project (HVP is developing standards for DNA sequence variation databases intended for use in the Australian clinical environment. The outputs of this project will be promoted to other health systems and accreditation bodies by the Human Variome Project to support the development of similar frameworks in other jurisdictions.

  14. The human DNA-activated protein kinase, DNA-PK: Substrate specificity

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, C.W.; Connelly, M.A.; Zhang, H.; Sipley, J.A. [Brookhaven National Lab., Upton, NY (United States). Biology Dept.; Lees-Miller, S.P.; Lintott, L.G. [Univ. of Calgary, Alberta (Canada). Dept. of Biological Sciences; Sakaguchi, Kazuyasu; Appella, E. [National Institutes of Health, Bethesda, MD (United States). Lab. of Cell Biology

    1994-11-05

    Although much has been learned about the structure and function of p53 and the probable sequence of subsequent events that lead to cell cycle arrest, little is known about how DNA damage is detected and the nature of the signal that is generated by DNA damage. Circumstantial evidence suggests that protein kinases may be involved. In vitro, human DNA-PK phosphorylates a variety of nuclear DNA-binding, regulatory proteins including the tumor suppressor protein p53, the single-stranded DNA binding protein RPA, the heat shock protein hsp90, the large tumor antigen (TAg) of simian virus 40, a variety of transcription factors including Fos, Jun, serum response factor (SRF), Myc, Sp1, Oct-1, TFIID, E2F, the estrogen receptor, and the large subunit of RNA polymerase II (reviewed in Anderson, 1993; Jackson et al., 1993). However, for most of these proteins, the sites that are phosphorylated by DNA-PK are not known. To determine if the sites that were phosphorylated in vitro also were phosphorylated in vivo and if DNA-PK recognized a preferred protein sequence, the authors identified the sites phosphorylated by DNA-PK in several substrates by direct protein sequence analysis. Each phosphorylated serine or threonine is followed immediately by glutamine in the polypeptide chain; at no other positions are the amino acid residues obviously constrained.

  15. The human DNA-activated protein kinase, DNA-PK: Substrate specificity

    Energy Technology Data Exchange (ETDEWEB)

    Anderson, C.W.; Connelly, M.A.; Zhang, H.; Sipley, J.A. [Brookhaven National Lab., Upton, NY (United States). Biology Dept.; Lees-Miller, S.P.; Lintott, L.G. [Univ. of Calgary, Alberta (Canada). Dept. of Biological Sciences; Sakaguchi, Kazuyasu; Appella, E. [National Institutes of Health, Bethesda, MD (United States). Lab. of Cell Biology

    1994-11-05

    Although much has been learned about the structure and function of p53 and the probable sequence of subsequent events that lead to cell cycle arrest, little is known about how DNA damage is detected and the nature of the signal that is generated by DNA damage. Circumstantial evidence suggests that protein kinases may be involved. In vitro, human DNA-PK phosphorylates a variety of nuclear DNA-binding, regulatory proteins including the tumor suppressor protein p53, the single-stranded DNA binding protein RPA, the heat shock protein hsp90, the large tumor antigen (TAg) of simian virus 40, a variety of transcription factors including Fos, Jun, serum response factor (SRF), Myc, Sp1, Oct-1, TFIID, E2F, the estrogen receptor, and the large subunit of RNA polymerase II (reviewed in Anderson, 1993; Jackson et al., 1993). However, for most of these proteins, the sites that are phosphorylated by DNA-PK are not known. To determine if the sites that were phosphorylated in vitro also were phosphorylated in vivo and if DNA-PK recognized a preferred protein sequence, the authors identified the sites phosphorylated by DNA-PK in several substrates by direct protein sequence analysis. Each phosphorylated serine or threonine is followed immediately by glutamine in the polypeptide chain; at no other positions are the amino acid residues obviously constrained.

  16. Isolation and sequence analysis of a cDNA clone encoding the fifth complement component

    DEFF Research Database (Denmark)

    Lundwall, Åke B; Wetsel, Rick A; Kristensen, Torsten

    1985-01-01

    obtained further predicted an arginine-rich sequence (RPRR) immediately upstream of the N-terminal threonine of C5a, indicating that the promolecule form of C5 is synthesized with a beta alpha-chain orientation as previously shown for pro-C3 and pro-C4. The C5 cDNA clone was sheared randomly by sonication......We have used available protein sequence data for the anaphylatoxin (C5a) portion of the fifth component of human complement (residues 19-25) to synthesize a mixed-sequence oligonucleotide probe. The labeled oligonucleotide was then used to screen a human liver cDNA library, and a single candidate cDNA...... clone of 1.85 kilobase pairs was isolated. Hybridization of the mixed-sequence probe to the complementary strand of the plasmid insert and sequence analysis by the dideoxy method predicted the expected protein sequence of C5a (positions 1-12), amino-terminal to the anticipated priming site. The sequence...

  17. Templated sequence insertion polymorphisms in the human genome

    Science.gov (United States)

    Onozawa, Masahiro; Aplan, Peter

    2016-11-01

    Templated Sequence Insertion Polymorphism (TSIP) is a recently described form of polymorphism recognized in the human genome, in which a sequence that is templated from a distant genomic region is inserted into the genome, seemingly at random. TSIPs can be grouped into two classes based on nucleotide sequence features at the insertion junctions; Class 1 TSIPs show features of insertions that are mediated via the LINE-1 ORF2 protein, including 1) target-site duplication (TSD), 2) polyadenylation 10-30 nucleotides downstream of a “cryptic” polyadenylation signal, and 3) preference for insertion at a 5’-TTTT/A-3’ sequence. In contrast, class 2 TSIPs show features consistent with repair of a DNA double-strand break via insertion of a DNA “patch” that is derived from a distant genomic region. Survey of a large number of normal human volunteers demonstrates that most individuals have 25-30 TSIPs, and that these TSIPs track with specific geographic regions. Similar to other forms of human polymorphism, we suspect that these TSIPs may be important for the generation of human diversity and genetic diseases.

  18. Temporal stability of epigenetic markers: sequence characteristics and predictors of short-term DNA methylation variations.

    Directory of Open Access Journals (Sweden)

    Hyang-Min Byun

    Full Text Available BACKGROUND: DNA methylation is an epigenetic mechanism that has been increasingly investigated in observational human studies, particularly on blood leukocyte DNA. Characterizing the degree and determinants of DNA methylation stability can provide critical information for the design and conduction of human epigenetic studies. METHODS: We measured DNA methylation in 12 gene-promoter regions (APC, p16, p53, RASSF1A, CDH13, eNOS, ET-1, IFNγ, IL-6, TNFα, iNOS, and hTERT and 2 of non-long terminal repeat elements, i.e., L1 and Alu in blood samples obtained from 63 healthy individuals at baseline (Day 1 and after three days (Day 4. DNA methylation was measured by bisulfite-PCR-Pyrosequencing. We calculated intraclass correlation coefficients (ICCs to measure the within-individual stability of DNA methylation between Day 1 and 4, subtracted of pyrosequencing error and adjusted for multiple covariates. RESULTS: Methylation markers showed different temporal behaviors ranging from high (IL-6, ICC = 0.89 to low stability (APC, ICC = 0.08 between Day 1 and 4. Multiple sequence and marker characteristics were associated with the degree of variation. Density of CpG dinucleotides nearby the sequence analyzed (measured as CpG(o/e or G+C content within ±200 bp was positively associated with DNA methylation stability. The 3' proximity to repeat elements and range of DNA methylation on Day 1 were also positively associated with methylation stability. An inverted U-shaped correlation was observed between mean DNA methylation on Day 1 and stability. CONCLUSIONS: The degree of short-term DNA methylation stability is marker-dependent and associated with sequence characteristics and methylation levels.

  19. Assignment of casein kinase 2 alpha sequences to two different human chromosomes

    DEFF Research Database (Denmark)

    Boldyreff, B; Klett, C; Göttert, E

    1992-01-01

    Human casein kinase 2 alpha gene (CK-2-alpha) sequences have been localized within the human genome by in situ hybridization and somatic cell hybrid analysis using a CK-2 alpha cDNA as a probe. By in situ hybridization, the CK-2 alpha cDNA could be assigned to two different loci, one on 11p15.1-ter...

  20. Isolation and characterization of DNA probes for human chromosome 21.

    Science.gov (United States)

    Watkins, P C

    1990-01-01

    A coordinated effort to map and sequence the human genome has recently become a national priority. Chromosome 21, the smallest human chromosome accounting for less than 2% of the human genome, is an attractive model system for developing and evaluating genome mapping technology. Several strategies are currently being explored including the development of chromosome 21 libraries from somatic cell hybrids as reported here, the cloning of chromosome 21 in yeast artificial chromosomes (McCormick et al., 1989b), and the construction of chromosome 21 libraries using chromosome flow-sorting techniques (Fuscoe et al., 1989). This report describes the approaches used to identify DNA probes that are useful for mapping chromosome 21. Probes were successfully isolated from both phage and cosmid libraries made from two somatic cell hybrids that contain human chromosome 21 as the only human chromosome. The 15 cosmid clones from the WA17 library, reduced to cloned DNA sequences of an average size of 3 kb, total 525 kb of DNA which is approximately 1% of chromosome 21. From these clones, a set of polymorphic DNA markers that span the length of the long arm of chromosome 21 has been generated. All of the probes thus far analyzed from the WA17 libraries have been mapped to chromosome 21 both by physical and genetic mapping methods. It is therefore likely that the WA17 hybrid cell line contains human chromosome 21 as the only human component, in agreement with cytogenetic observation. The 153E7b cosmid libraries will provide an alternative source of cloned chromosome 21 DNA. Library screening techniques can be employed to obtain cloned DNA sequences from the same genetic loci of the two different chromosome 21s. Comparative analysis will allow direct estimation of DNA sequence variation for different regions of chromosome 21. Mapped DNA probes make possible the molecular analysis of chromosome 21 at a level of resolution not achievable by classical cytogenetic techniques (Graw et al

  1. High-throughput DNA sequencing: a genomic data manufacturing process.

    Science.gov (United States)

    Huang, G M

    1999-01-01

    The progress trends in automated DNA sequencing operation are reviewed. Technological development in sequencing instruments, enzymatic chemistry and robotic stations has resulted in ever-increasing capacity of sequence data production. This progress leads to a higher demand on laboratory information management and data quality assessment. High-throughput laboratories face the challenge of organizational management, as well as technology management. Engineering principles of process control should be adopted in this biological data manufacturing procedure. While various systems attempt to provide solutions to automate different parts of, or even the entire process, new technical advances will continue to change the paradigm and provide new challenges.

  2. Functionalized nanopore-embedded electrodes for rapid DNA sequencing

    CERN Document Server

    He, Haiying; Pandey, Ravindra; Rocha, Alexandre Reily; Sanvito, Stefano; Grigoriev, Anton; Ahuja, Rajeev; Karna, Shashi P

    2007-01-01

    The determination of a patient's DNA sequence can, in principle, reveal an increased risk to fall ill with particular diseases [1,2] and help to design "personalized medicine" [3]. Moreover, statistical studies and comparison of genomes [4] of a large number of individuals are crucial for the analysis of mutations [5] and hereditary diseases, paving the way to preventive medicine [6]. DNA sequencing is, however, currently still a vastly time-consuming and very expensive task [4], consisting of pre-processing steps, the actual sequencing using the Sanger method, and post-processing in the form of data analysis [7]. Here we propose a new approach that relies on functionalized nanopore-embedded electrodes to achieve an unambiguous distinction of the four nucleic acid bases in the DNA sequencing process. This represents a significant improvement over previously studied designs [8,9] which cannot reliably distinguish all four bases of DNA. The transport properties of the setup investigated by us, employing state-o...

  3. POSA : Perl objects for DNA sequencing data analysis

    NARCIS (Netherlands)

    Aerts, JA; Jungerius, BJ; Groenen, MA

    2004-01-01

    Background: Capillary DNA sequencing machines allow the generation of vast amounts of data with little hands-on time. With this expansion of data generation, there is a growing need for automated data processing. Most available software solutions, however, still require user intervention or provide

  4. POSA: perl objects for DNA sequencing data analysis

    NARCIS (Netherlands)

    Aerts, J.A.; Jungerius, B.J.; Groenen, M.A.M.

    2004-01-01

    Background - Capillary DNA sequencing machines allow the generation of vast amounts of data with little hands-on time. With this expansion of data generation, there is a growing need for automated data processing. Most available software solutions, however, still require user intervention or provide

  5. DNA sequence handling programs in BASIC for home computers.

    OpenAIRE

    Biro, P A

    1984-01-01

    This paper describes a DNA sequence handling program written entirely in BASIC and designed to be run on an Atari home computer. Many of the features common to more sophisticated programs have been included. The advantage of this program are its convenience, its transportability and its potential for user modification. The disadvantages are lack of sophistication and speed.

  6. Decoding long nanopore sequencing reads of natural DNA.

    Science.gov (United States)

    Laszlo, Andrew H; Derrington, Ian M; Ross, Brian C; Brinkerhoff, Henry; Adey, Andrew; Nova, Ian C; Craig, Jonathan M; Langford, Kyle W; Samson, Jenny Mae; Daza, Riza; Doering, Kenji; Shendure, Jay; Gundlach, Jens H

    2014-08-01

    Nanopore sequencing of DNA is a single-molecule technique that may achieve long reads, low cost and high speed with minimal sample preparation and instrumentation. Here, we build on recent progress with respect to nanopore resolution and DNA control to interpret the procession of ion current levels observed during the translocation of DNA through the pore MspA. As approximately four nucleotides affect the ion current of each level, we measured the ion current corresponding to all 256 four-nucleotide combinations (quadromers). This quadromer map is highly predictive of ion current levels of previously unmeasured sequences derived from the bacteriophage phi X 174 genome. Furthermore, we show nanopore sequencing reads of phi X 174 up to 4,500 bases in length, which can be unambiguously aligned to the phi X 174 reference genome, and demonstrate proof-of-concept utility with respect to hybrid genome assembly and polymorphism detection. This work provides a foundation for nanopore sequencing of long, natural DNA strands.

  7. POSA : Perl objects for DNA sequencing data analysis

    NARCIS (Netherlands)

    Aerts, JA; Jungerius, BJ; Groenen, MA

    2004-01-01

    Background: Capillary DNA sequencing machines allow the generation of vast amounts of data with little hands-on time. With this expansion of data generation, there is a growing need for automated data processing. Most available software solutions, however, still require user intervention or provide

  8. Cloning and sequencing of a DNA fragment encoding N37 apoptotic peptide derived from p53

    Institute of Scientific and Technical Information of China (English)

    2009-01-01

    Objective It was reported that p53 apoptotic peptide (N37) could inhibit p73 gene through being bound with iASPP,which could induce tumor cell apoptosis. To further explore the function of N37,we constructed the cloning plasmid of DNA fragment encoding p53 (N37) apoptotic peptide by using DNA synthesis and molecular biology methods. Methods According to human p53 sequence from the GenBank database,the primer of p53(N37) gene was designed using Primer V7.0 software. The DNA fragment encoding p53 (N37) apopto...

  9. ADN-Viewer: a 3D approach for bioinformatic analyses of large DNA sequences.

    Science.gov (United States)

    Hérisson, Joan; Ferey, Nicolas; Gros, Pierre-Emmanuel; Gherbi, Rachid

    2007-01-20

    Most of biologists work on textual DNA sequences that are limited to the linear representation of DNA. In this paper, we address the potential offered by Virtual Reality for 3D modeling and immersive visualization of large genomic sequences. The representation of the 3D structure of naked DNA allows biologists to observe and analyze genomes in an interactive way at different levels. We developed a powerful software platform that provides a new point of view for sequences analysis: ADNViewer. Nevertheless, a classical eukaryotic chromosome of 40 million base pairs requires about 6 Gbytes of 3D data. In order to manage these huge amounts of data in real-time, we designed various scene management algorithms and immersive human-computer interaction for user-friendly data exploration. In addition, one bioinformatics study scenario is proposed.

  10. More of an Art than a Science: Using Microbial DNA Sequences to Compose Music†

    Science.gov (United States)

    Larsen, Peter E.

    2016-01-01

    Bacteria are everywhere. Microbial ecology is emerging as a critical field for understanding the relationships between these ubiquitous bacterial communities, the environment, and human health. Next generation DNA sequencing technology provides us a powerful tool to indirectly observe the communities by sequencing and analyzing all of the bacterial DNA present in an environment. The results of the DNA sequencing experiments can generate gigabytes to terabytes of information, however, making it difficult for the citizen scientist to grasp and the educator to convey this data. Here, we present a method for interpreting massive amounts of microbial ecology data as musical performances, easily generated on any computer and using only commonly available or freely available software and the ‘Microbial Bebop’ algorithm. Using this approach, citizen scientists and biology educators can sonify complex data in a fun and interactive format, making it easier to communicate both the importance and the excitement of exploring the planet earth’s largest ecosystem. PMID:27047609

  11. More of an Art than a Science: Using Microbial DNA Sequences to Compose Music.

    Science.gov (United States)

    Larsen, Peter E

    2016-03-01

    Bacteria are everywhere. Microbial ecology is emerging as a critical field for understanding the relationships between these ubiquitous bacterial communities, the environment, and human health. Next generation DNA sequencing technology provides us a powerful tool to indirectly observe the communities by sequencing and analyzing all of the bacterial DNA present in an environment. The results of the DNA sequencing experiments can generate gigabytes to terabytes of information, however, making it difficult for the citizen scientist to grasp and the educator to convey this data. Here, we present a method for interpreting massive amounts of microbial ecology data as musical performances, easily generated on any computer and using only commonly available or freely available software and the 'Microbial Bebop' algorithm. Using this approach, citizen scientists and biology educators can sonify complex data in a fun and interactive format, making it easier to communicate both the importance and the excitement of exploring the planet earth's largest ecosystem.

  12. More of an Art than a Science: Using Microbial DNA Sequences to Compose Music

    Directory of Open Access Journals (Sweden)

    Peter E. Larsen

    2015-12-01

    Full Text Available Bacteria are everywhere. Microbial ecology is emerging as a critical field for understanding the relationships between these ubiquitous bacterial communities, the environment, and human health. Next generation DNA sequencing technology provides us a powerful tool to indirectly observe the communities by sequencing and analyzing all of the bacterial DNA present in an environment. The results of the DNA sequencing experiments can generate gigabytes to terabytes of information, however, making it difficult for the citizen scientist to grasp and the educator to convey this data. Here, we present a method for interpreting massive amounts of microbial ecology data as musical performances, easily generated on any computer and using only commonly available or freely available software and the ‘Microbial Bebop’ algorithm. Using this approach, citizen scientists and biology educators can sonify complex data in a fun and interactive format, making it easier to communicate both the importance and the excitement of exploring the planet earth’s largest ecosystem.

  13. MAGIC-SPP: a database-driven DNA sequence processing package with associated management tools

    Directory of Open Access Journals (Sweden)

    Qu Junfeng

    2006-03-01

    Full Text Available Abstract Background Processing raw DNA sequence data is an especially challenging task for relatively small laboratories and core facilities that produce as many as 5000 or more DNA sequences per week from multiple projects in widely differing species. To meet this challenge, we have developed the flexible, scalable, and automated sequence processing package described here. Results MAGIC-SPP is a DNA sequence processing package consisting of an Oracle 9i relational database, a Perl pipeline, and user interfaces implemented either as JavaServer Pages (JSP or as a Java graphical user interface (GUI. The database not only serves as a data repository, but also controls processing of trace files. MAGIC-SPP includes an administrative interface, a laboratory information management system, and interfaces for exploring sequences, monitoring quality control, and troubleshooting problems related to sequencing activities. In the sequence trimming algorithm it employs new features designed to improve performance with respect to concerns such as concatenated linkers, identification of the expected start position of a vector insert, and extending the useful length of trimmed sequences by bridging short regions of low quality when the following high quality segment is sufficiently long to justify doing so. Conclusion MAGIC-SPP has been designed to minimize human error, while simultaneously being robust, versatile, flexible and automated. It offers a unique combination of features that permit administration by a biologist with little or no informatics background. It is well suited to both individual research programs and core facilities.

  14. A simple method encoding linear single strain DNA sequence with natural numbers

    Institute of Scientific and Technical Information of China (English)

    LI Jiye; XU Yuan; ZHANG Wang

    2008-01-01

    A simple method presenting linear single strain DNA (LssDNA) sequence with natural numbers is introduced in this paper. The method presents LssDNA correspondingly with the numerals 1, 2, 3 and 4. After calculation, the sequence can be coded in natural numbers which can also be decoded into the DNA sequence. Thus, an LssDNA sequence can be expressed in a natural number and a dot at coordinate axes. In the future, a new LssDNA sequences database termed "DotBank" would be realized in which each LssDNA sequence is determined as a dot.

  15. Inhibition of hepatitis B virus replication with linear DNA sequences expressing antiviral micro-RNA shuttles

    Energy Technology Data Exchange (ETDEWEB)

    Chattopadhyay, Saket; Ely, Abdullah; Bloom, Kristie; Weinberg, Marc S. [Antiviral Gene Therapy Research Unit, University of the Witwatersrand (South Africa); Arbuthnot, Patrick, E-mail: Patrick.Arbuthnot@wits.ac.za [Antiviral Gene Therapy Research Unit, University of the Witwatersrand (South Africa)

    2009-11-20

    RNA interference (RNAi) may be harnessed to inhibit viral gene expression and this approach is being developed to counter chronic infection with hepatitis B virus (HBV). Compared to synthetic RNAi activators, DNA expression cassettes that generate silencing sequences have advantages of sustained efficacy and ease of propagation in plasmid DNA (pDNA). However, the large size of pDNAs and inclusion of sequences conferring antibiotic resistance and immunostimulation limit delivery efficiency and safety. To develop use of alternative DNA templates that may be applied for therapeutic gene silencing, we assessed the usefulness of PCR-generated linear expression cassettes that produce anti-HBV micro-RNA (miR) shuttles. We found that silencing of HBV markers of replication was efficient (>75%) in cell culture and in vivo. miR shuttles were processed to form anti-HBV guide strands and there was no evidence of induction of the interferon response. Modification of terminal sequences to include flanking human adenoviral type-5 inverted terminal repeats was easily achieved and did not compromise silencing efficacy. These linear DNA sequences should have utility in the development of gene silencing applications where modifications of terminal elements with elimination of potentially harmful and non-essential sequences are required.

  16. Sequence heterogeneity accelerates protein search for targets on DNA

    Energy Technology Data Exchange (ETDEWEB)

    Shvets, Alexey A.; Kolomeisky, Anatoly B., E-mail: tolya@rice.edu [Department of Chemistry and Center for Theoretical Biological Physics, Rice University, Houston, Texas 77005 (United States)

    2015-12-28

    The process of protein search for specific binding sites on DNA is fundamentally important since it marks the beginning of all major biological processes. We present a theoretical investigation that probes the role of DNA sequence symmetry, heterogeneity, and chemical composition in the protein search dynamics. Using a discrete-state stochastic approach with a first-passage events analysis, which takes into account the most relevant physical-chemical processes, a full analytical description of the search dynamics is obtained. It is found that, contrary to existing views, the protein search is generally faster on DNA with more heterogeneous sequences. In addition, the search dynamics might be affected by the chemical composition near the target site. The physical origins of these phenomena are discussed. Our results suggest that biological processes might be effectively regulated by modifying chemical composition, symmetry, and heterogeneity of a genome.

  17. Solid-State Nanopore-Based DNA Sequencing Technology

    Directory of Open Access Journals (Sweden)

    Zewen Liu

    2016-01-01

    Full Text Available The solid-state nanopore-based DNA sequencing technology is becoming more and more attractive for its brand new future in gene detection field. The challenges that need to be addressed are diverse: the effective methods to detect base-specific signatures, the control of the nanopore’s size and surface properties, and the modulation of translocation velocity and behavior of the DNA molecules. Among these challenges, the realization of the high-quality nanopores with the help of modern micro/nanofabrication technologies is a crucial one. In this paper, typical technologies applied in the field of solid-state nanopore-based DNA sequencing have been reviewed.

  18. The linguistics of DNA. [HUMAN GENOME PROJECT

    Energy Technology Data Exchange (ETDEWEB)

    Searls, D.B. (Univ. of Pennsylvania, Philadelphia (United States))

    Discusses the structure of DNA and RNA and the mechanisms of transcription and translation in relation to the grammatical rules of language. The ultimate purpose is to design a grammar which can be used to write flexible, adaptive computer programs for searching nucleotide sequences, with the goal of being able to search large sequences for gene-coding regions. 11 refs., 16 figs.

  19. A blind testing design for authenticating ancient DNA sequences.

    Science.gov (United States)

    Yang, H; Golenberg, E M; Shoshani, J

    1997-04-01

    Reproducibility is a serious concern among researchers of ancient DNA. We designed a blind testing procedure to evaluate laboratory accuracy and authenticity of ancient DNA obtained from closely related extant and extinct species. Soft tissue and bones of fossil and contemporary museum proboscideans were collected and identified based on morphology by one researcher, and other researchers carried out DNA testing on the samples, which were assigned anonymous numbers. DNA extracted using three principal isolation methods served as template in PCR amplifications of a segment of the cytochrome b gene (mitochondrial genome), and the PCR product was directly sequenced and analyzed. The results show that such a blind testing design performed in one laboratory, when coupled with phylogenetic analysis, can nonarbitrarily test the consistency and reliability of ancient DNA results. Such reproducible results obtained from the blind testing can increase confidence in the authenticity of ancient sequences obtained from postmortem specimens and avoid bias in phylogenetic analysis. A blind testing design may be applicable as an alternative to confirm ancient DNA results in one laboratory when independent testing by two laboratories is not available.

  20. POSA: Perl Objects for DNA Sequencing Data Analysis

    Directory of Open Access Journals (Sweden)

    Jungerius Bart J

    2004-08-01

    Full Text Available Abstract Background Capillary DNA sequencing machines allow the generation of vast amounts of data with little hands-on time. With this expansion of data generation, there is a growing need for automated data processing. Most available software solutions, however, still require user intervention or provide modules that need advanced informatics skills to allow implementation in pipelines. Results Here we present POSA, a pair of new perl objects that describe DNA sequence traces and Phrap contig assemblies in detail. Methods included in POSA include basecalling with quality scores (by Phred, contig assembly (by Phrap, generation of primer3 input and automated SNP annotation (by PolyPhred. Although easily implemented by users with only limited programming experience, these objects considerabily reduce hands-on analysis time compared to using the Staden package for extracting sequence information from raw sequencing files and for SNP discovery. Conclusions The POSA objects allow a flexible and easy design, implementation and usage of perl-based pipelines to handle and analyze DNA sequencing data, while requiring only minor programming skills.

  1. DNA watermarks in non-coding regulatory sequences

    Directory of Open Access Journals (Sweden)

    Pyka Martin

    2009-07-01

    Full Text Available Abstract Background DNA watermarks can be applied to identify the unauthorized use of genetically modified organisms. It has been shown that coding regions can be used to encrypt information into living organisms by using the DNA-Crypt algorithm. Yet, if the sequence of interest presents a non-coding DNA sequence, either the function of a resulting functional RNA molecule or a regulatory sequence, such as a promoter, could be affected. For our studies we used the small cytoplasmic RNA 1 in yeast and the lac promoter region of Escherichia coli. Findings The lac promoter was deactivated by the integrated watermark. In addition, the RNA molecules displayed altered configurations after introducing a watermark, but surprisingly were functionally intact, which has been verified by analyzing the growth characteristics of both wild type and watermarked scR1 transformed yeast cells. In a third approach we introduced a second overlapping watermark into the lac promoter, which did not affect the promoter activity. Conclusion Even though the watermarked RNA and one of the watermarked promoters did not show any significant differences compared to the wild type RNA and wild type promoter region, respectively, it cannot be generalized that other RNA molecules or regulatory sequences behave accordingly. Therefore, we do not recommend integrating watermark sequences into regulatory regions.

  2. DNA sequence analysis of newly formed telomeres in yeast.

    Science.gov (United States)

    Wang, S S; Pluta, A F; Zakian, V A

    1989-01-01

    A plasmid can be maintained in linear form in baker's yeast if it bears telomeric sequences at each end. Linear plasmids bearing cloned telomeric C4A4 repeats at one end (test end) and a natural DNA terminus with approximately 300 bps of C4A2 repeats at the other or control end were introduced by transformation into yeast. Test-end termini of 28 to 112 bps supported telomere formation. During telomere formation, C4A2 repeats were often transferred to test-end termini. To determine in greater detail the fate of test-end sequences on these plasmids after propagation in yeast, test-end telomeres were subcloned into E. coli and sequenced. DNA sequencing established a number of points about the molecular events involved in telomere formation in yeast. The results suggest that there are at least two mechanisms for telomere formation in yeast. One is mediated by a recombination event that requires neither a long stretch of homology nor the RAD52 gene product. The other mechanism is by addition of C1-3A repeats to the termini of linear DNA molecules. The telomeric sequence required to support C1-3A addition need not be at the very end of a molecule for telomere formation.

  3. A Nano-Biosensor for DNA Sequence Detection Using Absorption Spectra of SWNT-DNA Composite

    Directory of Open Access Journals (Sweden)

    J. Bansal

    2011-01-01

    Full Text Available A biosensor based on Single Walled Carbon Nanotube (SWNT-Poly (GTn ssDNA hybrid has been developed for medical diagnostics. The absorption spectrum of this assay is determined with the help of a Shimadzu UV-VIS-NIR spectrophotometer. Two distinct bands each containing three peaks corresponding to first and second van Hove singularities in the density of states of the nanotubes were observed in the absorption spectrum. When a single-stranded DNA (ssDNA having a sequence complementary to probic DNA is added to the ssDNA-SWNT conjugates, hybridization takes place, which causes the red shift of absorption spectrum of nanotubes. On the other hand, when the DNA is noncomplementary, no shift in the absorption spectrum occurs since hybridization between the DNA and probe does not take place. The red shifting of the spectrum is considered to be due to change in the dielectric environment around nanotubes.

  4. Detection of papillomaviral DNA sequences in a feline oral squamous cell carcinoma.

    Science.gov (United States)

    Munday, J S; Howe, L; French, A; Squires, R A; Sugiarto, H

    2009-04-01

    Oral squamous cell carcinomas (OSCCs) are common and often fatal feline neoplasms. Factors that predispose to neoplasm development in cats are poorly defined. Around 25% of human OSCCs are caused by papillomaviruses (PVs). To determine if PVs are associated with OSCCs in cats, three sets of consensus primers were used to evaluate 20 feline OSCCs and 20 non-neoplastic feline oral lesions for the presence of PV DNA. Papillomaviral sequences were detected within one OSCC, but no non-neoplastic lesion. Sequencing of the amplified DNA revealed a previously unreported PV that was most similar to human PV type 76. This is the first time PV DNA has been amplified from the oral cavity of a cat. However, while these results suggest that feline gingival epithelial cells can be infected by PVs, they do not support a causal association between viral infection and the development of feline OSCCs.

  5. The DNA methylome of human peripheral blood mononuclear cells

    DEFF Research Database (Denmark)

    Li, Yingrui; Zhu, Jingde; Tian, Geng

    2010-01-01

    DNA methylation plays an important role in biological processes in human health and disease. Recent technological advances allow unbiased whole-genome DNA methylation (methylome) analysis to be carried out on human cells. Using whole-genome bisulfite sequencing at 24.7-fold coverage (12.3-fold per...... strand), we report a comprehensive (92.62%) methylome and analysis of the unique sequences in human peripheral blood mononuclear cells (PBMC) from the same Asian individual whose genome was deciphered in the YH project. PBMC constitute an important source for clinical blood tests world-wide. We found...... that 68.4% of CpG sites and 80% displayed allele-specific expression (ASE). These data demonstrate that ASM is a recurrent phenomenon and is highly correlated with ASE in human PBMCs. Together with recently reported similar studies, our study provides a comprehensive resource for future epigenomic...

  6. Environmental DNA sequencing primers for eutardigrades and bdelloid rotifers

    Directory of Open Access Journals (Sweden)

    Martin Andrew P

    2009-12-01

    Full Text Available Abstract Background The time it takes to isolate individuals from environmental samples and then extract DNA from each individual is one of the problems with generating molecular data from meiofauna such as eutardigrades and bdelloid rotifers. The lack of consistent morphological information and the extreme abundance of these classes makes morphological identification of rare, or even common cryptic taxa a large and unwieldy task. This limits the ability to perform large-scale surveys of the diversity of these organisms. Here we demonstrate a culture-independent molecular survey approach that enables the generation of large amounts of eutardigrade and bdelloid rotifer sequence data directly from soil. Our PCR primers, specific to the 18s small-subunit rRNA gene, were developed for both eutardigrades and bdelloid rotifers. Results The developed primers successfully amplified DNA of their target organism from various soil DNA extracts. This was confirmed by both the BLAST similarity searches and phylogenetic analyses. Tardigrades showed much better phylogenetic resolution than bdelloids. Both groups of organisms exhibited varying levels of endemism. Conclusion The development of clade-specific primers for characterizing eutardigrades and bdelloid rotifers from environmental samples should greatly increase our ability to characterize the composition of these taxa in environmental samples. Environmental sequencing as shown here differs from other molecular survey methods in that there is no need to pre-isolate the organisms of interest from soil in order to amplify their DNA. The DNA sequences obtained from methods that do not require culturing can be identified post-hoc and placed phylogenetically as additional closely related sequences are obtained from morphologically identified conspecifics. Our non-cultured environmental sequence based approach will be able to provide a rapid and large-scale screening of the presence, absence and diversity of

  7. Environmental DNA sequencing primers for eutardigrades and bdelloid rotifers

    Science.gov (United States)

    2009-01-01

    Background The time it takes to isolate individuals from environmental samples and then extract DNA from each individual is one of the problems with generating molecular data from meiofauna such as eutardigrades and bdelloid rotifers. The lack of consistent morphological information and the extreme abundance of these classes makes morphological identification of rare, or even common cryptic taxa a large and unwieldy task. This limits the ability to perform large-scale surveys of the diversity of these organisms. Here we demonstrate a culture-independent molecular survey approach that enables the generation of large amounts of eutardigrade and bdelloid rotifer sequence data directly from soil. Our PCR primers, specific to the 18s small-subunit rRNA gene, were developed for both eutardigrades and bdelloid rotifers. Results The developed primers successfully amplified DNA of their target organism from various soil DNA extracts. This was confirmed by both the BLAST similarity searches and phylogenetic analyses. Tardigrades showed much better phylogenetic resolution than bdelloids. Both groups of organisms exhibited varying levels of endemism. Conclusion The development of clade-specific primers for characterizing eutardigrades and bdelloid rotifers from environmental samples should greatly increase our ability to characterize the composition of these taxa in environmental samples. Environmental sequencing as shown here differs from other molecular survey methods in that there is no need to pre-isolate the organisms of interest from soil in order to amplify their DNA. The DNA sequences obtained from methods that do not require culturing can be identified post-hoc and placed phylogenetically as additional closely related sequences are obtained from morphologically identified conspecifics. Our non-cultured environmental sequence based approach will be able to provide a rapid and large-scale screening of the presence, absence and diversity of Bdelloidea and Eutardigrada in

  8. Spectral sum rules and search for periodicities in DNA sequences

    Science.gov (United States)

    Chechetkin, V. R.

    2011-04-01

    Periodic patterns play the important regulatory and structural roles in genomic DNA sequences. Commonly, the underlying periodicities should be understood in a broad statistical sense, since the corresponding periodic patterns have been strongly distorted by the random point mutations and insertions/deletions during molecular evolution. The latent periodicities in DNA sequences can be efficiently displayed by Fourier transform. The criteria of significance for observed periodicities are obtained via the comparison versus the counterpart characteristics of the reference random sequences. We show that the restrictions imposed on the significance criteria by the rigorous spectral sum rules can be rationally described with De Finetti distribution. This distribution provides the convenient intermediate asymptotic form between Rayleigh distribution and exact combinatoric theory.

  9. DNA sequencing leads to genomics progress in China

    Institute of Scientific and Technical Information of China (English)

    WU JiaYan; XIAO JingFa; ZHANG RuoSi; YU Jun

    2011-01-01

    1 Science in the large-scale sequencing era Ten years ago,the first draft sequence assembly of the human genome was completed [1],bringing biomedical research one-step closer toward the goal of revolutionizing diagnosis,prevention,and treatment of human diseases.Recently,journalists from the journal Nature surveyed more than 1000 life scientists regarding this laudable aim [2],obtaining substantially negative responses [3].However,almost all of those surveyed had been influenced,in one way or another,by the availability of the human genome sequence,and they also agreed with the notion that the "sequence is the start." The complexity of genome biology and almost every aspect of human biology is far greater than previously thought [4].

  10. VoSeq: a voucher and DNA sequence web application.

    Directory of Open Access Journals (Sweden)

    Carlos Peña

    Full Text Available There is an ever growing number of molecular phylogenetic studies published, due to, in part, the advent of new techniques that allow cheap and quick DNA sequencing. Hence, the demand for relational databases with which to manage and annotate the amassing DNA sequences, genes, voucher specimens and associated biological data is increasing. In addition, a user-friendly interface is necessary for easy integration and management of the data stored in the database back-end. Available databases allow management of a wide variety of biological data. However, most database systems are not specifically constructed with the aim of being an organizational tool for researchers working in phylogenetic inference. We here report a new software facilitating easy management of voucher and sequence data, consisting of a relational database as back-end for a graphic user interface accessed via a web browser. The application, VoSeq, includes tools for creating molecular datasets of DNA or amino acid sequences ready to be used in commonly used phylogenetic software such as RAxML, TNT, MrBayes and PAUP, as well as for creating tables ready for publishing. It also has inbuilt BLAST capabilities against all DNA sequences stored in VoSeq as well as sequences in NCBI GenBank. By using mash-ups and calls to web services, VoSeq allows easy integration with public services such as Yahoo! Maps, Flickr, Encyclopedia of Life (EOL and GBIF (by generating data-dumps that can be processed with GBIF's Integrated Publishing Toolkit.

  11. Cloning, characterization, and properties of seven triplet repeat DNA sequences.

    Science.gov (United States)

    Ohshima, K; Kang, S; Larson, J E; Wells, R D

    1996-07-12

    Several neuromuscular and neurodegenerative diseases are caused by genetically unstable triplet repeat sequences (CTG.CAG, CGG.CCG, or AAG.CTT) in or near the responsible genes. We implemented novel cloning strategies with chemically synthesized oligonucleotides to clone seven of the triplet repeat sequences (GTA.TAC, GAT.ATC, GTT.AAC, CAC.GTG, AGG.CCT, TCG.CGA, and AAG.CTT), and the adjoining paper (Ohshima, K., Kang, S., Larson, J. E., and Wells, R. D.(1996) J. Biol. Chem. 271, 16784-16791) describes studies on TTA.TAA. This approach in conjunction with in vivo expansion studies in Escherichia coli enabled the preparation of at least 81 plasmids containing the repeat sequences with lengths of approximately 16 up to 158 triplets in both orientations with varying extents of polymorphisms. The inserts were characterized by DNA sequencing as well as DNA polymerase pausings, two-dimensional agarose gel electrophoresis, and chemical probe analyses to evaluate the capacity to adopt negative supercoil induced non-B DNA conformations. AAG.CTT and AGG.CCT form intramolecular triplexes, and the other five repeat sequences do not form any previously characterized non-B structures. However, long tracts of TCG.CGA showed strong inhibition of DNA synthesis at specific loci in the repeats as seen in the cases of CTG.CAG and CGG.CCG (Kang, S., Ohshima, K., Shimizu, M., Amirhaeri, S., and Wells, R. D.(1995) J. Biol. Chem. 270, 27014-27021). This work along with other studies (Wells, R. D.(1996) J. Biol. Chem. 271, 2875-2878) on CTG.CAG, CGG.CCG, and TTA.TAA makes available long inserts of all 10 triplet repeat sequences for a variety of physical, molecular biological, genetic, and medical investigations. A model to explain the reduction in mRNA abundance in Friedreich's ataxia based on intermolecular triplex formation is proposed.

  12. Perspectives of DNA microarray and next-generation DNA sequencing technologies

    Institute of Scientific and Technical Information of China (English)

    2009-01-01

    DNA microarray and next-generation DNA sequencing technologies are important tools for high-throughput genome research,in revealing both the structural and functional characteristics of genomes.In the past decade the DNA microarray technologies have been widely applied in the studies of functional genomics,systems biology and pharmacogenomics.The next-generation DNA sequencing method was first introduced by the 454 Company in 2003,immediately followed by the establishment of the Solexa and Solid techniques by other biotech companies.Though it has not been long since the first emergence of this technology,with the fast and impressive improvement,the application of this technology has extended to almost all fields of genomics research,as a rival challenging the existing DNA microarray technology.This paper briefly reviews the working principles of these two technologies as well as their application and perspectives in genome research.

  13. Detection of genetically modified DNA sequences in milk from the Italian market.

    Science.gov (United States)

    Agodi, Antonella; Barchitta, Martina; Grillo, Agata; Sciacca, Salvatore

    2006-01-01

    The possible transfer and accumulation of novel DNA and/or proteins in food for human consumption derived from animals receiving genetically modified (GM) feed is at present the object of scientific dispute. A number of studies failed to identify GM DNA in milk, meat, or eggs derived from livestock receiving GM feed ingredients. The present study was performed in order to: (i) develop a valid protocol by PCR and multicomponent analysis for the detection of specific DNA sequences in milk, focused on GM maize and GM soybean; (ii) assess the stability of transgenic DNA after pasteurization treatment and (iii) determine the presence of GM DNA sequences in milk samples collected from the Italian market. Results from the screening of 60 samples of 12 different milk brands demonstrated the presence of GM maize sequences in 15 (25%) and of GM soybean sequences in 7 samples (11.7%). Our screening methodology shows a very high sensitivity and the use of an automatic identification of the amplified products increases its specificity and reliability. Moreover, we demonstrated that the pasteurization process is not able to degrade the DNA sequences in spiked milk samples. The detection of GM DNA in milk can be interpreted as an indicator of fecal or airborne contamination, respectively, with feed DNA or feed particles, although an alternative source of contamination, possibly recognizable in the natural environment can be suggested. Further studies, performed on a larger number of milk samples, are needed to understand the likely source of contamination of milk collected from the Italian market.

  14. Origin and diversification of minisatellites derived from human Alu sequences.

    Science.gov (United States)

    Jurka, Jerzy; Gentles, Andrew J

    2006-01-01

    We analyze minisatellites derived from Alu fragments corresponding approximately to the first 44 bases of human Alu consensus sequences from different subfamilies. The origin of Alu-derived minisatellites appears to have been mediated by short flanking repeats, as first proposed by Haber and Louis [Haber, J.E., Louis, E.J., 1998. Minisatellite origins in yeast and humans. Genomics 48, 132-135.]. We also present evidence for base substitutions and deletions introduced to minisatellites by gene conversion with partially similar but unrelated flanking regions. Segments flanked by short direct repeats are relatively common in different regions of Alu and other repetitive sequences. Our analysis shows that they can be effectively used in comparative studies of the overall sequence context which may contribute to instability of DNA segments flanked by short direct repeats.

  15. Recent developments in sequence selective minor groove DNA effectors.

    Science.gov (United States)

    Reddy, B S; Sharma, S K; Lown, J W

    2001-04-01

    DNA is a well characterized intracellular target but its large size and sequential nature make it an elusive target for selective drug action. Binding of low molecular weight ligands to DNA causes a wide variety of potential biological responses. In this respect the main consideration is given to recent developments in DNA sequence selective binding agents bearing conjugated effectors because of their potential application in diagnosis and treatment of cancers as well as in molecular biology. Recent progress in the development of cross linked lexitropsin oligopeptides and hairpins, which bind selectively to the minor groove of duplex DNA, is discussed. Bis-distamycins and related lexitropsins show inhibitory activity against HIV-1 and HIV-2 integrases at low nanomolar concentrations. Benzoyl nitrogen mustard analogs of lexitropsins are active against a variety of tumor models. Certain of the bis-benzimidazoles show altered DNA sequence preference and bind to DNA at 5'CG and TG sequences rather than at the preferred AT sites of the parent drug. A comparison of bifunctional bizelesin with monoalkylating adozelesin shows that it appears to have an increased sequence selectivity such that monoalkylating compounds react at more than one site but bizelesin reacts only at sites where there are two suitably positioned alkylation sites. Adozelesin, bizelesin and carzelesin are far more potent as cytotoxic agents than cisplatin or doxorubicin. A new class of 1,2,9,9a-tetrahydrocyclo-propa[c]benz[e]indole-4-one (CBI) analogs i.e., CBI-lexitropsin conjugates arising from the latter leads are also discussed.A number of cyclopropylpyrroloindole (CPI) and CBI-lexitropsin conjugates related to CC-1065 alkylate at the N3 position of adenine in the minor groove of DNA in a sequence specific manner, and also show cytotoxicities in the femtomolar range. The cross linking efficiency of PBD dimers is much greater than that of other cross linkers including cisplatin, and melphalan. A new

  16. Complete cDNA sequence of the preproform of human pregnancy-associated plasma protein-A. Evidence for expression in the brain and induction by cAMP

    DEFF Research Database (Denmark)

    Haaning, Jesper; Oxvig, Claus; Overgaard, Michael Toft

    1996-01-01

    A cDNA that encodes the prepropeptide of pregnancy-associated plasma protein-A (preproPAPP-A), a putative metalloproteinase, has been cloned and sequenced. PAPP-A is synthesized in the placenta as a 1627-residue precursor preproprotein with a putative 22-residue signal peptide and a highly basic ...

  17. [Characterization and modification of phage T7 DNA polymerase for use in DNA sequencing]: Progress report

    Energy Technology Data Exchange (ETDEWEB)

    1992-01-01

    This project focuses on the DNA polymerase and accessory proteins of phage T7 for use in DNA sequence analysis. T7 DNA polymerase (gene 5 protein) interacts with accessory proteins for the acquisition of properties such as processivity that are necessary for DNA replication. One goal is to understand these interactions in order to modify the proteins to increase their usefulness with DNA sequence analysis. Using a genetically modified gene 5 protein lacking 3' to 5' exonuclease activity we have found that in the presence of manganese there is no discrimination against dideoxynucleotides, a property that enables novel approaches to DNA sequencing using automated technology. Pyrophosphorolysis can create problems in DNA sequence determination, a problem that can be eliminated by the addition of pyrophosphatase. Crystals of the gene 5 protein/thioredoxin complex have now been obtained and X-ray diffraction analysis will be undertaken once their quality has been improved. Amino acid changes in gene 5 protein have been identified that alter its interaction with thioredoxin. Characterization of these proteins should help determine how thioredoxin confers processivity on polymerization. We have characterized the 17 DNA binding protein, the gene 2.5 protein, and shown that it interacts with gene 5 protein and gene 4 protein. The gene 2.5 protein mediates homologous base pairing and strand uptake. Gene 5.5 protein interacts with E. coli Hl protein and affects gene expression. Biochemical and genetic studies on the T7 56-kDa gene 4 protein, the helicase, are focused on its physical interaction with T7 DNA polymerase and the mechanism by which the hydrolysis of nucleoside triphosphates fuels its unidirectional translocation on DNA.

  18. [Characterization and modification of phage T7 DNA polymerase for use in DNA sequencing]: Progress report

    Energy Technology Data Exchange (ETDEWEB)

    1992-12-31

    This project focuses on the DNA polymerase and accessory proteins of phage T7 for use in DNA sequence analysis. T7 DNA polymerase (gene 5 protein) interacts with accessory proteins for the acquisition of properties such as processivity that are necessary for DNA replication. One goal is to understand these interactions in order to modify the proteins to increase their usefulness with DNA sequence analysis. Using a genetically modified gene 5 protein lacking 3` to 5` exonuclease activity we have found that in the presence of manganese there is no discrimination against dideoxynucleotides, a property that enables novel approaches to DNA sequencing using automated technology. Pyrophosphorolysis can create problems in DNA sequence determination, a problem that can be eliminated by the addition of pyrophosphatase. Crystals of the gene 5 protein/thioredoxin complex have now been obtained and X-ray diffraction analysis will be undertaken once their quality has been improved. Amino acid changes in gene 5 protein have been identified that alter its interaction with thioredoxin. Characterization of these proteins should help determine how thioredoxin confers processivity on polymerization. We have characterized the 17 DNA binding protein, the gene 2.5 protein, and shown that it interacts with gene 5 protein and gene 4 protein. The gene 2.5 protein mediates homologous base pairing and strand uptake. Gene 5.5 protein interacts with E. coli Hl protein and affects gene expression. Biochemical and genetic studies on the T7 56-kDa gene 4 protein, the helicase, are focused on its physical interaction with T7 DNA polymerase and the mechanism by which the hydrolysis of nucleoside triphosphates fuels its unidirectional translocation on DNA.

  19. Cloning and sequencing of human lambda immunoglobulin genes by the polymerase chain reaction.

    Science.gov (United States)

    Songsivilai, S; Bye, J M; Marks, J D; Hughes-Jones, N C

    1990-12-01

    Universal oligonucleotide primers, designed for amplifying and sequencing genes encoding the rearranged human lambda immunoglobulin variable region, were validated by amplification of the lambda light chain genes from four human heterohybridoma cell lines and in the generation of a cDNA library of human V lambda sequences from Epstein-Barr virus-transformed human peripheral blood lymphocytes. This technique allows rapid cloning and sequencing of human immunoglobulin genes, and has potential applications in the rescue of unstable human antibody-producing cell lines and in the production of human monoclonal antibodies.

  20. Next generation sequencing of DNA-launched Chikungunya vaccine virus

    Energy Technology Data Exchange (ETDEWEB)

    Hidajat, Rachmat; Nickols, Brian [Medigen, Inc., 8420 Gas House Pike, Suite S, Frederick, MD 21701 (United States); Forrester, Naomi [Institute for Human Infections and Immunity, Sealy Center for Vaccine Development and Department of Pathology, University of Texas Medical Branch, GNL, 301 University Blvd., Galveston, TX 77555 (United States); Tretyakova, Irina [Medigen, Inc., 8420 Gas House Pike, Suite S, Frederick, MD 21701 (United States); Weaver, Scott [Institute for Human Infections and Immunity, Sealy Center for Vaccine Development and Department of Pathology, University of Texas Medical Branch, GNL, 301 University Blvd., Galveston, TX 77555 (United States); Pushko, Peter, E-mail: ppushko@medigen-usa.com [Medigen, Inc., 8420 Gas House Pike, Suite S, Frederick, MD 21701 (United States)

    2016-03-15

    Chikungunya virus (CHIKV) represents a pandemic threat with no approved vaccine available. Recently, we described a novel vaccination strategy based on iDNA® infectious clone designed to launch a live-attenuated CHIKV vaccine from plasmid DNA in vitro or in vivo. As a proof of concept, we prepared iDNA plasmid pCHIKV-7 encoding the full-length cDNA of the 181/25 vaccine. The DNA-launched CHIKV-7 virus was prepared and compared to the 181/25 virus. Illumina HiSeq2000 sequencing revealed that with the exception of the 3′ untranslated region, CHIKV-7 viral RNA consistently showed a lower frequency of single-nucleotide polymorphisms than the 181/25 RNA including at the E2-12 and E2-82 residues previously identified as attenuating mutations. In the CHIKV-7, frequencies of reversions at E2-12 and E2-82 were 0.064% and 0.086%, while in the 181/25, frequencies were 0.179% and 0.133%, respectively. We conclude that the DNA-launched virus has a reduced probability of reversion mutations, thereby enhancing vaccine safety. - Highlights: • Chikungunya virus (CHIKV) is an emerging pandemic threat. • In vivo DNA-launched attenuated CHIKV is a novel vaccine technology. • DNA-launched virus was sequenced using HiSeq2000 and compared to the 181/25 virus. • DNA-launched virus has lower frequency of SNPs at E2-12 and E2-82 attenuation loci.

  1. Combining two technologies for full genome sequencing of human.

    Science.gov (United States)

    Skryabin, K G; Prokhortchouk, E B; Mazur, A M; Boulygina, E S; Tsygankova, S V; Nedoluzhko, A V; Rastorguev, S M; Matveev, V B; Chekanov, N N; D A, Goranskaya; Teslyuk, A B; Gruzdeva, N M; Velikhov, V E; Zaridze, D G; Kovalchuk, M V

    2009-10-01

    At present, the new technologies of DNA sequencing are rapidly developing allowing quick and efficient characterisation of organisms at the level of the genome structure. In this study, the whole genome sequencing of a human (Russian man) was performed using two technologies currently present on the market - Sequencing by Oligonucleotide Ligation and Detection (SOLiD™) (Applied Biosystems) and sequencing technologies of molecular clusters using fluorescently labeled precursors (Illumina). The total number of generated data resulted in 108.3 billion base pairs (60.2 billion from Illumina technology and 48.1 billion from SOLiD technology). Statistics performed on reads generated by GAII and SOLiD showed that they covered 75% and 96% of the genome respectively. Short polymorphic regions were detected with comparable accuracy however, the absolute amount of them revealed by SOLiD was several times less than by GAII. Optimal algorithm for using the latest methods of sequencing was established for the analysis of individual human genomes. The study is the first Russian effort towards whole human genome sequencing.

  2. DNA-binding specificities of human transcription factors.

    Science.gov (United States)

    Jolma, Arttu; Yan, Jian; Whitington, Thomas; Toivonen, Jarkko; Nitta, Kazuhiro R; Rastas, Pasi; Morgunova, Ekaterina; Enge, Martin; Taipale, Mikko; Wei, Gonghong; Palin, Kimmo; Vaquerizas, Juan M; Vincentelli, Renaud; Luscombe, Nicholas M; Hughes, Timothy R; Lemaire, Patrick; Ukkonen, Esko; Kivioja, Teemu; Taipale, Jussi

    2013-01-17

    Although the proteins that read the gene regulatory code, transcription factors (TFs), have been largely identified, it is not well known which sequences TFs can recognize. We have analyzed the sequence-specific binding of human TFs using high-throughput SELEX and ChIP sequencing. A total of 830 binding profiles were obtained, describing 239 distinctly different binding specificities. The models represent the majority of human TFs, approximately doubling the coverage compared to existing systematic studies. Our results reveal additional specificity determinants for a large number of factors for which a partial specificity was known, including a commonly observed A- or T-rich stretch that flanks the core motifs. Global analysis of the data revealed that homodimer orientation and spacing preferences, and base-stacking interactions, have a larger role in TF-DNA binding than previously appreciated. We further describe a binding model incorporating these features that is required to understand binding of TFs to DNA.

  3. Comparison of levels of human immunodeficiency virus type 1 RNA in plasma as measured by the NucliSens nucleic acid sequence-based amplification and Quantiplex branched-DNA assays.

    Science.gov (United States)

    Ginocchio, C C; Tetali, S; Washburn, D; Zhang, F; Kaplan, M H

    1999-04-01

    This study compared levels of human immunodeficiency virus type 1 RNA in plasma as measured by the Quantiplex branched-DNA and NucliSens nucleic acid sequence-based amplification assays. RNA was detectable in 118 of 184 samples (64.13%) by the Quantiplex assay and in 171 of 184 samples (92.94%) by the NucliSens assay. Regression analysis indicated that a linear relationship existed between the two sets of values (P < 0.0001), although the Quantiplex and NucliSens values were significantly different (P < 0.001), with the NucliSens values being approximately 0.323 log higher. Spearman correlation analysis indicated that the overall changes in patient viral load patterns were highly correlative between the two assays: r = 0.912, P < 0.0001. The lower limits of sensitivity were determined to be approximately 100 copies/ml and 1,200 to 1,400 copies/ml for the NucliSens and Quantiplex assays, respectively.

  4. CLONING AND SEQUENCING OF MATURED FRAGMENT OF HUMAN NEVER GROWTH FACTOR GENE

    Institute of Scientific and Technical Information of China (English)

    马巍; 吴玲; 王德利; 刘淼; 任惠民; 杨广笑; 王全颖

    2003-01-01

    Objective Molecular cloning and sequencing of the human matured fragment of human nerve growth factor(NGF) gene. Methods Extracting the human genomic DNA from the white blood cells as templates, the gene of NGF was cloned by using PCR and T-vector cloning method. Screening the positive clones and identified by the restriction enzymes, and then the cloned amplified fragment was sequenced and analyzed. Results DNA sequence comparison the cloned gene of NGF with the GenBank (V01511) sequence demonstrated that both of sequences were identical, 354bp length. Conclusion Cloning the NGF gene from the human genomic DNA has paved the way for further study on gene therapy of nerve system injury.

  5. Distribution patterns of postmortem damage in human mitochondrial DNA

    DEFF Research Database (Denmark)

    Gilbert, M Thomas P; Willerslev, Eske; Hansen, Anders J

    2002-01-01

    The distribution of postmortem damage in mitochondrial DNA retrieved from 37 ancient human DNA samples was analyzed by cloning and was compared with a selection of published animal data. A relative rate of damage (rho(v)) was calculated for nucleotide positions within the human hypervariable region......, such as MT5, have lower in vivo mutation rates and lower postmortem-damage rates. The postmortem data also identify a possible functional subregion of the HVR1, termed "low-diversity 1," through the lack of sequence damage. The amount of postmortem damage observed in mitochondrial coding regions...

  6. Generation and Analysis of Full-length cDNA Sequences from Elephant Shark (Callorhinchus milii)

    KAUST Repository

    Kodzius, Rimantas

    2009-03-17

    Cartilaginous fishes are the oldest living group of jawed vertebrates and therefore is an important group for understanding the evolution of vertebrate genomes including the human genome. Our laboratory has proposed elephant shark (C. milii) as a model cartilaginous fish genome because of its relatively small genome size (910 Mb). The whole genome of C. milii is being sequenced (first cartilaginous fish genome to be sequenced completely). To characterize the transcriptome of C. milii and to assist in annotating exon-intron boundaries, transcriptional start sites and alternatively spliced transcripts, we are generating full-length cDNA sequences from C. milii.

  7. Cloning and sequencing of Indian Water buffalo (Bubalus bubalis) interleukin-3 cDNA

    KAUST Repository

    Sugumar, Thennarasu

    2011-12-12

    Full-length cDNA (435 bp) of the interleukin-3(IL-3) gene of the Indian water buffalo was amplified by reverse transcriptase-polymerase chain reaction and sequenced. This sequence had 96% nucleotide identity and 92% amino acid identity with bovine IL-3. There are 10 amino acid substitutions in buffalo compared with that of bovine. The amino acid sequence of buffalo IL-3 also showed very high identity with that of other ruminants, indicating functional cross-reactivity. Structural homology modelling of buffalo IL-3 protein with human IL-3 showed the presence of five helical structures.

  8. Automated parallel DNA sequencing on multiple channel microchips.

    Science.gov (United States)

    Liu, S; Ren, H; Gao, Q; Roach, D J; Loder, R T; Armstrong, T M; Mao, Q; Blaga, I; Barker, D L; Jovanovich, S B

    2000-05-09

    We report automated DNA sequencing in 16-channel microchips. A microchip prefilled with sieving matrix is aligned on a heating plate affixed to a movable platform. Samples are loaded into sample reservoirs by using an eight-tip pipetting device, and the chip is docked with an array of electrodes in the focal plane of a four-color scanning detection system. Under computer control, high voltage is applied to the appropriate reservoirs in a programmed sequence that injects and separates the DNA samples. An integrated four-color confocal fluorescent detector automatically scans all 16 channels. The system routinely yields more than 450 bases in 15 min in all 16 channels. In the best case using an automated base-calling program, 543 bases have been called at an accuracy of >99%. Separations, including automated chip loading and sample injection, normally are completed in less than 18 min. The advantages of DNA sequencing on capillary electrophoresis chips include uniform signal intensity and tolerance of high DNA template concentration. To understand the fundamentals of these unique features we developed a theoretical treatment of cross-channel chip injection that we call the differential concentration effect. We present experimental evidence consistent with the predictions of the theory.

  9. Impacts of degraded DNA on restriction enzyme associated DNA sequencing (RADSeq).

    Science.gov (United States)

    Graham, Carly F; Glenn, Travis C; McArthur, Andrew G; Boreham, Douglas R; Kieran, Troy; Lance, Stacey; Manzon, Richard G; Martino, Jessica A; Pierson, Todd; Rogers, Sean M; Wilson, Joanna Y; Somers, Christopher M

    2015-11-01

    Degraded DNA from suboptimal field sampling is common in molecular ecology. However, its impact on techniques that use restriction site associated next-generation DNA sequencing (RADSeq, GBS) is unknown. We experimentally examined the effects of in situDNA degradation on data generation for a modified double-digest RADSeq approach (3RAD). We generated libraries using genomic DNA serially extracted from the muscle tissue of 8 individual lake whitefish (Coregonus clupeaformis) following 0-, 12-, 48- and 96-h incubation at room temperature posteuthanasia. This treatment of the tissue resulted in input DNA that ranged in quality from nearly intact to highly sheared. All samples were sequenced as a multiplexed pool on an Illumina MiSeq. Libraries created from low to moderately degraded DNA (12-48 h) performed well. In contrast, the number of RADtags per individual, number of variable sites, and percentage of identical RADtags retained were all dramatically reduced when libraries were made using highly degraded DNA (96-h group). This reduction in performance was largely due to a significant and unexpected loss of raw reads as a result of poor quality scores. Our findings remained consistent after changes in restriction enzymes, modified fold coverage values (2- to 16-fold), and additional read-length trimming. We conclude that starting DNA quality is an important consideration for RADSeq; however, the approach remains robust until genomic DNA is extensively degraded.

  10. Repeat Finding Techniques, Data Structures and Algorithms in DNA sequences: A Survey

    Directory of Open Access Journals (Sweden)

    Freeson Kaniwa

    2015-09-01

    Full Text Available DNA sequencing technologies keep getting faster and cheaper leading to massive availability of entire human genomes. This massive availability calls for better analysis tools with a potential to realize a shift from reactive to predictive medicine. The challenge remains, since the entire human genomes need more space and processing power than that can be offered by a standard Desktop PC for their analysis. A background of key concepts surrounding the area of DNA analysis is given and a review of selected prominent algorithms used in this area. The significance of this paper would be to survey the concepts surrounding DNA analysis so as to provide a deep rooted understanding and knowledge transfer regarding existing approaches for DNA analysis using Burrows-Wheeler transform, Wavelet tree and their respective strengths and weaknesses. Consequent to this survey, the paper attempts to provide some directions for future research.

  11. Human Genome Sequencing in Health and Disease

    Science.gov (United States)

    Gonzaga-Jauregui, Claudia; Lupski, James R.; Gibbs, Richard A.

    2013-01-01

    Following the “finished,” euchromatic, haploid human reference genome sequence, the rapid development of novel, faster, and cheaper sequencing technologies is making possible the era of personalized human genomics. Personal diploid human genome sequences have been generated, and each has contributed to our better understanding of variation in the human genome. We have consequently begun to appreciate the vastness of individual genetic variation from single nucleotide to structural variants. Translation of genome-scale variation into medically useful information is, however, in its infancy. This review summarizes the initial steps undertaken in clinical implementation of personal genome information, and describes the application of whole-genome and exome sequencing to identify the cause of genetic diseases and to suggest adjuvant therapies. Better analysis tools and a deeper understanding of the biology of our genome are necessary in order to decipher, interpret, and optimize clinical utility of what the variation in the human genome can teach us. Personal genome sequencing may eventually become an instrument of common medical practice, providing information that assists in the formulation of a differential diagnosis. We outline herein some of the remaining challenges. PMID:22248320

  12. Human identification from forensic materials by amplification of a human-specific sequence in the myoglobin gene.

    Directory of Open Access Journals (Sweden)

    Ono T

    2001-06-01

    Full Text Available We developed a method for human identification of forensic biological materials by PCR-based detection of a human-specific sequence in exon 3 of the myoglobin gene. This human-specific DNA sequence was deduced from differences in the amino acid sequences of myoglobins between humans and other animal species. The new method enabled amplification of the target DNA fragment from 30 samples of human DNA, and the amplified sequences were identical with that already reported. Using this method, we were able to distinguish human samples from those of 21 kinds of animals: the crab-eating monkey, horse, cow, sheep, goat, pig, wild boar, dog, raccoon dog, cat, rabbit, guinea pig, hamster, rat, mouse, whale, chicken, pigeon, turtle, frog, and tuna. However, we were unable to distinguish between human and gorilla samples. This method enabled us to detect the target sequence from 25 pg of human DNA, and the target DNA fragment from blood stored at 37 degrees C for 6 months, and from bloodstains heated at 150 degrees C for 4 h or stored at room temperature for 26 years. Herein we also report a practical application of the method for human identification of a bone fragment.

  13. Assessing the fidelity of ancient DNA sequences amplified from nuclear genes

    DEFF Research Database (Denmark)

    Binladen, Jonas; Wiuf, Carsten Henrik; Gilbert, M. Thomas P.

    2006-01-01

    To date, the field of ancient DNA has relied almost exclusively on mitochondrial DNA (mtDNA) sequences. However, a number of recent studies have reported the successful recovery of ancient nuclear DNA (nuDNA) sequences, thereby allowing the characterization of genetic loci directly involved in ph...

  14. Isolation and sequence analysis of a cDNA clone encoding the fifth complement component

    DEFF Research Database (Denmark)

    Lundwall, Åke B; Wetsel, Rick A; Kristensen, Torsten;

    1985-01-01

    DNA clone of 1.85 kilobase pairs was isolated. Hybridization of the mixed-sequence probe to the complementary strand of the plasmid insert and sequence analysis by the dideoxy method predicted the expected protein sequence of C5a (positions 1-12), amino-terminal to the anticipated priming site. The sequence......, subcloned into M13 mp8, and sequenced at random by the dideoxy technique, thereby generating a contiguous sequence of 1703 base pairs. This clone contained coding sequence for the C-terminal 262 amino acid residues of the beta-chain, the entire C5a fragment, and the N-terminal 98 residues of the alpha......'-chain. The 3' end of the clone had a polyadenylated tail preceded by a polyadenylation recognition site, a 3'-untranslated region, and base pairs homologous to the human Alu concensus sequence. Comparison of the derived partial human C5 protein sequence with that previously determined for murine C3 and human...

  15. Pulling out the 1%: whole-genome capture for the targeted enrichment of ancient DNA sequencing libraries.

    Science.gov (United States)

    Carpenter, Meredith L; Buenrostro, Jason D; Valdiosera, Cristina; Schroeder, Hannes; Allentoft, Morten E; Sikora, Martin; Rasmussen, Morten; Gravel, Simon; Guillén, Sonia; Nekhrizov, Georgi; Leshtakov, Krasimir; Dimitrova, Diana; Theodossiev, Nikola; Pettener, Davide; Luiselli, Donata; Sandoval, Karla; Moreno-Estrada, Andrés; Li, Yingrui; Wang, Jun; Gilbert, M Thomas P; Willerslev, Eske; Greenleaf, William J; Bustamante, Carlos D

    2013-11-07

    Most ancient specimens contain very low levels of endogenous DNA, precluding the shotgun sequencing of many interesting samples because of cost. Ancient DNA (aDNA) libraries often contain libraries. By using biotinylated RNA baits transcribed from genomic DNA libraries, we are able to capture DNA fragments from across the human genome. We demonstrate this method on libraries created from four Iron Age and Bronze Age human teeth from Bulgaria, as well as bone samples from seven Peruvian mummies and a Bronze Age hair sample from Denmark. Prior to capture, shotgun sequencing of these libraries yielded an average of 1.2% of reads mapping to the human genome (including duplicates). After capture, this fraction increased substantially, with up to 59% of reads mapped to human and enrichment ranging from 6- to 159-fold. Furthermore, we maintained coverage of the majority of regions sequenced in the precapture library. Intersection with the 1000 Genomes Project reference panel yielded an average of 50,723 SNPs (range 3,062-147,243) for the postcapture libraries sequenced with 1 million reads, compared with 13,280 SNPs (range 217-73,266) for the precapture libraries, increasing resolution in population genetic analyses. Our whole-genome capture approach makes it less costly to sequence aDNA from specimens containing very low levels of endogenous DNA, enabling the analysis of larger numbers of samples.

  16. Spectral sum rules and search for periodicities in DNA sequences

    Energy Technology Data Exchange (ETDEWEB)

    Chechetkin, V.R., E-mail: chechet@biochip.r [Theoretical Department of Division for Perspective Investigations, Troitsk Institute of Innovation and Thermonuclear Investigations (TRINITI), Troitsk, 142190 Moscow Region (Russian Federation)

    2011-04-18

    Periodic patterns play the important regulatory and structural roles in genomic DNA sequences. Commonly, the underlying periodicities should be understood in a broad statistical sense, since the corresponding periodic patterns have been strongly distorted by the random point mutations and insertions/deletions during molecular evolution. The latent periodicities in DNA sequences can be efficiently displayed by Fourier transform. The criteria of significance for observed periodicities are obtained via the comparison versus the counterpart characteristics of the reference random sequences. We show that the restrictions imposed on the significance criteria by the rigorous spectral sum rules can be rationally described with De Finetti distribution. This distribution provides the convenient intermediate asymptotic form between Rayleigh distribution and exact combinatoric theory. - Highlights: We study the significance criteria for latent periodicities in DNA sequences. The constraints imposed by sum rules can be described with De Finetti distribution. It is intermediate between Rayleigh distribution and exact combinatoric theory. Theory is applicable to the study of correlations between different periodicities. The approach can be generalized to the arbitrary discrete Fourier transform.

  17. Early Lyme disease with spirochetemia - diagnosed by DNA sequencing

    Directory of Open Access Journals (Sweden)

    Jones William

    2010-11-01

    Full Text Available Abstract Background A sensitive and analytically specific nucleic acid amplification test (NAAT is valuable in confirming the diagnosis of early Lyme disease at the stage of spirochetemia. Findings Venous blood drawn from patients with clinical presentations of Lyme disease was tested for the standard 2-tier screen and Western Blot serology assay for Lyme disease, and also by a nested polymerase chain reaction (PCR for B. burgdorferi sensu lato 16S ribosomal DNA. The PCR amplicon was sequenced for B. burgdorferi genomic DNA validation. A total of 130 patients visiting emergency room (ER or Walk-in clinic (WALKIN, and 333 patients referred through the private physicians' offices were studied. While 5.4% of the ER/WALKIN patients showed DNA evidence of spirochetemia, none (0% of the patients referred from private physicians' offices were DNA-positive. In contrast, while 8.4% of the patients referred from private physicians' offices were positive for the 2-tier Lyme serology assay, only 1.5% of the ER/WALKIN patients were positive for this antibody test. The 2-tier serology assay missed 85.7% of the cases of early Lyme disease with spirochetemia. The latter diagnosis was confirmed by DNA sequencing. Conclusion Nested PCR followed by automated DNA sequencing is a valuable supplement to the standard 2-tier antibody assay in the diagnosis of early Lyme disease with spirochetemia. The best time to test for Lyme spirochetemia is when the patients living in the Lyme disease endemic areas develop unexplained symptoms or clinical manifestations that are consistent with Lyme disease early in the course of their illness.

  18. MEME: discovering and analyzing DNA and protein sequence motifs.

    Science.gov (United States)

    Bailey, Timothy L; Williams, Nadya; Misleh, Chris; Li, Wilfred W

    2006-07-01

    MEME (Multiple EM for Motif Elicitation) is one of the most widely used tools for searching for novel 'signals' in sets of biological sequences. Applications include the discovery of new transcription factor binding sites and protein domains. MEME works by searching for repeated, ungapped sequence patterns that occur in the DNA or protein sequences provided by the user. Users can perform MEME searches via the web server hosted by the National Biomedical Computation Resource (http://meme.nbcr.net) and several mirror sites. Through the same web server, users can also access the Motif Alignment and Search Tool to search sequence databases for matches to motifs encoded in several popular formats. By clicking on buttons in the MEME output, users can compare the motifs discovered in their input sequences with databases of known motifs, search sequence databases for matches to the motifs and display the motifs in various formats. This article describes the freely accessible web server and its architecture, and discusses ways to use MEME effectively to find new sequence patterns in biological sequences and analyze their significance.

  19. The dynamic DNA methylomes of double-stranded DNA viruses associated with human cancer

    Science.gov (United States)

    Fernandez, Agustin F.; Rosales, Cecilia; Lopez-Nieva, Pilar; Graña, Osvaldo; Ballestar, Esteban; Ropero, Santiago; Espada, Jesus; Melo, Sonia A.; Lujambio, Amaia; Fraga, Mario F.; Pino, Irene; Javierre, Biola; Carmona, Francisco J.; Acquadro, Francesco; Steenbergen, Renske D.M.; Snijders, Peter J.F.; Meijer, Chris J.; Pineau, Pascal; Dejean, Anne; Lloveras, Belen; Capella, Gabriel; Quer, Josep; Buti, Maria; Esteban, Juan-Ignacio; Allende, Helena; Rodriguez-Frias, Francisco; Castellsague, Xavier; Minarovits, Janos; Ponce, Jordi; Capello, Daniela; Gaidano, Gianluca; Cigudosa, Juan Cruz; Gomez-Lopez, Gonzalo; Pisano, David G.; Valencia, Alfonso; Piris, Miguel Angel; Bosch, Francesc X.; Cahir-McFarland, Ellen; Kieff, Elliott; Esteller, Manel

    2009-01-01

    The natural history of cancers associated with virus exposure is intriguing, since only a minority of human tissues infected with these viruses inevitably progress to cancer. However, the molecular reasons why the infection is controlled or instead progresses to subsequent stages of tumorigenesis are largely unknown. In this article, we provide the first complete DNA methylomes of double-stranded DNA viruses associated with human cancer that might provide important clues to help us understand the described process. Using bisulfite genomic sequencing of multiple clones, we have obtained the DNA methylation status of every CpG dinucleotide in the genome of the Human Papilloma Viruses 16 and 18 and Human Hepatitis B Virus, and in all the transcription start sites of the Epstein-Barr Virus. These viruses are associated with infectious diseases (such as hepatitis B and infectious mononucleosis) and the development of human tumors (cervical, hepatic, and nasopharyngeal cancers, and lymphoma), and are responsible for 1 million deaths worldwide every year. The DNA methylomes presented provide evidence of the dynamic nature of the epigenome in contrast to the genome. We observed that the DNA methylome of these viruses evolves from an unmethylated to a highly methylated genome in association with the progression of the disease, from asymptomatic healthy carriers, through chronically infected tissues and pre-malignant lesions, to the full-blown invasive tumor. The observed DNA methylation changes have a major functional impact on the biological behavior of the viruses. PMID:19208682

  20. A CLIQUE algorithm using DNA computing techniques based on closed-circle DNA sequences.

    Science.gov (United States)

    Zhang, Hongyan; Liu, Xiyu

    2011-07-01

    DNA computing has been applied in broad fields such as graph theory, finite state problems, and combinatorial problem. DNA computing approaches are more suitable used to solve many combinatorial problems because of the vast parallelism and high-density storage. The CLIQUE algorithm is one of the gird-based clustering techniques for spatial data. It is the combinatorial problem of the density cells. Therefore we utilize DNA computing using the closed-circle DNA sequences to execute the CLIQUE algorithm for the two-dimensional data. In our study, the process of clustering becomes a parallel bio-chemical reaction and the DNA sequences representing the marked cells can be combined to form a closed-circle DNA sequences. This strategy is a new application of DNA computing. Although the strategy is only for the two-dimensional data, it provides a new idea to consider the grids to be vertexes in a graph and transform the search problem into a combinatorial problem.

  1. Short sequence effect of ancient DNA on mammoth phylogenetic analyses

    Institute of Scientific and Technical Information of China (English)

    Guilian SHENG; Lianjuan WU; Xindong HOU; Junxia YUAN; Shenghong CHENG; Bojian ZHONG; Xulong LAI

    2009-01-01

    The evolution of Elephantidae has been intensively studied in the past few years, especially after 2006. The molecular approaches have made great contribution to the assumption that the extinct woolly mammoth has a close relationship with the Asian elephant instead of the African elephant. In this study, partial ancient DNA sequences of cytochrome b (cyt b) gene in mitochondrial genome were successfully retrieved from Late Pleistocene Mammuthus primigenius bones collected from Heilongjiang Province in Northeast China. Both the partial and complete homologous cyt b gene sequences and the whole mitochondrial genome sequences extracted from GenBank were aligned and used as datasets for phylogenetic analyses. All of the phylogenetic trees, based on either the partial or the complete cyt b gene, reject the relationship constructed by the whole mitochondrial genome, showing the occurrence of an effect of sequence length of cyt b gene on mammoth phylogenetic analyses.

  2. No evidence of Neandertal mtDNA contribution to early modern humans.

    Directory of Open Access Journals (Sweden)

    David Serre

    2004-03-01

    Full Text Available The retrieval of mitochondrial DNA (mtDNA sequences from four Neandertal fossils from Germany, Russia, and Croatia has demonstrated that these individuals carried closely related mtDNAs that are not found among current humans. However, these results do not definitively resolve the question of a possible Neandertal contribution to the gene pool of modern humans since such a contribution might have been erased by genetic drift or by the continuous influx of modern human DNA into the Neandertal gene pool. A further concern is that if some Neandertals carried mtDNA sequences similar to contemporaneous humans, such sequences may be erroneously regarded as modern contaminations when retrieved from fossils. Here we address these issues by the analysis of 24 Neandertal and 40 early modern human remains. The biomolecular preservation of four Neandertals and of five early modern humans was good enough to suggest the preservation of DNA. All four Neandertals yielded mtDNA sequences similar to those previously determined from Neandertal individuals, whereas none of the five early modern humans contained such mtDNA sequences. In combination with current mtDNA data, this excludes any large genetic contribution by Neandertals to early modern humans, but does not rule out the possibility of a smaller contribution.

  3. Viral discovery and sequence recovery using DNA microarrays.

    Directory of Open Access Journals (Sweden)

    David Wang

    2003-11-01

    Full Text Available Because of the constant threat posed by emerging infectious diseases and the limitations of existing approaches used to identify new pathogens, there is a great demand for new technological methods for viral discovery. We describe herein a DNA microarray-based platform for novel virus identification and characterization. Central to this approach was a DNA microarray designed to detect a wide range of known viruses as well as novel members of existing viral families; this microarray contained the most highly conserved 70mer sequences from every fully sequenced reference viral genome in GenBank. During an outbreak of severe acute respiratory syndrome (SARS in March 2003, hybridization to this microarray revealed the presence of a previously uncharacterized coronavirus in a viral isolate cultivated from a SARS patient. To further characterize this new virus, approximately 1 kb of the unknown virus genome was cloned by physically recovering viral sequences hybridized to individual array elements. Sequencing of these fragments confirmed that the virus was indeed a new member of the coronavirus family. This combination of array hybridization followed by direct viral sequence recovery should prove to be a general strategy for the rapid identification and characterization of novel viruses and emerging infectious disease.

  4. Phylogenetic relationships of the Gomphales based on nuc-25S-rDNA, mit-12S-rDNA, and mit-atp6-DNA combined sequences

    Science.gov (United States)

    Admir J. Giachini; Kentaro Hosaka; Eduardo Nouhra; Joseph Spatafora; James M. Trappe

    2010-01-01

    Phylogenetic relationships among Geastrales, Gomphales, Hysterangiales, and Phallales were estimated via combined sequences: nuclear large subunit ribosomal DNA (nuc-25S-rDNA), mitochondrial small subunit ribosomal DNA (mit-12S-rDNA), and mitochondrial atp6 DNA (mit-atp6-DNA). Eighty-one taxa comprising 19 genera and 58 species...

  5. Phylogenetic analysis of the genus Hordeum using repetitive DNA sequences

    DEFF Research Database (Denmark)

    Svitashev, S.; Bryngelsson, T.; Vershinin, A.

    1994-01-01

    A set of six cloned barley (Hordeum vulgare) repetitive DNA sequences was used for the analysis of phylogenetic relationships among 31 species (46 taxa) of the genus Hordeum, using molecular hybridization techniques. In situ hybridization experiments showed dispersed organization of the sequences...... over all chromosomes of H. vulgare and the wild barley species H. bulbosum, H. marinum and H. murinum. Southern blot hybridization revealed different levels of polymorphism among barley species and the RFLP data were used to generate a phylogenetic tree for the genus Hordeum. Our data are in a good...

  6. The implementation of bit-parallelism for DNA sequence alignment

    Science.gov (United States)

    Setyorini; Kuspriyanto; Widyantoro, D. H.; Pancoro, A.

    2017-05-01

    Dynamic Programming (DP) remain the central algorithm of biological sequence alignment. Matching score computation is the most time-consuming process. Bit-parallelism is one of approximate string matching techniques that transform DP matrix cell unit processing into word unit (groups of cell). Bit-parallelism computate the scores column-wise. Adopting from word processing in computer system work, this technique promise reducing time in score computing process in DP matrix. In this paper, we implement bit-parallelism technique for DNA sequence alignment. Our bit-parallelism implementation have less time for score computational process but still need improvement for there construction process.

  7. Sequences sufficient for programming imprinted germline DNA methylation defined.

    Directory of Open Access Journals (Sweden)

    Yoon Jung Park

    Full Text Available Epigenetic marks are fundamental to normal development, but little is known about signals that dictate their placement. Insights have been provided by studies of imprinted loci in mammals, where monoallelic expression is epigenetically controlled. Imprinted expression is regulated by DNA methylation programmed during gametogenesis in a sex-specific manner and maintained after fertilization. At Rasgrf1 in mouse, paternal-specific DNA methylation on a differential methylation domain (DMD requires downstream tandem repeats. The DMD and repeats constitute a binary switch regulating paternal-specific expression. Here, we define sequences sufficient for imprinted methylation using two transgenic mouse lines: One carries the entire Rasgrf1 cluster (RC; the second carries only the DMD and repeats (DR from Rasgrf1. The RC transgene recapitulated all aspects of imprinting seen at the endogenous locus. DR underwent proper DNA methylation establishment in sperm and erasure in oocytes, indicating the DMD and repeats are sufficient to program imprinted DNA methylation in germlines. Both transgenes produce a DMD-spanning pit-RNA, previously shown to be necessary for imprinted DNA methylation at the endogenous locus. We show that when pit-RNA expression is controlled by the repeats, it regulates DNA methylation in cis only and not in trans. Interestingly, pedigree history dictated whether established DR methylation patterns were maintained after fertilization. When DR was paternally transmitted followed by maternal transmission, the unmethylated state that was properly established in the female germlines could not be maintained. This provides a model for transgenerational epigenetic inheritance in mice.

  8. Evaluation of intra- and interspecific divergence of satellite DNA sequences by nucleotide frequency calculation and pairwise sequence comparison

    Directory of Open Access Journals (Sweden)

    Kato Mikio

    2003-01-01

    Full Text Available Satellite DNA sequences are known to be highly variable and to have been subjected to concerted evolution that homogenizes member sequences within species. We have analyzed the mode of evolution of satellite DNA sequences in four fishes from the genus Diplodus by calculating the nucleotide frequency of the sequence array and the phylogenetic distances between member sequences. Calculation of nucleotide frequency and pairwise sequence comparison enabled us to characterize the divergence among member sequences in this satellite DNA family. The results suggest that the evolutionary rate of satellite DNA in D. bellottii is about two-fold greater than the average of the other three fishes, and that the sequence homogenization event occurred in D. puntazzo more recently than in the others. The procedures described here are effective to characterize mode of evolution of satellite DNA.

  9. Effect of dephasing on DNA sequencing via transverse electronic transport

    Energy Technology Data Exchange (ETDEWEB)

    Zwolak, Michael [Los Alamos National Laboratory; Krems, Matt [NON LANL; Pershin, Yuriy V [NON LANL; Di Ventra, Massimiliano [NON LANL

    2009-01-01

    We study theoretically the effects of dephasing on DNA sequencing in a nanopore via transverse electronic transport. To do this, we couple classical molecular dynamics simulations with transport calculations using scattering theory. Previous studies, which did not include dephasing, have shown that by measuring the transverse current of a particular base multiple times, one can get distributions of currents for each base that are distinguishable. We introduce a dephasing parameter into transport calculations to simulate the effects of the ions and other fluctuations. These effects lower the overall magnitude of the current, but have little effect on the current distributions themselves. The results of this work further implicate that distinguishing DNA bases via transverse electronic transport has potential as a sequencing tool.

  10. Finding human promoter groups based on DNA physical properties

    Science.gov (United States)

    Zeng, Jia; Cao, Xiao-Qin; Zhao, Hongya; Yan, Hong

    2009-10-01

    DNA rigidity is an important physical property originating from the DNA three-dimensional structure. Although the general DNA rigidity patterns in human promoters have been investigated, their distinct roles in transcription are largely unknown. In this paper, we discover four highly distinct human promoter groups based on similarity of their rigidity profiles. First, we find that all promoter groups conserve relatively rigid DNAs at the canonical TATA box [a consensus TATA(A/T)A(A/T) sequence] position, which are important physical signals in binding transcription factors. Second, we find that the genes activated by each group of promoters share significant biological functions based on their gene ontology annotations. Finally, we find that these human promoter groups correlate with the tissue-specific gene expression.

  11. Finding human promoter groups based on DNA physical properties.

    Science.gov (United States)

    Zeng, Jia; Cao, Xiao-Qin; Zhao, Hongya; Yan, Hong

    2009-10-01

    DNA rigidity is an important physical property originating from the DNA three-dimensional structure. Although the general DNA rigidity patterns in human promoters have been investigated, their distinct roles in transcription are largely unknown. In this paper, we discover four highly distinct human promoter groups based on similarity of their rigidity profiles. First, we find that all promoter groups conserve relatively rigid DNAs at the canonical TATA box [a consensus TATA(A/T)A(A/T) sequence] position, which are important physical signals in binding transcription factors. Second, we find that the genes activated by each group of promoters share significant biological functions based on their gene ontology annotations. Finally, we find that these human promoter groups correlate with the tissue-specific gene expression.

  12. 人UCA1基因新剪接变异体全长cDNA序列的克隆%Cloning of the full-length cDNA sequence of a novel human UCA1 spliced variant

    Institute of Scientific and Technical Information of China (English)

    王宇; 陈葳; 李旭

    2012-01-01

    Objective To clone the full-length cDNA sequence of novel UCA1 spliced isoforms for understanding the exact mechanism of this type of alternative splicing. Methods The full-length cDNA was amplified from BLZ-211 cells by using the in silicon sequence elongation technique, 5'-RACE and 3'-RACE techniques. Products of RT-PCR were sequenced and further assembled. Results The new UCA1 spliced isoform sequence was 2 202 bp. Conclusion A combination of the in silicon sequence elongation, 5'-RACE and 3'-RACE techniques is an effective way to obtain the full-length cDNA, which will guide further research on the mechanism of this type of alternative splicing.%目的 克隆新的UCA1剪接变异体全长cDNA序列,为研究其可变剪接机制奠定基础.方法 用电子克隆技术和cDNA序列末端快速扩增技术(rapid amplification of cDNA ends,RACE)扩增细胞系BLZ-211 cDNA并进行产物测序和序列拼接.结果 新克隆的UCA1剪接变异体全长cDNA序列为2 202 bp.结论 综合采用电子克隆技术与RACE技术是获得全长cDNA序列的有效方法,为该基因的后续可变剪接机制的研究奠定了基础.

  13. Roche genome sequencer FLX based high-throughput sequencing of ancient DNA

    DEFF Research Database (Denmark)

    Alquezar-Planas, David E; Fordyce, Sarah Louise

    2012-01-01

    Since the development of so-called "next generation" high-throughput sequencing in 2005, this technology has been applied to a variety of fields. Such applications include disease studies, evolutionary investigations, and ancient DNA. Each application requires a specialized protocol to ensure tha...

  14. Measuring cation dependent DNA polymerase fidelity landscapes by deep sequencing.

    Directory of Open Access Journals (Sweden)

    Bradley Michael Zamft

    Full Text Available High-throughput recording of signals embedded within inaccessible micro-environments is a technological challenge. The ideal recording device would be a nanoscale machine capable of quantitatively transducing a wide range of variables into a molecular recording medium suitable for long-term storage and facile readout in the form of digital data. We have recently proposed such a device, in which cation concentrations modulate the misincorporation rate of a DNA polymerase (DNAP on a known template, allowing DNA sequences to encode information about the local cation concentration. In this work we quantify the cation sensitivity of DNAP misincorporation rates, making possible the indirect readout of cation concentration by DNA sequencing. Using multiplexed deep sequencing, we quantify the misincorporation properties of two DNA polymerases--Dpo4 and Klenow exo(---obtaining the probability and base selectivity of misincorporation at all positions within the template. We find that Dpo4 acts as a DNA recording device for Mn(2+ with a misincorporation rate gain of ∼2%/mM. This modulation of misincorporation rate is selective to the template base: the probability of misincorporation on template T by Dpo4 increases >50-fold over the range tested, while the other template bases are affected less strongly. Furthermore, cation concentrations act as scaling factors for misincorporation: on a given template base, Mn(2+ and Mg(2+ change the overall misincorporation rate but do not alter the relative frequencies of incoming misincorporated nucleotides. Characterization of the ion dependence of DNAP misincorporation serves as the first step towards repurposing it as a molecular recording device.

  15. An automated annotation tool for genomic DNA sequences using GeneScan and BLAST

    Indian Academy of Sciences (India)

    Andrew M. Lynn; Chakresh Kumar Jain; K. Kosalai; Pranjan Barman; Nupur Thakur; Harish Batra; Alok Bhattacharya

    2001-04-01

    Genomic sequence data are often available well before the annotated sequence is published. We present a method for analysis of genomic DNA to identify coding sequences using the GeneScan algorithm and characterize these resultant sequences by BLAST. The routines are used to develop a system for automated annotation of genome DNA sequences.

  16. DNA Methylation Landscapes of Human Fetal Development

    NARCIS (Netherlands)

    Slieker, Roderick C.; Roost, Matthias S.; van Iperen, Liesbeth; Suchiman, H. Eka D; Tobi, Elmar W.; Carlotti, Françoise; de Koning, Eelco J P; Slagboom, P. Eline; Heijmans, Bastiaan T.; Chuva de Sousa Lopes, Susana M.

    2015-01-01

    Remodelling the methylome is a hallmark of mammalian development and cell differentiation. However, current knowledge of DNA methylation dynamics in human tissue specification and organ development largely stems from the extrapolation of studies in vitro and animal models. Here, we report on the DNA

  17. Unscheduled DNA replication origin activation at inserted HPV 18 sequences in a HPV-18/MYC amplicon.

    Science.gov (United States)

    Conti, Chiara; Herrick, John; Bensimon, Aaron

    2007-08-01

    Oncogene amplification is a critical step leading to tumorigenesis, but the underlying mechanisms are still poorly understood. Despite data suggesting that DNA replication is a major source of genomic instability, little is known about replication origin usage and replication fork progression in rearranged regions. Using a single DNA molecule approach, we provide here the first study of replication kinetics on a previously characterized MYC/papillomavirus (HPV18) amplicon in a cervical cancer. Using this amplicon as a model, we investigated the role DNA replication control plays in generating amplifications in human cancers. The data reveal severely perturbed DNA replication kinetics in the amplified region when compared with other regions of the same genome. It was found that DNA replication is initiated from both genomic and viral sequences, resulting in a higher median frequency of origin firings. In addition, it was found that the higher initiation frequency was associated with an equivalent increase in the number of stalled replication forks. These observations raise the intriguing possibility that unscheduled replication origin activation at inserted HPV-18 viral DNA sequences triggers DNA amplification in this cancer cell line and the subsequent overexpression of the MYC oncogene.

  18. Color image encryption scheme using CML and DNA sequence operations.

    Science.gov (United States)

    Wang, Xing-Yuan; Zhang, Hui-Li; Bao, Xue-Mei

    2016-06-01

    In this paper, an encryption algorithm for color images using chaotic system and DNA (Deoxyribonucleic acid) sequence operations is proposed. Three components for the color plain image is employed to construct a matrix, then perform confusion operation on the pixels matrix generated by the spatiotemporal chaos system, i.e., CML (coupled map lattice). DNA encoding rules, and decoding rules are introduced in the permutation phase. The extended Hamming distance is proposed to generate new initial values for CML iteration combining color plain image. Permute the rows and columns of the DNA matrix and then get the color cipher image from this matrix. Theoretical analysis and experimental results prove the cryptosystem secure and practical, and it is suitable for encrypting color images of any size. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  19. Similarity Estimation Between DNA Sequences Based on Local Pattern Histograms of Binary Images

    Institute of Scientific and Technical Information of China (English)

    Yusei Kobori; Satoshi Mizuta

    2016-01-01

    Graphical representation of DNA sequences is one of the most popular techniques for alignment-free sequence comparison. Here, we propose a new method for the feature extraction of DNA sequences represented by binary images, by estimating the similarity between DNA sequences using the frequency histograms of local bitmap patterns of images. Our method shows linear time complexity for the length of DNA sequences, which is practical even when long sequences, such as whole genome sequences, are compared. We tested five distance measures for the estimation of sequence similarities, and found that the histogram intersection and Manhattan distance are the most appropriate ones for phylogenetic analyses.

  20. The diploid genome sequence of an individual human.

    Directory of Open Access Journals (Sweden)

    Samuel Levy

    2007-09-01

    Full Text Available Presented here is a genome sequence of an individual human. It was produced from approximately 32 million random DNA fragments, sequenced by Sanger dideoxy technology and assembled into 4,528 scaffolds, comprising 2,810 million bases (Mb of contiguous sequence with approximately 7.5-fold coverage for any given region. We developed a modified version of the Celera assembler to facilitate the identification and comparison of alternate alleles within this individual diploid genome. Comparison of this genome and the National Center for Biotechnology Information human reference assembly revealed more than 4.1 million DNA variants, encompassing 12.3 Mb. These variants (of which 1,288,319 were novel included 3,213,401 single nucleotide polymorphisms (SNPs, 53,823 block substitutions (2-206 bp, 292,102 heterozygous insertion/deletion events (indels(1-571 bp, 559,473 homozygous indels (1-82,711 bp, 90 inversions, as well as numerous segmental duplications and copy number variation regions. Non-SNP DNA variation accounts for 22% of all events identified in the donor, however they involve 74% of all variant bases. This suggests an important role for non-SNP genetic alterations in defining the diploid genome structure. Moreover, 44% of genes were heterozygous for one or more variants. Using a novel haplotype assembly strategy, we were able to span 1.5 Gb of genome sequence in segments >200 kb, providing further precision to the diploid nature of the genome. These data depict a definitive molecular portrait of a diploid human genome that provides a starting point for future genome comparisons and enables an era of individualized genomic information.

  1. Pitfalls in the analysis of ancient human mtDNA

    Institute of Scientific and Technical Information of China (English)

    2003-01-01

    The retrieval of DNA from ancient human specimens is not always successful owing to DNA deterioration and contamination although it is vital to provide new insights into the genetic structure of ancient people and to reconstruct the past history. Normally, only short DNA fragments can be retrieved from the ancient specimens. How to identify the authenticity of DNA obtained and to uncover the information it contained are difficult. We employed the ancient mtDNAs reported from Central Asia (including Xinjiang, China) as an example to discern potentially extraneous DNA contamination based on the updated mtDNA phylogeny derived from mtDNA control region, coding region, as well as complete sequence information. Our results demonstrated that many mtDNAs reported are more or less problematic. Starting from a reliable mtDNA phylogeney and combining the available modern data into analysis, one can ascertain the authenticity of the ancient DNA, distinguish the potential errors in a data set, and efficiently decipher the meager information it harbored. The reappraisal of the mtDNAs with the age of more than 2000 years from Central Asia gave support to the suggestion of extensively (pre)historical gene admixture in this region.

  2. Molecular cloning and sequence analysis of hamster CENP-A cDNA

    Science.gov (United States)

    Figueroa, Javier; Pendón, Carlos; Valdivia, Manuel M

    2002-01-01

    Background The centromere is a specialized locus that mediates chromosome movement during mitosis and meiosis. This chromosomal domain comprises a uniquely packaged form of heterochromatin that acts as a nucleus for the assembly of the kinetochore a trilaminar proteinaceous structure on the surface of each chromatid at the primary constriction. Kinetochores mediate interactions with the spindle fibers of the mitotic apparatus. Centromere protein A (CENP-A) is a histone H3-like protein specifically located to the inner plate of kinetochore at active centromeres. CENP-A works as a component of specialized nucleosomes at centromeres bound to arrays of repeat satellite DNA. Results We have cloned the hamster homologue of human and mouse CENP-A. The cDNA isolated was found to contain an open reading frame encoding a polypeptide consisting of 129 amino acid residues with a C-terminal histone fold domain highly homologous to those of CENP-A and H3 sequences previously released. However, significant sequence divergence was found at the N-terminal region of hamster CENP-A that is five and eleven residues shorter than those of mouse and human respectively. Further, a human serine 7 residue, a target site for Aurora B kinase phosphorylation involved in the mechanism of cytokinesis, was not found in the hamster protein. A human autoepitope at the N-terminal region of CENP-A described in autoinmune diseases is not conserved in the hamster protein. Conclusions We have cloned the hamster cDNA for the centromeric protein CENP-A. Significant differences on protein sequence were found at the N-terminal tail of hamster CENP-A in comparison with that of human and mouse. Our results show a high degree of evolutionary divergence of kinetochore CENP-A proteins in mammals. This is related to the high diverse nucleotide repeat sequences found at the centromere DNA among species and support a current centromere model for kinetochore function and structural plasticity. PMID:12019018

  3. A DNA sequence alignment algorithm using quality information and a fuzzy inference method

    Institute of Scientific and Technical Information of China (English)

    Kwangbaek Kim; Minhwan Kim; Youngwoon Woo

    2008-01-01

    DNA sequence alignment algorithms in computational molecular biology have been improved by diverse methods.In this paper.We propose a DNA sequence alignment that Uses quality information and a fuzzy inference method developed based on the characteristics of DNA fragments and a fuzzy logic system in order to improve conventional DNA sequence alignment methods that uses DNA sequence quality information.In conventional algorithms.DNA sequence alignment scores are calculated by the global sequence alignment algorithm proposed by Needleman-Wunsch,which is established by using quality information of each DNA fragment.However,there may be errors in the process of calculating DNA sequence alignment scores when the quality of DNA fragment tips is low.because only the overall DNA sequence quality information are used.In our proposed method.an exact DNA sequence alignment can be achieved in spite of the low quality of DNA fragment tips by improvement of conventional algorithms using quality information.Mapping score parameters used to calculate DNA sequence alignment scores are dynamically adjusted by the fuzzy logic system utilizing lengths of DNA fragments and frequencies of low quality DNA bases in the fragments.From the experiments by applying real genome data of National Center for Bioteclmology Information,we could see that the proposed method is more efficient than conventional algorithms.

  4. Human identification by lice: A Next Generation Sequencing challenge.

    Science.gov (United States)

    Pilli, Elena; Agostino, Alessandro; Vergani, Debora; Salata, Elena; Ciuna, Ignazio; Berti, Andrea; Caramelli, David; Lambiase, Simonetta

    2016-09-01

    Rapid and progressive advances in molecular biology techniques and the advent of Next Generation Sequencing (NGS) have opened new possibilities for analyses also in the identification of entomological matrixes. Insects and other arthropods are widespread in nature and those found at a crime scene can provide a useful contribution to forensic investigations. Entomological evidence is used by experts to define the postmortem interval (PMI), which is essentially based on morphological recognition of the insect and an estimation of its insect life cycle stage. However, molecular genotyping methods can also provide an important support for forensic entomological investigations when the identification of species or human genetic material is required. This case study concerns a collection of insects found in the house of a woman who died from unknown causes. Initially the insects were identified morphologically as belonging to the Pediculidae family, and then, human DNA was extracted and analyzed from their gastrointestinal tract. The application of the latest generation forensic DNA assays, such as the Quantifiler(®) Trio DNA Quantification Kit and the HID-Ion AmpliSeq™ Identity Panel (Applied Biosystems(®)), individuated the presence of human DNA in the samples and determined the genetic profile.

  5. Model identification for DNA sequence-structure relationships.

    Science.gov (United States)

    Hawley, Stephen Dwyer; Chiu, Anita; Chizeck, Howard Jay

    2006-11-01

    We investigate the use of algebraic state-space models for the sequence dependent properties of DNA. By considering the DNA sequence as an input signal, rather than using an all atom physical model, computational efficiency is achieved. A challenge in deriving this type of model is obtaining its structure and estimating its parameters. Here we present two candidate model structures for the sequence dependent structural property Slide and a method of encoding the models so that a recursive least squares algorithm can be applied for parameter estimation. These models are based on the assumption that the value of Slide at a base-step is determined by the surrounding tetranucleotide sequence. The first model takes the four bases individually as inputs and has a median root mean square deviation of 0.90 A. The second model takes the four bases pairwise and has a median root mean square deviation of 0.88 A. These values indicate that the accuracy of these models is within the useful range for structure prediction. Performance is comparable to published predictions of a more physically derived model, at significantly less computational cost.

  6. DNA sequence chromatogram browsing using JAVA and CORBA.

    Science.gov (United States)

    Parsons, J D; Buehler, E; Hillier, L

    1999-03-01

    DNA sequence chromatograms (traces) are the primary data source for all large-scale genomic and expressed sequence tags (ESTs) sequencing projects. Access to the sequencing trace assists many later analyses, for example contig assembly and polymorphism detection, but obtaining and using traces is problematic. Traces are not collected and published centrally, they are much larger than the base calls derived from them, and viewing them requires the interactivity of a local graphical client with local data. To provide efficient global access to DNA traces, we developed a client/server system based on flexible Java components integrated into other applications including an applet for use in a WWW browser and a stand-alone trace viewer. Client/server interaction is facilitated by CORBA middleware which provides a well-defined interface, a naming service, and location independence. [The software is packaged as a Jar file available from the following URL: http://www.ebi.ac.uk/jparsons. Links to working examples of the trace viewers can be found at http://corba.ebi.ac.uk/EST. All the Washington University mouse EST traces are available for browsing at the same URL.

  7. Human Insulin from Recombinant DNA Technology

    Science.gov (United States)

    Johnson, Irving S.

    1983-02-01

    Human insulin produced by recombinant DNA technology is the first commercial health care product derived from this technology. Work on this product was initiated before there were federal guidelines for large-scale recombinant DNA work or commercial development of recombinant DNA products. The steps taken to facilitate acceptance of large-scale work and proof of the identity and safety of such a product are described. While basic studies in recombinant DNA technology will continue to have a profound impact on research in the life sciences, commercial applications may well be controlled by economic conditions and the availability of investment capital.

  8. Bisulfite sequencing of chromatin immunoprecipitated DNA (BisChIP-seq) directly informs methylation status of histone-modified DNA

    NARCIS (Netherlands)

    Statham, A.L.; Robinson, M.D.; Song, J.Z.; Coolen, M.W.; Stirzaker, C.; Clark, S. J.

    2012-01-01

    The complex relationship between DNA methylation, chromatin modification, and underlying DNA sequence is often difficult to unravel with existing technologies. Here, we describe a novel technique based on high-throughput sequencing of bisulfite-treated chromatin immunoprecipitated DNA (BisChIP-seq),

  9. From DNA sequence to transcriptional behaviour: a quantitative approach.

    Science.gov (United States)

    Segal, Eran; Widom, Jonathan

    2009-07-01

    Complex transcriptional behaviours are encoded in the DNA sequences of gene regulatory regions. Advances in our understanding of these behaviours have been recently gained through quantitative models that describe how molecules such as transcription factors and nucleosomes interact with genomic sequences. An emerging view is that every regulatory sequence is associated with a unique binding affinity landscape for each molecule and, consequently, with a unique set of molecule-binding configurations and transcriptional outputs. We present a quantitative framework based on existing methods that unifies these ideas. This framework explains many experimental observations regarding the binding patterns of factors and nucleosomes and the dynamics of transcriptional activation. It can also be used to model more complex phenomena such as transcriptional noise and the evolution of transcriptional regulation.

  10. Maternal Plasma DNA and RNA Sequencing for Prenatal Testing.

    Science.gov (United States)

    Tamminga, Saskia; van Maarle, Merel; Henneman, Lidewij; Oudejans, Cees B M; Cornel, Martina C; Sistermans, Erik A

    2016-01-01

    Cell-free DNA (cfDNA) testing has recently become indispensable in diagnostic testing and screening. In the prenatal setting, this type of testing is often called noninvasive prenatal testing (NIPT). With a number of techniques, using either next-generation sequencing or single nucleotide polymorphism-based approaches, fetal cfDNA in maternal plasma can be analyzed to screen for rhesus D genotype, common chromosomal aneuploidies, and increasingly for testing other conditions, including monogenic disorders. With regard to screening for common aneuploidies, challenges arise when implementing NIPT in current prenatal settings. Depending on the method used (targeted or nontargeted), chromosomal anomalies other than trisomy 21, 18, or 13 can be detected, either of fetal or maternal origin, also referred to as unsolicited or incidental findings. For various biological reasons, there is a small chance of having either a false-positive or false-negative NIPT result, or no result, also referred to as a "no-call." Both pre- and posttest counseling for NIPT should include discussing potential discrepancies. Since NIPT remains a screening test, a positive NIPT result should be confirmed by invasive diagnostic testing (either by chorionic villus biopsy or by amniocentesis). As the scope of NIPT is widening, professional guidelines need to discuss the ethics of what to offer and how to offer. In this review, we discuss the current biochemical, clinical, and ethical challenges of cfDNA testing in the prenatal setting and its future perspectives including novel applications that target RNA instead of DNA.

  11. Chimeric TALE recombinases with programmable DNA sequence specificity.

    Science.gov (United States)

    Mercer, Andrew C; Gaj, Thomas; Fuller, Roberta P; Barbas, Carlos F

    2012-11-01

    Site-specific recombinases are powerful tools for genome engineering. Hyperactivated variants of the resolvase/invertase family of serine recombinases function without accessory factors, and thus can be re-targeted to sequences of interest by replacing native DNA-binding domains (DBDs) with engineered zinc-finger proteins (ZFPs). However, imperfect modularity with particular domains, lack of high-affinity binding to all DNA triplets, and difficulty in construction has hindered the widespread adoption of ZFPs in unspecialized laboratories. The discovery of a novel type of DBD in transcription activator-like effector (TALE) proteins from Xanthomonas provides an alternative to ZFPs. Here we describe chimeric TALE recombinases (TALERs): engineered fusions between a hyperactivated catalytic domain from the DNA invertase Gin and an optimized TALE architecture. We use a library of incrementally truncated TALE variants to identify TALER fusions that modify DNA with efficiency and specificity comparable to zinc-finger recombinases in bacterial cells. We also show that TALERs recombine DNA in mammalian cells. The TALER architecture described herein provides a platform for insertion of customized TALE domains, thus significantly expanding the targeting capacity of engineered recombinases and their potential applications in biotechnology and medicine.

  12. Apoptosis and DNA damage in human spermatozoa

    Institute of Scientific and Technical Information of China (English)

    R John Aitken; Adam J Koppers

    2011-01-01

    DNA damage is frequently encountered in spermatozoa of subfertile males and is correlated with a range of adverse clinical outcomes including impaired fertilization, disrupted preimplantation embryonic development, increased rates of miscarriage and an enhanced risk of disease in the progeny. The etiology of DNA fragmentation in human spermatozoa is closely correlated with the appearance of oxidative base adducts and evidence of impaired spermiogenesis. We hypothesize that oxidative stress impedes spermiogenesis,resulting in the generation of spermatozoa with poorly remodelled chromatin. These defective cells have a tendency to default to an apoptotic pathway associated with motility loss, caspase activation, phosphatidylserine exteriorization and the activation of free radical generation by the mitochondria. The latter induces lipid peroxidation and oxidative DNA damage, which then leads to DNA fragmentation and cell death. The physical architecture of spermatozoa prevents any nucleases activated as a result of this apoptotic process from gaining access to the nuclear DNA and inducing its fragmentation. It is for this reason that a majority of the DNA damage encountered in human spermatozoa seems to be oxidative. Given the important role that oxidative stress seems to have in the etiology of DNA damage, there should be an important role for antioxidants in the treatment of this condition. If oxidative DNA damage in spermatozoa is providing a sensitive readout of systemic oxidative stress, the implications of these findings could stretch beyond our immediate goal of trying to minimize DNA damage in spermatozoa as a prelude to assisted conception therapy.

  13. DNA structure in human RNA polymerase II promoters

    DEFF Research Database (Denmark)

    Pedersen, Anders Gorm; Baldi, Pierre; Chauvin, Yves

    1998-01-01

    the high-bendability regions position nucleosomes at the downstream end of the transcriptional start point, and consider the possibility of interaction between histone-like TAFs and this area. We also propose the use of this structural signature in computational promoter-finding algorithms.......The fact that DNA three-dimensional structure is important for transcriptional regulation begs the question of whether eukaryotic promoters contain general structural features independently of what genes they control. We present an analysis of a large set of human RNA polymerase II promoters...... with a very low level of sequence similarity. The sequences, which include both TATA-containing and TATA-less promoters, are aligned by hidden Markov models. Using three different models of sequence-derived DNA bendability, the aligned promoters display a common structural profile with bendability being low...

  14. High nucleosome occupancy is encoded at human regulatory sequences.

    Directory of Open Access Journals (Sweden)

    Desiree Tillo

    Full Text Available Active eukaryotic regulatory sites are characterized by open chromatin, and yeast promoters and transcription factor binding sites (TFBSs typically have low intrinsic nucleosome occupancy. Here, we show that in contrast to yeast, DNA at human promoters, enhancers, and TFBSs generally encodes high intrinsic nucleosome occupancy. In most cases we examined, these elements also have high experimentally measured nucleosome occupancy in vivo. These regions typically have high G+C content, which correlates positively with intrinsic nucleosome occupancy, and are depleted for nucleosome-excluding poly-A sequences. We propose that high nucleosome preference is directly encoded at regulatory sequences in the human genome to restrict access to regulatory information that will ultimately be utilized in only a subset of differentiated cells.

  15. Analysis of DNA sequences by an optical time-integrating correlator.

    Science.gov (United States)

    Brousseau, N; Brousseau, R; Salt, J W; Gutz, L; Tucker, M D

    1992-08-10

    The analysis of the molecular structure called DNA is of particular interest for the understanding of the basic processes governing life. Correlation techniques implemented on digital computers are currently used to do this analysis, but the process is so slow that the mapping and sequencing of the entire human genome requires a computational breakthrough. This paper presents a new method of performing the analysis of DNA sequences with an optical time-integrating correlator. The method is characterized by short processing times that make the analysis of the entire human genome a tractable enterprise. A processing strategy and the resultant processing times are presented. Experimental proofs of concept for the two types of analysis specified by the strategy are also included.

  16. Bacterial DNA Sequence Compression Models Using Artificial Neural Networks

    Directory of Open Access Journals (Sweden)

    Armando J. Pinho

    2013-08-01

    Full Text Available It is widely accepted that the advances in DNA sequencing techniques have contributed to an unprecedented growth of genomic data. This fact has increased the interest in DNA compression, not only from the information theory and biology points of view, but also from a practical perspective, since such sequences require storage resources. Several compression methods exist, and particularly, those using finite-context models (FCMs have received increasing attention, as they have been proven to effectively compress DNA sequences with low bits-per-base, as well as low encoding/decoding time-per-base. However, the amount of run-time memory required to store high-order finite-context models may become impractical, since a context-order as low as 16 requires a maximum of 17.2 x 109 memory entries. This paper presents a method to reduce such a memory requirement by using a novel application of artificial neural networks (ANN to build such probabilistic models in a compact way and shows how to use them to estimate the probabilities. Such a system was implemented, and its performance compared against state-of-the art compressors, such as XM-DNA (expert model and FCM-Mx (mixture of finite-context models , as well as with general-purpose compressors. Using a combination of order-10 FCM and ANN, similar encoding results to those of FCM, up to order-16, are obtained using only 17 megabytes of memory, whereas the latter, even employing hash-tables, uses several hundreds of megabytes.

  17. Mapping and Sequencing the Human Genome

    Science.gov (United States)

    1988-01-01

    Numerous meetings have been held and a debate has developed in the biological community over the merits of mapping and sequencing the human genome. In response a committee to examine the desirability and feasibility of mapping and sequencing the human genome was formed to suggest options for implementing the project. The committee asked many questions. Should the analysis of the human genome be left entirely to the traditionally uncoordinated, but highly successful, support systems that fund the vast majority of biomedical research. Or should a more focused and coordinated additional support system be developed that is limited to encouraging and facilitating the mapping and eventual sequencing of the human genome. If so, how can this be done without distorting the broader goals of biological research that are crucial for any understanding of the data generated in such a human genome project. As the committee became better informed on the many relevant issues, the opinions of its members coalesced, producing a shared consensus of what should be done. This report reflects that consensus.

  18. Cynomolgus monkey testicular cDNAs for discovery of novel human genes in the human genome sequence

    Directory of Open Access Journals (Sweden)

    Terao Keiji

    2002-12-01

    Full Text Available Abstract Background In order to contribute to the establishment of a complete map of transcribed regions of the human genome, we constructed a testicular cDNA library for the cynomolgus monkey, and attempted to find novel transcripts for identification of their human homologues. Result The full-insert sequences of 512 cDNA clones were determined. Ultimately we found 302 non-redundant cDNAs carrying open reading frames of 300 bp-length or longer. Among them, 89 cDNAs were found not to be annotated previously in the Ensembl human database. After searching against the Ensembl mouse database, we also found 69 putative coding sequences have no homologous cDNAs in the annotated human and mouse genome sequences in Ensembl. We subsequently designed a DNA microarray including 396 non-redundant cDNAs (with and without open reading frames to examine the expression of the full-sequenced genes. With the testicular probe and a mixture of probes of 10 other tissues, 316 of 332 effective spots showed intense hybridized signals and 75 cDNAs were shown to be expressed very highly in the cynomolgus monkey testis, but not ubiquitously. Conclusions In this report, we determined 302 full-insert sequences of cynomolgus monkey cDNAs with enough length of open reading frames to discover novel transcripts as human homologues. Among 302 cDNA sequences, human homologues of 89 cDNAs have not been predicted in the annotated human genome sequence in the Ensembl. Additionally, we identified 75 dominantly expressed genes in testis among the full-sequenced clones by using a DNA microarray. Our cDNA clones and analytical results will be valuable resources for future functional genomic studies.

  19. Modeling genetic imprinting effects of DNA sequences with multilocus polymorphism data

    Directory of Open Access Journals (Sweden)

    Staud Roland

    2009-08-01

    Full Text Available Abstract Single nucleotide polymorphisms (SNPs represent the most widespread type of DNA sequence variation in the human genome and they have recently emerged as valuable genetic markers for revealing the genetic architecture of complex traits in terms of nucleotide combination and sequence. Here, we extend an algorithmic model for the haplotype analysis of SNPs to estimate the effects of genetic imprinting expressed at the DNA sequence level. The model provides a general procedure for identifying the number and types of optimal DNA sequence variants that are expressed differently due to their parental origin. The model is used to analyze a genetic data set collected from a pain genetics project. We find that DNA haplotype GAC from three SNPs, OPRKG36T (with two alleles G and T, OPRKA843G (with alleles A and G, and OPRKC846T (with alleles C and T, at the kappa-opioid receptor, triggers a significant effect on pain sensitivity, but with expression significantly depending on the parent from which it is inherited (p = 0.008. With a tremendous advance in SNP identification and automated screening, the model founded on haplotype discovery and statistical inference may provide a useful tool for genetic analysis of any quantitative trait with complex inheritance.

  20. Analysis of Human Accelerated DNA Regions Using Archaic Hominin Genomes

    Science.gov (United States)

    Burbano, Hernán A.; Green, Richard E.; Maricic, Tomislav; Lalueza-Fox, Carles; de la Rasilla, Marco; Rosas, Antonio; Kelso, Janet; Pollard, Katherine S.; Lachmann, Michael; Pääbo, Svante

    2012-01-01

    Several previous comparisons of the human genome with other primate and vertebrate genomes identified genomic regions that are highly conserved in vertebrate evolution but fast-evolving on the human lineage. These human accelerated regions (HARs) may be regions of past adaptive evolution in humans. Alternatively, they may be the result of non-adaptive processes, such as biased gene conversion. We captured and sequenced DNA from a collection of previously published HARs using DNA from an Iberian Neandertal. Combining these new data with shotgun sequence from the Neandertal and Denisova draft genomes, we determine at least one archaic hominin allele for 84% of all positions within HARs. We find that 8% of HAR substitutions are not observed in the archaic hominins and are thus recent in the sense that the derived allele had not come to fixation in the common ancestor of modern humans and archaic hominins. Further, we find that recent substitutions in HARs tend to have come to fixation faster than substitutions elsewhere in the genome and that substitutions in HARs tend to cluster in time, consistent with an episodic rather than a clock-like process underlying HAR evolution. Our catalog of sequence changes in HARs will help prioritize them for functional studies of genomic elements potentially responsible for modern human adaptations. PMID:22412940

  1. Choosing the best heuristic for seeded alignment of DNA sequences

    Directory of Open Access Journals (Sweden)

    Buhler Jeremy

    2006-03-01

    Full Text Available Abstract Background Seeded alignment is an important component of algorithms for fast, large-scale DNA similarity search. A good seed matching heuristic can reduce the execution time of genomic-scale sequence comparison without degrading sensitivity. Recently, many types of seed have been proposed to improve on the performance of traditional contiguous seeds as used in, e.g., NCBI BLASTN. Choosing among these seed types, particularly those that use information besides the presence or absence of matching residue pairs, requires practical guidance based on a rigorous comparison, including assessment of sensitivity, specificity, and computational efficiency. This work performs such a comparison, focusing on alignments in DNA outside widely studied coding regions. Results We compare seeds of several types, including those allowing transition mutations rather than matches at fixed positions, those allowing transitions at arbitrary positions ("BLASTZ" seeds, and those using a more general scoring matrix. For each seed type, we use an extended version of our Mandala seed design software to choose seeds with optimized sensitivity for various levels of specificity. Our results show that, on a test set biased toward alignments of noncoding DNA, transition information significantly improves seed performance, while finer distinctions between different types of mismatches do not. BLASTZ seeds perform especially well. These results depend on properties of our test set that are not shared by EST-based test sets with a strong bias toward coding DNA. Conclusion Practical seed design requires careful attention to the properties of the alignments being sought. For noncoding DNA sequences, seeds that use transition information, especially BLASTZ-style seeds, are particularly useful. The Mandala seed design software can be found at http://www.cse.wustl.edu/~yanni/mandala/.

  2. Complete genome sequence of mitochondrial DNA (mtDNA) of Chlorella sorokiniana.

    Science.gov (United States)

    Orsini, Massimiliano; Costelli, Cristina; Malavasi, Veronica; Cusano, Roberto; Concas, Alessandro; Angius, Andrea; Cao, Giacomo

    2016-01-01

    The complete sequence of mitochondrial genome of the Chlorella sorokiniana strain (SAG 111-8 k) is presented in this work. Within the Chlorella genus, it represents the second species with a complete sequenced and annotated mitochondrial genome (GenBank accession no. KM241869). The genome consists of circular chromosomes of 52,528 bp and encodes a total of 31 protein coding genes, 3 rRNAs and 26 tRNAs. The overall AT contents of the C. sorokiniana mtDNA is 70.89%, while the coding sequence is of 97.4%.

  3. Sequencing of mitochondrial HV1 and HV2 DNA with length heteroplasmy

    DEFF Research Database (Denmark)

    Rasmussen, E. Michael; Eriksen, Birthe; Larsen, Hans Jakob

    2003-01-01

    This study presents a fast method for sequencing the poly C/G regions in HV1 and HV2 in the mitochondrial DNA (mtDNA)......This study presents a fast method for sequencing the poly C/G regions in HV1 and HV2 in the mitochondrial DNA (mtDNA)...

  4. Organization and evolution of primate centromeric DNA from whole-genome shotgun sequence data.

    Directory of Open Access Journals (Sweden)

    Can Alkan

    2007-09-01

    Full Text Available The major DNA constituent of primate centromeres is alpha satellite DNA. As much as 2%-5% of sequence generated as part of primate genome sequencing projects consists of this material, which is fragmented or not assembled as part of published genome sequences due to its highly repetitive nature. Here, we develop computational methods to rapidly recover and categorize alpha-satellite sequences from previously uncharacterized whole-genome shotgun sequence data. We present an algorithm to computationally predict potential higher-order array structure based on paired-end sequence data and then experimentally validate its organization and distribution by experimental analyses. Using whole-genome shotgun data from the human, chimpanzee, and macaque genomes, we examine the phylogenetic relationship of these sequences and provide further support for a model for their evolution and mutation over the last 25 million years. Our results confirm fundamental differences in the dispersal and evolution of centromeric satellites in the Old World monkey and ape lineages of evolution.

  5. Exact Tandem Repeats Analyzer (E-TRA): A new program for DNA sequence mining

    Indian Academy of Sciences (India)

    Mehmet Karaca; Mehmet Bilgen; A. Naci Onus; Ayse Gul Ince; Safinaz Y. Elmasulu

    2005-04-01

    Exact Tandem Repeats Analyzer 1.0 (E-TRA) combines sequence motif searches with keywords such as ‘organs’, ‘tissues’, ‘cell lines’ and ‘development stages’ for finding simple exact tandem repeats as well as non-simple repeats. E-TRA has several advanced repeat search parameters/options compared to other repeat finder programs as it not only accepts GenBank, FASTA and expressed sequence tags (EST) sequence files, but also does analysis of multiple files with multiple sequences. The minimum and maximum tandem repeat motif lengths that E-TRA finds vary from one to one thousand. Advanced user defined parameters/options let the researchers use different minimum motif repeats search criteria for varying motif lengths simultaneously. One of the most interesting features of genomes is the presence of relatively short tandem repeats (TRs). These repeated DNA sequences are found in both prokaryotes and eukaryotes, distributed almost at random throughout the genome. Some of the tandem repeats play important roles in the regulation of gene expression whereas others do not have any known biological function as yet. Nevertheless, they have proven to be very beneficial in DNA profiling and genetic linkage analysis studies. To demonstrate the use of E-TRA, we used 5,465,605 human EST sequences derived from 18,814,550 GenBank EST sequences. Our results indicated that 12.44% (679,800) of the human EST sequences contained simple and non-simple repeat string patterns varying from one to 126 nucleotides in length. The results also revealed that human organs, tissues, cell lines and different developmental stages differed in number of repeats as well as repeat composition, indicating that the distribution of expressed tandem repeats among tissues or organs are not random, thus differing from the un-transcribed repeats found in genomes.

  6. Developmentally programmed excision of internal DNA sequences in Paramecium aurelia.

    Science.gov (United States)

    Gratias, A; Bétermier, M

    2001-01-01

    The development of a new somatic nucleus (macronucleus) during sexual reproduction of the ciliate Paramecium aurelia involves reproducible chromosomal rearrangements that affect the entire germline genome. Macronuclear development can be induced experimentally, which makes P. aurelia an attractive model for the study of the mechanism and the regulation of DNA rearrangements. Two major types of rearrangements have been identified: the fragmentation of the germline chromosomes, followed by the formation of the new macronuclear chromosome ends in association with imprecise DNA elimination, and the precise excision of internal eliminated sequences (IESs). All IESs identified so far are short, A/T rich and non-coding elements. They are flanked by a direct repeat of a 5'-TA-3' dinucleotide, a single copy of which remains at the macronuclear junction after excision. The number of these single-copy sequences has been estimated to be around 60,000 per haploid genome. This review focuses on the current knowledge about the genetic and epigenetic determinants of IES elimination in P. aurelia, the analysis of excision products, and the tightly regulated timing of excision throughout macronuclear development. Several models for the molecular mechanism of IES excision will be discussed in relation to those proposed for DNA elimination in other ciliates.

  7. Mycobacterial DNA extraction for whole-genome sequencing from early positive liquid (MGIT) cultures.

    Science.gov (United States)

    Votintseva, Antonina A; Pankhurst, Louise J; Anson, Luke W; Morgan, Marcus R; Gascoyne-Binzi, Deborah; Walker, Timothy M; Quan, T Phuong; Wyllie, David H; Del Ojo Elias, Carlos; Wilcox, Mark; Walker, A Sarah; Peto, Tim E A; Crook, Derrick W

    2015-04-01

    We developed a low-cost and reliable method of DNA extraction from as little as 1 ml of early positive mycobacterial growth indicator tube (MGIT) cultures that is suitable for whole-genome sequencing to identify mycobacterial species and predict antibiotic resistance in clinical samples. The DNA extraction method is based on ethanol precipitation supplemented by pretreatment steps with a MolYsis kit or saline wash for the removal of human DNA and a final DNA cleanup step with solid-phase reversible immobilization beads. The protocol yielded ≥0.2 ng/μl of DNA for 90% (MolYsis kit) and 83% (saline wash) of positive MGIT cultures. A total of 144 (94%) of the 154 samples sequenced on the MiSeq platform (Illumina) achieved the target of 1 million reads, with 90% coverage achieved. The DNA extraction protocol, therefore, will facilitate fast and accurate identification of mycobacterial species and resistance using a range of bioinformatics tools. Copyright © 2015, Votintseva et al.

  8. Paging through history: parchment as a reservoir of ancient DNA for next generation sequencing.

    Science.gov (United States)

    Teasdale, M D; van Doorn, N L; Fiddyment, S; Webb, C C; O'Connor, T; Hofreiter, M; Collins, M J; Bradley, D G

    2015-01-19

    Parchment represents an invaluable cultural reservoir. Retrieving an additional layer of information from these abundant, dated livestock-skins via the use of ancient DNA (aDNA) sequencing has been mooted by a number of researchers. However, prior PCR-based work has indicated that this may be challenged by cross-individual and cross-species contamination, perhaps from the bulk parchment preparation process. Here we apply next generation sequencing to two parchments of seventeenth and eighteenth century northern English provenance. Following alignment to the published sheep, goat, cow and human genomes, it is clear that the only genome displaying substantial unique homology is sheep and this species identification is confirmed by collagen peptide mass spectrometry. Only 4% of sequence reads align preferentially to a different species indicating low contamination across species. Moreover, mitochondrial DNA sequences suggest an upper bound of contamination at 5%. Over 45% of reads aligned to the sheep genome, and even this limited sequencing exercise yield 9 and 7% of each sampled sheep genome post filtering, allowing the mapping of genetic affinity to modern British sheep breeds. We conclude that parchment represents an excellent substrate for genomic analyses of historical livestock.

  9. Significance of satellite DNA revealed by conservation of a widespread repeat DNA sequence among angiosperms.

    Science.gov (United States)

    Mehrotra, Shweta; Goel, Shailendra; Raina, Soom Nath; Rajpal, Vijay Rani

    2014-08-01

    The analysis of plant genome structure and evolution requires comprehensive characterization of repetitive sequences that make up the majority of plant nuclear DNA. In the present study, we analyzed the nature of pCtKpnI-I and pCtKpnI-II tandem repeated sequences, reported earlier in Carthamus tinctorius. Interestingly, homolog of pCtKpnI-I repeat sequence was also found to be present in widely divergent families of angiosperms. pCtKpnI-I showed high sequence similarity but low copy number among various taxa of different families of angiosperms analyzed. In comparison, pCtKpnI-II was specific to the genus Carthamus and was not present in any other taxa analyzed. The molecular structure of pCtKpnI-I was analyzed in various unrelated taxa of angiosperms to decipher the evolutionary conserved nature of the sequence and its possible functional role.

  10. Ribosomal DNA ITS-1 sequencing of Galba truncatula (Gastropoda, Lymnaeidae and its potential impact on fascioliasis transmission in Mendoza, Argentina

    Directory of Open Access Journals (Sweden)

    Bargues, M. D.

    2006-12-01

    Full Text Available Sequencing of the rDNA ITS–1 proved that the lymnaeid snail species Galba truncatula is present in Argentina and that it belongs to the haplotype HC, the same as that responsible for the fascioliasis transmission in the human hyperendemic area with the highest human prevalences and intensities known, the Northern Bolivian Altiplano.

  11. Base composition at mtDNA boundaries suggests a DNA triple helix model for human mitochondrial DNA large-scale rearrangements.

    Science.gov (United States)

    Rocher, Christophe; Letellier, Thierry; Copeland, William C; Lestienne, Patrick

    2002-06-01

    Different mechanisms have been proposed to account for mitochondrial DNA (mtDNA) instability based on the presence of short homologous sequences (direct repeats, DR) at the potential boundaries of mtDNA rearrangements. Among them, slippage-mispairing of the replication complex during the asymmetric replication cycle of the mammalian mitochondrial DNA has been proposed to account for the preferential localization of deletions. This mechanism involves a transfer of the replication complex from the first neo-synthesized heavy (H) strand of the DR1, to the DR2, thus bypassing the intervening sequence and producing a deleted molecule. Nevertheless, the nature of the bonds between the DNA strands remains unknown as the forward sequence of DR2, beyond the replication complex, stays double-stranded. Here, we have analyzed the base composition of the DR at the boundaries of mtDNA deletions and duplications and found a skewed pyrimidine content of about 75% in the light-strand DNA template. This suggests the possible building of a DNA triple helix between the G-rich neo-synthesized DR1 and the base-paired homologous G.C-rich DR2. In vitro experiments with the purified human DNA polymerase gamma subunits enabled us to show that the third DNA strand may be used as a primer for DNA replication, using a template with the direct repeat forming a hairpin, with which the primer could initiate DNA replication. These data suggest a novel molecular basis for mitochondrial DNA rearrangements through the distributive nature of the DNA polymerase gamma, at the level of the direct repeats. A general model accounting for large-scale mitochondrial DNA deletion and duplication is proposed. These experiments extend to a DNA polymerase from an eucaryote source the use of a DNA triple helix strand as a primer, like other DNA polymerases from phage and bacterial origins.

  12. Isolation and analysis of high quality nuclear DNA with reduced organellar DNA for plant genome sequencing and resequencing

    Directory of Open Access Journals (Sweden)

    Zdepski Anna

    2011-05-01

    Full Text Available Abstract Background High throughput sequencing (HTS technologies have revolutionized the field of genomics by drastically reducing the cost of sequencing, making it feasible for individual labs to sequence or resequence plant genomes. Obtaining high quality, high molecular weight DNA from plants poses significant challenges due to the high copy number of chloroplast and mitochondrial DNA, as well as high levels of phenolic compounds and polysaccharides. Multiple methods have been used to isolate DNA from plants; the CTAB method is commonly used to isolate total cellular DNA from plants that contain nuclear DNA, as well as chloroplast and mitochondrial DNA. Alternatively, DNA can be isolated from nuclei to minimize chloroplast and mitochondrial DNA contamination. Results We describe optimized protocols for isolation of nuclear DNA from eight different plant species encompassing both monocot and eudicot species. These protocols use nuclei isolation to minimize chloroplast and mitochondrial DNA contamination. We also developed a protocol to determine the number of chloroplast and mitochondrial DNA copies relative to the nuclear DNA using quantitative real time PCR (qPCR. We compared DNA isolated from nuclei to total cellular DNA isolated with the CTAB method. As expected, DNA isolated from nuclei consistently yielded nuclear DNA with fewer chloroplast and mitochondrial DNA copies, as compared to the total cellular DNA prepared with the CTAB method. This protocol will allow for analysis of the quality and quantity of nuclear DNA before starting a plant whole genome sequencing or resequencing experiment. Conclusions Extracting high quality, high molecular weight nuclear DNA in plants has the potential to be a bottleneck in the era of whole genome sequencing and resequencing. The methods that are described here provide a framework for researchers to extract and quantify nuclear DNA in multiple types of plants.

  13. DNA Sequencing as a Tool to Monitor Marine Ecological Status

    Directory of Open Access Journals (Sweden)

    Kelly D. Goodwin

    2017-05-01

    Full Text Available Many ocean policies mandate integrated, ecosystem-based approaches to marine monitoring, driving a global need for efficient, low-cost bioindicators of marine ecological quality. Most traditional methods to assess biological quality rely on specialized expertise to provide visual identification of a limited set of specific taxonomic groups, a time-consuming process that can provide a narrow view of ecological status. In addition, microbial assemblages drive food webs but are not amenable to visual inspection and thus are largely excluded from detailed inventory. Molecular-based assessments of biodiversity and ecosystem function offer advantages over traditional methods and are increasingly being generated for a suite of taxa using a “microbes to mammals” or “barcodes to biomes” approach. Progress in these efforts coupled with continued improvements in high-throughput sequencing and bioinformatics pave the way for sequence data to be employed in formal integrated ecosystem evaluation, including food web assessments, as called for in the European Union Marine Strategy Framework Directive. DNA sequencing of bioindicators, both traditional (e.g., benthic macroinvertebrates, ichthyoplankton and emerging (e.g., microbial assemblages, fish via eDNA, promises to improve assessment of marine biological quality by increasing the breadth, depth, and throughput of information and by reducing costs and reliance on specialized taxonomic expertise.

  14. Exome sequencing of a multigenerational human pedigree.

    Directory of Open Access Journals (Sweden)

    Dale J Hedges

    Full Text Available Over the next few years, the efficient use of next-generation sequencing (NGS in human genetics research will depend heavily upon the effective mechanisms for the selective enrichment of genomic regions of interest. Recently, comprehensive exome capture arrays have become available for targeting approximately 33 Mb or approximately 180,000 coding exons across the human genome. Selective genomic enrichment of the human exome offers an attractive option for new experimental designs aiming to quickly identify potential disease-associated genetic variants, especially in family-based studies. We have evaluated a 2.1 M feature human exome capture array on eight individuals from a three-generation family pedigree. We were able to cover up to 98% of the targeted bases at a long-read sequence read depth of > or = 3, 86% at a read depth of > or = 10, and over 50% of all targets were covered with > or = 20 reads. We identified up to 14,284 SNPs and small indels per individual exome, with up to 1,679 of these representing putative novel polymorphisms. Applying the conservative genotype calling approach HCDiff, the average rate of detection of a variant allele based on Illumina 1 M BeadChips genotypes was 95.2% at > or = 10x sequence. Further, we propose an advantageous genotype calling strategy for low covered targets that empirically determines cut-off thresholds at a given coverage depth based on existing genotype data. Application of this method was able to detect >99% of SNPs covered > or = 8x. Our results offer guidance for "real-world" applications in human genetics and provide further evidence that microarray-based exome capture is an efficient and reliable method to enrich for chromosomal regions of interest in next-generation sequencing experiments.

  15. Exome sequencing of a multigenerational human pedigree.

    Science.gov (United States)

    Hedges, Dale J; Hedges, Dale; Burges, Dan; Powell, Eric; Almonte, Cherylyn; Huang, Jia; Young, Stuart; Boese, Benjamin; Schmidt, Mike; Pericak-Vance, Margaret A; Martin, Eden; Zhang, Xinmin; Harkins, Timothy T; Züchner, Stephan

    2009-12-14

    Over the next few years, the efficient use of next-generation sequencing (NGS) in human genetics research will depend heavily upon the effective mechanisms for the selective enrichment of genomic regions of interest. Recently, comprehensive exome capture arrays have become available for targeting approximately 33 Mb or approximately 180,000 coding exons across the human genome. Selective genomic enrichment of the human exome offers an attractive option for new experimental designs aiming to quickly identify potential disease-associated genetic variants, especially in family-based studies. We have evaluated a 2.1 M feature human exome capture array on eight individuals from a three-generation family pedigree. We were able to cover up to 98% of the targeted bases at a long-read sequence read depth of > or = 3, 86% at a read depth of > or = 10, and over 50% of all targets were covered with > or = 20 reads. We identified up to 14,284 SNPs and small indels per individual exome, with up to 1,679 of these representing putative novel polymorphisms. Applying the conservative genotype calling approach HCDiff, the average rate of detection of a variant allele based on Illumina 1 M BeadChips genotypes was 95.2% at > or = 10x sequence. Further, we propose an advantageous genotype calling strategy for low covered targets that empirically determines cut-off thresholds at a given coverage depth based on existing genotype data. Application of this method was able to detect >99% of SNPs covered > or = 8x. Our results offer guidance for "real-world" applications in human genetics and provide further evidence that microarray-based exome capture is an efficient and reliable method to enrich for chromosomal regions of interest in next-generation sequencing experiments.

  16. DNA typing of Calliphorids collected from human corpses in Malaysia.

    Science.gov (United States)

    Kavitha, R; Tan, T C; Lee, H L; Nazni, W A; Sofian-Azirun, M

    2013-03-01

    Estimation of post-mortem interval (PMI) is crucial for time of death determination. The advent of DNA-based identification techniques forensic entomology saw the beginning of a proliferation of molecular studies into forensically important Calliphoridae (Diptera). The use of DNA to characterise morphologically indistinguishable immature calliphorids was recognised as a valuable molecular tool with enormous practical utility. The local entomofauna in most cases is important for the examination of entomological evidences. The survey of the local entomofauna has become a fundamental first step in forensic entomological studies, because different geographical distributions, seasonal and environmental factors may influence the decomposition process and the occurrence of different insect species on corpses. In this study, calliphorids were collected from 13 human corpses recovered from indoors, outdoors and aquatic conditions during the post-mortem examination by pathologists from the government hospitals in Malaysia. Only two species, Chrysomya megacephala and Chrysomya rufifacies were recovered from human corpses. DNA sequencing was performed to study the mitochondrial encoded COI gene and to evaluate the suitability of the 1300 base pairs of COI fragments for identification of blow fly species collected from real crime scene. The COI gene from blow fly specimens were sequenced and deposited in GenBank to expand local databases. The sequenced COI gene was useful in identifying calliphorids retrieved from human corpses.

  17. Mitochondrial DNA sequence analysis of patients with 'atypical psychosis'.

    Science.gov (United States)

    Kazuno, An-A; Munakata, Kae; Mori, Kanako; Tanaka, Masashi; Nanko, Shinichiro; Kunugi, Hiroshi; Umekage, Tadashi; Tochigi, Mamoru; Kohda, Kazuhisa; Sasaki, Tsukasa; Akiyama, Tsuyoshi; Washizuka, Shinsuke; Kato, Nobumasa; Kato, Tadafumi

    2005-08-01

    Although classical psychopathological studies have shown the presence of an independent diagnostic category, 'atypical psychosis', most psychotic patients are currently classified into two major diagnostic categories, schizophrenia and bipolar disorder, by the Diagnostic and Statistical Manual of Mental Disorders (4th edn; DSM-IV) criteria. 'Atypical psychosis' is characterized by acute confusion without systematic delusion, emotional instability, and psychomotor excitement or stupor. Such clinical features resemble those seen in organic mental syndrome, and differential diagnosis is often difficult. Because patients with mitochondrial myopathy, encephalopathy, lactic acidosis, and stroke-like episodes (MELAS) sometimes show organic mental disorder, 'atypical psychosis' may be caused by mutations of mitochondrial DNA (mtDNA) in some patients. In the present study whole mtDNA was sequenced for seven patients with various psychotic disorders, who could be categorized as 'atypical psychosis'. None of them had known mtDNA mutations pathogenic for mitochondrial encephalopathy. Two of seven patients belonged to a subhaplogroup F1b1a with low frequency. These results did not support the hypothesis that clinical presentation of some patients with 'atypical psychosis' is a reflection of subclinical mitochondrial encephalopathy. However, the subhaplogroup F1b1a may be a good target for association study of 'atypical psychosis'.

  18. Retroviral DNA Sequences as a Means for Determining Ancient Diets.

    Directory of Open Access Journals (Sweden)

    Jessica I Rivera-Perez

    Full Text Available For ages, specialists from varying fields have studied the diets of the primeval inhabitants of our planet, detecting diet remains in archaeological specimens using a range of morphological and biochemical methods. As of recent, metagenomic ancient DNA studies have allowed for the comparison of the fecal and gut microbiomes associated to archaeological specimens from various regions of the world; however the complex dynamics represented in those microbial communities still remain unclear. Theoretically, similar to eukaryote DNA the presence of genes from key microbes or enzymes, as well as the presence of DNA from viruses specific to key organisms, may suggest the ingestion of specific diet components. In this study we demonstrate that ancient virus DNA obtained from coprolites also provides information reconstructing the host's diet, as inferred from sequences obtained from pre-Columbian coprolites. This depicts a novel and reliable approach to determine new components as well as validate the previously suggested diets of extinct cultures and animals. Furthermore, to our knowledge this represents the first description of the eukaryotic viral diversity found in paleofaeces belonging to pre-Columbian cultures.

  19. Peptide Synthesis on a Next-Generation DNA Sequencing Platform.

    Science.gov (United States)

    Svensen, Nina; Peersen, Olve B; Jaffrey, Samie R

    2016-09-01

    Methods for displaying large numbers of peptides on solid surfaces are essential for high-throughput characterization of peptide function and binding properties. Here we describe a method for converting the >10(7) flow cell-bound clusters of identical DNA strands generated by the Illumina DNA sequencing technology into clusters of complementary RNA, and subsequently peptide clusters. We modified the flow-cell-bound primers with ribonucleotides thus enabling them to be used by poliovirus polymerase 3D(pol) . The primers hybridize to the clustered DNA thus leading to RNA clusters. The RNAs fold into functional protein- or small molecule-binding aptamers. We used the mRNA-display approach to synthesize flow-cell-tethered peptides from these RNA clusters. The peptides showed selective binding to cognate antibodies. The methods described here provide an approach for using DNA clusters to template peptide synthesis on an Illumina flow cell, thus providing new opportunities for massively parallel peptide-based assays.

  20. Entropy and long-range correlations in DNA sequences.

    Science.gov (United States)

    Melnik, S S; Usatenko, O V

    2014-12-01

    We analyze the structure of DNA molecules of different organisms by using the additive Markov chain approach. Transforming nucleotide sequences into binary strings, we perform statistical analysis of the corresponding "texts". We develop the theory of N-step additive binary stationary ergodic Markov chains and analyze their differential entropy. Supposing that the correlations are weak we express the conditional probability function of the chain by means of the pair correlation function and represent the entropy as a functional of the pair correlator. Since the model uses two point correlators instead of probability of block occurring, it makes possible to calculate the entropy of subsequences at much longer distances than with the use of the standard methods. We utilize the obtained analytical result for numerical evaluation of the entropy of coarse-grained DNA texts. We believe that the entropy study can be used for biological classification of living species.

  1. Field guide to next-generation DNA sequencers.

    Science.gov (United States)

    Glenn, Travis C

    2011-09-01

    The diversity of available 2(nd) and 3(rd) generation DNA sequencing platforms is increasing rapidly. Costs for these systems range from $10/Mb for 454 and some Ion Torrent chips). In terms of cost per nonmultiplexed sample and instrument run time, the Pacific Biosciences and Ion Torrent platforms excel, with the 454 GS Junior and Illumina MiSeq also notable in this regard. All platforms allow multiplexing of samples, but details of library preparation, experimental design and data analysis can constrain the options. The wide range of characteristics among available platforms provides opportunities both to conduct groundbreaking studies and to waste money on scales that were previously infeasible. Thus, careful thought about the desired characteristics of these systems is warranted before purchasing or using any of them. Updated information from this guide will be maintained at: http://dna.uga.edu/ and http://tomato.biol.trinity.edu/blog/. © 2011 Blackwell Publishing Ltd.

  2. New scoring schema for finding motifs in DNA Sequences

    Directory of Open Access Journals (Sweden)

    Nowzari-Dalini Abbas

    2009-03-01

    Full Text Available Abstract Background Pattern discovery in DNA sequences is one of the most fundamental problems in molecular biology with important applications in finding regulatory signals and transcription factor binding sites. An important task in this problem is to search (or predict known binding sites in a new DNA sequence. For this reason, all subsequences of the given DNA sequence are scored based on an scoring function and the prediction is done by selecting the best score. By assuming no dependency between binding site base positions, most of the available tools for known binding site prediction are designed. Recently Tomovic and Oakeley investigated the statistical basis for either a claim of dependence or independence, to determine whether such a claim is generally true, and they presented a scoring function for binding site prediction based on the dependency between binding site base positions. Our primary objective is to investigate the scoring functions which can be used in known binding site prediction based on the assumption of dependency or independency in binding site base positions. Results We propose a new scoring function based on the dependency between all positions in biding site base positions. This scoring function uses joint information content and mutual information as a measure of dependency between positions in transcription factor binding site. Our method for modeling dependencies is simply an extension of position independency methods. We evaluate our new scoring function on the real data sets extracted from JASPAR and TRANSFAC data bases, and compare the obtained results with two other well known scoring functions. Conclusion The results demonstrate that the new approach improves known binding site discovery and show that the joint information content and mutual information provide a better and more general criterion to investigate the relationships between positions in the TFBS. Our scoring function is formulated by simple

  3. Photocatalytic probing of DNA sequence by using TiO{sub 2}/dopamine-DNA triads

    Energy Technology Data Exchange (ETDEWEB)

    Liu Jianqin [Center for Nanoscale Materials, Argonne National Laboratory, 9700 South Cass Avenue, Argonne, IL 60439 (United States); Garza, Linda de la [Chemistry Division, Argonne National Laboratory, 9700 South Cass Avenue, Argonne, IL 60439 (United States); Zhang Ligang; Dimitrijevic, Nada M. [Center for Nanoscale Materials, Argonne National Laboratory, 9700 South Cass Avenue, Argonne, IL 60439 (United States); Zuo Xiaobing; Tiede, David M. [Chemistry Division, Argonne National Laboratory, 9700 South Cass Avenue, Argonne, IL 60439 (United States); Rajh, Tijana [Center for Nanoscale Materials, Argonne National Laboratory, 9700 South Cass Avenue, Argonne, IL 60439 (United States); Chemistry Division, Argonne National Laboratory, 9700 South Cass Avenue, Argonne, IL 60439 (United States)], E-mail: rajh@anl.gov

    2007-10-15

    A method to control charge transfer reaction in DNA using hybrid nanometer-sized TiO{sub 2} nanoparticles was developed. In this system extended charge separation reflects the sequence of DNA and was measured using metallic silver deposition or by photocurrent response. Light-induced extended charge separation in these systems was found to be dependent on the DNA-bridge length and sequence. The yield of photocatalytic deposition of silver was studied in systems having GG accepting sites imbedded in AT runs at varying distances from the TiO{sub 2} nanoparticle surface. Weak distance dependence of charge separation indicative of a hole hopping through mediating adenine (A) sites was found. The quantum yield of silver deposition in the system having a GG accepting site placed 8.5 A from the nanoparticle surface was found to be {phi} = 0.70 (70%) and {phi} = 0.56 (56%) for (A){sub n} and (AT){sub n/2} bridge, respectively. Hole injection to GG trapping sites as far as 70 A from a nanoparticle surface in the absence of G hopping sites was measured. Introduction of G hopping sites increased the efficiency of hole injection. The efficiency of photocatalytic deposition of metallic silver was found to be sensitive to the presence of a single nucleobase mismatch in the DNA sequence.

  4. Next-generation DNA barcoding: using next-generation sequencing to enhance and accelerate DNA barcode capture from single specimens.

    Science.gov (United States)

    Shokralla, Shadi; Gibson, Joel F; Nikbakht, Hamid; Janzen, Daniel H; Hallwachs, Winnie; Hajibabaei, Mehrdad

    2014-09-01

    DNA barcoding is an efficient method to identify specimens and to detect undescribed/cryptic species. Sanger sequencing of individual specimens is the standard approach in generating large-scale DNA barcode libraries and identifying unknowns. However, the Sanger sequencing technology is, in some respects, inferior to next-generation sequencers, which are capable of producing millions of sequence reads simultaneously. Additionally, direct Sanger sequencing of DNA barcode amplicons, as practiced in most DNA barcoding procedures, is hampered by the need for relatively high-target amplicon yield, coamplification of nuclear mitochondrial pseudogenes, confusion with sequences from intracellular endosymbiotic bacteria (e.g. Wolbachia) and instances of intraindividual variability (i.e. heteroplasmy). Any of these situations can lead to failed Sanger sequencing attempts or ambiguity of the generated DNA barcodes. Here, we demonstrate the potential application of next-generation sequencing platforms for parallel acquisition of DNA barcode sequences from hundreds of specimens simultaneously. To facilitate retrieval of sequences obtained from individual specimens, we tag individual specimens during PCR amplification using unique 10-mer oligonucleotides attached to DNA barcoding PCR primers. We employ 454 pyrosequencing to recover full-length DNA barcodes of 190 specimens using 12.5% capacity of a 454 sequencing run (i.e. two lanes of a 16 lane run). We obtained an average of 143 sequence reads for each individual specimen. The sequences produced are full-length DNA barcodes for all but one of the included specimens. In a subset of samples, we also detected Wolbachia, nontarget species, and heteroplasmic sequences. Next-generation sequencing is of great value because of its protocol simplicity, greatly reduced cost per barcode read, faster throughout and added information content.

  5. A preliminary analysis of the DNA and diet of the extinct Beothuk: a systematic approach to ancient human DNA

    DEFF Research Database (Denmark)

    Kuch, Melanie; Gröcke, Darren R; Knyf, Martin C

    2007-01-01

    We have used a systematic protocol for extracting, quantitating, sexing and validating ancient human mitochondrial and nuclear DNA of one male and one female Beothuk, a Native American population from Newfoundland, which became extinct approximately 180 years