WorldWideScience
1

CLIPS-1D: analysis of multiple sequence alignments to deduce for residue-positions a role in catalysis, ligand-binding, or protein structure  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background One aim of the in silico characterization of proteins is to identify all residue-positions, which are crucial for function or structure. Several sequence-based algorithms exist, which predict functionally important sites. However, with respect to sequence information, many functionally and structurally important sites are hard to distinguish and consequently a large number of incorrectly predicted functional sites have to be expected. This is why we were interested to design a new classifier that differentiates between functionally and structurally important sites and to assess its performance on representative datasets. Results We have implemented CLIPS-1D, which predicts a role in catalysis, ligand-binding, or protein structure for residue-positions in a mutually exclusive manner. By analyzing a multiple sequence alignment, the algorithm scores conservation as well as abundance of residues at individual sites and their local neighborhood and categorizes by means of a multiclass support vector machine. A cross-validation confirmed that residue-positions involved in catalysis were identified with state-of-the-art quality; the mean MCC-value was 0.34. For structurally important sites, prediction quality was considerably higher (mean MCC = 0.67. For ligand-binding sites, prediction quality was lower (mean MCC = 0.12, because binding sites and structurally important residue-positions share conservation and abundance values, which makes their separation difficult. We show that classification success varies for residues in a class-specific manner. This is why our algorithm computes residue-specific p-values, which allow for the statistical assessment of each individual prediction. CLIPS-1D is available as a Web service at http://www-bioinf.uni-regensburg.de/. Conclusions CLIPS-1D is a classifier, whose prediction quality has been determined separately for catalytic sites, ligand-binding sites, and structurally important sites. It generates hypotheses about residue-positions important for a set of homologous proteins and focuses on conservation and abundance signals. Thus, the algorithm can be applied in cases where function cannot be transferred from well-characterized proteins by means of sequence comparison.

Janda Jan-Oliver

2012-04-01

2

Nucleotide sequence of a gene from chromosome 1D of wheat encoding a HMW-glutenin subunit.  

Science.gov (United States)

A high molecular weight glutenin gene in hexaploid wheat has been isolated by cloning in bacteriophage lambda and characterized. The gene corresponds to polypeptide 12 encoded by chromosome 1D in the variety "Chinese Spring". The coding sequence predicted contains seven cysteine residues six of which flank a central repetitive region comprising more than 70% of the polypeptide. These findings are related to the role of high molecular weight subunits in the viscoelastic theory of gluten structure. PMID:3840588

Thompson, R D; Bartels, D; Harberd, N P

1985-10-11

3

SSViewer: Sequence Structure Viewer  

Directory of Open Access Journals (Sweden)

Full Text Available An important aspect of bioinformatics is sequence. Sequence is a discrete function which contains the combinations of amino acids in proteins and nucleotides in Dna. Important functions of Amino Acids are to serve as the building blocks of proteins, which are linear chains of amino acids. Amino acids can be linked together in varying sequences to form a vast variety of proteins. Twenty-two amino acids are naturally incorporated into polypeptides and are called protein-o-genic or standard amino acids. Of these, 20 are encoded by the universal genetic code. In the case of the DNA sequence A, T, G, C is used to represent DNA. This sequence information is analysed to determine genes that encode polypeptides (proteins, RNA, genes, regulatory sequences, structural motifs, repetitive sequences and DNA sequences can be accurately analysed using computational techniques like BLAST, FASTA which is not possible manually. In the present study we developed a tool to visualize the 3D structure for a given sequence by using programming language Java and HTML.

Shyam Perugu,

2013-09-01

4

Genome Sequence of Bovine Viral Diarrhea Virus Strain 10JJ-SKR, Belonging to Genotype 1d  

OpenAIRE

Here, we report the complete genome sequence of a bovine viral diarrhea virus (BVDV) belonging to genotype 1d, strain 10JJ-SKR, which was isolated from cattle. The complete genome is 12,267 nucleotides (nt) in length, with a single large open reading frame. This is the first report of a BVDV belonging to genotype 1d and will enable further study of the molecular and epidemiological characteristics of this virus.

Joo, Soo-kyung; Lim, Seong-in; Jeoung, Hye-young; Song, Jae-young; Oem, Jae-ku; Mun, Seong-hwan; An, Dong-jun

2013-01-01

5

Genome Sequence of Bovine Viral Diarrhea Virus Strain 10JJ-SKR, Belonging to Genotype 1d  

Science.gov (United States)

Here, we report the complete genome sequence of a bovine viral diarrhea virus (BVDV) belonging to genotype 1d, strain 10JJ-SKR, which was isolated from cattle. The complete genome is 12,267 nucleotides (nt) in length, with a single large open reading frame. This is the first report of a BVDV belonging to genotype 1d and will enable further study of the molecular and epidemiological characteristics of this virus. PMID:23929474

Joo, Soo-Kyung; Lim, Seong-In; Jeoung, Hye-Young; Song, Jae-Young; Oem, Jae-Ku; Mun, Seong-Hwan

2013-01-01

6

Dispersive Elastodynamics of 1D Banded Materials and Structures: Design  

CERN Document Server

Within periodic materials and structures, wave scattering and dispersion occur across constituent material interfaces leading to a banded frequency response. In an earlier paper, the elastodynamics of one-dimensional periodic materials and finite structures comprising these materials were examined with an emphasis on their frequency-dependent characteristics. In this work, a novel design paradigm is presented whereby periodic unit cells are designed for desired frequency band properties, and with appropriate scaling, these cells are used as building blocks for forming fully periodic or partially periodic structures with related dynamical characteristics. Through this multiscale dispersive design methodology, which is hierarchical and integrated, structures can be devised for effective vibration or shock isolation without needing to employ dissipative damping mechanisms. The speed of energy propagation in a designed structure can also be dictated through synthesis of the unit cells. Case studies are presented ...

Hussein, M I; Scott, R A

2006-01-01

7

Dispersive Elastodynamics of 1D Banded Materials and Structures: Design  

OpenAIRE

Within periodic materials and structures, wave scattering and dispersion occur across constituent material interfaces leading to a banded frequency response. In an earlier paper, the elastodynamics of one-dimensional periodic materials and finite structures comprising these materials were examined with an emphasis on their frequency-dependent characteristics. In this work, a novel design paradigm is presented whereby periodic unit cells are designed for desired frequency ban...

Hussein, M. I.; Hulbert, G. M.; Scott, R. A.

2006-01-01

8

Structure and Catalytic Mechanism of Human Steroid 5-Reductase (AKR1D1)  

Energy Technology Data Exchange (ETDEWEB)

Human steroid 5{beta}-reductase (aldo-keto reductase (AKR) 1D1) catalyzes reduction of {Delta}{sup 4}-ene double bonds in steroid hormones and bile acid precursors. We have reported the structures of an AKR1D1-NADP{sup +} binary complex, and AKR1D1-NADP{sup +}-cortisone, AKR1D1-NADP{sup +}-progesterone and AKR1D1-NADP{sup +}-testosterone ternary complexes at high resolutions. Recently, structures of AKR1D1-NADP{sup +}-5{beta}-dihydroprogesterone complexes showed that the product is bound unproductively. Two quite different mechanisms of steroid double bond reduction have since been proposed. However, site-directed mutagenesis supports only one mechanism. In this mechanism, the 4-pro-R hydride is transferred from the re-face of the nicotinamide ring to C5 of the steroid substrate. E120, a unique substitution in the AKR catalytic tetrad, permits a deeper penetration of the steroid substrate into the active site to promote optimal reactant positioning. It participates with Y58 to create a 'superacidic' oxyanion hole for polarization of the C3 ketone. A role for K87 in the proton relay proposed using the AKR1D1-NADP{sup +}-5{beta}-dihydroprogesterone structure is not supported.

Costanzo, L.; Drury, J; Christianson, D; Penning, T

2009-01-01

9

The Deuteron Spin-dependent Structure Function g1d and its First Moment  

OpenAIRE

We present a measurement of the deuteron spin-dependent structure function g1d based on the data collected by the COMPASS experiment at CERN during the years 2002-2004. The data provide an accurate evaluation for Gamma_1^d, the first moment of g1d(x), and for the matrix element of the singlet axial current, a0. The results of QCD fits in the next to leading order (NLO) on all g1 deep inelastic scattering data are also presented. They provide two solutions with the gluon spin...

Compass, The Collaboration; Alexakhin, V. Yu

2006-01-01

10

Nucleotide sequence of a gene from chromosome 1D of wheat encoding a HMW-glutenin subunit.  

OpenAIRE

A high molecular weight glutenin gene in hexaploid wheat has been isolated by cloning in bacteriophage lambda and characterized. The gene corresponds to polypeptide 12 encoded by chromosome 1D in the variety "Chinese Spring". The coding sequence predicted contains seven cysteine residues six of which flank a central repetitive region comprising more than 70% of the polypeptide. These findings are related to the role of high molecular weight subunits in the viscoelastic theory of gluten struct...

Thompson, Rd; Bartels, D.; Harberd, Np

1985-01-01

11

Protein Structure Predicted from Sequence  

CERN Document Server

The evolutionary trajectory of a protein through sequence space is constrained by function and three-dimensional (3D) structure. Residues in spatial proximity tend to co-evolve, yet attempts to invert the evolutionary record to identify these constraints and use them to computationally fold proteins have so far been unsuccessful. Here, we show that co-variation of residue pairs, observed in a large protein family, provides sufficient information to determine 3D protein structure. Using a data-constrained maximum entropy model of the multiple sequence alignment, we identify pairs of statistically coupled residue positions which are expected to be close in the protein fold, termed contacts inferred from evolutionary information (EICs). To assess the amount of information about the protein fold contained in these coupled pairs, we evaluate the accuracy of predicted 3D structures for proteins of 50-260 residues, from 15 diverse protein families, including a G-protein coupled receptor. These structure predictions ...

Marks, Debora S; Sheridan, Robert; Hopf, Thomas A; Pagnani, Andrea; Zecchina, Riccardo; Sander, Chris

2011-01-01

12

An evaluation of LSU rDNA D1-D2 sequences for their use in species identification  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Identification of species via DNA sequences is the basis for DNA taxonomy and DNA barcoding. Currently there is a strong focus on using a mitochondrial marker for this purpose, in particular a fragment from the cytochrome oxidase I gene (COI. While there is ample evidence that this marker is indeed suitable across a broad taxonomic range to delineate species, it has also become clear that a complementation by a nuclear marker system could be advantageous. Ribosomal RNA genes could be suitable for this purpose, because of their global occurrence and the possibility to design universal primers. However, it has so far been assumed that these genes are too highly conserved to allow resolution at, or even beyond the species level. On the other hand, it is known that ribosomal gene regions harbour also highly divergent parts. We explore here the information content of two adjacent divergence regions of the large subunit ribosomal gene, the D1-D2 region. Results Universal primers were designed to amplify the D1-D2 region from all metazoa. We show that amplification products in the size between 800–1300 bp can be obtained across a broad range of animal taxa, provided some optimizations of the PCR procedure are implemented. Although the ribosomal genes occur in multiple copies in the genomes, we find generally very little intra-individual polymorphism (Cottus and genus Aphyosemion show that the D1-D2 LSU sequence can resolve even very closely related species with the same fidelity as COI sequences. In one case we can even show that a mitochondrial transfer must have occurred, since the nuclear sequence confirms the taxonomic assignment, while the mitochondrial sequence would have led to the wrong classification. We have further explored whether hybrids between species can be detected with the nuclear sequence and we show for a test case of natural hybrids among cyprinid fish species (Alburnus alburnus and Rutilus rutilus that this is indeed possible. Conclusion The D1-D2 LSU region is a suitable marker region for applications in DNA based species identification and should be considered to be routinely used as a marker complementing broad scale studies based on mitochondrial markers.

Tautz Diethard

2007-02-01

13

Large frequency range of negligible transmission in 1D photonic quantum well structures  

OpenAIRE

We show that it is possible to enlarge the range of low transmission in 1D photonic crystals by using photonic quantum well structures. If a defect is introduced in the photonic quantum well structures, defect modes with a very high quality factor may appear. The transmission of the defect mode is due to the coupling between the eigenmodes of the defect and those at the band edges of the constituent photonic crystals.

Zi, J.; Wan, J.; Zhang, C.

1998-01-01

14

HERMES Precision Results on g1p, g1d and g1n and the First Measurement of the Tensor Structure Function b1d  

CERN Document Server

Final HERMES results on the proton, deuteron and neutron structure function g1 are presented in the kinematic range 0.0021structure function b1d are presented.

Riedl, C; Akopov, Z; Amarian, M; Ammosov, V V; Andrus, A; Aschenauer, E C; Augustyniak, W; Avakian, R; Avetisian, A; Avetissian, E; Bailey, P; Baturin, V; Baumgarten, C; Beckmann, M; Belostotskii, S; Bernreuther, S; Bianchi, N; Blok, H P; Böttcher, Helmut B; Borisov, A; Bouwhuis, M; Brack, J; Brüll, A; Bryzgalov, V V; Capitani, G P; Chiang, H C; Ciullo, G; Contalbrigo, M; Dalpiaz, P F; De Leo, R; De Nardo, L; De Sanctis, E; Devitsin, E G; Di Nezza, P; Düren, M; Ehrenfried, M; Elalaoui-Moulay, A; Elbakian, G M; Ellinghaus, F; Elschenbroich, U; Ely, J; Fabbri, R; Fantoni, A; Feshchenko, A; Felawka, L; Fox, B; Franz, J; Frullani, S; Gärber, Y; Gapienko, G; Gapienko, V; Garibaldi, F; Garrow, K; Garutti, E; Gaskell, D; Gavrilov, G E; Karibian, V; Graw, G; Grebenyuk, O; Greeniaus, L G; Hafidi, K; Hartig, M; Hasch, D; Heesbeen, D; Henoch, M; Hertenberger, R; Hesselink, W H A; Hillenbrand, A; Hoek, M; Holler, Y; Hommez, B; Iarygin, G; Ivanilov, A; Izotov, A; Jackson, H E; Jgoun, A; Kaiser, R; Kinney, E; Kiselev, A; Königsmann, K C; Kopytin, M; Korotkov, V A; Kozlov, V; Krauss, B; Krivokhizhin, V G; Lagamba, L; Lapikas, L; Laziev, A; Lenisa, P; Liebing, P; Lindemann, T; Lipka, K; Lorenzon, W; Lü, J; Maiheu, B; Makins, N C R; Marianski, B; Marukyan, H O; Masoli, F; Mexner, V; Meyners, N; Miklukho, O; Miller, C A; Miyachi, Y; Muccifora, V; Nagaitsev, A; Nappi, E; Naryshkin, Yu; Nass, A; Negodaev, M A; Nowak, Wolf-Dieter; Oganessyan, K; Ohsuga, H; Orlandi, G; Pickert, N; Potashov, S Yu; Potterveld, D H; Raithel, M; Reggiani, D; Reimer, P E; Reischl, A; Reolon, A R; Rith, K; Airapetian, A; Rosner, G; Rostomyan, A; Rubacek, L; Ryckbosch, D; Salomatin, Yu I; Sanjiev, I; Savin, I; Scarlett, C; Schäfer, A; Schill, C; Schnell, G; Schüler, K P; Schwind, A; Seele, J; Seidl, R; Seitz, B; Shanidze, R G; Shearer, C; Shibata, T A; Shutov, V B; Simani, M C; Sinram, K; Stancari, M D; Statera, M; Steffens, E; Steijger, J J M; Stewart, J; Stösslein, U; Tait, P; Tanaka, H; Taroian, S P; Tchuiko, B; Terkulov, A R; Tkabladze, A V; Trzcinski, A; Tytgat, M; Vandenbroucke, A; Van der Nat, P B; van der Steenhoven, G; Vetterli, Martin C; Vikhrov, V; Vincter, M G; Visser, J; Vogel, C; Vogt, M; Volmer, J; Weiskopf, C; Wendland, J; Wilbert, J; Ybeles-Smit, G V; Yen, S; Zihlmann, B; Zohrabyan, H G; Zupranski, P; Riedl, Caroline

2005-01-01

15

Ground-state spin structure of strongly interacting disordered 1D Hubbard model  

International Nuclear Information System (INIS)

We study the influence of on-site disorder on the magnetic properties of the ground state of the infinite-U one-dimensional (1D) Hubbard model. We find that the ground state is not ferromagnetic. This is analysed in terms of the algebraic structure of the spin dependence of the Hamiltonian. A simple explanation is derived for the 1/N periodicity in the persistent current for this model. (author)

16

Resistivity structure of Sumatran Fault (Aceh segment) derived from 1-D magnetotelluric modeling  

Science.gov (United States)

Sumatran Fault Zone is the most active fault in Indonesia as a result of strike-slip component of Indo-Australian oblique convergence. With the length of 1900 km, Sumatran fault was divided into 20 segments starting from the southernmost Sumatra Island having small slip rate and increasing to the north end of Sumatra Island. There are several geophysical methods to analyze fault structure depending on physical parameter used in these methods, such as seismology, geodesy and electromagnetic. Magnetotelluric method which is one of geophysical methods has been widely used in mapping and sounding resistivity distribution because it does not only has the ability for detecting contras resistivity but also has a penetration range up to hundreds of kilometers. Magnetotelluric survey was carried out in Aceh region with the 12 total sites crossing Sumatran Fault on Aceh and Seulimeum segments. Two components of electric and magnetic fields were recorded during 10 hours in average with the frequency range from 320 Hz to 0,01 Hz. Analysis of the pseudosection of phase and apparent resistivity exhibit vertical low phase flanked on the west and east by high phase describing the existence of resistivity contras in this region. Having rotated the data to N45°E direction, interpretation of the result has been performed using three different methods of 1D MT modeling i.e. Bostick inversion, 1D MT inversion of TM data, and 1D MT inversion of the impedance determinant. By comparison, we concluded that the use of TM data only and the impedance determinant in 1D inversion yield the more reliable resistivity structure of the fault compare to other methods. Based on this result, it has been shown clearly that Sumatra Fault is characterized by vertical contras resistivity indicating the existence of Aceh and Seulimeum faults which has a good agreement with the geological data.

Nurhasan, Sutarno, D.; Bachtiar, H.; Sugiyanto, D.; Ogawa, Y.; Kimata, F.; Fitriani, D.

2012-06-01

17

Computational Study and Analysis of Structural Imperfections in 1D and 2D Photonic Crystals  

Energy Technology Data Exchange (ETDEWEB)

Dielectric reflectors that are periodic in one or two dimensions, also known as 1D and 2D photonic crystals, have been widely studied for many potential applications due to the presence of wavelength-tunable photonic bandgaps. However, the unique optical behavior of photonic crystals is based on theoretical models of perfect analogues. Little is known about the practical effects of dielectric imperfections on their technologically useful optical properties. In order to address this issue, a finite-difference time-domain (FDTD) code is employed to study the effect of three specific dielectric imperfections in 1D and 2D photonic crystals. The first imperfection investigated is dielectric interfacial roughness in quarter-wave tuned 1D photonic crystals at normal incidence. This study reveals that the reflectivity of some roughened photonic crystal configurations can change up to 50% at the center of the bandgap for RMS roughness values around 20% of the characteristic periodicity of the crystal. However, this reflectivity change can be mitigated by increasing the index contrast and/or the number of bilayers in the crystal. In order to explain these results, the homogenization approximation, which is usually applied to single rough surfaces, is applied to the quarter-wave stacks. The results of the homogenization approximation match the FDTD results extremely well, suggesting that the main role of the roughness features is to grade the refractive index profile of the interfaces in the photonic crystal rather than diffusely scatter the incoming light. This result also implies that the amount of incoherent reflection from the roughened quarterwave stacks is extremely small. This is confirmed through direct extraction of the amount of incoherent power from the FDTD calculations. Further FDTD studies are done on the entire normal incidence bandgap of roughened 1D photonic crystals. These results reveal a narrowing and red-shifting of the normal incidence bandgap with increasing RMS roughness. Again, the homogenization approximation is able to predict these results. The problem of surface scratches on 1D photonic crystals is also addressed. Although the reflectivity decreases are lower in this study, up to a 15% change in reflectivity is observed in certain scratched photonic crystal structures. However, this reflectivity change can be significantly decreased by adding a low index protective coating to the surface of the photonic crystal. Again, application of homogenization theory to these structures confirms its predictive power for this type of imperfection as well. Additionally, the problem of a circular pores in 2D photonic crystals is investigated, showing that almost a 50% change in reflectivity can occur for some structures. Furthermore, this study reveals trends that are consistent with the 1D simulations: parameter changes that increase the absolute reflectivity of the photonic crystal will also increase its tolerance to structural imperfections. Finally, experimental reflectance spectra from roughened 1D photonic crystals are compared to the results predicted computationally in this thesis. Both the computed and experimental spectra correlate favorably, validating the findings presented herein.

K.R. Maskaly

2005-06-01

18

Study of phase transformation and crystal structure for 1D carbon-modified titania ribbons  

Energy Technology Data Exchange (ETDEWEB)

One-dimensional hydrogen titanate ribbons were successfully prepared with hydrothermal reaction in a highly basic solution. A series of one-dimensional carbon-modified TiO{sub 2} ribbons were prepared via calcination of the mixture of hydrogen titanate ribbons and sucrose solution under N{sub 2} flow at different temperatures. The phase transformation process of hydrogen titanate ribbons was investigated by in-situ X-ray diffraction at various temperatures. Besides, one-dimensional carbon-modified TiO{sub 2} ribbons calcined at different temperatures were characterized by X-ray diffraction, scanning electron microscopy, transmission electron microscopy, nitrogen adsorption isotherms, diffuse reflectance ultraviolet–visible spectroscopy, and so on. Carbon-modified TiO{sub 2} ribbons showed one-dimensional ribbon crystal structure and various crystal phases of TiO{sub 2}. After being modified with carbon, a layer of uniform carbon film was coated on the surface of TiO{sub 2} ribbons, which improved their adsorption capacity for methyl orange as a model organic pollutant. One-dimensional carbon-modified TiO{sub 2} ribbons also exhibited enhanced visible-light absorbance with the increase of calcination temperatures. - Highlights: • The synthesis of 1D carbon-modified TiO{sub 2} ribbons. • The phase transformation of 1D carbon-modified TiO{sub 2} ribbons. • 1D carbon-modified TiO{sub 2} exhibites enhanced visible-light absorbance.

Zhou, Lihui, E-mail: lhzhou@ecust.edu.cn; Zhang, Fang; Li, Jinxia

2014-02-15

19

Study of phase transformation and crystal structure for 1D carbon-modified titania ribbons  

International Nuclear Information System (INIS)

One-dimensional hydrogen titanate ribbons were successfully prepared with hydrothermal reaction in a highly basic solution. A series of one-dimensional carbon-modified TiO2 ribbons were prepared via calcination of the mixture of hydrogen titanate ribbons and sucrose solution under N2 flow at different temperatures. The phase transformation process of hydrogen titanate ribbons was investigated by in-situ X-ray diffraction at various temperatures. Besides, one-dimensional carbon-modified TiO2 ribbons calcined at different temperatures were characterized by X-ray diffraction, scanning electron microscopy, transmission electron microscopy, nitrogen adsorption isotherms, diffuse reflectance ultraviolet–visible spectroscopy, and so on. Carbon-modified TiO2 ribbons showed one-dimensional ribbon crystal structure and various crystal phases of TiO2. After being modified with carbon, a layer of uniform carbon film was coated on the surface of TiO2 ribbons, which improved their adsorption capacity for methyl orange as a model organic pollutant. One-dimensional carbon-modified TiO2 ribbons also exhibited enhanced visible-light absorbance with the increase of calcination temperatures. - Highlights: • The synthesis of 1D carbon-modified TiO2 ribbons. • The phase transformation of 1D carbon-modified TiO2 ribbons. • 1D carbon-modified TiO2 exhibites enhanced visible-light absorbance

20

The modular structure of informational sequences.  

Science.gov (United States)

It is shown that DNA sequences can be decomposed into smaller units much the same as texts can be decomposed into syllables, words, or groups of words. Those smaller units (modules) are extracted from DNA sequences according to statistical criteria. Tests with sequences of known modular structure (two novels and a FORTRAN source code) were performed. The rate to which DNA sequences can be decomposed into modules (modularity) turns out to be a very sensitive measure to distinguish DNA sequences from random sequences. PMID:8924645

Schmitt, A O; Ebeling, W; Herzel, H

1996-01-01

21

Human chromosomal centromere (AATGG)n sequence forms stable structures with unusual base pairs.  

Science.gov (United States)

Nine DNA sequences related to the purine strand of the human centromeric satellite (AATGG)n (CCATT)n repeat have been studied by two-dimensional nuclear magnetic resonance spectroscopy. Earlier studies have suggested that the structure of (AATGG)n sequence has an equilibrium between the duplex form and a fold-back form. Structural refinement of d(CAATGG) and its related sequences by an NOE-constrained simulated annealing procedure reveals that the duplex form incorporates dynamic type-I G-A base pairs. 1D exchangeable proton NMR data support this model. The reverse sequence motif (GGTAA) destabilizes the structure. PMID:8013671

Jaishree, T N; Wang, A H

1994-06-20

22

Overcoming sequence misalignments with weighted structural superposition.  

Science.gov (United States)

An appropriate structural superposition identifies similarities and differences between homologous proteins that are not evident from sequence alignments alone. We have coupled our Gaussian-weighted RMSD (wRMSD) tool with a sequence aligner and seed extension (SE) algorithm to create a robust technique for overlaying structures and aligning sequences of homologous proteins (HwRMSD). HwRMSD overcomes errors in the initial sequence alignment that would normally propagate into a standard RMSD overlay. SE can generate a corrected sequence alignment from the improved structural superposition obtained by wRMSD. HwRMSD's robust performance and its superiority over standard RMSD are demonstrated over a range of homologous proteins. Its better overlay results in corrected sequence alignments with good agreement to HOMSTRAD. Finally, HwRMSD is compared to established structural alignment methods: FATCAT, secondary-structure matching, combinatorial extension, and Dalilite. Most methods are comparable at placing residue pairs within 2 Å, but HwRMSD places many more residue pairs within 1 Å, providing a clear advantage. Such high accuracy is essential in drug design, where small distances can have a large impact on computational predictions. This level of accuracy is also needed to correct sequence alignments in an automated fashion, especially for omics-scale analysis. HwRMSD can align homologs with low-sequence identity and large conformational differences, cases where both sequence-based and structural-based methods may fail. The HwRMSD pipeline overcomes the dependency of structural overlays on initial sequence pairing and removes the need to determine the best sequence-alignment method, substitution matrix, and gap parameters for each unique pair of homologs. PMID:22733542

Khazanov, Nickolay A; Damm-Ganamet, Kelly L; Quang, Daniel X; Carlson, Heather A

2012-11-01

23

Sequence and structural analysis of antibodies  

OpenAIRE

The work presented in this thesis focusses on the sequence and structural analysis of antibodies and has fallen into three main areas. First I developed a method to assess how typical an antibody sequence is of the expressed human antibody repertoire. My hypothesis was that the more \\humanlike" an antibody sequence is (in other words how typical it is of the expressed human repertoire), the less likely it is to elicit an immune response when used in vivo in humans. In practi...

Raghavan, A. K.

2009-01-01

24

Permittivity and Permeability for Floquet-Bloch Space Harmonics in Infinite 1D Magneto-Dielectric Periodic Structures  

DEFF Research Database (Denmark)

For an infinite 1D periodic structure with unit cells consisting of two planar slabs of magnetodielectric materials, the electric field – as well as magnetic field, electric flux density, magnetic flux density, polarization, and magnetization – can be expressed as infinite series of Floquet-Bloch space harmonics. We discuss how space harmonic permittivity and permeability can be expressed in seemingly different though equivalent forms, and we investigate these parameters of the zeroeth order space harmonic for a particular 1D periodic structure that is based on a previously reported 3D periodic structure with unit cells containing a magneto-dielectric sphere.

Breinbjerg, Olav; Yaghjian, Arthur D.

2014-01-01

25

The Role of the Impedivity in the Magnetotelluric Response of 1D and 2D Structures  

Science.gov (United States)

The influence of the resistivity dispersion on the magnetotelluric (MT) response is analyzed. MT uses the natural electromagnetic (EM) field to determine the electrical resistivity of the subsoil and retrieve the geometry of lithospheric structures, revealing the presence of bodies as metallic deposits, hydrocarbons reservoirs, geothermal fluids. The frequency range of the EM field used varies from 10-4 to 104 Hz. If the soil is polarizable, the dispersion of the resistivity, whose characteristic frequency interval is between 10-2 and 102 Hz, may affect MT responses. Resistivity dispersion is a known phenomenology, which constitutes the basis of the Induced Polarization (IP) prospecting method. In the frequency domain (FD), the dispersion consists in a variation of the resistivity parameter as the frequency of the exciting current is changed. The dispersive resistivity, called impedivity, is a complex function of the frequency. At vanishing frequency, however, the impedivity is real and coincides with the classical resistivity parameter used in DC geoelectrical methods. A real asymptote is also approached as the frequency tends to infinity. The complex physical and chemical fluid-metal-rock interactions may produce induced polarization effects, which are related to the dispersion in rocks. This is manifested on the MT response, creating a distortion on the experimental curves. Disregarding the distortion effect may lead to misleading interpretation of the surveyed structures. We show the results from simulation of the MT responses, when dispersion is assumed to characterize the electrical properties of a region of the explored half-space. Initially, a 1D-layered earth is considered, with intermediate layer assumed to be dispersive. The influence of the dispersion amplitude on the shape of the MT responses is evaluated. The dispersion alters the shape of the curves in a way that, without any external constraints, may make the interpretation of the curves quite ambiguous. Successively, a 2D case is considered, consisting in a magma chamber at a depth of 1 km, buried into a soil. The synthetic responses were performed considering both the non-dispersive and the dispersive case and the differences of the modelled MT curves are compared. As for the 1D case, the dispersion alters the resistivity values, particularly at the boundary of the buried body, leading to an ambiguous interpretation. MT data alone are not sufficient to distinguish polarization effects or can induce to see dispersion where is not present. An approach to solve this problem consists of the combined interpretation of DC geoelectrical and MT data collected at the same site. Review of real cases is also shown.

Esposito, Roberta; Giulia Di Giuseppe, Maria; Troiano, Antonio; Patella, Domenico; Mariano Castelo Branco, Raimundo

2014-05-01

26

1-D structured flexible supercapacitor electrodes with prominent electronic/ionic transport capabilities.  

Science.gov (United States)

A highly efficient 1-D flexible supercapacitor with a stainless steel mesh (SSM) substrate is demonstrated. Indium tin oxide (ITO) nanowires are prepared on the surface of the stainless steel fiber (SSF), and MnO2 shell layers are coated onto the ITO/SSM electrode by means of electrodeposition. The ITO NWs, which grow radially on the SSF, are single-crystalline and conductive enough for use as a current collector for MnO2-based supercapacitors. A flake-shaped, nanoporous, and uniform MnO2 shell layer with a thickness of ~130 nm and an average crystallite size of ~2 nm is obtained by electrodeposition at a constant voltage. The effect of the electrode geometry on the supercapacitor properties was investigated using electrochemical impedance spectroscopy, cyclic voltammetry, and a galvanostatic charge/discharge study. The electrodes with ITO NWs exhibit higher specific capacitance levels and good rate capability owing to the superior electronic/ionic transport capabilities resulting from the open pore structure. Moreover, the use of a porous mesh substrate (SSM) increases the specific capacitance to 667 F g(-1) at 5 mV s(-1). In addition, the electrode with ITO NWs and the SSM shows very stable cycle performance (no decrease in the specific capacitance after 5000 cycles). PMID:24397749

Kim, Ju Seong; Shin, Seong Sik; Han, Hyun Soo; Oh, Lee Seul; Kim, Dong Hoe; Kim, Jae-Hun; Hong, Kug Sun; Kim, Jin Young

2014-01-01

27

Exome Sequencing in Fetuses with Structural Malformations  

Directory of Open Access Journals (Sweden)

Full Text Available Prenatal diagnostic testing is a rapidly advancing field. An accurate diagnosis of structural anomalies and additional abnormalities in fetuses with structural anomalies is important to allow “triage” and designation of prognosis. This will allow parents to make an informed decision relating to the pregnancy. This review outlines the current tests used in prenatal diagnosis, focusing particularly on “new technologies” such as exome sequencing. We demonstrate the utility of exome sequencing above that of conventional karyotyping and Chromosomal Microarray (CMA alone by outlining a recent proof of concept study investigating 30 parent-fetus trios where the fetus is known to have a structural anomaly. This may allow the identification of pathological gene anomalies and consequently improved prognostic profiling, as well as excluding anomalies and distinguishing between de novo and inherited mutations, in order to estimate the recurrence risk in future pregnancies. The potential ethical dilemmas surrounding exome sequencing are also considered, and the future of prenatal genetic diagnosis is discussed.

Fiona L. Mackie

2014-07-01

28

Hyperfine structure evolution in an electric field and determination of tensor polarizabilities in He (4 and 5 1D)  

OpenAIRE

Variations of hyperfine structure in 3He (n 1D) are calculated as a function of an applied electric field. Using a quantum beat method on a fast helium beam excited and aligned by a thin carbon foil, we observe the time evolution of the alignment for a set of static electric fields. Tensor polarizabilities of levels are deduced.

Denis, A.; Ouerdane, Y.; Docao, G.; De?sesquelles, J.

1987-01-01

29

Scaling and Hierarchical Structures in DNA Sequences  

Science.gov (United States)

A method of analyzing DNA correlation structure is introduced. Density fluctuations of nucleotides are shown to display an extended self-similarity scaling when the scale varies between 100 and 8000 base pairs. The scaling is accurately described by a hierarchical structure model of She and Leveque [Phys. Rev. Lett. 72, 336 (1994)PRLTAO0031-900710.1103/PhysRevLett.72.336]. The derived model parameter ? is able to quantify moderately large-scale correlations which exist in a true DNA sequence but are absent in its randomly shuffled sequence and in a simulated model sequence by an evolution model of Hsieh et al. [Phys. Rev. Lett. 90, 018101 (2003)PRLTAO0031-900710.1103/PhysRevLett.90.018101]. Finally, it is shown that ? varies with the evolution category and measures the organizational complexity of the genome.

Ouyang, Zhengqing; Wang, Chao; She, Zhen-Su

2004-08-01

30

Modeling alternate RNA structures in genomic sequences.  

Science.gov (United States)

We introduce the concept of RNA multistructures, which is a formal grammar-based framework specifically designed to model a set of alternate RNA secondary structures. Such alternate structures can either be a set of suboptimal foldings, or distinct stable folding states, or variants within an RNA family. We provide several such examples and propose an efficient algorithm to search for RNA multistructures within a genomic sequence. PMID:25768235

Saffarian, Azadeh; Giraud, Mathieu; Touzet, Hélène

2015-03-01

31

Inhibition of Human Steroid 5-Reductase (AKR1D1) by Finasteride and Structure of the Enzyme-Inhibitor Complex  

Energy Technology Data Exchange (ETDEWEB)

The {Delta}{sup 4}-3-ketosteroid functionality is present in nearly all steroid hormones apart from estrogens. The first step in functionalization of the A-ring is mediated in humans by steroid 5{alpha}- or 5{beta}-reductase. Finasteride is a mechanism-based inactivator of 5{alpha}-reductase type 2 with subnanomolar affinity and is widely used as a therapeutic for the treatment of benign prostatic hyperplasia. It is also used for androgen deprivation in hormone-dependent prostate carcinoma, and it has been examined as a chemopreventive agent in prostate cancer. The effect of finasteride on steroid 5{beta}-reductase (AKR1D1) has not been previously reported. We show that finasteride competitively inhibits AKR1D1 with low micromolar affinity but does not act as a mechanism-based inactivator. The structure of the AKR1D1 {center_dot} NADP{sup +} {center_dot} finasteride complex determined at 1.7 {angstrom} resolution shows that it is not possible for NADPH to reduce the {Delta}{sup 1-2}-ene of finasteride because the cofactor and steroid are not proximal to each other. The C3-ketone of finasteride accepts hydrogen bonds from the catalytic residues Tyr-58 and Glu-120 in the active site of AKR1D1, providing an explanation for the competitive inhibition observed. This is the first reported structure of finasteride bound to an enzyme involved in steroid hormone metabolism.

Drury, J.; Di Costanzo, L; Penning, T; Christianson, D

2009-01-01

32

A revised 1-D electrical conductivity reference structure beneath North Pacific obtained by semi-global induction study  

International Nuclear Information System (INIS)

Complete text of publication follows. One dimensional (1-D) electrical conductivity structure in the mid-mantle beneath the northern Pacific is revised in order to discuss the mean state of the mantle and to obtain a credible starting model for 3-D inversions. Semi-global geomagnetic depth sounding (GDS) responses obtained at 13 stations and submarine cable magnetotelluric (MT) responses for 8 cables in the period range 1.7 to 113 days were used to obtain the revised structure. We employed an iterative scheme combining surface layer correction to remove the effect of ocean-land heterogeneity in the responses and 1-D inversion to obtain the revised structure. The validity of the scheme is examined by making synthetic tests: We confirmed that the structure obtained using this scheme not only represents the model which explains the corrected response the best but also reflects the actual mean conductivity structure in the mid-mantle depths. The obtained 1-D conductivity in the transition zone by supposing jumps at 400 and 650 km depths (2-jump model) is higher than that of dry Wadsleyite and Ringwoodite measured experimentally by Yoshino et al. (2008). If the high conductivity is entirely due to the effect of water in the transition zone, the region contains 0.5 wt% of water. However, if an additional discontinuity of electrical conductivity is allowed at 500 km depth in the 1-D inversion, the obtained model has lower conductivity than the 2-jump model in the upper 100 ky than the 2-jump model in the upper 100 km of the transition zone. In this case, the conductivity in the layer is rather close to that of dry Wadsleyite.

33

Global structure of integer partitions sequences  

OpenAIRE

Integer partitions are deeply related to many phenomena in statistical physics. A question naturally arises which is of interest to physics both on "purely" theoretical and on practical, computational grounds. Is it possible to apprehend the global pattern underlying integer partition sequences and to express the global pattern compactly, in the form of a "matrix" giving all of the partitions of N into exactly M parts? This paper demonstrates that the global structure of int...

Chase, N. M.

2004-01-01

34

Mechanical consequences of LOCA in PWR: Full scale coupled 1D/3D simulations with fluid–structure interaction  

International Nuclear Information System (INIS)

Highlights: • We propose an approach to analyze the transient effects of LOCA on PWR internals. • The complete primary loop is modeled using a coupled 1D-pipe/3D strategy. • Full fluid–structure interaction is considered inside the main vessel. • Impedance relations modeling the influence of small details are precisely calibrated. • The capabilities of the 1D/3D methodology are demonstrated on a significant example. - Abstract: The present paper is dedicated to the analysis of the fast transient mechanical consequences of the Loss Of Coolant Accident (LOCA) on the internal structures of a Pressurized Water Reactor (PWR). A complete methodology is described, based on a coupled 1D/3D representation of the entire primary loop of the reactor, with a robust and accurate approach for fluid–structure interaction inside the main vessel. A special attention is given to the modeling of small geometric details, such as perforated plates in the vicinity of the reactor core, through local impedance relations acting on the flow, which must be carefully calibrated for industrial purposes. The capabilities of the proposed framework are demonstrated with the application of the complete computational scheme to the simulation of the consequences of LOCA for a French 900 MW PWR, performed with EUROPLEXUS software

35

Synthesis, crystal structure, and properties of a 1-D terbium-substituted monolacunary Keggin-type polyoxotungstate.  

Science.gov (United States)

A new 1-D linear chainlike terbium-substituted polyoxometalate [Tb(H2O)2(?-PW11O39)](4-) (1) has been synthesized in aqueous solution and characterized by elemental analysis, inductively coupled plasma atomic emission spectrometry (ICP-AES), X-ray powder diffraction (XRPD), IR spectrum, thermal analysis, electrospray ionization mass spectrometry (ESI-MS), and X-ray single-crystal diffraction. X-ray structural analysis reveals that 1 displays a 1-D linear chain containing [Tb(H2O)2(?-PW11O39)](4-) moieties. The Tb(III) cation incorporated into the monolacunary Keggin-type [?-PW11O39](7-) unit resides in a distorted monocapped triangular prismatic geometry and acts as a linker to join two adjacent [?-PW11O39](7-) units to form a 1-D chain structure. Solid-state photoluminescent property of 1 has been investigated at room temperature and the photoluminescent emission mainly results from the synergistic effect of the Tb(III) cation and the Na7[?-PW11O39] precursor. The ESI-MS spectrum of 1 confirms that the polyanion [Tb(H2O)(HPW11O39)](3-) is stable in aqueous solution. PMID:25541394

Ma, Pengtao; Si, Yanan; Wan, Rong; Zhang, Shaowei; Wang, Jingping; Niu, Jingyang

2015-03-01

36

Far-infrared studies of 2D and 1D electrons in ultra-high mobility gated semiconductor structures  

International Nuclear Information System (INIS)

Full text: Far-infrared (FIR) photoconductivity experiments are reported for extremely high-mobility gated GaAs-AlGaAs heterostructures that are free of random disorder introduced by modulation doping, and in which the electron density and confining potential are separately adjustable by lithographically-defined surface gates. In 2D structures, unprecedented mean free paths in excess of 100 ?m have been observed by ballistic transport measurements in 2D and conductance quantisation has been observed in 5 ?m long 1D quantum wires. The density of these samples is tunable over nearly two orders of magnitude, and this allows detailed studies of cyclotron resonance (CR) with differing Landau level filling factors, v. Since the samples are undoped, the carriers being introduced by a top-gate, important comparisons can be drawn with similar studies in modulation-doped structures, in particular CR measurements in the extreme quantum limit (v<<1) where a splitting of the CR line has been used to probe correlated electron physics. The extension of this work to FIR studies of quantum wires at milli-Kelvin temperatures is expected to provide a spectroscopic probe of 2D-1D coupling and correlation effects in 1D (Luttinger liquid), where the absence of random disorder becomes increasingly important

37

Global structure of integer partitions sequences  

CERN Document Server

Integer partitions are deeply related to many phenomena in statistical physics. A question naturally arises which is of interest to physics both on "purely" theoretical and on practical, computational grounds. Is it possible to apprehend the global pattern underlying integer partition sequences and to express the global pattern compactly, in the form of a "matrix" giving all of the partitions of N into exactly M parts? This paper demonstrates that the global structure of integer partitions sequences (IPS) is that of a complex tree. By analyzing the structure of this tree, we derive a closed form expression for a map from (N, M) to the set of all partitions of a positive integer N into exactly M positive integer summands without regard to order. The derivation is based on the use of modular arithmetic to solve an isomorphic combinatoric problem, that of describing the global organization of the sequence of all ordered placements of N indistinguishable balls into M distinguishable non-empty bins or boxes. This ...

Chase, N M

2004-01-01

38

Nucleic acid sequences encoding D1 and D1/D2 domains of human coxsackievirus and adenovirus receptor (CAR)  

Science.gov (United States)

The invention provides recombinant human CAR (coxsackievirus and adenovirus receptor) polypeptides which bind adenovirus. Specifically, polypeptides corresponding to adenovirus binding domain D1 and the entire extracellular domain of human CAR protein comprising D1 and D2 are provided. In another aspect, the invention provides nucleic acid sequences encoding these domains and expression vectors for producing the domains and bacterial cells containing such vectors. The invention also includes an isolated fusion protein comprised of the D1 polypeptide fused to a polypeptide which facilitates folding of D1 when expressed in bacteria. The functional D1 domain finds application in a therapeutic method for treating a patient infected with a CAR D1-binding virus, and also in a method for identifying an antiviral compound which interferes with viral attachment. The invention also provides a method for specifically targeting a cell for infection by a virus which binds to D1.

Freimuth, Paul I.

2010-04-06

39

Development of input structure software for MARS 1D-3D graphic user interface  

International Nuclear Information System (INIS)

A user-friendly Input Software for MARS 1D-3D GUI called MARA (MARS Adjunct Reactor Assembler) has been developed. Extension of the current MARA to the overall input system for MARS will result in an integrated commercial GUI comparable to those for computational analysis codes ANSYS, ABAQUS, FLUENT and CFX. MARA will help accelerate marketing of MARS and other potential system analysis codes to developing countries in Southeast Asia planning to put nuclear power in their electrical grids. MARS code and associated developmental technology are in the process of being disseminated to twenty-two organizations spanning the industry, academia and laboratories across the country. MARA will find its way to practical applications in a variety of engineering problems

40

Measurement of the Deuteron Spin Structure Function g1d(x) for 1 (GeV/c)2 2 2  

International Nuclear Information System (INIS)

New measurements are reported on the deuteron spin structure function g1d. These results were obtained from deep inelastic scattering of 48.3 GeV electrons on polarized deuterons in the kinematic range 0.01 2 2. These are the first high dose electron scattering data obtained using lithium deuteride (6Li2H) as the target material. Extrapolations of the data were performed to obtain moments of g1d, including ?1d, and the net quark polarization ? ?

41

Linking bed surface characteristics, near-bed flow hydraulics and sediment transport in pool-riffle sequences using a simple 1D morphodynamic model.  

Science.gov (United States)

Many gravel bed streams have a typical bed morphology consisting of pool-riffle sequences, which provides important habitat diversity both in terms of flow and substrate. Here we use a 1D unsteady multi-fraction morphodynamic model to explain the formation, self maintenance and degradation of pool-riffle sequences. Previous research has focussed almost exclusively on understanding self-maintenance on existing pool-riffle sequences, leaving formation and degradation at a speculative level. Spontaneous formation of pools and riffles has not been attempted before, even though other similar stable bedforms have been generated numerically and in the laboratory through the interaction of flow and sediment transport, like alternate bars in meanders and central bars in braided rivers. While this previous research substantially simplified flow (constant discharge), geometry (sinusoidal description of curvature or channel width) and sediment (uniform material) it showed the forcing effects of either curvature or width on the location of bedforms. For the formation of pools and riffles we use elements of the previous approach (width forcing) but we also incorporate a much more detailed geometry, flow and sediment description, since our intent is to study driving mechanisms for both the morphology and the sediment composition of the bed in a more realistic setting. We tested two hypotheses using our model. The first hypothesis states that the dynamic interaction of 1D flow and sediment processes can not only maintain but also generate a stable pool-riffle morphology with the corresponding longitudinal sorting on a stream with a non uniform bed material, variations in width and subjected to a variable flow regime. The second hypothesis we investigated is that the two key forming sediment processes of erosion/deposition and sorting have different time scales, which interact with the flow time scales to produce a feedback mechanism that reinforces the pool-riffle morphology, including bed geometry and composition. We performed both variable flow and constant flow simulation imposing the width variations observed in an existing river reach and did experiments starting from both a flat bed (to study formation) and from a pool-riffle sequence (to study degradation). Using measured flows on a stream in which we have removed initial bedforms and sediment sorting our model spontaneously generates pools with finer substrate at narrow sections and riffles with coarser sediment at wider sections, closely resembling the natural bed morphology. Additional experiments show that under our modelling assumptions a variable flow regime is fundamental for development and self-maintenance of the longitudinal grain sorting characteristic of pool-riffle sequences, which could not be obtained or maintained with constant discharges.

de Almeida, Gustavo; Rodriguez, Jose

2013-04-01

42

Spatially encoded phase-contrast MRI-3D MRI movies of 1D and 2D structures at millisecond resolution.  

Science.gov (United States)

This work demonstrates that the principles underlying phase-contrast MRI may be used to encode spatial rather than flow information along a perpendicular dimension, if this dimension contains an MRI-visible object at only one spatial location. In particular, the situation applies to 3D mapping of curved 2D structures which requires only two projection images with different spatial phase-encoding gradients. These phase-contrast gradients define the field of view and mean spin-density positions of the object in the perpendicular dimension by respective phase differences. When combined with highly undersampled radial fast low angle shot (FLASH) and image reconstruction by regularized nonlinear inversion, spatial phase-contrast MRI allows for dynamic 3D mapping of 2D structures in real time. First examples include 3D MRI movies of the acting human hand at a temporal resolution of 50 ms. With an even simpler technique, 3D maps of curved 1D structures may be obtained from only three acquisitions of a frequency-encoded MRI signal with two perpendicular phase encodings. Here, 3D MRI movies of a rapidly rotating banana were obtained at 5 ms resolution or 200 frames per second. In conclusion, spatial phase-contrast 3D MRI of 2D or 1D structures is respective two or four orders of magnitude faster than conventional 3D MRI. PMID:21842502

Merboldt, Klaus-Dietmar; Uecker, Martin; Voit, Dirk; Frahm, Jens

2011-10-01

43

The bandgap structure and the transfer behavior of 1-D dielectric photonic crystals with magnetic defect layers  

International Nuclear Information System (INIS)

This work concerns the optical properties of not only 1-D photonic crystals (PCs) on a glass substrate, consisting of dielectric Ti2O3 and Al2O3, but also 1-D magnetic PCs of Bi:YIG thin layers added to the 1-D PCs. The structure of the pure dielectric multilayers was optimized, and the omni-photonic bandgap (PBG) properties were investigated. When a Bi:YIG layer with an optical thickness of ?/2 is inserted, a defect mode is obtained at the designed wavelength within the PBG. However, if the thickness is ?/4, no defect mode is observed. More magnetic defect layers produce correspondingly more defect modes. The existence of magnetic layers leads to a considerable amount of coupled light, perpendicular to the polarization direction of the linearly-polarized incident light, revealing a large magneto-optical effect, which is related to localized photon states at the defect modes within the PBG. The intensity of coupled light is found to be easily affected by absorption in the magnetic layers.

44

The Structural Phase Transition in FeSe (Fe1+dSe)  

OpenAIRE

In this letter we show that superconducting Fe1.01Se undergoes a structural transition at 90 K from a tetragonal to an orthorhombic phase but that non-superconducting Fe1.03Se does not. Further, high resolution electron microscopy study at low temperatures reveals an unexpected additional modulation of the crystal structure of the superconducting phase involving displacements of the Fe atoms, and that the non-superconducting material shows a distinct, complex nanometer-scale...

Mcqueen, T. M.; Williams, A. J.; Stephens, P. W.; Tao, J.; Zhu, Y.; Ksenofontov, V.; Casper, F.; Felser, C.; Cava, R. J.

2009-01-01

45

Properties of Floquet-Bloch space harmonics in 1D periodic magneto-dielectric structures  

DEFF Research Database (Denmark)

Recent years have witnessed a significant research interest in Floquet-Bloch analysis for determining the homogenized permittivity and permeability of metamaterials consisting of periodic structures. This work investigates fundamental properties of the Floquet-Bloch space harmonics in a 1-dimensional magneto-dielectric lossless structure supporting a transverse-electric-magnetic Floquet-Bloch wave; in particular, the space harmonic permittivity and permeability, as well as the space harmonic Poynting vector.

Breinbjerg, O.

2012-01-01

46

MSACompro: improving multiple protein sequence alignment by predicted structural features.  

Science.gov (United States)

Multiple Sequence Alignment (MSA) is an essential tool in protein structure modeling, gene and protein function prediction, DNA motif recognition, phylogenetic analysis, and many other bioinformatics tasks. Therefore, improving the accuracy of multiple sequence alignment is an important long-term objective in bioinformatics. We designed and developed a new method MSACompro to incorporate predicted secondary structure, relative solvent accessibility, and residue-residue contact information into the currently most accurate posterior probability-based MSA methods to improve the accuracy of multiple sequence alignments. Different from the multiple sequence alignment methods that use the tertiary structure information of some sequences, our method uses the structural information purely predicted from sequences. In this chapter, we first introduce some background and related techniques in the field of multiple sequence alignment. Then, we describe the detailed algorithm of MSACompro. Finally, we show that integrating predicted protein structural information improved the multiple sequence alignment accuracy. PMID:24170409

Deng, Xin; Cheng, Jianlin

2014-01-01

47

Accurate multiple sequence-structure alignment of RNA sequences using combinatorial optimization  

OpenAIRE

Abstract Background The discovery of functional non-coding RNA sequences has led to an increasing interest in algorithms related to RNA analysis. Traditional sequence alignment algorithms, however, fail at computing reliable alignments of low-homology RNA sequences. The spatial conformation of RNA sequences largely determines their function, and therefore RNA alignment algorithms have to take structural information into account. Results We present a graph-based representation for sequence-str...

Klau Gunnar W; Bauer Markus; Reinert Knut

2007-01-01

48

4,4'-bipyMnF 3, a modulated hybrid layer structure with 1D magnetic properties  

Science.gov (United States)

4,4'-bipyMnF 3 (bipy=bipyridine) has been crystallized from the system Mn III/4,4'-bipy/HF/H 2O/methanol/acetone and its crystal structure was determined by single crystal X-ray diffraction at different temperatures: at 295 K the structure is orthorhombic, space group I222, Z=2, a=10.704(1), b=11.384(2) Å, c=3.9413(4) Å, wR2=0.0637, R=0.0244. Mn III is octahedrally coordinated by four F and two N ligands. In the c direction an inorganic F?Mn?F?Mn- trans-chain is formed, along the b axis bridging by the organic bipy ligands takes place, thus the structure can be classified as a 2D hybrid coordination polymer. The 180° Mn?F?Mn bridge angle is symmetry imposed but large anisotropic displacement ellipsoids for the F ligands indicate dynamical disorder of an angular chain. At 240 K and 153 K 1D incommensurate modulated structures (space group I222(00 ?)00s) are observed with large variation of the bridge angle down to 149°. Below 100 K the structure can be described as "lock-in phase" with doubled c-axis in the orthorhombic space group P2 12 12 1 and a bridge angle of 156.4°. 4,4'-bipyMnF 3 shows 1D antiferromagnetic properties with an exchange energy along the Mn?F?Mn chain of J/ k=-11.5 K.

Darriet, Jacques; Massa, Werner; Pebler, Jürgen; Stief, Ronald

2002-11-01

49

A linear inside-outside algorithm for correcting sequencing errors in structured RNA sequences  

OpenAIRE

Analysis of the sequence-structure relationship in RNA molecules are essential to evolutionary studies but also to concrete applications such as error-correction methodologies in sequencing technologies. The prohibitive sizes of the mutational and conformational landscapes combined with the volume of data to proceed require e cient algorithms to compute sequence-structure properties. More speci cally, here we aim to calculate which mutations increase the most the likelihood of a sequence to a...

Reinharz, Vladimir; Ponty, Yann; Waldispu?hl, Je?ro?me

2013-01-01

50

Dynamic structure factor of a Bose Einstein condensate in a 1D optical lattice  

CERN Document Server

We study the effect of a one dimensional periodic potential on the dynamic structure factor of an interacting Bose Einstein condensate at zero temperature. We show that, due to phononic correlations, the excitation strength towards the first band develops a typical oscillating behaviour as a function of the momentum transfer, and vanishes at even multiples of the Bragg momentum. The effects of interactions on the static structure factor are found to be significantly amplified by the presence of the optical potential. Our predictions can be tested in stimulated photon scattering experiments.

Menotti, C; Pitaevskii, L P; Stringari, S

2003-01-01

51

Impetus for solvothermal synthesis technique: synthesis and structure of a novel 1-D borophosphate using ionic liquid as medium  

International Nuclear Information System (INIS)

A novel borophosphate compound has been synthesized under solvothermal conditions using ionic liquid as a medium and structurally characterized by single-crystal X-ray diffractions. The compound crystallizes in the monoclinic, space group P2(1)/n, a=8.089(8) A, b=13.977(12) A, c=8.441(8) A, ?=112.517(11) deg. , Z=2, V=881.7(14) A3, R1=0.03, wR2=0.079 and S=1.01. Its structure consists of a 1-D straight chain that is built of the alternative linkage of mutually perpendicular four-member rings. Other characterizations by IR and thermal and elemental analyses are also described

52

Statistical analysis of sequence-structure alignment scores  

OpenAIRE

The structural analysis of proteins is fundamental to the analysis of protein functions. In this context, sequence-structure alignment methods are important among the different empirical methods. In order to assess the quality of sequence-structure alignments, a statistical method using a Bayesian approach proposed by Lathrop et al. (1998) will be presented. Finally, the results of a developed statistical analysis of scores of RDP(recursive dynamic programming)-sequence-structure alignments (...

Brunnert, Marcus; Thiele, Ralf; Mevissen, Heinz-theodor; Urfer, Wolfgang

2002-01-01

53

Integrability of and differential–algebraic structures for spatially 1D hydrodynamical systems of Riemann type  

International Nuclear Information System (INIS)

Highlights: • A new differential–algebraic–geometric approach for testing integrability is described. • The approach is applied to a generalized Riemann type hydrodynamic system. • The approach is applied to a generalized Ostrovsky–Vakhnenko system. • The approach is applied to a new two-component Burgers type hydrodynamic system. -- Abstract: A differential–algebraic approach to studying the Lax integrability of a generalized Riemann type hydrodynamic hierarchy is revisited and a new Lax representation is constructed. The related bi-Hamiltonian integrability and compatible Poissonian structures of this hierarchy are also investigated using gradient-holonomic and geometric methods. The complete integrability of a new generalized Riemann hydrodynamic system is studied via a novel combination of symplectic and differential–algebraic tools. A compatible pair of polynomial Poissonian structures, a Lax representation and a related infinite hierarchy of conservation laws are obtained. In addition, the differential–algebraic approach is used to prove the complete Lax integrability of the generalized Ostrovsky–Vakhnenko and a new Burgers type system, and special cases are studied using symplectic and gradient-holonomic tools. Compatible pairs of polynomial Poissonian structures, matrix Lax representations and infinite hierarchies of conservation laws are derived

54

Tools for integrated sequence-structure analysis with UCSF Chimera  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Comparing related structures and viewing the structures in the context of sequence alignments are important tasks in protein structure-function research. While many programs exist for individual aspects of such work, there is a need for interactive visualization tools that: (a provide a deep integration of sequence and structure, far beyond mapping where a sequence region falls in the structure and vice versa; (b facilitate changing data of one type based on the other (for example, using only sequence-conserved residues to match structures, or adjusting a sequence alignment based on spatial fit; (c can be used with a researcher's own data, including arbitrary sequence alignments and annotations, closely or distantly related sets of proteins, etc.; and (d interoperate with each other and with a full complement of molecular graphics features. We describe enhancements to UCSF Chimera to achieve these goals. Results The molecular graphics program UCSF Chimera includes a suite of tools for interactive analyses of sequences and structures. Structures automatically associate with sequences in imported alignments, allowing many kinds of crosstalk. A novel method is provided to superimpose structures in the absence of a pre-existing sequence alignment. The method uses both sequence and secondary structure, and can match even structures with very low sequence identity. Another tool constructs structure-based sequence alignments from superpositions of two or more proteins. Chimera is designed to be extensible, and mechanisms for incorporating user-specific data without Chimera code development are also provided. Conclusion The tools described here apply to many problems involving comparison and analysis of protein structures and their sequences. Chimera includes complete documentation and is intended for use by a wide range of scientists, not just those in the computational disciplines. UCSF Chimera is free for non-commercial use and is available for Microsoft Windows, Apple Mac OS X, Linux, and other platforms from http://www.cgl.ucsf.edu/chimera.

Huang Conrad C

2006-07-01

55

Local duality in spin structure functions g1(p) and g1(d)  

International Nuclear Information System (INIS)

Inclusive double spin asymmetries obtained by scattering polarized electrons off polarized protons and deuterons have been analyzed to address the issue of quark hadron duality in the polarized spin structure functions gp 1 and gd 1. A polarized electron beam, solid polarized NH3 and ND3 targets and the CEBAF Large Acceptance Spectrometer (CLAS) in Hall B were used to collect the data. The resulting gp 1 and gd 1 were averaged over the nucleon resonance energy region (M < W <2.00 Gev), and three lowest lying resonances individually for tests of global and local duality

56

Magnetic structure and interactions in the quasi-1D antiferromagnet CaV2O4  

International Nuclear Information System (INIS)

CaV2O4 is a spin-1 antiferromagnet where the magnetic vanadium ions are arranged on quasi-one-dimensional zig-zag chains with frustrated antiferromagnetic exchange interactions. Here we present high temperature susceptibility and single-crystal neutron diffraction measurements, which are used to deduce the magnetic structure, dominant exchange interactions and orbital configurations. The results suggest that at high temperatures of CaV2O4, the zig-zags behave as Haldane chains but at low temperatures, orbital ordering lifts the exchange frustration and the zig-zags become spin-1 ladders.

57

Syntheses, crystal structures and spectral properties of 1D tetracyanonickelate(II) complexes with 1-ethylimidazole  

Science.gov (United States)

Three new cyano-bridged tetracyanonickelate(II) complexes with 1-ethylimidazole (etim) ligand, [Ni(etim)4Ni(?-CN)2(CN)2]n (1), [Zn(etim)4Ni(?-CN)2(CN)2]n (2) and {[Cd(etim)4Ni(?-CN)2(CN)2]2·2H2O}n (3) have been synthesized and characterized by spectroscopic (IR and Raman) and X-ray diffraction techniques. According to the crystallographic data, it was understood that the complexes with the chain 2,2-TT structures belong to P-1 space group of triclinic crystal system. In the complexes, it was observed that Ni(II), Zn(II) and Cd(II) ions have distorted octahedral geometries. In the [Ni(?-CN)2(CN)2]2- anion, the Ni(II) ion is coordinated by the four carbon atoms of the four cyano ligands, and exhibits square-planar coordination geometry. The crystallographic analyses reveal that the crystal structures of complexes 1-3 are one dimensional linear chain polymers and these chains are held together by the CH⋯?, CH⋯Ni and hydrogen bonding interactions, forming three-dimensional networks.

Çetinkaya, Fulya; Kürkçüo?lu, Güne? Süheyla; Ye?ilel, Okan Zafer; Hökelek, Tuncer; Süzen, Yasemin

2013-09-01

58

Structure elucidation of organic compounds from natural sources using 1D and 2D NMR techniques  

Science.gov (United States)

In our continuing studies on Lamiaceae family plants including Salvia, Teucrium, Ajuga, Sideritis, Nepeta and Lavandula growing in Anatolia, many terpenoids, consisting of over 50 distinct triterpenoids and steroids, and over 200 diterpenoids, several sesterterpenoids and sesquiterpenoids along with many flavonoids and other phenolic compounds have been isolated. For Salvia species abietanes, for Teucrium and Ajuga species neo-clerodanes for Sideritis species ent-kaurane diterpenes are characteristic while nepetalactones are specific for Nepeta species. In this review article, only some interesting and different type of skeleton having constituents, namely rearranged, nor- or rare diterpenes, isolated from these species will be presented. For structure elucidation of these natural diterpenoids intensive one- and two-dimensional NMR techniques ( 1H, 13C, APT, DEPT, NOE/NOESY, 1H- 1H COSY, HETCOR, COLOC, HMQC/HSQC, HMBC, SINEPT) were used besides mass and some other spectroscopic methods.

Topcu, Gulacti; Ulubelen, Ayhan

2007-05-01

59

Analysis of Phase Space Structure of A 1-D Discrete System Using Global and Local Symbolic Dynamics  

International Nuclear Information System (INIS)

Symbolic dynamics, in which the system trajectory is represented as a string of symbols, appears as a convenient method for the analysis of properties of chaotic attractors. In this paper, we show that, using a noncanonical coding scheme based on a moving partition point, we are able to access such properties of the phase space of a dynamical system as the localisation of unstable periodic orbits and of their stable invariant manifolds. Applying different coding schemes enables us to extract different information about the phase space structure from the chaotic trajectory. A judicial choice of the method of symbolic coding allows to obtain information which may be missing in the symbolic dynamics from the generating partition. We present results for the 1-D case taking the logistic map as a numerical example. The extension to higher dimension is also discussed. The theoretical background of the methods used is also given. (author)

60

Synthesis, crystal structures, magnetic and luminescent properties of unique 1D p-ferrocenylbenzoate-bridged lanthanide complexes  

International Nuclear Information System (INIS)

Treatments of p-ferrocenylbenzoate [p-NaOOCH4C6Fc, Fc=(?5-C5H5)Fe(?5-C5H4)] with Ln(NO3)3.nH2O afford seven p-ferrocenylbenzoate lanthanide complexes {[Ln(OOCH4C6Fc)2(?2-OOCH4C6Fc)2(H2O)2](H3O)}n [Ln=Ce (1), Pr (2), Sm (3), Eu (4), Gd (5), Tb (6) and Dy (7)]. X-ray crystallographic analysis reveals that the isomorphous complexes {[Ce(OOCH4C6Fc)2(?2-OOCH4C6Fc)2(H2O)2](H3O)}n (1) and {[Pr(OOCH4C6Fc)2(?2-OOCH4C6Fc)2(H2O)2](H3O)}n (2) form a unique 1D double-bridged infinite chain structure bridged by ?2-OOCH4C6Fc groups. Each Ln(III) ion adopts a dodecahedron coordination environment with eight coordinated oxygen atoms from two terminal monodentate coordinated FcC6H4COO- units, two terminal monodentate coordinated H2O molecules and four ?2--OOCH4C6Fc units. The luminescent spectra reveal that only 4 and 6 exhibit characteristic emissions of lanthanide ions, Eu(III) and Ts of lanthanide ions, Eu(III) and Tb(III) ions, respectively. The variable-temperature magnetic properties of 5 and 7 suggest that a ferromagnetic coupling between spin carriers may exist in 5. - Graphical abstract: Seven p-ferrocenylbenzoate lanthanide coordination polymers were synthesized. Given is the perspective view of a unique 1D double-bridged infinite chain structure of 1, excitation and emission spectra of 6 and plots of ?mT vs. T and ?m-1 vs. T of 5.

61

Finding the most significant common sequence and structure motifs in a set of RNA sequences.  

OpenAIRE

We present a computational scheme to locally align a collection of RNA sequences using sequence and structure constraints. In addition, the method searches for the resulting alignments with the most significant common motifs, among all possible collections. The first part utilizes a simplified version of the Sankoff algorithm for simultaneous folding and alignment of RNA sequences, but maintains tractability by constructing multi-sequence alignments from pairwise comparisons. The algorithm fi...

Gorodkin, J.; Heyer, L. J.; Stormo, G. D.

1997-01-01

62

Computational methods in sequence and structure prediction  

Science.gov (United States)

This dissertation is organized into two parts. In the first part, we will discuss three computational methods for cis-regulatory element recognition in three different gene regulatory networks as the following: (a) Using a comprehensive "Phylogenetic Footprinting Comparison" method, we will investigate the promoter sequence structures of three enzymes (PAL, CHS and DFR) that catalyze sequential steps in the pathway from phenylalanine to anthocyanins in plants. Our result shows there exists a putative cis-regulatory element "AC(C/G)TAC(C)" in the upstream of these enzyme genes. We propose this cis-regulatory element to be responsible for the genetic regulation of these three enzymes and this element, might also be the binding site for MYB class transcription factor PAP1. (b) We will investigate the role of the Arabidopsis gene glutamate receptor 1.1 (AtGLR1.1) in C and N metabolism by utilizing the microarray data we obtained from AtGLR1.1 deficient lines (antiAtGLR1.1). We focus our investigation on the putatively co-regulated transcript profile of 876 genes we have collected in antiAtGLR1.1 lines. By (a) scanning the occurrence of several groups of known abscisic acid (ABA) related cisregulatory elements in the upstream regions of 876 Arabidopsis genes; and (b) exhaustive scanning of all possible 6-10 bps motif occurrence in the upstream regions of the same set of genes, we are able to make a quantative estimation on the enrichment level of each of the cis-regulatory element candidates. We finally conclude that one specific cis-regulatory element group, called "ABRE" elements, are statistically highly enriched within the 876-gene group as compared to their occurrence within the genome. (c) We will introduce a new general purpose algorithm, called "fuzzy REDUCE1", which we have developed recently for automated cis-regulatory element identification. In the second part, we will discuss our newly devised protein design framework. With this framework we have developed a software package which is capable of designing novel protein structures at the atomic resolution. This software package allows us to perform protein structure design with a flexible backbone. The backbone flexibility includes loop region relaxation as well as a secondary structure collective mode relaxation scheme. (Abstract shortened by UMI.)

Lang, Caiyi

63

Syntheses, structures, and photoluminescence of d 10 coordination architectures: From 1D to 3D complexes based on mixed ligands  

Science.gov (United States)

Six new compounds, namely, {[Cd 3(Himpy) 3(tda) 2]·3H 2O} n ( 1), {[Zn 3(bipy) 2(tda) 2(H 2O) 2]·4H 2O} n ( 2), {[Cd 3(bipy) 3(tda) 2]·4H 2O} n ( 3), {[Cd 3(tda) 2(H 2O) 3Cl]·H 2O} n ( 4), {[Zn 2(tz)(tda)(H 2O) 2]·H 2O} n ( 5) and {[Cd 7(pz)(tda) 4(OAc)(H 2O) 7]·3H 2O} n ( 6) [H 3tda = 1H-1,2,3-triazole-4,5-dicarboxylic acid, Himpy = 2-(1H-imidazol-2-yl)pyridine, bipy = 2,2'-bipyridine, Htz = 1H-1,2,4-triazole, H 2pz = piperazine] have been prepared under hydrothermal condition and characterized by elemental analyses, infrared spectroscopy, powder X-ray diffraction and single-crystal X-ray diffraction analyses. Compound 1 is a 1D column-like structure and displays a 3D supramolecular network via the ?···? stacking interaction. The compounds 2 and 3 exhibit similar 2D layer-like structure, which further extend to 3D supermolecular structure by the ?···? stacking interaction. All of compounds 4- 6 display 3D framework with diverse topology constructed from the tda 3- ligands in different coordination modes and secondary ligands (or bridging atom) connecting metal ions. Furthermore, the thermal stabilities and photoluminescent properties of compounds 1- 6 were studied.

Yuan, Gang; Shao, Kui-Zhan; Du, Dong-Ying; Wang, Xin-Long; Su, Zhong-Min

2011-05-01

64

Insights into the 3+1 D structure of rainfall through a multifractal analysis of 2DVD data  

Science.gov (United States)

We investigate the 3+1D (3 spatial dimensions + time) structure of the rainfall field with the help of data recorded by a 2D video disdrometer. The distribution of drop positions and sizes with its associated moments (number, rain rate, radar reflectivity) is analysed. The data was collected in Ardèche during two HyMeX campaigns in the South of France. Five intense events that occurred in September / October 2012 / 2013 are studied. Analyses are performed in the Universal Multifractal framework which has been extensively used to analyse and simulate geophysical fields extremely variable over wide ranges of scales. Only three parameters are used to characterize variability across scales: C1 the mean intermittency, alpha the multifractality index and H the non-conservative exponent. First, a 35 m column above the measuring device is reconstructed by assuming a vertical fall of drops with a constant velocity equal to the one measured at the ground level by the 2DVD. The latter assumption is very coarse, but we believe that the resulting reconstruction yields some insights; these columns are indeed the best drop by drop data available at this scale. A scaling analysis shows that the distribution of drops (and its associated moments) exhibits a scaling behaviour over scales ranging from 35 m to roughly 50 cm with alpha almost equal to 2 and C1 smaller than 0.1. The distribution within boxes of 50 cm seems homogeneous. Finally the consequences of this inhomogeneous distribution of drops on radar remote sensing through the speckle effect (coherent backscattering) are briefly discussed. Secondly, extremely high resolution (1ms) time series of the rain rate recorded at the ground level are analysed. In agreement with earlier results obtained with the help of Optical Spectrometer Pluviometer data, two scaling regimes are visible with a transition plateau in between. The small scale regime, between roughly 50 ms and 1 ms, exhibits a monofractal behaviour, corresponding to a flow of individual drops through the sampling area. For the large scale regime which ranges from few hours to few minutes, it appears that UM parameters are quite different according to the event. Estimates of alpha are in the range 1-2, whereas C1's are in the range 0.2-0.5 and exhibits less variability from one event to the other. These results highlight the need to develop a theoretical representation of a 1D temporal cut of a 3+1D field to better understand the link between the column reconstruction and time series analysis.

Gires, Auguste; Tchiguirinskaia, Ioulia; Schertzer, Daniel; Berne, Alexis

2014-05-01

65

Syntheses, structures, spectroscopic and electrochemical properties of two 1D organic-inorganic CuII-LnIII heterometallic germanotungstates  

Science.gov (United States)

Two organic-inorganic hybrid copper-lanthanide heterometallic germanotungstates KNa2H7[enH2]3[Cu(en)2(H2O)]2[Cu(en)2]2{Cu(en)2[Eu(?-GeW11O39)2]2}·13H2O (1) and Na2H4[Cu(en)2(H2O)]2[Cu(en)2]6[Cu(en)2]{Cu(en)2[La(?-GeW11O39)2]2}·12H2O (2) have been hydrothermally synthesized by reaction of K8Na2[A-?-GeW9O34]·25H2O with CuCl2·2H2O and EuCl3/LaCl3 in the presence of en (en = ethylenediamine) and structurally characterized by elemental analyses, IR spectra and single-crystal X-ray diffraction. 1 exhibits the 1D chain motif built by tetrameric {[Cu(en)2(H2O)]2[Cu(en)2]2{Cu(en)2[Eu(?-GeW11O39)2]2}}16- moieties through square antiprismatic K+ cations while 2 displays the 1D architecture made by tetrameric [[Cu(en)2]6[Cu(en)2]{Cu(en)2[La(?-GeW11O39)2]2}]10- units via octahedral [Cu(en)2]2+ cations. Furthermore, the solid-state electrochemical and electrocatalytic properties of 1 have been investigated and 1 indicates the good electrocatalytic activity for nitrite reduction. In addition, the photoluminescence property of 1 has been investigated.

Zhang, Jingli; Li, Jie; Li, Lijie; Zhao, Haozhe; Ma, Pengtao; Zhao, Junwei; Chen, Lijuan

2013-10-01

66

Bayesian Model of Protein Primary Sequence for Secondary Structure Prediction  

OpenAIRE

Determining the primary structure (i.e., amino acid sequence) of a protein has become cheaper, faster, and more accurate. Higher order protein structure provides insight into a protein’s function in the cell. Understanding a protein’s secondary structure is a first step towards this goal. Therefore, a number of computational prediction methods have been developed to predict secondary structure from just the primary amino acid sequence. The most successful methods use machine learning appr...

Li, Qiwei; Dahl, David B.; Vannucci, Marina; Hyun Joo,; Tsai, Jerry W.

2014-01-01

67

Formatt: Correcting protein multiple structural alignments by incorporating sequence alignment  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background The quality of multiple protein structure alignments are usually computed and assessed based on geometric functions of the coordinates of the backbone atoms from the protein chains. These purely geometric methods do not utilize directly protein sequence similarity, and in fact, determining the proper way to incorporate sequence similarity measures into the construction and assessment of protein multiple structure alignments has proved surprisingly difficult. Results We present Formatt, a multiple structure alignment based on the Matt purely geometric multiple structure alignment program, that also takes into account sequence similarity when constructing alignments. We show that Formatt outperforms Matt and other popular structure alignment programs on the popular HOMSTRAD benchmark. For the SABMark twilight zone benchmark set that captures more remote homology, Formatt and Matt outperform other programs; depending on choice of embedded sequence aligner, Formatt produces either better sequence and structural alignments with a smaller core size than Matt, or similarly sized alignments with better sequence similarity, for a small cost in average RMSD. Conclusions Considering sequence information as well as purely geometric information seems to improve quality of multiple structure alignments, though defining what constitutes the best alignment when sequence and structural measures would suggest different alignments remains a difficult open question.

Daniels Noah M

2012-10-01

68

Human renin gene: structure and sequence analysis.  

OpenAIRE

The complete protein precursor of human kidney renin has been determined from the sequence of cloned genomic DNA. The gene spans 12 kilobases of DNA and is interrupted by eight intervening sequences. The nine regions (exons) encoding the protein were mapped with a mouse renin cDNA probe, synthetic oligonucleotide probes, and by hybridization of genomic restriction fragments to a 1600-nucleotide human kidney mRNA. The predicted 403-amino acid preprorenin consists of mature renin and a 66-resid...

Hobart, P. M.; Fogliano, M.; O Connor, B. A.; Schaefer, I. M.; Chirgwin, J. M.

1984-01-01

69

Finding the most significant common sequence and structure motifs in a set of RNA sequences  

DEFF Research Database (Denmark)

We present a computational scheme to locally align a collection of RNA sequences using sequence and structure constraints, In addition, the method searches for the resulting alignments with the most significant common motifs, among all possible collections, The first part utilizes a simplified version of the Sankoff algorithm for simultaneous folding and alignment of RNA sequences, but maintains tractability by constructing multi-sequence alignments from pairwise comparisons, The algorithm finds the multiple alignments using a greedy approach and has similarities to both CLUSTAL and CONSENSUS, but the core algorithm assures that the pairwise alignments are optimized for both sequence and structure conservation. The choice of scoring system and the method of progressively constructing the final solution are important considerations that are discussed, Example solutions, and comparisons with other approaches, are provided, The solutions include finding consensus structures identical to published ones.

Gorodkin, Jan; Heyer, L.J.

1997-01-01

70

Analysis of protein sequence/structure similarity relationships.  

OpenAIRE

Current analyses of protein sequence/structure relationships have focused on expected similarity relationships for structurally similar proteins. To survey and explore the basis of these relationships, we present a general sequence/structure map that covers all combinations of similarity/dissimilarity relationships and provide novel energetic analyses of these relationships. To aid our analysis, we divide protein relationships into four categories: expected/unexpected similarity (S and S(?)) ...

Gan, Hin Hark; Perlow, Rebecca A.; Roy, Sharmili; Ko, Joy; Wu, Min; Huang, Jing; Yan, Shixiang; Nicoletta, Angelo; Vafai, Jonathan; Sun, Ding; Wang, Lihua; Noah, Joyce E.; Pasquali, Samuela; Schlick, Tamar

2002-01-01

71

Specific alignment of structured RNA: stochastic grammars and sequence annealing  

OpenAIRE

Motivation: Whole-genome screens suggest that eukaryotic genomes are dense with non-coding RNAs (ncRNAs). We introduce a novel approach to RNA multiple alignment which couples a generative probabilistic model of sequence and structure with an efficient sequence annealing approach for exploring the space of multiple alignments. This leads to a new software program, Stemloc-AMA, that is both accurate and specific in the alignment of multiple related RNA sequences.

Bradley, Robert K.; Pachter, Lior; Holmes, Ian

2008-01-01

72

Massively Parallel Sequencing Approaches for Characterization of Structural Variation  

OpenAIRE

The emergence of next-generation sequencing (NGS) technologies offers an incredible opportunity to comprehensively study DNA sequence variation in human genomes. Commercially available platforms from Roche (454), Illumina (Genome Analyzer and Hiseq 2000), and Applied Biosystems (SOLiD) have the capability to completely sequence individual genomes to high levels of coverage. NGS data is particularly advantageous for the study of structural variation (SV) because it offers the sensitivity to de...

Koboldt, Daniel C.; Larson, David E.; Chen, Ken; Ding, Li; Wilson, Richard K.

2012-01-01

73

Formatt: Correcting protein multiple structural alignments by incorporating sequence alignment  

OpenAIRE

Abstract Background The quality of multiple protein structure alignments are usually computed and assessed based on geometric functions of the coordinates of the backbone atoms from the protein chains. These purely geometric methods do not utilize directly protein sequence similarity, and in fact, determining the proper way to incorporate sequence similarity measures into the construction and assessment of protein multiple structure alignments has proved surprisingly difficult. Results We pre...

Daniels Noah M; Nadimpalli Shilpa; Cowen Lenore J

2012-01-01

74

Relationships between Th1 or Th2 iNKT cell activity and structures of CD1d-antigen complexes: meta-analysis of CD1d-glycolipids dynamics simulations.  

Science.gov (United States)

A number of potentially bioactive molecules can be found in nature. In particular, marine organisms are a valuable source of bioactive compounds. The activity of an ?-galactosylceramide was first discovered in 1993 via screening of a Japanese marine sponge (Agelas mauritanius). Very rapidly, a synthetic glycololipid analogue of this natural molecule was discovered, called KRN7000. Associated with the CD1d protein, this ?-galactosylceramide 1 (KRN7000) interacts with the T-cell antigen receptor to form a ternary complex that yields T helper (Th) 1 and Th2 responses with opposing effects. In our work, we carried out molecular dynamics simulations (11.5 µs in total) involving eight different ligands (conducted in triplicate) in an effort to find out correlation at the molecular level, if any, between chemical modulation of 1 and the orientation of the known biological response, Th1 or Th2. Comparative investigations of human versus mouse and Th1 versus Th2 data have been carried out. A large set of analysis tools was employed including free energy landscapes. One major result is the identification of a specific conformational state of the sugar polar head, which could be correlated, in the present study, to the biological Th2 biased response. These theoretical tools provide a structural basis for predicting the very different dynamical behaviors of ?-glycosphingolipids in CD1d and might aid in the future design of new analogues of 1. PMID:25376021

Laurent, Xavier; Renault, Nicolas; Farce, Amaury; Chavatte, Philippe; Hénon, Eric

2014-11-01

75

Structure-guided reprogramming of serine recombinase DNA sequence specificity.  

Science.gov (United States)

Routine manipulation of cellular genomes is contingent upon the development of proteins and enzymes with programmable DNA sequence specificity. Here we describe the structure-guided reprogramming of the DNA sequence specificity of the invertase Gin from bacteriophage Mu and Tn3 resolvase from Escherichia coli. Structure-guided and comparative sequence analyses were used to predict a network of amino acid residues that mediate resolvase and invertase DNA sequence specificity. Using saturation mutagenesis and iterative rounds of positive antibiotic selection, we identified extensively redesigned and highly convergent resolvase and invertase populations in the context of engineered zinc-finger recombinase (ZFR) fusion proteins. Reprogrammed variants selectively catalyzed recombination of nonnative DNA sequences > 10,000-fold more effectively than their parental enzymes. Alanine-scanning mutagenesis revealed the molecular basis of resolvase and invertase DNA sequence specificity. When used as rationally designed ZFR heterodimers, the reprogrammed enzyme variants site-specifically modified unnatural and asymmetric DNA sequences. Early studies on the directed evolution of serine recombinase DNA sequence specificity produced enzymes with relaxed substrate specificity as a result of randomly incorporated mutations. In the current study, we focused our mutagenesis exclusively on DNA determinants, leading to redesigned enzymes that remained highly specific and directed transgene integration into the human genome with > 80% accuracy. These results demonstrate that unique resolvase and invertase derivatives can be developed to site-specifically modify the human genome in the context of zinc-finger recombinase fusion proteins. PMID:21187418

Gaj, Thomas; Mercer, Andrew C; Gersbach, Charles A; Gordley, Russell M; Barbas, Carlos F

2011-01-11

76

Sequence and Structural Analyses for Functional Non-coding RNAs  

Science.gov (United States)

Analysis and detection of functional RNAs are currently important topics in both molecular biology and bioinformatics research. Several computational methods based on stochastic context-free grammars (SCFGs) have been developed for modeling and analysing functional RNA sequences. These grammatical methods have succeeded in modeling typical secondary structures of RNAs and are used for structural alignments of RNA sequences. Such stochastic models, however, are not sufficient to discriminate member sequences of an RNA family from non-members, and hence to detect non-coding RNA regions from genome sequences. Recently, the support vector machine (SVM) and kernel function techniques have been actively studied and proposed as a solution to various problems in bioinformatics. SVMs are trained from positive and negative samples and have strong, accurate discrimination abilities, and hence are more appropriate for the discrimination tasks. A few kernel functions that extend the string kernel to measure the similarity of two RNA sequences from the viewpoint of secondary structures have been proposed. In this article, we give an overview of recent progress in SCFG-based methods for RNA sequence analysis and novel kernel functions tailored to measure the similarity of two RNA sequences and developed for use with support vector machines (SVM) in discriminating members of an RNA family from non-members.

Sakakibara, Yasubumi; Sato, Kengo

77

Pairwise local structural alignment of RNA sequences with sequence similarity less than 40%  

DEFF Research Database (Denmark)

Motivation: Searching for non-coding RNA (ncRNA) genes and structural RNA elements (eleRNA) are major challenges in gene finding todya as these often are conserved in structure rather than in sequence. Even though the number of available methods is growing, it is still of interest to pairwise detect two genes with low sequence similarity, where the genes are part of a larger genomic region. Results: Here we present such an approach for pairwise local alignment which is based on FILDALIGN and the Sankoff algorithm for simultaneous structural alignment of multiple sequences. We include the ability to conduct mutual scans of two sequences of arbitrary length while searching for common local structural motifs of some maximum length. This drastically reduces the complexity of the algorithm. The scoring scheme includes structural parameters corresponding to those available for free energy as well as for substitution matrices similar to RIBOSUM. The new FOLDALIGN implementation is tested on a dataset where the ncRNAs and eleRNAs have sequence similarity <40% and where the ncRNAs and eleRNAs are energetically indistinguishable from the surrounding genomic sequence context. The method is tested in two ways: (1) its ability to find the common structure between the genes only and (2) its ability to locate ncRNAs and eleRNAs in a genomic context. In case (1), it makes sense to compare with methods like Dynalign, and the performances are very similar, but FOLDALIGN is substantially faster. The structure prediction performance for a family is typically around 0.7 using Matthews correlation coefficient. In case (2), the algorithm is successful at locating RNA families with an average sensitivity of 0.8 and a positive predictive value of 0.9 using a BLAST-like hit selection scheme. Availability: The program is available online at http://foldalign.kvl.dk Contact: gorodkin@bioinf.kvl.dk

Havgaard, Jakob Hull; LyngsØ, Rune B.

2005-01-01

78

Correlated mutations in protein sequences: Phylogenetic and structural effects  

Energy Technology Data Exchange (ETDEWEB)

Covariation analysis of sets of aligned sequences for RNA molecules is relatively successful in elucidating RNA secondary structure, as well as some aspects of tertiary structure. Covariation analysis of sets of aligned sequences for protein molecules is successful in certain instances in elucidating certain structural and functional links, but in general, pairs of sites displaying highly covarying mutations in protein sequences do not necessarily correspond to sites that are spatially close in the protein structure. In this paper the authors identify two reasons why naive use of covariation analysis for protein sequences fails to reliably indicate sequence positions that are spatially proximate. The first reason involves the bias introduced in calculation of covariation measures due to the fact that biological sequences are generally related by a non-trivial phylogenetic tree. The authors present a null-model approach to solve this problem. The second reason involves linked chains of covariation which can result in pairs of sites displaying significant covariation even though they are not spatially proximate. They present a maximum entropy solution to this classic problem of causation versus correlation. The methodologies are validated in simulation.

Lapedes, A.S. [Los Alamos National Lab., NM (United States). Theoretical Div.]|[Santa Fe Inst., NM (United States); Giraud, B.G. [C.E.N. Saclay, Gif/Yvette (France). Service Physique Theorique; Liu, L.C. [Los Alamos National Lab., NM (United States). Theoretical Div.; Stormo, G.D. [Univ. of Colorado, Boulder, CO (United States). Dept. of Molecular, Cellular and Developmental Biology

1998-12-01

79

Stabilization of Ca1-dFe2-xMnxO4 (0.44 lt x lt 2) with CaFe2O4-type Structure and Ca2plus Defects in 1D Channels  

Energy Technology Data Exchange (ETDEWEB)

Solid solutions of Ca{sub 1-{delta}}Fe{sub 2-x}Mn{sub x}O{sub 4} (0.45 {<=} x {<=} 2) were synthesized from CaCl{sub 2} as flux at 850 C in air. The entire series, even with x = 2, crystallizes in the CaFe{sub 2}O{sub 4}-type structure (Pnma), rather than in the CaMn{sub 2}O{sub 4}-type structure (Pbcm). Rietveld refinements confirmed mixed-valency Mn{sup 3+}/Mn{sup 4+} and a substantial level of Ca{sup 2+} deficiency ({delta} {approx} 0.25) at high x. With increasing x, the unit-cell dimensions a and b decrease, while that of c increases. Detailed structural analyses, together with Mn K-edge X-ray absorption and {sup 57}Fe Moessbauer spectroscopy studies, revealed that the stabilization of CaFe{sub 2}O{sub 4}-type structure, even at high values of x, is due to the existence of non-Jahn-Teller active Mn{sup 4+} (and Fe{sup 3+}), which is compensated by the formation of the Ca{sup 2+} deficiencies in the one-dimensional (1D) channels of Ca{sub 1-{delta}}Fe{sub 2-x}Mn{sub x}O{sub 4} during the flux synthesis. Antiferromagnetic (AFM) long-range ordering is achieved for all compounds at low temperature, because of strong AFM interactions between Mn{sup 3+}/Mn{sup 4+} and Fe{sup 3+}. In addition, a spin (or cluster) glass component was also observed, as expected, because of the extensive Mn/Fe structural and Mn{sup 3+}/Mn{sup 4+} charge disordering.

T Yang; M Croft; A Ignatov; I Nowik; R Cong; M Greenblatt

2011-12-31

80

Pairwise local structural alignment of RNA sequences with sequence similarity less than 40%.  

OpenAIRE

MOTIVATION: Searching for non-coding RNA (ncRNA) genes and structural RNA elements (eleRNA) are major challenges in gene finding today as these often are conserved in structure rather than in sequence. Even though the number of available methods is growing, it is still of interest to pairwise detect two genes with low sequence similarity, where the genes are part of a larger genomic region. RESULTS: Here we present such an approach for pairwise local alignment which is based on foldalign and ...

Havgaard, Jh; Lyngsø, Rb; Stormo, Gd; Gorodkin, J.

2005-01-01

81

Structural and magnetic properties of 1D coordination polymers constructed by isophthalate and 2,4?-bipyridine  

Science.gov (United States)

Three new complexes {[Cu(ip)(2,4'-bpy) 2] · H 2O} n ( 1), {[Co(ip)(2,4'-bpy) 2] · H 2O} n ( 2) and [Ni(ip)(2,4'-bpy) 2(H 2O)] n ( 3) were prepared in the reaction of isophthalic acid (H 2ip) with different metal ions such as Cu II, Co II, and Ni II under hydrothermal conditions in the presence of 2,4'-bipyridine (2,4'-bpy) ligand. The metal ions are bridged by ip 2- anions to form 1D ribbon with 8-membered and 16-membered ring in complexes 1 and 2, while nickel ions are bridged by ip 2- anions and water molecules forming 1D ribbon with 8-membered and 20-membered ring in complex 3. It was found that antiferromagnetic interactions between metal ions can be tuned by the nature of metal center and/or the coordinated water molecule.

Cui, Shu-Xin; Zhao, Yu-Long; Zhang, Jing-Ping; Liu, Qun

2009-06-01

82

EURDYN-1D: a computer code for the one-dimensional non-linear dynamic analysis of structural systems. Description and users' manual (release 1)  

International Nuclear Information System (INIS)

The goal of the present report is to provide for a comprehensive users' manual describing the capabilities of the computer code EURDYN-1D. It includes information and examples about the type of problems which can be solved with the code and explanation on how to prepare input data and, how to interpret output results. The field of applications of EURDYN-1D is the one dimensional dynamic analysis of general structural systems and the code is particularly suited for fast transient events involving propagation of longitudinal mechanical waves (subsonic) in structures. Both geometrical and physical non-linearities can be taken into account. Typical examples are impact problems, fast dynamic loading due the explosions or sudden release for initial loads due to failures, etc. To these classes belong many problems encountered in the reactor safety field as well as in more common and general technological applications

83

Evolutionarily consistent families in SCOP: sequence, structure and function  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background SCOP is a hierarchical domain classification system for proteins of known structure. The superfamily level has a clear definition: Protein domains belong to the same superfamily if there is structural, functional and sequence evidence for a common evolutionary ancestor. Superfamilies are sub-classified into families, however, there is not such a clear basis for the family level groupings. Do SCOP families group together domains with sequence similarity, do they group domains with similar structure or by common function? It is these questions we answer, but most importantly, whether each family represents a distinct phylogenetic group within a superfamily. Results Several phylogenetic trees were generated for each superfamily: one derived from a multiple sequence alignment, one based on structural distances, and the final two from presence/absence of GO terms or EC numbers assigned to domains. The topologies of the resulting trees and confidence values were compared to the SCOP family classification. Conclusions We show that SCOP family groupings are evolutionarily consistent to a very high degree with respect to classical sequence phylogenetics. The trees built from (automatically generated structural distances correlate well, but are not always consistent with SCOP (hand annotated groupings. Trees derived from functional data are less consistent with the family level than those from structure or sequence, though the majority still agree. Much of GO and EC annotation applies directly to one family or subset of the family; relatively few terms apply at the superfamily level. Maximum sequence diversity within a family is on average 22% but close to zero for superfamilies.

Pethica Ralph B

2012-10-01

84

Informational structure of genetic sequences and nature of gene splicing  

Science.gov (United States)

Only about 1/20 of DNA of higher organisms codes for proteins, by means of classical triplet code. The rest of DNA sequences is largely silent, with unclear functions, if any. The triplet code is not the only code (message) carried by the sequences. There are three levels of molecular communication, where the same sequence ``talks'' to various bimolecules, while having, respectively, three different appearances: DNA, RNA and protein. Since the molecular structures and, hence, sequence specific preferences of these are substantially different, the original DNA sequence has to carry simultaneously three types of sequence patterns (codes, messages), thus, being a composite structure in which one had the same letter (nucleotide) is frequently involved in several overlapping codes of different nature. This multiplicity and overlapping of the codes is a unique feature of the Gnomic, language of genetic sequences. The coexisting codes have to be degenerate in various degrees to allow an optimal and concerted performance of all the encoded functions. There is an obvious conflict between the best possible performance of a given function and necessity to compromise the quality of a given sequence pattern in favor of other patterns. It appears that the major role of various changes in the sequences on their ``ontogenetic'' way from DNA to RNA to protein, like RNA editing and splicing, or protein post-translational modifications is to resolve such conflicts. New data are presented strongly indicating that the gene splicing is such a device to resolve the conflict between the code of DNA folding in chromatin and the triplet code for protein synthesis.

Trifonov, E. N.

1991-10-01

85

Modular prediction of protein structural classes from sequences of twilight-zone identity with predicting sequences  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Knowledge of structural class is used by numerous methods for identification of structural/functional characteristics of proteins and could be used for the detection of remote homologues, particularly for chains that share twilight-zone similarity. In contrast to existing sequence-based structural class predictors, which target four major classes and which are designed for high identity sequences, we predict seven classes from sequences that share twilight-zone identity with the training sequences. Results The proposed MODular Approach to Structural class prediction (MODAS method is unique as it allows for selection of any subset of the classes. MODAS is also the first to utilize a novel, custom-built feature-based sequence representation that combines evolutionary profiles and predicted secondary structure. The features quantify information relevant to the definition of the classes including conservation of residues and arrangement and number of helix/strand segments. Our comprehensive design considers 8 feature selection methods and 4 classifiers to develop Support Vector Machine-based classifiers that are tailored for each of the seven classes. Tests on 5 twilight-zone and 1 high-similarity benchmark datasets and comparison with over two dozens of modern competing predictors show that MODAS provides the best overall accuracy that ranges between 80% and 96.7% (83.5% for the twilight-zone datasets, depending on the dataset. This translates into 19% and 8% error rate reduction when compared against the best performing competing method on two largest datasets. The proposed predictor provides accurate predictions at 58% accuracy for membrane proteins class, which is not considered by majority of existing methods, in spite that this class accounts for only 2% of the data. Our predictive model is analyzed to demonstrate how and why the input features are associated with the corresponding classes. Conclusions The improved predictions stem from the novel features that express collocation of the secondary structure segments in the protein sequence and that combine evolutionary and secondary structure information. Our work demonstrates that conservation and arrangement of the secondary structure segments predicted along the protein chain can successfully predict structural classes which are defined based on the spatial arrangement of the secondary structures. A web server is available at http://biomine.ece.ualberta.ca/MODAS/.

Kurgan Lukasz

2009-12-01

86

Designing polymorphic ISSR primers in order to study gene sequences x and y types glutenin subunits in 1D locus controlling favourable baking quality in elite mutant lines of bread wheat  

International Nuclear Information System (INIS)

Baking quality is one of important traits in qualitative improvement of bread wheat. Gluten prolamins determine wheat flour quality for different technological process such as bread making. Between gluten proteins, High Molecular Glutenin (HMW) group and specially, d allele in 1D locus with x-type and y-type subunits are very valuable in baking quality. In this study, amino acid sequences of x-type subunits (2.1, 2.2, 2.2*, 5) and y-type subunits (10, 12) related to 1D locus were searched, found and compared together using Genedoc software. After amino acid sequences alignment of y-type subunits and x-type subunits, it was characterized that deletion, insertion (duplication) and point mutations in these subunits involved in biological function of proteins. most important insertion and deletion mutations were 185 amino acids sequence insertion of 2.2* subunit and 102 amino acids sequence insertion of x2.2 subunit in position 486 of amino acid sequence and six amino acid sequence deletion IGQGQQ in position 203 of y10 subunit. From important point mutations can be pointed to conversion of serine to cysteine in position 118 of x 5 subunit and substitution of glutamine to histidine in position 626 of x5 subunit. Finally, polymorph ISSR primers in repetitive domains were designed on similarities and differences in subunits of x and y-types. These primers show good banding polymorphisms in elite mutant lines, standard commercial cultivars and F2 populations from tivars and F2 populations from crosses. (author)

87

Identifying the mechanisms underpinning recognition of structured sequences of action.  

Science.gov (United States)

We present three experiments to identify the specific information sources that skilled participants use to make recognition judgements when presented with dynamic, structured stimuli. A group of less skilled participants acted as controls. In all experiments, participants were presented with filmed stimuli containing structured action sequences. In a subsequent recognition phase, participants were presented with new and previously seen stimuli and were required to make judgements as to whether or not each sequence had been presented earlier (or were edited versions of earlier sequences). In experiment 1, skilled participants demonstrated superior sensitivity in recognition when viewing dynamic clips compared with static images and clips where the frames were presented in a nonsequential, randomized manner, implicating the importance of motion information when identifying familiar or unfamiliar sequences. In experiment 2, we presented normal and mirror-reversed sequences in order to distort access to absolute motion information. Skilled participants demonstrated superior recognition sensitivity, but no significant differences were observed across viewing conditions, leading to the suggestion that skilled participants are more likely to extract relative rather than absolute motion when making such judgements. In experiment 3, we manipulated relative motion information by occluding several display features for the duration of each film sequence. A significant decrement in performance was reported when centrally located features were occluded compared to those located in more peripheral positions. Findings indicate that skilled participants are particularly sensitive to relative motion information when attempting to identify familiarity in dynamic, visual displays involving interaction between numerous features. PMID:22554230

Williams, A Mark; North, Jamie S; Hope, Edward R

2012-01-01

88

Massively parallel sequencing approaches for characterization of structural variation.  

Science.gov (United States)

The emergence of next-generation sequencing (NGS) technologies offers an incredible opportunity to comprehensively study DNA sequence variation in human genomes. Commercially available platforms from Roche (454), Illumina (Genome Analyzer and Hiseq 2000), and Applied Biosystems (SOLiD) have the capability to completely sequence individual genomes to high levels of coverage. NGS data is particularly advantageous for the study of structural variation (SV) because it offers the sensitivity to detect variants of various sizes and types, as well as the precision to characterize their breakpoints at base pair resolution. In this chapter, we present methods and software algorithms that have been developed to detect SVs and copy number changes using massively parallel sequencing data. We describe visualization and de novo assembly strategies for characterizing SV breakpoints and removing false positives. PMID:22228022

Koboldt, Daniel C; Larson, David E; Chen, Ken; Ding, Li; Wilson, Richard K

2012-01-01

89

SBSPKS: structure based sequence analysis of polyketide synthases  

OpenAIRE

Polyketide synthases (PKSs) catalyze biosynthesis of a diverse family of pharmaceutically important secondary metabolites. Bioinformatics analysis of sequence and structural features of PKS proteins plays a crucial role in discovery of new natural products by genome mining, as well as in design of novel secondary metabolites by biosynthetic engineering. The availability of the crystal structures of various PKS catalytic and docking domains, and mammalian fatty acid synthase module prompted us...

Anand, Swadha; Prasad, M. V. R.; Yadav, Gitanjali; Kumar, Narendra; Shehara, Jyoti; Ansari, Md Zeeshan; Mohanty, Debasisa

2010-01-01

90

Sequence, Structure, and Network Evolution of Protein Phosphorylation  

Science.gov (United States)

With the increasing amount of information about the phosphoproteomes of diverse organisms, it is now possible to begin to evaluate this information in the context of evolution. Work described at the inaugural Keystone Symposium on “The Evolution of Protein Phosphorylation” covered a wide range of eukaryotic and prokaryotic organisms, revealing insights into the evolution of protein phosphorylation at the sequence, network, and structural levels.

Chris Soon Heng Tan (Mount Sinai Hospital; Samuel Lunenfeld Research Institute REV)

2011-07-19

91

3DCoffee@igs: a web server for combining sequences and structures into a multiple sequence alignment  

OpenAIRE

This paper presents 3DCoffee@igs, a web-based tool dedicated to the computation of high-quality multiple sequence alignments (MSAs). 3D-Coffee makes it possible to mix protein sequences and structures in order to increase the accuracy of the alignments. Structures can be either provided as PDB identifiers or directly uploaded into the server. Given a set of sequences and structures, pairs of structures are aligned with SAP while sequence–structure pairs are aligned with Fugue. The resulting...

Poirot, Olivier; Suhre, Karsten; Abergel, Chantal; O Toole, Eamonn; Notredame, Cedric

2004-01-01

92

Coalescence phenomena in 1D silver nanostructures  

International Nuclear Information System (INIS)

Different coalescence processes on 1D silver nanostructures synthesized by a PVP assisted reaction in ethylene glycol at 160 deg. C were studied experimentally and theoretically. Analysis by TEM and HRTEM shows different defects found on the body of these materials, suggesting that they were induced by previous coalescence processes in the synthesis stage. TEM observations showed that irradiation with the electron beam eliminates the boundaries formed near the edges of the structures, suggesting that this process can be carried out by the application of other means of energy (i.e. thermal). These results were also confirmed by theoretical calculations by Monte Carlo simulations using a Sutton-Chen potential. A theoretical study by molecular dynamics simulation of the different coalescence processes on 1D silver nanostructures is presented, showing a surface energy driven sequence followed to form the final coalesced structure. Calculations were made at 1000-1300 K, which is near the melting temperature of silver (1234 K). Based on these results, it is proposed that 1D nanostructures can grow through a secondary mechanism based on coalescence, without losing their dimensionality.

93

Quantifying sequence and structural features of protein–RNA interactions  

Science.gov (United States)

Increasing awareness of the importance of protein–RNA interactions has motivated many approaches to predict residue-level RNA binding sites in proteins based on sequence or structural characteristics. Sequence-based predictors are usually high in sensitivity but low in specificity; conversely structure-based predictors tend to have high specificity, but lower sensitivity. Here we quantified the contribution of both sequence- and structure-based features as indicators of RNA-binding propensity using a machine-learning approach. In order to capture structural information for proteins without a known structure, we used homology modeling to extract the relevant structural features. Several novel and modified features enhanced the accuracy of residue-level RNA-binding propensity beyond what has been reported previously, including by meta-prediction servers. These features include: hidden Markov model-based evolutionary conservation, surface deformations based on the Laplacian norm formalism, and relative solvent accessibility partitioned into backbone and side chain contributions. We constructed a web server called aaRNA that implements the proposed method and demonstrate its use in identifying putative RNA binding sites. PMID:25063293

Li, Songling; Yamashita, Kazuo; Amada, Karlou Mar; Standley, Daron M.

2014-01-01

94

Expanding the 2,2'-bipyrimidine bridged 1D homonuclear coordination polymers family: [M(II)(bpym)Cl2] (M = Fe, Co) magnetic and structural characterization.  

Science.gov (United States)

One pot reaction of hydrated chloride salts of Fe(II) and Co(II) with stoichiometric amounts of 2,2'-bipyrimidine (bpym) in a methanol-acetonitrile mixture afforded the corresponding 1D homonuclear coordination polymers, [?-(bpym)MCl2]n. Crystal structures of both complexes are isomorphous in the highly symmetric orthorhombic space group Fddd. The 1D coordination polymers are composed of almost orthogonal alternating bipyrimidine bridges linking the {MCl2} units. The magnetic behaviour of the Fe(II) compound can be well understood as a uniform S = 2 chain with an antiferromagnetic exchange interaction between metal ion sites. In the case of the Co(II) ion, also an antiferromagnetic interaction is operative along the uniform chain, while at low temperatures a long range-ordering is observed due to spin canting originating from the anisotropic behaviour of the Co(II) lowest energy Kramers doublets. PMID:23676951

Alborés, Pablo; Rentschler, Eva

2013-07-14

95

PredyFlexy: flexibility and local structure prediction from sequence.  

Science.gov (United States)

Protein structures are necessary for understanding protein function at a molecular level. Dynamics and flexibility of protein structures are also key elements of protein function. So, we have proposed to look at protein flexibility using novel methods: (i) using a structural alphabet and (ii) combining classical X-ray B-factor data and molecular dynamics simulations. First, we established a library composed of structural prototypes (LSPs) to describe protein structure by a limited set of recurring local structures. We developed a prediction method that proposes structural candidates in terms of LSPs and predict protein flexibility along a given sequence. Second, we examine flexibility according to two different descriptors: X-ray B-factors considered as good indicators of flexibility and the root mean square fluctuations, based on molecular dynamics simulations. We then define three flexibility classes and propose a method based on the LSP prediction method for predicting flexibility along the sequence. This method does not resort to sophisticate learning of flexibility but predicts flexibility from average flexibility of predicted local structures. The method is implemented in PredyFlexy web server. Results are similar to those obtained with the most recent, cutting-edge methods based on direct learning of flexibility data conducted with sophisticated algorithms. PredyFlexy can be accessed at http://www.dsimb.inserm.fr/dsimb_tools/predyflexy/. PMID:22689641

de Brevern, Alexandre G; Bornot, Aurélie; Craveur, Pierrick; Etchebest, Catherine; Gelly, Jean-Christophe

2012-07-01

96

High-Throughput Sequencing Based Methods of RNA Structure Investigation  

DEFF Research Database (Denmark)

In this thesis we describe the development of four related methods for RNA structure probing that utilize massive parallel sequencing. Using them, we were able to gather structural data for multiple, long molecules simultaneously. First, we have established an easy to follow experimental and computational protocol for detecting the reverse transcription termination sites (RTTS-Seq). This protocol was subsequently applied to hydroxyl radical footprinting of three dimensional RNA structures to give a probing signal that correlates well with the RNA backbone solvent accessibility. Moreover, we applied RTTS-Seq to detect antisense oligonucleotide binding sites within a transcriptome. In this case, we applied an enrichment strategy to greatly reduce the background. Finally, we have modified the RTTS-Seq to study the secondary structure of 3’ untranslated regions. In the course of this thesis we describe several computational methods. One that alleviates PCR bias by estimating number of unique molecules existing before the amplification, and two methods for data normalization: one applicable when the paired end sequencing is performed, and the other that works with the single read sequencing with known priming sites.

Kielpinski, Lukasz Jan

2014-01-01

97

Biophysical and structural considerations for protein sequence evolution  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Protein sequence evolution is constrained by the biophysics of folding and function, causing interdependence between interacting sites in the sequence. However, current site-independent models of sequence evolutions do not take this into account. Recent attempts to integrate the influence of structure and biophysics into phylogenetic models via statistical/informational approaches have not resulted in expected improvements in model performance. This suggests that further innovations are needed for progress in this field. Results Here we develop a coarse-grained physics-based model of protein folding and binding function, and compare it to a popular informational model. We find that both models violate the assumption of the native sequence being close to a thermodynamic optimum, causing directional selection away from the native state. Sampling and simulation show that the physics-based model is more specific for fold-defining interactions that vary less among residue type. The informational model diffuses further in sequence space with fewer barriers and tends to provide less support for an invariant sites model, although amino acid substitutions are generally conservative. Both approaches produce sequences with natural features like dN/dS Conclusions Simple coarse-grained models of protein folding can describe some natural features of evolving proteins but are currently not accurate enough to use in evolutionary inference. This is partly due to improper packing of the hydrophobic core. We suggest possible improvements on the representation of structure, folding energy, and binding function, as regards both native and non-native conformations, and describe a large number of possible applications for such a model.

Grahnen Johan A

2011-12-01

98

Unusual structures of the tandem repetitive DNA sequences located at human centromeres.  

Science.gov (United States)

The presence of the highly conserved repetitive DNA sequence d(AATGG)n.d(CCATT)n in human centromeres argues for a special role for this sequence in recognition, most probably through the formation of an unusual structure during mitosis. Quantitative one- and two-dimensional nuclear magnetic resonance (1D/2D NMR) spectroscopic studies reveal that the Watson-Crick duplex d(AATGG)n.d(CCATT)n adopts the usual B-DNA conformation as illustrated by taking d(AATGG)3.d(CCATT)3 as an example, whereas the d(CCATT)n strand is essentially a random coil. In contrast, the d(AATGG)n strand adopts an unusual stem-loop motif for repeat lengths n = 2, 3, 4, and 6. In addition to normal Watson-Crick A.T pairs, the stem-loop structures are stabilized by mismatched A.G and G.G pairs in the stem and G-G-A stacking in the loop. Stem-loop structures of d(AATGG)n are independently verified by gel electrophoresis and nuclease digestion studies and were also previously shown to be as stable as the corresponding Watson-Crick duplex d(AATGG)n.d(CCATT)n [Grady et al. (1992) Proc. Natl. Acad. Sci. U.S.A. 89, 1695-1699]. Therefore, the sequence d(AATGG)n can, indeed, nucleate a stem-loop structure at little free energy cost, and if, during mitosis, it is located on the chromosome surface, it can provide specific recognition sites for kinetochore function. PMID:8142384

Catasti, P; Gupta, G; Garcia, A E; Ratliff, R; Hong, L; Yau, P; Moyzis, R K; Bradbury, E M

1994-04-01

99

Crystal Structure of Human Liver delta {4}-3-Ketosteroid 5 beta-Reductase (AKR1D1) and Implications for Substrate Binding and Catalysis  

Energy Technology Data Exchange (ETDEWEB)

AKR1D1 (steroid 5{beta}-reductase) reduces all 4-3-ketosteroids to form 5{beta}-dihydrosteroids, a first step in the clearance of steroid hormones and an essential step in the synthesis of all bile acids. The reduction of the carbon-carbon double bond in an a,{beta}-unsaturated ketone by 5{beta}-reductase is a unique reaction in steroid enzymology because hydride transfer from NADPH to the {beta}-face of a 4-3-ketosteroid yields a cis-A/B-ring configuration with an {approx}90 bend in steroid structure. Here, we report the first x-ray crystal structure of a mammalian steroid hormone carbon-carbon double bond reductase, human 4-3-ketosteroid 5{beta}-reductase (AKR1D1), and its complexes with intact substrates. We have determined the structures of AKR1D1 complexes with NADP+ at 1.79- and 1.35- Angstroms resolution (HEPES bound in the active site), NADP+ and cortisone at 1.90- Angstroms resolution, NADP+ and progesterone at 2.03- Angstroms resolution, and NADP+ and testosterone at 1.62- Angstroms resolution. Complexes with cortisone and progesterone reveal productive substrate binding orientations based on the proximity of each steroid carbon-carbon double bond to the re-face of the nicotinamide ring of NADP+. This orientation would permit 4-pro-(R)-hydride transfer from NADPH. Each steroid carbonyl accepts hydrogen bonds from catalytic residues Tyr58 and Glu120. The Y58F and E120A mutants are devoid of activity, supporting a role for this dyad in the catalytic mechanism. Intriguingly, testosterone binds nonproductively, thereby rationalizing the substrate inhibition observed with this particular steroid. The locations of disease-linked mutations thought to be responsible for bile acid deficiency are also revealed.

Di Costanzo,L.; Drury, J.; Penning, T.; Christianson, D.

2008-01-01

100

Real-time defect detection in transparent multilayer polymer films using structured illumination and 1D filtering  

Science.gov (United States)

Today, typical polymer films consist of several functional layers, like printable surface or barrier layers. They are produced in coextrusion processes, in which the different materials are extruded through a single die and formed to a blown- or cast film with haul-off speeds up to 500 m/min. In the production of transparent multilayer films certain defects, called "interfacial instabilities", can occur. They emerge from shear stress and turbulences in the material flow during the process and result in a reduction of the mechanical properties and the optical quality of the product. Interfacial instabilities cannot be detected by conventional film inspection systems available on the market because the optical distortions they produce do not change the brightness of a pixel. In this paper, an approach for solving this problem is presented. The film is illuminated with a patterned line-light source in a backlight setting and a CCD line scan camera is used for recording the image lines. The defects can be detected using a 1D filter tuned to the spatial-frequency of the pattern. The distortion caused by the defects leads to a local extremum in the feature image generated by the filter, which can be easily detected by threshold segmentation. The system has been tested in an industrial setting and proved to be fast enough for inline-inspection. Further applications could be in the fast deflectometric inspection of high-gloss surfaces.

Michaeli, Walter; Berdel, Klaus; Osterbrink, Oliver

2009-06-01

101

Structure of Rutile TiO2 (110)-(1x2): Formation of Ti2O3 Quasi-1D Metallic Chains  

International Nuclear Information System (INIS)

Combining STM, LEED, and density functional theory, we determine the atomic surface structure of rutile TiO2 (110)-(1x2): nonstoichiometric Ti2O3 stripes along the [001] direction. LEED patterns are sharp and free of streaks, while STM images show monatomic steps, wide terraces, and no cross-links. At room temperature, atoms in the Ti2O3 group have large amplitudes of vibration. The long quasi-1D chains display metallic character, show no interaction between them, and cannot couple to bulk or surface states in the gap region, forming good atomic wires

102

A benchmark of multiple sequence alignment programs upon structural RNAs  

OpenAIRE

To date, few attempts have been made to benchmark the alignment algorithms upon nucleic acid sequences. Frequently, sophisticated PAM or BLOSUM like models are used to align proteins, yet equivalents are not considered for nucleic acids; instead, rather ad hoc models are generally favoured. Here, we systematically test the performance of existing alignment algorithms on structural RNAs. This work was aimed at achieving the following goals: (i) to determine conditions where it is appropriate t...

Gardner, Paul P.; Wilm, Andreas; Washietl, Stefan

2005-01-01

103

Exploring the sequence–structure relationship for amyloid peptides  

OpenAIRE

Amyloid fibril formation is associated with misfolding diseases, as well as fulfilling a functional role. The cross-? molecular architecture has been reported in increasing numbers of amyloid-like fibrillar systems. The Waltz algorithm is able to predict ordered self-assembly of amyloidogenic peptides by taking into account the residue type and position. This algorithm has expanded the amyloid sequence space, and in the present study we characterize the structures of amyloid-like fibrils fo...

Morris, Kyle l; Rodger, Alison; Hicks, Matthew r; Debulpaep, Maya; Schymkowitz, Joost; Rousseau, Frederic; Serpell, Louise c

2013-01-01

104

Sequence-structure analysis of FAD-containing proteins  

OpenAIRE

We have analyzed structure-sequence relationships in 32 families of flavin adenine dinucleotide (FAD)-binding proteins, to prepare for genomic-scale analyses of this family. Four different FAD-family folds were identified, each containing at least two or more protein families. Three of these families, exemplified by glutathione reductase (GR), ferredoxin reductase (FR), and p-cresol methylhydroxylase (PCMH) were previously defined, and a family represented by pyruvate oxidase (PO) is newly de...

Dym, Orly; Eisenberg, David

2001-01-01

105

Application of 1D- and 2D-NMR techniques for the structural studies of glycoprotein-derived carbohydrates  

International Nuclear Information System (INIS)

The first part of this thesis (Chapters 1 to 4) describe the determination of the primary structure for a large number of oligosaccharide-alditols obtained from bronchial sputum of cystic fibrosis patients suffering from chronic bronchitis. The second part (Chapters 5 to 8) is devoted to the application of two-dimensional NMR methods for the structural analysis of oligosaccharides. (H.W.). 163 refs.; 50 figs.; 25 tabs

106

Exploring the sequence-structure relationship for amyloid peptides.  

Science.gov (United States)

Amyloid fibril formation is associated with misfolding diseases, as well as fulfilling a functional role. The cross-? molecular architecture has been reported in increasing numbers of amyloid-like fibrillar systems. The Waltz algorithm is able to predict ordered self-assembly of amyloidogenic peptides by taking into account the residue type and position. This algorithm has expanded the amyloid sequence space, and in the present study we characterize the structures of amyloid-like fibrils formed by three peptides identified by Waltz that form fibrils but not crystals. The structural challenge is met by combining electron microscopy, linear dichroism, CD and X-ray fibre diffraction. We propose structures that reveal a cross-? conformation with 'steric-zipper' features, giving insights into the role for side chains in peptide packing and stability within fibrils. The amenity of these peptides to structural characterization makes them compelling model systems to use for understanding the relationship between sequence, self-assembly, stability and structure of amyloid fibrils. PMID:23252554

Morris, Kyle L; Rodger, Alison; Hicks, Matthew R; Debulpaep, Maya; Schymkowitz, Joost; Rousseau, Frederic; Serpell, Louise C

2013-03-01

107

Recognition of remotely related structural homologues using sequence profiles of aligned homologous protein structures.  

Science.gov (United States)

In order to bridge the gap between proteins with three-dimensional (3-D) structural information and those without 3-D structures, extensive experimental and computational efforts for structure recognition are being invested. One of the rapid and simple computational approaches for structure recognition makes use of sequence profiles with sensitive profile matching procedures to identify remotely related homologous families. While adopting this approach we used profiles that are generated from structure-based sequence alignment of homologous protein domains of known structures integrated with sequence homologues. We present an assessment of this fast and simple approach. About one year ago, using this approach, we had identified structural homologues for 315 sequence families, which were not known to have any 3-D structural information. The subsequent experimental structure determination for at least one of the members in 110 of 315 sequence families allowed a retrospective assessment of the correctness of structure recognition. We demonstrate that correct folds are detected with an accuracy of 96.4% (106/110). Most (81/106) of the associations are made correctly to the specific structural family. For 23/106, the structure associations are valid at the superfamily level. Thus, profiles of protein families of known structure when used with sensitive profile-based search procedure result in structure association of high confidence. Further assignment at the level of superfamily or family would provide clues to probable functions of new proteins. Importantly, the public availability of these profiles from us could enable one to perform genome wide structure assignment in a local machine in a fast and accurate manner. PMID:15506994

Namboori, Seema; Srinivasan, Narayanaswamy; Pandit, Shashi B

2004-01-01

108

WildSpan: mining structured motifs from protein sequences  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Automatic extraction of motifs from biological sequences is an important research problem in study of molecular biology. For proteins, it is desired to discover sequence motifs containing a large number of wildcard symbols, as the residues associated with functional sites are usually largely separated in sequences. Discovering such patterns is time-consuming because abundant combinations exist when long gaps (a gap consists of one or more successive wildcards are considered. Mining algorithms often employ constraints to narrow down the search space in order to increase efficiency. However, improper constraint models might degrade the sensitivity and specificity of the motifs discovered by computational methods. We previously proposed a new constraint model to handle large wildcard regions for discovering functional motifs of proteins. The patterns that satisfy the proposed constraint model are called W-patterns. A W-pattern is a structured motif that groups motif symbols into pattern blocks interleaved with large irregular gaps. Considering large gaps reflects the fact that functional residues are not always from a single region of protein sequences, and restricting motif symbols into clusters corresponds to the observation that short motifs are frequently present within protein families. To efficiently discover W-patterns for large-scale sequence annotation and function prediction, this paper first formally introduces the problem to solve and proposes an algorithm named WildSpan (sequential pattern mining across large wildcard regions that incorporates several pruning strategies to largely reduce the mining cost. Results WildSpan is shown to efficiently find W-patterns containing conserved residues that are far separated in sequences. We conducted experiments with two mining strategies, protein-based and family-based mining, to evaluate the usefulness of W-patterns and performance of WildSpan. The protein-based mining mode of WildSpan is developed for discovering functional regions of a single protein by referring to a set of related sequences (e.g. its homologues. The discovered W-patterns are used to characterize the protein sequence and the results are compared with the conserved positions identified by multiple sequence alignment (MSA. The family-based mining mode of WildSpan is developed for extracting sequence signatures for a group of related proteins (e.g. a protein family for protein function classification. In this situation, the discovered W-patterns are compared with PROSITE patterns as well as the patterns generated by three existing methods performing the similar task. Finally, analysis on execution time of running WildSpan reveals that the proposed pruning strategy is effective in improving the scalability of the proposed algorithm. Conclusions The mining results conducted in this study reveal that WildSpan is efficient and effective in discovering functional signatures of proteins directly from sequences. The proposed pruning strategy is effective in improving the scalability of WildSpan. It is demonstrated in this study that the W-patterns discovered by WildSpan provides useful information in characterizing protein sequences. The WildSpan executable and open source codes are available on the web (http://biominer.csie.cyu.edu.tw/wildspan.

Chen Chien-Yu

2011-03-01

109

Natural-abundance measurement of spin-spin couplings the nitrogen-15 in 1D and 2D NMR spectra by HEED pulse sequences  

Science.gov (United States)

Standard pulse sequences frequently employed in NMR studies, such as INEPT, DEPT, HETCOR, phase-sensitive HETCOR, and HETCOR with nongeminal proton decoupling in the F1 dimension, have been extended by Hahn spin echoes. This enables measurement of 15NX and long-range 15N 1H couplings (together with the comparison of their relative signs) at the natural-abundance level of isotopes. The sequences were optimized and verified for X ? 13C, 29Si, 31P, 119Sn, 207Pb, using a wide variety of nitrogen compounds (e.g., pyrroles, nitro compounds, cyanides, cyanates, isocyanates, isothiocyanates, carbodiimides, silylamines, P?N, Si?N, Sn?N, and Pb?N compounds). Both positive and negative 1J( 15N13C) couplings were observed. The trends were reproduced by SCF INDO FPT calculations. Reduced coupling constants 1K119 Sn15 N and 1 K207, 15Pb 15N were all negative. Two-and four-bond 15N 1H couplings were of either sign, whereas vicinal 3J( 15N1H) couplings were always negative, showing a crude linear relationship with V 1J( 15N13C) in the compounds studied. Since the intensity of the residual signal in the HEED experiments is readily adjustable, the measurement of {15}/{14}N isotope shifts, 1? {15}/{14}N(X) , is straightforward. The 1|gD {151}/{14}N( 13C) values determined for rather different bonding situations show a complex behavior and there is no simple relation between 1? values and bond order, s character, or hybridization. A previously proposed classification of {15}/{14} ( NC) isotope shifts is inadequate in the light of the present data. The 1? {15}/{14}N values for 3P(III) chemical shifts are much larger than those for P(V) derivatives. The {15}/{14} isotope effects on 13C and 29Si chemical shifts are similar in magnitude. Unexpectedly, several Pb?N compounds showed 1? {15}/{14}N( 207Pb) values close to zero.

Kup?e, ?rik; Wrackmeyer, Bernd

110

Data Structures: Sequence Problems, Range Queries, and Fault Tolerance  

DEFF Research Database (Denmark)

The focus of this dissertation is on algorithms, in particular data structures that give provably ecient solutions for sequence analysis problems, range queries, and fault tolerant computing. The work presented in this dissertation is divided into three parts. In Part I we consider algorithms for a range of sequence analysis problems that have risen from applications in pattern matching, bioinformatics, and data mining. On a high level, each problem is dened by a function and some constraints and the job at hand is to locate subsequences that score high with this function and are not invalidated by the constraints. Many variants and similar problems have been proposed leading to several dierent approaches and algorithms. We consider problems where the function is the sum of the elements in the sequence and the constraints only bound the length of the subsequences considered. We give optimal algorithms for several variants of the problem based on a simple idea and classic algorithms and data structures. In Part II we consider range query data structures. This a category of problems where the task is to preprocess an input sequence using as little time and space as possible such that one can eciently compute a certain function on the elements in a given query subsequence. There are many types of functions that has been considered in connection with input from dierent sources. The input could be ip-data sorted by ip-address, real estate prices sorted by zip code, advertising cost sorted by time etc. We consider data structures for two classic statistics functions, namely median and mode. Finally, Part III investigates fault tolerant algorithms and data structures. This deals with the trend of avoiding elaborate error checking and correction circuitry that would impose non-negligible costs in terms of hardware performance and money in the design of todays high speed memory technologies. Hardware, power failures, and environmental conditions such as cosmic rays and alpha particles can all alter the memory in unpredictable ways. In applications where large memory capacities are needed at low cost, it makes sense to assume that the algorithms themselves are in charge for dealing with memory faults. We investigate searching, sorting and counting algorithms and data structures that provably returns sensible information in spite of memory corruptions.

JØrgensen, Allan GrØnlund

2010-01-01

111

The sequence, structure and evolutionary features of HOTAIR in mammals  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background An increasing number of long noncoding RNAs (lncRNAs have been identified recently. Different from all the others that function in cis to regulate local gene expression, the newly identified HOTAIR is located between HoxC11 and HoxC12 in the human genome and regulates HoxD expression in multiple tissues. Like the well-characterised lncRNA Xist, HOTAIR binds to polycomb proteins to methylate histones at multiple HoxD loci, but unlike Xist, many details of its structure and function, as well as the trans regulation, remain unclear. Moreover, HOTAIR is involved in the aberrant regulation of gene expression in cancer. Results To identify conserved domains in HOTAIR and study the phylogenetic distribution of this lncRNA, we searched the genomes of 10 mammalian and 3 non-mammalian vertebrates for matches to its 6 exons and the two conserved domains within the 1800 bp exon6 using Infernal. There was just one high-scoring hit for each mammal, but many low-scoring hits were found in both mammals and non-mammalian vertebrates. These hits and their flanking genes in four placental mammals and platypus were examined to determine whether HOTAIR contained elements shared by other lncRNAs. Several of the hits were within unknown transcripts or ncRNAs, many were within introns of, or antisense to, protein-coding genes, and conservation of the flanking genes was observed only between human and chimpanzee. Phylogenetic analysis revealed discrete evolutionary dynamics for orthologous sequences of HOTAIR exons. Exon1 at the 5' end and a domain in exon6 near the 3' end, which contain domains that bind to multiple proteins, have evolved faster in primates than in other mammals. Structures were predicted for exon1, two domains of exon6 and the full HOTAIR sequence. The sequence and structure of two fragments, in exon1 and the domain B of exon6 respectively, were identified to robustly occur in predicted structures of exon1, domain B of exon6 and the full HOTAIR in mammals. Conclusions HOTAIR exists in mammals, has poorly conserved sequences and considerably conserved structures, and has evolved faster than nearby HoxC genes. Exons of HOTAIR show distinct evolutionary features, and a 239 bp domain in the 1804 bp exon6 is especially conserved. These features, together with the absence of some exons and sequences in mouse, rat and kangaroo, suggest ab initio generation of HOTAIR in marsupials. Structure prediction identifies two fragments in the 5' end exon1 and the 3' end domain B of exon6, with sequence and structure invariably occurring in various predicted structures of exon1, the domain B of exon6 and the full HOTAIR.

Zhu Hao

2011-04-01

112

Structural Approaches to Sequence Evolution Molecules, Networks, Populations  

CERN Document Server

Structural requirements constrain the evolution of biological entities at all levels, from macromolecules to their networks, right up to populations of biological organisms. Classical models of molecular evolution, however, are focused at the level of the symbols - the biological sequence - rather than that of their resulting structure. Now recent advances in understanding the thermodynamics of macromolecules, the topological properties of gene networks, the organization and mutation capabilities of genomes, and the structure of populations make it possible to incorporate these key elements into a broader and deeply interdisciplinary view of molecular evolution. This book gives an account of such a new approach, through clear tutorial contributions by leading scientists specializing in the different fields involved.

Bastolla, Ugo; Roman, H. Eduardo; Vendruscolo, Michele

2007-01-01

113

Topological characterization of crystalline ice structures from coordination sequences  

CERN Document Server

Topological properties of crystalline ice structures are studied by considering ring statistics, coordination sequences, and topological density of different ice phases. The coordination sequences (number of sites at topological distance k from a reference site) have been obtained by direct enumeration until at least 40 coordination spheres for different ice polymorphs. This allows us to study the asymptotic behavior of the mean number of sites in the k-th shell, M_k, for high values of k: M_k ~ a k^2, a being a structure-dependent parameter. Small departures from a strict parabolic dependence have been studied by considering first and second differences of the series {M_k} for each structure. The parameter a ranges from 2.00 for ice VI to 4.27 for ice XII, and is used to define a topological density for these solid phases of water. Correlations between such topological density and the actual volume of ice phases are discussed. Ices Ih and Ic are found to depart from the general trend in this correlation due ...

Herrero, Carlos P

2013-01-01

114

Modulation of DNA radiolysis by sequence, structure and ligands  

International Nuclear Information System (INIS)

DNA structure, topology and interactions with ligands are continuously changing during the cell cycle, through phenomena such as replication or transcription. Until recently, it was considered that ionizing radiations break DNA strands with the same probability at all nucleotide sites. Using restriction fragments and synthetic oligonucleotides, we have shown that DNA is heterogeneously radiosensitive. The breakage probability at a given nucleotide site depends on nucleotide type and sequence, presence of ligands (metal ions, proteins, polyamines) and structural parameters (strandedness, handedness, minor groove depth, topological stress). We have observed that in 'naked'(without ligands) dna, the bent 5'-AATT regions that present a narrow minor groove are less sensitive than 'random DNA'. In a (GC)n sequence, all G sites are more radiosensitive, and all C sites are more radioresistant in a negatively super-coiling-induced left-handed Z-DNA than in the right-handed B-DNA. Some G's located in particular regions of single stranded DNA are more radiosensitive than in double stranded DNA. We have also shown that several natural ligands, such as Cu2+, polyamines or DNA-binding proteins modify DNA radiosensitivity directly of via the structural modifications that they induce in DNA. (authors)

115

Syntheses, crystal structures, and characterization of three 1D, 2D and 3D complexes based on mixed multidentate N- and O-donor ligands  

Science.gov (United States)

Three new 1D to 3D complexes, namely, {[Ni(btec)(Himb)2(H2O)2]·6H2O}n (1), {[Cd(btec)0.5(imb)(H2O)]·1.5H2O}n (2), and {[Zn(btec)0.5(imb)]·H2O}n (3) (H4btec=1,2,4,5-benzenetetracarboxylic acid, imb=2-(1H-imidazol-1-methyl)-1H-benzimidazole) have been synthesized by adjusting the central metal ions. Single-crystal X-ray diffraction analyses reveal that complex 1 possesses a 1D chain structure which is further extended into the 3D supramolecular architecture via hydrogen bonds. Complex 2 features a 2D network with Schla¨fli symbol (53·62·7)(52·64). Complex 3 presents a 3D framework with a point symbol of (4·64·8)(42·62·82). Moreover, their IR spectra, PXRD patterns, thermogravimetric curves, and luminescent emissions were studied at room temperature.

Yang, Huai-Xia; Liang, Zhen; Hao, Bao-Lian; Meng, Xiang-Ru

2014-10-01

116

DNA Sequence-Directed Organization of Chromatin: Structure-Based Computational Analysis of Nucleosome-Binding Sequences  

Science.gov (United States)

The folding of DNA on the nucleosome core particle governs many fundamental issues in eukaryotic molecular biology. In this study, an updated set of sequence-dependent empirical “energy” functions, derived from the structures of other protein-bound DNA molecules, is used to investigate the extent to which the architecture of nucleosomal DNA is dictated by its underlying sequence. The potentials are used to estimate the cost of deforming a collection of sequences known to bind or resist uptake in nucleosomes along various left-handed superhelical pathways and to deduce the features of sequence contributing to a particular structural form. The deformation scores reflect the choice of template, the deviations of structural parameters at each step of the nucleosome-bound DNA from their intrinsic values, and the sequence-dependent “deformability” of a given dimer. The correspondence between the computed scores and binding propensities points to a subtle interplay between DNA sequence and nucleosomal folding, e.g., sequences with periodically spaced pyrimidine-purine steps deform at low cost along a kinked template whereas sequences that resist deformation prefer a smoother spatial pathway. Successful prediction of the known settings of some of the best-resolved nucleosome-positioning sequences, however, requires a template with “kink-and-slide” steps like those found in high-resolution nucleosome structures. PMID:19289051

Balasubramanian, Sreekala; Xu, Fei; Olson, Wilma K.

2009-01-01

117

Crystal Structure of Human Liver [delta][superscript 4]-3-Ketosteroid 5[beta]-Reductase (AKR1D1) and Implications for Substrate Binding and Catalysis  

Energy Technology Data Exchange (ETDEWEB)

AKR1D1 (steroid 5{beta}-reductase) reduces all {Delta}{sup 4}-3-ketosteroids to form 5{beta}-dihydrosteroids, a first step in the clearance of steroid hormones and an essential step in the synthesis of all bile acids. The reduction of the carbon-carbon double bond in an {alpha}{beta}-unsaturated ketone by 5{beta}-reductase is a unique reaction in steroid enzymology because hydride transfer from NADPH to the {beta}-face of a {Delta}{sup 4}-3-ketosteroid yields a cis-A/B-ring configuration with an {approx}90{sup o} bend in steroid structure. Here, we report the first x-ray crystal structure of a mammalian steroid hormone carbon-carbon double bond reductase, human {Delta}{sup 4}-3-ketosteroid 5{beta}-reductase (AKR1D1), and its complexes with intact substrates. We have determined the structures of AKR1D1 complexes with NADP{sup +} at 1.79- and 1.35-{angstrom} resolution (HEPES bound in the active site), NADP{sup +} and cortisone at 1.90-{angstrom} resolution, NADP{sup +} and progesterone at 2.03-{angstrom} resolution, and NADP{sup +} and testosterone at 1.62-{angstrom} resolution. Complexes with cortisone and progesterone reveal productive substrate binding orientations based on the proximity of each steroid carbon-carbon double bond to the re-face of the nicotinamide ring of NADP{sup +}. This orientation would permit 4-pro-(R)-hydride transfer from NADPH. Each steroid carbonyl accepts hydrogen bonds from catalytic residues Tyr{sup 58} and Glu{sup 120}. The Y58F and E120A mutants are devoid of activity, supporting a role for this dyad in the catalytic mechanism. Intriguingly, testosterone binds nonproductively, thereby rationalizing the substrate inhibition observed with this particular steroid. The locations of disease-linked mutations thought to be responsible for bile acid deficiency are also revealed.

Di Costanzo, Luigi; Drury, Jason E.; Penning, Trevor M.; Christianson, David W. (UPENN); (UPENN-MED)

2008-07-15

118

Moments of the spin structure functions g1p and g1d for 0.0522  

International Nuclear Information System (INIS)

The spin structure functions g1 for the proton and the deuteron have been measured over a wide kinematic range in x and Q2 using 1.6 and 5.7 GeV longitudinally polarized electrons incident upon polarized NH3 and ND3 targets at Jefferson Lab. Scattered electrons were detected in the CEBAF Large Acceptance Spectrometer, for 0.0522 and W1 for the proton and deuteron are presented - both have a negative slope at low Q2, as predicted by the extended Gerasimov-Drell-Hearn sum rule. The first extraction of the generalized forward spin polarizability of the proton ?0p is also reported. This quantity shows strong Q2 dependence at low Q2. Our analysis of the Q2 evolution of the first moment of g1 shows agreement in leading order with Heavy Baryon Chiral Perturbation Theory. However, a significant discrepancy is observed between the ?0p data and Chiral Perturbation calculations for ?0p, even at the lowest Q2

119

Measurements of Spin Structure Function G1(P) and G1(D) for Proton and Deuteron at SLAC E143  

International Nuclear Information System (INIS)

E143 was a high precision measurement of the proton and deuteron spin structure functions g1 and g2 in SLAC's End Station A facility, with longitudinally and transversely polarized NH3 and ND3 targets, and a longitudinally polarized electron beam. The experiment was done,at beam energies of 29, 16 and 9.7 Gev. The deeply inelastic scattered electrons were detected by two independent spectrometers at 4.5o and 7o relative to the incident electron beam. At a beam energy of 29 Gev, the measurements covered the Bjorken x range from 0.03 to 0.8, and the Q2 range from 1.2 (GeV/c)2 to 9.8 (GeV/c)2 . It was found that the ?01 g1p(x, Q2)dx is more than two standard deviations away from the Ellis-Jaffe sum rule, and the corresponding deuteron integral is more than three standard deviations away from the Ellis-Jaffe's rule, but the Bjorken sum rule is consistent with the experimental data. Tests of the sum rules at different values of Q2, and the implications of these results for the quark-parton model have also been done

120

Reversible switching of electronic ground state in a pentacoordinated Cu(II) 1D cationic polymer and structural diversity.  

Science.gov (United States)

Two copper(II) polymeric complexes {[Cu(HPymat)(MeOH)](NO3)}n (1) and {[Cu4(Pymab)4(H2O)4](NO3)4} (2) were synthesized with the carboxylate-containing Schiff-base ligands HPymat(-) and Pymab(-) [H2Pymat = (E)-2-(1-(pyridin-2-yl)methyleneamino)terephthalic acid, HPymab = (E)-2-((pyridine-2-yl)methyleneamino)benzoic acid]. Complex 1 is a one-dimensional Cu(II) cationic polymeric complex containing free protonated carboxylic groups and nitrate anions as counterions. Complex 2 is a zero-dimensional tetranuclear cationic Cu(II) complex containing nitrate anions as counterions. Complex 1 shows rhombic electron paramagnetic resonance (EPR) spectra in the solid state at room temperature (RT) and 77 K and tetragonal EPR spectra in dimethyl sulfoxide (DMSO) and dimethylformamide (DMF) and "inverse" EPR spectrum in CH3CN. Complex 2 shows rhombic EPR spectra in the solid state at RT and 77 K. But complex 2 shows tetragonal spectra in DMSO, DMF, and CH3CN. Thermogravimetric analysis was also performed for both complexes 1 and 2. Mean-square displacement amplitude analysis was carried out to detect librational disorder along the metal-ligand bonds in crystal structures. PMID:24911032

Sasmal, Ashok; Garribba, Eugenio; Rizzoli, Corrado; Mitra, Samiran

2014-07-01

121

DNA Sequence-Directed Organization of Chromatin: Structure-Based Computational Analysis of Nucleosome-Binding Sequences  

OpenAIRE

The folding of DNA on the nucleosome core particle governs many fundamental issues in eukaryotic molecular biology. In this study, an updated set of sequence-dependent empirical “energy” functions, derived from the structures of other protein-bound DNA molecules, is used to investigate the extent to which the architecture of nucleosomal DNA is dictated by its underlying sequence. The potentials are used to estimate the cost of deforming a collection of sequences known to bind or resist up...

Balasubramanian, Sreekala; Xu, Fei; Olson, Wilma K.

2009-01-01

122

Lanthanide complexes of the monovacant Dawson polyoxotungstate [?2-As2W17O61]10- with 1D chain: Synthesis, structures, and photoluminescence properties  

International Nuclear Information System (INIS)

Six new lanthanide complexes, (H3O)[Ln3(H2O)17(?2-As2W17O61)].nH2O ((1) Ln=CeIII and n?13; (2) Ln=PrIII and n?9; (3) Ln=NdIII and n?14; (4) Ln=SmIII and n?8; (5) Ln=EuIII and n?4; (6) Ln=GdIII and n?7), have been isolated by conventional solution method and characterized by elemental analysis, IR spectroscopy and single crystal X-ray diffraction. All the complexes are isomorphic and crystallize in the triclinic space group P-1. These complexes are 1D chain-like structures constructed by lanthanide cations and monovacant Dawson-type [?2-As2W17O61]10- polyoxoanions. The striking feature of the structures is that there are three kinds of coordination environments for lanthanide cations, which are responsible for the formation of polymeric structures. Photoluminescence measurements reveal that 4 and 5 exhibit orange and red fluorescent emission at room temperature, respectively. - Graphical abstract: Six new lanthanide complexes based on monovacant Dawson-type tungstoarsenates have been synthesized. These complexes are one-dimensional chain-like structures constructed by lanthanide cations and [?2-As2W17O61]10- anions. There are three kinds of coordination environment for lanthanide catironment for lanthanide cations. Photoluminescence measurement reveals that 4 and 5 exhibit orange and red fluorescent emission at room temperature, respectively

123

Isolation of a metastable intermediate in a heterometallic Cu(II)-Hg(II) 1D polymeric chain: synthesis, crystal structure, and photophysical properties.  

Science.gov (United States)

A metastable heterometallic intermediate, [Cu2(bpy)2(DIPSA)2Hg2(OAc)4(DIPSA)2]n (1, where OAc = CH3COO(-), bpy = bipyridine, and DIPSA = diisopropylsalicylic acid), has been isolated and characterized during the synthesis of 1D polymer [Cu2(bpy)2(DIPSA)2(CH3CN)2Hg2(OAc)2(DIPSA)4]n (2) at ambient temperature in acetonitrile. Moreover, recrystallization of 2 in methanol results in monomeric [Cu(DIPSA)(bpy)(CH3OH)]·CH3OH (3). Complexes 1-3 have been characterized by elemental analysis, Fourier transform infrared, and UV-vis spectroscopy as well as by their single-crystal X-ray structures. The photophysical study suggests the quenching of fluorescence of DIPSA upon complexation. PMID:25615821

Mobin, Shaikh M; Mishra, Veenu; Chaudhary, Archana

2015-02-16

124

Age-structured Trait Substitution Sequence Process and Canonical Equation  

CERN Document Server

We are interested in a stochastic model of trait and age-structured population undergoing mutation and selection. We start with a continuous time, discrete individual-centered population process. Taking the large population and rare mutations limits under a well-chosen time-scale separation condition, we obtain a jump process that generalizes the Trait Substitution Sequence process describing Adaptive Dynamics for populations without age structure. Under the additional assumption of small mutations, we derive an age-dependent ordinary differential equation that extends the Canonical Equation. These evolutionary approximations have never been introduced to our knowledge. They are based on ecological phenomena represented by PDEs that generalize the Gurtin-McCamy equation in Demography. Another particularity is that they involve a fitness function, describing the probability of invasion of the resident population by the mutant one, that can not always be computed explicitly. Examples illustrate how adding an ag...

Méléard, Sylvie

2007-01-01

125

Combining multiple structure and sequence alignments to improve sequence detection and alignment: Application to the SH2 domains of Janus kinases  

OpenAIRE

In this paper, an approach is described that combines multiple structure alignments and multiple sequence alignments to generate sequence profiles for protein families. First, multiple sequence alignments are generated from sequences that are closely related to each sequence of known three-dimensional structure. These alignments then are merged through a multiple structure alignment of family members of known structure. The merged alignment is used to generate a Hi...

Al-lazikani, Bissan; Sheinerman, Felix B.; Honig, Barry

2001-01-01

126

Sequence and structural selectivity of nucleic acid binding ligands.  

Science.gov (United States)

The sequence and structural selectivity of 15 different DNA binding agents was explored using a novel, thermodynamically rigorous, competition dialysis procedure. In the competition dialysis method, 13 different nucleic acid structures were dialyzed against a common ligand solution. More ligand accumulated in the dialysis tube containing the structural form with the highest ligand binding affinity. DNA structural forms included in the assay ranged from single-stranded forms, through a variety of duplex forms, to multistranded triplex and tetraplex forms. Left-handed Z-DNA, RNA, and a DNA-RNA hybrid were also represented. Standard intercalators (ethidium, daunorubicin, and actinomycin D) served as control compounds and were found to show structural binding preferences fully consistent with their previously published behavior. Standard groove binding agents (DAPI, distamycin, and netropsin) showed a strong preference for AT-rich duplex DNA forms, along with apparently strong binding to the poly(dA)-[poly(dT)](2) triplex. Thermal denaturation studies revealed the apparent triplex binding to be complex, and perhaps to result from displacement of the third strand. Putative triplex (BePI, coralyne, and berberine) and tetraplex [H(2)TmPyP, 5,10,15, 20-tetrakis[4-(trimethylammonio)phenyl]-21H,23H-porphine, and N-methyl mesoporphyrin IX] selective agents showed in many cases less dramatic binding selectivity than anticipated from published reports that compared their binding to only a few structural forms. Coralyne was found to bind strongly to single-stranded poly(dA), a novel and previously unreported interaction. Finally, three compounds (berenil, chromomycin A, and pyrenemethylamine) whose structural preferences are largely unknown were examined. Pyrenemethylamine exhibited an unexpected and unprecedented preference for duplex poly(dAdT). PMID:10587429

Ren, J; Chaires, J B

1999-12-01

127

Two novel 1-D helical chains Zn(II)/Cd(II) polymers based on tetrazolate-1-acetic acid: Crystal structures, solid state fluorescence and thermal behaviors  

Science.gov (United States)

Two new d10 metal complexes with tetrazolate-1-acetic acid, [Zn(1-tza)(Cl)(H2O)] (1) and [Cd(1-tza)(phen)(NO3)] (2) (1-Htza = tetrazole-1-acetic acid, phen = 1,10-phenanthroline), have been prepared, and their structures have been characterized by single-crystal X-ray diffraction. The flexibilities of 1-tza ligands result in 1-D helical chained structures of the two obtained complexes, in which the 1-tza ligands adopt different coordination mode: 1 with ?2-kO1: kN4 and 2 with ?2-kO1, O2: kN3. Compounds 1 exhibits a nonracemic enantiopure topology while compound 2 reveals to be mesomeric structures. The crystal packing in 1 and 2 is controlled mainly by hydrogen bonds and face-to-face ?-? stacking interactions, respectively. Photoluminescence studies show that 1 and 2 exhibit strong luminescence. Moreover, compound 1 exhibits a second-order nonlinear optical coefficient equal to that of potassium dihydrogen phosphate (KDP). The thermal stability of the two complexes has also been investigated.

Lu, Ying-Bing; Jin, Shuang; Jian, Fang-Mei; Xie, Yong-Rong; Luo, Guo-Tian

2014-03-01

128

Mononuclear, dinuclear and 1-D polymeric complexes of Cd(II) of a pyridyl pyrazole ligand: Syntheses, crystal structures and photoluminescence studies  

Science.gov (United States)

The syntheses, crystal structures and photoluminescence properties of four new Cd(II) complexes are reported using strongly coordinating ligand 3,5-dimethyl-1-(2'-pyridyl) pyrazole (L) in presence of anionic ancillary bridging ligands as nitrite, chloride and dicyanamide. Among the complexes two (1 and 2) are monomeric, 3 is ?2 - chloro bridged dimer and the last one (4) is a mixed alternate chloro - end to end (EE) dicyanamide bridged 1D polymer. All the four complexes have been X-ray crystallographically characterized. The ligand L behaves as a potent bidentate neutral N, N donor. Geometrical diversity of Cd(II) complexes is due to no loss or gain of crystal field stability with the variation of geometry. Consequently the stability of a structure depends on steric requirements. The ligand L shows considerable fluorescence and all four complexes in methanol exhibit interesting photoluminescence properties with different emission intensities. The band maxima and fluorescence efficiency (in methanol) are found to be dependent on the coordination chromophore and structural rigidity induced by the incorporated Cd(II) ion. Among the synthesized complexes 1 exhibits the highest fluorescence intensity in methanol.

Das, Kinsuk; Konar, Saugata; Jana, Atanu; Barik, Anil Kumar; Roy, Sangita; Kar, Susanta Kumar

2013-03-01

129

Combined sequence and sequence-structure-based methods for analyzing RAAS gene SNPs: a computational approach.  

Science.gov (United States)

The renin-angiotensin-aldosterone system (RAAS) plays a key role in the regulation of blood pressure (BP). Mutations on the genes that encode components of the RAAS have played a significant role in genetic susceptibility to hypertension and have been intensively scrutinized. The identification of such probably causal mutations not only provides insight into the RAAS but may also serve as antihypertensive therapeutic targets and diagnostic markers. The methods for analyzing the SNPs from the huge dataset of SNPs, containing both functional and neutral SNPs is challenging by the experimental approach on every SNPs to determine their biological significance. To explore the functional significance of genetic mutation (SNPs), we adopted combined sequence and sequence-structure-based SNP analysis algorithm. Out of 3864 SNPs reported in dbSNP, we found 108 missense SNPs in the coding region and remaining in the non-coding region. In this study, we are reporting only those SNPs in coding region to be deleterious when three or more tools are predicted to be deleterious and which have high RMSD from the native structure. Based on these analyses, we have identified two SNPs of REN gene, eight SNPs of AGT gene, three SNPs of ACE gene, two SNPs of AT1R gene, three SNPs of CYP11B2 gene and three SNPs of CMA1 gene in the coding region were found to be deleterious. Further this type of study will be helpful in reducing the cost and time for identification of potential SNP and also helpful in selecting potential SNP for experimental study out of SNP pool. PMID:24878201

Singh, Kh Dhanachandra; Karthikeyan, Muthusamy

2014-12-01

130

Isolation, sequencing, and structure-activity relationships of cyclotides.  

Science.gov (United States)

Cyclotides are a topologically fascinating family of miniproteins discovered over the past decade that have expanded the diversity of plant-derived natural products. They are approximately 30 amino acids in size and occur in plants of the Violaceae, Rubiaceae, and Cucurbitaceae families. Despite their proteinaceous composition, cyclotides behave in much the same way as many nonpeptidic natural products in that they are resistant to degradation by enzymes or heat and can be extracted from plants using methanol. Their stability arises, in large part, due to their characteristic cyclic cystine knot (CCK) structural motif. Cystine knots are present in a variety of proteins of insect, plant, and animal origin, comprising a ring formed by two disulfide bonds and their connecting backbone segments that is threaded by a third disulfide bond. In cyclotides, the cystine knot is uniquely embedded within a head-to-tail cyclized peptide backbone, leading to the ultrastable CCK structural motif. Apart from the six absolutely conserved cysteine residues, the majority of amino acids in the six backbone loops of cyclotides are tolerant to variation. It has been predicted that the family might include up to 50,000 members; although, so far, sequences for only 140 have been reported. Cyclotides exhibit a variety of biological activities, including insecticidal, nematocidal, molluscicidal, antimicrobial, antibarnacle, anti-HIV, and antitumor activities. Due to their diverse activities and common structural core from which variable loops protrude, cyclotides can be thought of as combinatorial peptide templates capable of displaying a variety of amino acid sequences. They have thus attracted interest in drug design as well as in crop protection applications. PMID:20718473

Ireland, David C; Clark, Richard J; Daly, Norelle L; Craik, David J

2010-09-24

131

Structure and sequence variation of the canine perforin gene.  

Science.gov (United States)

Lymphocyte-mediated cytotoxicity is essential to control viral infections, limit lymphocyte expansion and activation, and survey for malignant cells. Humans with defects in lymphocyte cytotoxicity have reduced perforin function resulting in uncontrolled lymphocyte expansion, leading to excessive histiocyte activation and a hemophagocytic disorder. Dog breeds such as Bernese mountain dogs (BMD) have a high incidence of reactive and malignant diseases affecting histiocytes. This study addressed the hypothesis that changes in the perforin gene contribute to the development of hemophagocytic histiocytic sarcoma (HHS) in BMD. Canine perforin DNA was amplified and sequenced through multiple PCR assays from healthy and diseased dogs, and the gene structure determined by rapid amplification of cDNA ends. The coding component of the gene consists of 1679bp, with two exons of 536bp and 1143bp separated by an intron of 865bp. Gene configuration and location differ from that in other species although the coding sequence is highly conserved. Three silent single nucleotide polymorphisms (SNP) were identified. Analysis of their distribution indicated a consistent genotype among 6 middle-aged to older BMD without histiocytic diseases. Among samples from 10 dogs with HHS and 10 without histiocytic diseases SNP occurred with variable frequency. It was concluded that changes in the amino acid sequence of perforin were not associated with HHS but that a constellation of SNP may characterize BMD without histiocytic disease. Investigation of more dogs is required to confirm a specific genotype. Future studies should focus on the potential contribution of reduced perforin expression and/or function to HHS in dogs. PMID:19740553

Neta, M; Wen, X; Moore, P F; Bienzle, D

2010-02-15

132

Rate of steroid double-bond reduction catalysed by the human steroid 5?-reductase (AKR1D1) is sensitive to steroid structure: implications for steroid metabolism and bile acid synthesis.  

Science.gov (United States)

Human AKR1D1 (steroid 5?-reductase/aldo-keto reductase 1D1) catalyses the stereospecific reduction of double bonds in ?4-3-oxosteroids, a unique reaction that introduces a 90° bend at the A/B ring fusion to yield 5?-dihydrosteroids. AKR1D1 is the only enzyme capable of steroid 5?-reduction in humans and plays critical physiological roles. In steroid hormone metabolism, AKR1D1 serves mainly to inactivate the major classes of steroid hormones. AKR1D1 also catalyses key steps of the biosynthetic pathway of bile acids, which regulate lipid emulsification and cholesterol homoeostasis. Interestingly, AKR1D1 displayed a 20-fold variation in the kcat values, with steroid hormone substrates (e.g. aldosterone, testosterone and cortisone) having significantly higher kcat values than steroids with longer side chains (e.g. 7?-hydroxycholestenone, a bile acid precursor). Transient kinetic analysis revealed striking variations up to two orders of magnitude in the rate of the chemistry step (kchem), which resulted in different rate determining steps for the fast and slow substrates. By contrast, similar Kd values were observed for representative fast and slow substrates, suggesting similar rates of release for different steroid products. The release of NADP+ was shown to control the overall turnover for fast substrates, but not for slow substrates. Despite having high kchem values with steroid hormones, the kinetic control of AKR1D1 is consistent with the enzyme catalysing the slowest step in the catabolic sequence of steroid hormone transformation in the liver. The inherent slowness of the conversion of the bile acid precursor by AKR1D1 is also indicative of a regulatory role in bile acid synthesis. PMID:24894951

Jin, Yi; Chen, Mo; Penning, Trevor M

2014-08-15

133

Chaining sequence/structure seeds for computing RNA similarity.  

Science.gov (United States)

We describe a new method to compare a query RNA with a static set of target RNAs. Our method is based on (i) a static indexing of the sequence/structure seeds of the target RNAs; (ii) searching the target RNAs by detecting seeds of the query present in the target, chaining these seeds in promising candidate homologs; and then (iii) completing the alignment using an anchor-based exact alignment algorithm. We apply our method on the benchmark Bralibase2.1 and compare its accuracy and efficiency with the exact method LocARNA and its recent seeds-based speed-up ExpLoc-P. Our pipeline RNA-unchained greatly improves computation time of LocARNA and is comparable to the one of ExpLoc-P, while improving the overall accuracy of the final alignments. PMID:25768236

Bourgeade, Laetitia; Chauve, Cédric; Allali, Julien

2015-03-01

134

Identification of similar regions of protein structures using integrated sequence and structure analysis tools  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Understanding protein function from its structure is a challenging problem. Sequence based approaches for finding homology have broad use for annotation of both structure and function. 3D structural information of protein domains and their interactions provide a complementary view to structure function relationships to sequence information. We have developed a web site http://www.sblest.org/ and an API of web services that enables users to submit protein structures and identify statistically significant neighbors and the underlying structural environments that make that match using a suite of sequence and structure analysis tools. To do this, we have integrated S-BLEST, PSI-BLAST and HMMer based superfamily predictions to give a unique integrated view to prediction of SCOP superfamilies, EC number, and GO term, as well as identification of the protein structural environments that are associated with that prediction. Additionally, we have extended UCSF Chimera and PyMOL to support our web services, so that users can characterize their own proteins of interest. Results Users are able to submit their own queries or use a structure already in the PDB. Currently the databases that a user can query include the popular structural datasets ASTRAL 40 v1.69, ASTRAL 95 v1.69, CLUSTER50, CLUSTER70 and CLUSTER90 and PDBSELECT25. The results can be downloaded directly from the site and include function prediction, analysis of the most conserved environments and automated annotation of query proteins. These results reflect both the hits found with PSI-BLAST, HMMer and with S-BLEST. We have evaluated how well annotation transfer can be performed on SCOP ID's, Gene Ontology (GO ID's and EC Numbers. The method is very efficient and totally automated, generally taking around fifteen minutes for a 400 residue protein. Conclusion With structural genomics initiatives determining structures with little, if any, functional characterization, development of protein structure and function analysis tools are a necessary endeavor. We have developed a useful application towards a solution to this problem using common structural and sequence based analysis tools. These approaches are able to find statistically significant environments in a database of protein structure, and the method is able to quantify how closely associated each environment is to a predicted functional annotation.

Heiland Randy

2006-03-01

135

High quality protein sequence alignment by combining structural profile prediction and profile alignment using SABERTOOTH  

OpenAIRE

Abstract Background Protein alignments are an essential tool for many bioinformatics analyses. While sequence alignments are accurate for proteins of high sequence similarity, they become unreliable as they approach the so-called 'twilight zone' where sequence similarity gets indistinguishable from random. For such distant pairs, structure alignment is of much better quality. Nevertheless, sequence alignment is the only choice in the majority of cases where structural data is not available. T...

Bastolla Ugo; Minning Jonas; Teichert Florian; Porto Markus

2010-01-01

136

Structural analysis of DNA sequence: evidence for lateral gene transfer in Thermotoga maritima  

OpenAIRE

The recently published complete DNA sequence of the bacterium Thermotoga maritima provides evidence, based on protein sequence conservation, for lateral gene transfer between Archaea and Bacteria. We introduce a new method of periodicity analysis of DNA sequences, based on structural parameters, which brings independent evidence for the lateral gene transfer in the genome of T.maritima. The structural analysis relates the Archaea-like DNA sequences to the genome of Pyrococcus horikoshii. Anal...

Worning, Peder; Jensen, Lars J.; Nelson, Karen E.; Brunak, Søren; Ussery, David W.

2000-01-01

137

Sequence, structural, functional, and phylogenetic analyses of three glycosidase families.  

Science.gov (United States)

Glycosidases, which cleave the glycosidic bond between a carbohydrate and another moiety, have been classified into over 63 families. Here, a variety of computational techniques have been employed to examine three families important in normal and abnormal pathology with the aim of developing a framework for future homology modeling, experimental and other studies. Family 1 includes bacterial and archaeal enzymes as well as lactase phlorizin-hydrolase and klotho, glycosidases implicated in disaccharide intolerance II and aging respectively. A statistical model, a hidden Markov model (HMM), for the family 1 glycosidase domain was trained and used as the basis for comparative examination of the conserved and variable sequence and structural features as well as the phylogenetic relationships between family members. Although the structures of four family 1 glycosidases have been determined, this is the first comparative examination of all these enzymes. Aspects that are unique to specific members or subfamilies (substrate binding loops) as well those common to all members (a beta/alpha)8 barrel fold) have been defined. Active site residues in some domains in klotho and lactase-phlorizin hydrolases differ from other members and in one instance may bind but not cleave substrate. The four invariant and most highly conserved residues are not residues implicated in catalysis and/or substrate binding. Of these, a histidine may be involved in transition state stabilization. Glucosylceramidase (family 30) and galactosylceramidase (family 59) are mutated in the lysosomal storage disorders Gaucher disease and Krabbe disease, respectively. HMM-based analysis, structure prediction studies and examination of disease mutations reveal a glycosidase domain common to these two families that also occurs in some bacterial glycosidases. Similarities in the reactions catalyzed by families 30 and 59 are reflected in the presence of a structurally and functionally related (beta/alpha)8 barrel fold related to that in family 1. PMID:9779294

Mian, I S

1998-06-01

138

Structural properties of replication origins in yeast DNA sequences  

International Nuclear Information System (INIS)

Sequence-dependent DNA flexibility is an important structural property originating from the DNA 3D structure. In this paper, we investigate the DNA flexibility of the budding yeast (S. Cerevisiae) replication origins on a genome-wide scale using flexibility parameters from two different models, the trinucleotide and the tetranucleotide models. Based on analyzing average flexibility profiles of 270 replication origins, we find that yeast replication origins are significantly rigid compared with their surrounding genomic regions. To further understand the highly distinctive property of replication origins, we compare the flexibility patterns between yeast replication origins and promoters, and find that they both contain significantly rigid DNAs. Our results suggest that DNA flexibility is an important factor that helps proteins recognize and bind the target sites in order to initiate DNA replication. Inspired by the role of the rigid region in promoters, we speculate that the rigid replication origins may facilitate binding of proteins, including the origin recognition complex (ORC), Cdc6, Cdt1 and the MCM2-7 complex

139

Computation of 1-D shock structure in a gas in rotational non-equilibrium using a new set of simplified Burnett equations  

Science.gov (United States)

This paper describes the computations of hypersonic shock wave structure in a gas in rotational non-equilibrium using a newly developed simplified set of Burnett equations designated as Simplified Conventional Burnett (SCB) equations. Since the original formulation by Burnett, a number of variations to the original Burnett equations have been proposed and the differences among these variants and their merits/shortcomings have been described in the literature. A new variant is created based on the conventional Burnett equations for hypersonic flows by neglecting terms that are inversely proportional to the Mach number. This simplified set of conventional Burnett equations is linearly stable for small disturbances in contrast to the conventional Burnett equations which suffer from Bobylev instability. To simulate the rotational non-equilibrium effect in a diatomic gas, both the Navier-Stokes (NS) and the SCB equations are modified by including a rotational non-equilibrium relaxation model. The flow variables (density, translational and rotational temperature) in a typical 1-D shock at different Mach numbers (1.2, 5, and 10) in Nitrogen are computed using the SCB and NS equations and are compared with the DSMC results. SCB calculations are in close agreement with the DSMC results at high Mach numbers.

Zhao, Wenwen; Chen, Weifang; Liu, Hualin; Agarwal, Ramesh K.

2014-12-01

140

3-D structure of the Rio Grande Rift from 1-D constrained joint inversion of receiver functions and surface wave dispersion  

Science.gov (United States)

The Southern terminus of the Rio Grande Rift region has been poorly defined in the geologic record, with few seismic studies that provide information on the deeper Rift structure. In consequence, important questions related to tectonic and lithospheric activity of the Rio Grande Rift remain unresolved. To address some of these geological questions, we collect and analyze seismic data from 147 EarthScope Transportable Array (USArray) and other seismic stations in the region, to develop a 3-D crust and upper mantle velocity model. We apply a constrained optimization approach for joint inversion of surface wave and receiver functions using seismic S wave velocities as a model parameter. In particular, we compute receiver functions stacks based on ray parameter, and invert them jointly with collected surface wave group velocity dispersion observations. The inversions estimate 1-D seismic S-wave velocity profiles to 300 km depth, which are then interpolated to a 3-D velocity model using a Bayesian kriging scheme. Our 3-D models show a thin lower velocity crust anomaly along the southeastern Rio Grande Rift, a persistent low velocity anomaly underneath the Colorado Plateau and Basin and Range province, and another one at depth beneath the Jemez lineament, and the southern RGR.

Sosa, Anibal; Thompson, Lennox; Velasco, Aaron A.; Romero, Rodrigo; Herrmann, Robert B.

2014-09-01

141

MUSTER: Improving protein sequence profile–profile alignments by using multiple sources of structure information  

OpenAIRE

We develop a new threading algorithm MUSTER by extending the previous sequence profile–profile alignment method, PPA. It combines various sequence and structure information into single-body terms which can be conveniently used in dynamic programming search: (1) sequence profiles; (2) secondary structures; (3) structure fragment profiles; (4) solvent accessibility; (5) dihedral torsion angles; (6) hydrophobic scoring matrix. The balance of the weighting parameters is optimized by a grading s...

Wu, Sitao; Zhang, Yang

2008-01-01

142

A Initio Mr-Rci Calculations of ((n - 1)D + Ns)(n) Atomic Bound States: Application to Hyperfine Structure and Electron Affinity Studies.  

Science.gov (United States)

Systematic inclusion of many-body effects in open d and f subshell atoms has long been known as a formidable challenge in atomic structure theory. Due to the presence of competing relativistic effects in such systems, an appropriate theoretical approach needs to incorporate electron correlation within the framework of the Special Theory of Relativity. To this aim, the Relativistic Configuration Interaction methodology as developed by Beck and others has been extended and applied to multi-reference situations in ((n - 1)d + ns) ^{rm N} type valence configurations. Specific focus has been on the hyperfine structure and electron affinity studies of the transition metal ions and the rare earths respectively. Energies and magnetic dipole and electric quadrupole hyperfine structure constants of all the fifteen Zr II (4d + 5s)^3 J = 0.5, 1.5 levels and the twenty one Nb II (4d + 5s)^4 J = 2 levels have been determined with unprecedented accuracies. The average errors in energy are 0.087 eV and 0.050 eV for Zr II J = 3/2 & 1/2 respectively while that for the ten bottom levels of Nb II J = 2 is 0.055 eV. For the levels known experimentally, the corresponding errors in magnetic dipole hyperfine structure constants are 9.2%, 31.8% and 3.8%. Quite a few of the many-body hyperfine constant values exhibit striking improvements over the Multi-Configurational Dirac Fock values. A new value of nuclear quadrupole moment has also been predicted for Zr II. In all cases certain previous level assignments have been corrected and five previously unknown levels have been identified in Nb II. The rigorous systematics of the many-body effects important for the energy level and hyperfine structure of these systems has been presented including core-valence and core-core effects. Contrary to the conventional wisdom and theoretical predictions of the last decade, the attachment of an f electron has been discarded as the most likely mechanism for the formation of Lanthanide and Actinide negative ions. Neutral Th has been predicted to possess multiple opposite parity negative ion states below its ground state, which also suggests alternative attachment processes involving a p or a d electron. For all Th^{-} bound states electron configurations and electron affinity estimates have been provided. In conclusion, this work answers the major concerns of the many-body problem in open d systems and lays the foundation for relativistic ab initio studies of open f systems.

Datta, Debasis

143

Human serotonin 1D receptor is encoded by a subfamily of two distinct genes: 5-HT1D alpha and 5-HT1D beta.  

OpenAIRE

The serotonin 1D (5-HT1D) receptor is a pharmacologically defined binding site and functional receptor site. Observed variations in the properties of 5-HT1D receptors in different tissues have led to the speculation that multiple receptor proteins with slightly different properties may exist. We report here the cloning, deduced amino acid sequences, pharmacological properties, and second-messenger coupling of a pair of human 5-HT1D receptor genes, which we have designated 5-HT1D alpha and 5-H...

Weinshank, R. L.; Zgombick, J. M.; Macchi, M. J.; Branchek, T. A.; Hartig, P. R.

1992-01-01

144

Synthesis, crystal structures and magnetic properties of mer-cyanideiron(iii)-based 1D heterobimetallic cyanide-bridged chiral coordination polymers.  

Science.gov (United States)

Two pairs of cyanide-bridged Fe(iii)-Mn(iii)/Cu(ii) chiral enantiomer coordination polymers {[Mn(S,S/R,R-Salcy)(CH3OH)2]{[Mn(S,S/R,R-Salcy)][Fe(bbp)(CN)3]}}2n (,) (bbp = bis(2-benzimidazolyl)pyridine dianion) and {[Cu(S,S/R,R-Chxn)2]2[Fe2(tbbp)(CN)6]}n (,) (tbbp = tetra(3-benzimidazolyl)-4,4'-bipyridine tetraanion) have been successfully prepared by employing mer-tricyanometallate [PPh4]2[Fe(bbp)(CN)3] or the newly bimetallic mer-cyanideiron(iii) precursor K4[Fe2(tbbp)(CN)6] as building blocks and with chiral manganese(iii)/copper(ii) compounds as assemble segments. The four complexes have been characterized by elemental analysis, IR spectroscopy, circular dichroism (CD) and magnetic circular dichroism (MCD) spectra. Single X-ray diffraction reveals that complexes and possess a single anionic chain structure consisting of the asymmetric chiral {[Mn(S,S/R,R-Salcy)][Fe(bbp)(CN)3]}2(2-) unit with free [Mn(S,S/R,R-Salcy)](+) as balanced cations. The cyanide-bridged Fe(iii)-Cu(ii) complexes and can be structurally characterized as neutral ladder-like double chains composed of the alternating cyanide-bridged Fe-Cu units. Our investigation of magnetic susceptibilities reveals the antiferromagnetic coupling between the cyanide-bridged Fe(iii) and Mn(iii)/Cu(ii) ions for complexes . These results have been further confirmed by theoretical simulation through numerical matrix diagonalization techniques using a Fortran program or a uniform chain model, leading to the coupling constants J = -7.36 cm(-1), D = -1.52 cm(-1) () and J = -4.35 cm(-1) (), respectively. PMID:25661782

Zhang, Daopeng; Zhuo, Shuping; Zhang, Hongyan; Wang, Ping; Jiang, Jianzhuang

2015-02-24

145

Simultaneous Bayesian estimation of alignment and phylogeny under a joint model of protein sequence and structure.  

OpenAIRE

For sequences that are highly divergent, there is often insufficient information to infer accurate alignments, and phylogenetic uncertainty may be high. One way to address this issue is to make use of protein structural information, since structures generally diverge more slowly than sequences. In this work, we extend a recently developed stochastic model of pairwise structural evolution to multiple structures on a tree, analytically integrating over ancestral structures to permit efficient l...

Herman, Jl; Challis, Cj; Nova?k, A?; Hein, J.; Schmidler, Sc

2014-01-01

146

Structure and neural expression of a zebrafish homeobox sequence.  

Science.gov (United States)

A genomic library of zebrafish was constructed and screened with homeobox-containing probes. One of the positive clones contains a transcribed region which shares extensive sequence homology with the murine Hox-1.4 and Hox-2.6 genes and the human HHO.c13 gene. Characterization of this zebrafish homologue (ZF-13) with respect to expression demonstrated that it is transcribed during embryogenesis where a major RNA species of 2.5 kb and a minor transcript of 4.6 kb are detected. The highest concentration of both transcripts was found in embryos at the stage of somite formation. By in situ hybridization the spatial localization of expression was analysed in hatching embryos. Hybridization signals were mainly detected throughout the neural tube and in the brain. A small amount of RNA derived from ZF-13 was localized in differentiated muscle cells. Our results suggest that homeobox genes of distantly related vertebrate species are very similar with respect to structure and function. PMID:2468579

Njølstad, P R; Molven, A; Eiken, H G; Fjose, A

1988-12-15

147

Simultaneous Structural Variation Discovery in Multiple Paired-End Sequenced Genomes  

Science.gov (United States)

Next generation sequencing technologies have been decreasing the costs and increasing the world-wide capacity for sequence production at an unprecedented rate, making the initiation of large scale projects aiming to sequence almost 2000 genomes [1]. Structural variation detection promises to be one of the key diagnostic tools for cancer and other diseases with genomic origin. In this paper, we study the problem of detecting structural variation events in two or more sequenced genomes through high throughput sequencing . We propose to move from the current model of (1) detecting genomic variations in single next generation sequenced (NGS) donor genomes independently, and (2) checking whether two or more donor genomes indeed agree or disagree on the variations (in this paper we name this framework Independent Structural Variation Discovery and Merging - ISV&M), to a new model in which we detect structural variation events among multiple genomes simultaneously.

Hormozdiari, Fereydoun; Hajirasouliha, Iman; McPherson, Andrew; Eichler, Evan E.; Sahinalp, S. Cenk

148

4SALE – A tool for synchronous RNA sequence and secondary structure alignment and editing  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background In sequence analysis the multiple alignment builds the fundament of all proceeding analyses. Errors in an alignment could strongly influence all succeeding analyses and therefore could lead to wrong predictions. Hand-crafted and hand-improved alignments are necessary and meanwhile good common practice. For RNA sequences often the primary sequence as well as a secondary structure consensus is well known, e.g., the cloverleaf structure of the t-RNA. Recently, some alignment editors are proposed that are able to include and model both kinds of information. However, with the advent of a large amount of reliable RNA sequences together with their solved secondary structures (available from e.g. the ITS2 Database, we are faced with the problem to handle sequences and their associated secondary structures synchronously. Results 4SALE fills this gap. The application allows a fast sequence and synchronous secondary structure alignment for large data sets and for the first time synchronous manual editing of aligned sequences and their secondary structures. This study describes an algorithm for the synchronous alignment of sequences and their associated secondary structures as well as the main features of 4SALE used for further analyses and editing. 4SALE builds an optimal and unique starting point for every RNA sequence and structure analysis. Conclusion 4SALE, which provides an user-friendly and intuitive interface, is a comprehensive toolbox for RNA analysis based on sequence and secondary structure information. The program connects sequence and structure databases like the ITS2 Database to phylogeny programs as for example the CBCAnalyzer. 4SALE is written in JAVA and therefore platform independent. The software is freely available and distributed from the website at http://4sale.bioapps.biozentrum.uni-wuerzburg.de

Schultz Jörg

2006-11-01

149

Crystal Structures of Mouse CD1d-iGb3 Complex and its Cognate V?14 T Cell Receptor Suggest a Model for Dual Recognition of Foreign and Self Glycolipids  

OpenAIRE

The semi-invariant V?14J?18 T cell receptor (TCR) is expressed by regulatory NKT cells and has the unique ability to recognize chemically diverse ligands presented by CD1d. The crystal structure of CD1d complexed to a natural, endogenous ligand, isoglobotrihexosylceramide (iGb3), illustrates the extent of this diversity when compared to the binding of potent, exogenous ligands, such as ?-galactosylceramide (?-GalCer). A single mode of recognition for these two classes of ligands would the...

Zajonc, Dirk M.; Savage, Paul B.; Bendelac, Albert; Wilson, Ian A.; Teyton, Luc

2008-01-01

150

The Amino Acid Alphabet and the Architecture of the Protein Sequence-Structure Map. I. Binary Alphabets  

OpenAIRE

The correspondence between protein sequences and structures, or sequence-structure map, relates to fundamental aspects of structural, evolutionary and synthetic biology. The specifics of the mapping, such as the fraction of accessible sequences and structures, or the sequences' ability to fold fast, are dictated by the type of interactions between the monomers that compose the sequences. The set of possible interactions between monomers is encapsulated by the potential energy function. In thi...

Ferrada, Evandro

2014-01-01

151

Crystal structure of V?1 T cell receptor in complex with CD1d-sulfatide shows MHC-like recognition of a self-lipid by human ?? T cells.  

Science.gov (United States)

The nature of the antigens recognized by ?? T cells and their potential recognition of major histocompatibility complex (MHC)-like molecules has remained unclear. Members of the CD1 family of lipid-presenting molecules are suggested ligands for V?1 TCR-expressing ?? T cells, the major ?? lymphocyte population in epithelial tissues. We crystallized a V?1 TCR in complex with CD1d and the self-lipid sulfatide, revealing the unusual recognition of CD1d by germline V?1 residues spanning all complementarity-determining region (CDR) loops, as well as sulfatide recognition separately encoded by nongermline CDR3? residues. Binding and functional analysis showed that CD1d presenting self-lipids, including sulfatide, was widely recognized by gut V?1+ ?? T cells. These findings provide structural demonstration of MHC-like recognition of a self-lipid by ?? T cells and reveal the prevalence of lipid recognition by innate-like T cell populations. PMID:24239091

Luoma, Adrienne M; Castro, Caitlin D; Mayassi, Toufic; Bembinster, Leslie A; Bai, Li; Picard, Damien; Anderson, Brian; Scharf, Louise; Kung, Jennifer E; Sibener, Leah V; Savage, Paul B; Jabri, Bana; Bendelac, Albert; Adams, Erin J

2013-12-12

152

Differences in CD1d protein structure determine species-selective antigenicity of isoglobotrihexosylceramide (iGb3) to invariant natural killer T (iNKT)Cells  

OpenAIRE

Isoglobotrihexosylceramide (iGb3) has been identified as a potent CD1d-presented self-antigen for mouse iNKT cells. The role of iGb3 in humans remains unresolved, however, as there have been conflicting reports about iGb3-dependent human iNKT-cell activation, and humans lack iGb3 synthase, a key enzyme for iGb3 synthesis. Given the importance of human immune responses, we conducted a human-mouse cross-species analysis of iNKT-cell activation by iGb3-CD1d. Here we show that human and mouse iNK...

Sanderson, Joseph P.; Brennan, Patrick J.; Mansour, Salah; Matulis, Gediminas; Patel, Onisha; Lissin, Nikolai; Godfrey, Dale I.; Kawahara, Kazuyoshi; Za?hringer, Ulrich; Rossjohn, Jamie; Brenner, Michael B.; Gadola, Stephan D.

2013-01-01

153

Structural (and sequence-based) analysis of transcriptional regulation  

OpenAIRE

Most computational approaches to transcriptional regulation use sequence-based methodologies, that aim to discover regulatory motifs in genomic segments. Here we argue that the current content of the Protein Data Bank (PDB) can provide invaluable data that drive the prediction of regulatory interactions within genomes. First, we dissect protein-DNA interfaces and find atomic interactions that contribute to sequence-specific recognition, mainly hydrogen bonds and Van derWaals contacts. Thes...

Contreras-moreira, Bruno; Lozada-cha?vez, Irma; Espinosa Angarica, Vladimir

2008-01-01

154

SEQUENCING  

Science.gov (United States)

DESK Standard: Summarize important ideas/events; summarize supporting details in sequence. . DATES: You can begin this activity on January 22. You should complete it by January 26. OBJECTIVE: It is important to remember the events of a story in the order they happen. You wouldn\\'t want to know how a good story ends before reading all of the ...

Mr. Hughes

2006-02-24

155

Structural gene and complete amino acid sequence of Vibrio alginolyticus collagenase.  

Science.gov (United States)

The DNA encoding the collagenase of Vibrio alginolyticus was cloned, and its complete nucleotide sequence was determined. When the cloned gene was ligated to pUC18, the Escherichia coli expression vector, bacteria carrying the gene exhibited both collagenase antigen and collagenase activity. The open reading frame from the ATG initiation codon was 2442 bp in length for the collagenase structural gene. The amino acid sequence, deduced from the nucleotide sequence, revealed that the mature collagenase consists of 739 amino acids with an Mr of 81875. The amino acid sequences of 20 polypeptide fragments were completely identical with the deduced amino acid sequences of the collagenase gene. The amino acid composition predicted from the DNA sequence was similar to the chemically determined composition of purified collagenase reported previously. The analyses of both the DNA and amino acid sequences of the collagenase gene were rigorously performed, but we could not detect any significant sequence similarity to other collagenases. Images Fig. 2. PMID:1311172

Takeuchi, H; Shibano, Y; Morihara, K; Fukushima, J; Inami, S; Keil, B; Gilles, A M; Kawamoto, S; Okuda, K

1992-01-01

156

Improving protein structure similarity searches using domain boundaries based on conserved sequence information  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background The identification of protein domains plays an important role in protein structure comparison. Domain query size and composition are critical to structure similarity search algorithms such as the Vector Alignment Search Tool (VAST, the method employed for computing related protein structures in NCBI Entrez system. Currently, domains identified on the basis of structural compactness are used for VAST computations. In this study, we have investigated how alternative definitions of domains derived from conserved sequence alignments in the Conserved Domain Database (CDD would affect the domain comparisons and structure similarity search performance of VAST. Results Alternative domains, which have significantly different secondary structure composition from those based on structurally compact units, were identified based on the alignment footprints of curated protein sequence domain families. Our analysis indicates that domain boundaries disagree on roughly 8% of protein chains in the medium redundancy subset of the Molecular Modeling Database (MMDB. These conflicting sequence based domain boundaries perform slightly better than structure domains in structure similarity searches, and there are interesting cases when structure similarity search performance is markedly improved. Conclusion Structure similarity searches using domain boundaries based on conserved sequence information can provide an additional method for investigators to identify interesting similarities between proteins with known structures. Because of the improvement in performance of structure similarity searches using sequence domain boundaries, we are in the process of implementing their inclusion into the VAST search and MMDB resources in the NCBI Entrez system.

Madej Tom

2009-05-01

157

Heterobridged dinuclear, tetranuclear, dinuclear-based 1-d, and heptanuclear-based 1-D complexes of copper(II) derived from a dinucleating ligand: syntheses, structures, magnetochemistry, spectroscopy, and catecholase activity.  

Science.gov (United States)

The work in this paper presents syntheses, characterization, crystal structures, variable-temperature/field magnetic properties, catecholase activity, and electrospray ionization mass spectroscopic (ESI-MS positive) study of five copper(II) complexes of composition [Cu(II)(2)L(?(1,1)-NO(3))(H(2)O)(NO(3))](NO(3)) (1), [{Cu(II)(2)L(?-OH)(H(2)O)}(?-ClO(4))](n)(ClO(4))(n) (2), [{Cu(II)(2)L(NCS)(2)}(?(1,3)-NCS)](n) (3), [{Cu(II)(2)L(?(1,1)-N(3))(ClO(4))}(2)(?(1,3)-N(3))(2)] (4), and [{Cu(II)(2)L(?-OH)}{Cu(II)(2)L(?(1,1)-N(3))}{Cu(II)(?(1,1)-N(3))(4)(dmf)}{Cu(II)(2)(?(1,1)-N(3))(2)(N(3))(4)}](n)·ndmf (5), derived from a new compartmental ligand 2,6-bis[N-(2-pyridylethyl)formidoyl]-4-ethylphenol, which is the 1:2 condensation product of 4-ethyl-2,6-diformylphenol and 2-(2-aminoethyl)pyridine. The title compounds are either of the following nuclearities/topologies: dinuclear (1), dinuclear-based one-dimensional (2 and 3), tetranuclear (4), and heptanuclear-based one-dimensional (5). The bridging moieties in 1-5 are as follows: ?-phenoxo-?(1,1)-nitrate (1), ?-phenoxo-?-hydroxo and ?-perchlorate (2), ?-phenoxo and ?(1,3)-thiocyanate (3), ?-phenoxo-?(1,1)-azide and ?(1,3)-azide (4), ?-phenoxo-?-hydroxo, ?-phenoxo-?(1,1)-azide, and ?(1,1)-azide (5). All the five compounds exhibit overall antiferromagnetic interaction. The J values in 1-4 have been determined (-135 cm(-1) for 1, -298 cm(-1) for 2, -105 cm(-1) for 3, -119.5 cm(-1) for 4). The pairwise interactions in 5 have been evaluated qualitatively to result in S(T) = 3/2 spin ground state, which has been verified by magnetization experiment. Utilizing 3,5-di-tert-butyl catechol (3,5-DTBCH(2)) as the substrate, catecholase activity of all the five complexes have been checked. While 1 and 3 are inactive, complexes 2, 4, and 5 show catecholase activity with turn over numbers 39 h(-1) (for 2), 40 h(-1) (for 4), and 48 h(-1) (for 5) in dmf and 167 h(-1) (for 2) and 215 h(-1) (for 4) in acetonitrile. Conductance of the dmf solution of the complexes has been measured, revealing that bridging moieties and nuclearity have been almost retained in solution. Electrospray ionization mass (ESI-MS positive) spectra of complexes 1, 2, and 4 have been recorded in acetonitrile solutions and the positive ions have been well characterized. ESI-MS positive spectrum of complex 2 in presence of 3,5-DTBCH(2) have also been recorded and, interestingly, a positive ion [Cu(II)(2)L(?-3,5-DTBC(2-))(3,5-DTBCH(-))Na(I)](+) has been identified. PMID:21776948

Majumder, Samit; Sarkar, Sohini; Sasmal, Sujit; Sañudo, E Carolina; Mohanta, Sasankasekhar

2011-08-15

158

Structure of a functional amyloid protein subunit computed using sequence variation.  

Science.gov (United States)

Functional amyloid fibers, called curli, play a critical role in adhesion and invasion of many bacteria. Unlike pathological amyloids, curli structures are formed by polypeptide sequences whose amyloid structure has been selected for during evolution. This important distinction provides us with an opportunity to obtain structural insights from an unexpected source: the covariation of amino acids in sequences of different curli proteins. We used recently developed methods to extract amino acid contacts from a multiple sequence alignment of homologues of the curli subunit protein, CsgA. Together with an efficient force field, these contacts allow us to determine structural models of CsgA. We find that CsgA forms a ?-helical structure, where each turn corresponds to previously identified repeat sequences in CsgA. The proposed structure is validated by previously measured solid-state NMR, electron microscopy, and X-ray diffraction data and agrees with an earlier proposed model derived by complementary means. PMID:25415595

Tian, Pengfei; Boomsma, Wouter; Wang, Yong; Otzen, Daniel E; Jensen, Mogens H; Lindorff-Larsen, Kresten

2015-01-14

159

A novel 1D-AF hybrid organic-inorganic chromium(II) methyl phosphonate dihydrate: synthesis, X-ray crystal and molecular structure, and magnetic properties.  

Science.gov (United States)

Light-blue crystals of chromium(II) methyl phosphonate dihydrate, [Cr(CH(3)PO(3))(H(2)O)].H(2)O, were obtained in water by mixing filtered solutions of methylphosphonic acid and chromium(II) chloride in the presence of urea in an inert atmosphere. The compound was characterized by elemental analysis, TGA-DSC, X-ray crystallography, magnetic measurements, and UV-visible and FT-IR spectroscopies. The crystal and molecular structures (orthorhombic Pnma (no. 62): a = 4.4714(5) A, b = 6.8762(7) A, c = 19.180(2) A, Z = 4) have been solved using single-crystal X-ray diffraction. The chromium(II) ion is six-coordinated by oxygens (4 + 2) to form an elongated octahedron, with the four equatorial oxygen atoms belonging to [-PO(3)](2-) phosphonate groups. This stereochemistry of the Cr(II) ion (high-spin d(4) electronic configuration) is ascribed to the Jahn-Teller effect. The [CrO(6)] chromophore, the [CH(3)PO(3)](2-) anions, and the water molecules build a novel one-dimensional (1D) metal(II) oxide chain, anchored to each other within the ab plane by two oxygens of the phosphonate ligand. Within the chain, each Cr(2+) ion is connected through double oxygen bridges to its two neighbors, forming edge-sharing octahedra running along the b axis. The chains are further connected with the adjacent chains by phosphonate [-PO(3)](2-) groups of the ligand, forming an inorganic layer that alternates along the c axis of the unit cell with bilayers, consisting of methyl groups and water of crystallization. The thermal variation of the magnetic susceptibility follows the Curie-Weiss law, with a large negative Weiss constant, theta = -60 K, indicating the presence of antiferromagnetic AF exchange interactions between neighboring Cr(II) ions. The magnetic behavior and the magnetic dimensionality have been analyzed in terms of Fisher's classical limiting form of the Heisenberg chain theory, and a value of J = -9.3 cm(-1) was found. The negative value of the intra-chain exchange constant coupling J confirms the presence of an AF coupling. No sign of long-range magnetic ordering down to 2 K (the lowest measured temperature) is observed, in agreement with the predominant one-dimensional character of the exchange interactions. PMID:20690756

Bauer, Elvira M; Bellitto, Carlo; Imperatori, Patrizia; Righini, Guido; Colapietro, Marcello; Portalone, Gustavo; Gómez-García, Carlos J

2010-08-16

160

SUPERFAMILY: HMMs representing all proteins of known structure. SCOP sequence searches, alignments and genome assignments  

OpenAIRE

The SUPERFAMILY database contains a library of hidden Markov models representing all proteins of known structure. The database is based on the SCOP ‘superfamily’ level of protein domain classification which groups together the most distantly related proteins which have a common evolutionary ancestor. There is a public server at http://supfam.org which provides three services: sequence searching, multiple alignments to sequences of known structure, and structural assignments to all complet...

Gough, Julian; Chothia, Cyrus

2002-01-01

161

ConSurf 2010: calculating evolutionary conservation in sequence and structure of proteins and nucleic acids  

OpenAIRE

It is informative to detect highly conserved positions in proteins and nucleic acid sequence/structure since they are often indicative of structural and/or functional importance. ConSurf (http://consurf.tau.ac.il) and ConSeq (http://conseq.tau.ac.il) are two well-established web servers for calculating the evolutionary conservation of amino acid positions in proteins using an empirical Bayesian inference, starting from protein structure and sequence, respectively. Here, we present the new ver...

Ashkenazy, Haim; Erez, Elana; Martz, Eric; Pupko, Tal; Ben-tal, Nir

2010-01-01

162

Quantifying the relationship between sequence and three-dimensional structure conservation in RNA  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background In recent years, the number of available RNA structures has rapidly grown reflecting the increased interest on RNA biology. Similarly to the studies carried out two decades ago for proteins, which gave the fundamental grounds for developing comparative protein structure prediction methods, we are now able to quantify the relationship between sequence and structure conservation in RNA. Results Here we introduce an all-against-all sequence- and three-dimensional (3D structure-based comparison of a representative set of RNA structures, which have allowed us to quantitatively confirm that: (i there is a measurable relationship between sequence and structure conservation that weakens for alignments resulting in below 60% sequence identity, (ii evolution tends to conserve more RNA structure than sequence, and (iii there is a twilight zone for RNA homology detection. Discussion The computational analysis here presented quantitatively describes the relationship between sequence and structure for RNA molecules and defines a twilight zone region for detecting RNA homology. Our work could represent the theoretical basis and limitations for future developments in comparative RNA 3D structure prediction.

Capriotti Emidio

2010-06-01

163

PredyFlexy: flexibility and local structure prediction from sequence  

OpenAIRE

Protein structures are necessary for understanding protein function at a molecular level. Dynamics and flexibility of protein structures are also key elements of protein function. So, we have proposed to look at protein flexibility using novel methods: (i) using a structural alphabet and (ii) combining classical X-ray B-factor data and molecular dynamics simulations. First, we established a library composed of structural prototypes (LSPs) to describe protein structure by a limited set of recu...

Brevern, Alexandre; Bornot, Aure?lie; Craveur, Pierrick; Etchebest, Catherine; Gelly, Jean-christophe

2012-01-01

164

Structator: fast index-based search for RNA sequence-structure patterns  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background The secondary structure of RNA molecules is intimately related to their function and often more conserved than the sequence. Hence, the important task of searching databases for RNAs requires to match sequence-structure patterns. Unfortunately, current tools for this task have, in the best case, a running time that is only linear in the size of sequence databases. Furthermore, established index data structures for fast sequence matching, like suffix trees or arrays, cannot benefit from the complementarity constraints introduced by the secondary structure of RNAs. Results We present a novel method and readily applicable software for time efficient matching of RNA sequence-structure patterns in sequence databases. Our approach is based on affix arrays, a recently introduced index data structure, preprocessed from the target database. Affix arrays support bidirectional pattern search, which is required for efficiently handling the structural constraints of the pattern. Structural patterns like stem-loops can be matched inside out, such that the loop region is matched first and then the pairing bases on the boundaries are matched consecutively. This allows to exploit base pairing information for search space reduction and leads to an expected running time that is sublinear in the size of the sequence database. The incorporation of a new chaining approach in the search of RNA sequence-structure patterns enables the description of molecules folding into complex secondary structures with multiple ordered patterns. The chaining approach removes spurious matches from the set of intermediate results, in particular of patterns with little specificity. In benchmark experiments on the Rfam database, our method runs up to two orders of magnitude faster than previous methods. Conclusions The presented method's sublinear expected running time makes it well suited for RNA sequence-structure pattern matching in large sequence databases. RNA molecules containing several stem-loop substructures can be described by multiple sequence-structure patterns and their matches are efficiently handled by a novel chaining method. Beyond our algorithmic contributions, we provide with Structator a complete and robust open-source software solution for index-based search of RNA sequence-structure patterns. The Structator software is available at http://www.zbh.uni-hamburg.de/Structator.

Will Sebastian

2011-05-01

165

Fraisse sequences: category-theoretic approach to universal homogeneous structures.  

Czech Academy of Sciences Publication Activity Database

Ro?. 165, ?. 11 (2014), s. 1755-1811. ISSN 0168-0072 R&D Projects: GA ?R(CZ) GAP201/12/0290 Institutional support: RVO:67985840 Keywords : universal homogeneous object * Fraissé sequence * amalgamation Subject RIV: BA - General Mathematics Impact factor: 0.451, year: 2013 http://www.sciencedirect.com/science/article/pii/S0168007214000773

Kubi?, Wieslaw

2014-01-01

166

Impact of chromatin structure on sequence variability in the human genome  

OpenAIRE

DNA sequence variations in individual genomes within the same species give rise to different phenotypes. One mechanism in this process is alteration of chromatin structure due to sequence variation that impacts gene regulation downstream. In this study, we compose a high-confidence collection of human indels and SNPs based on the analysis of a large set of publicly available sequencing data and investigate whether the DNA loci associated with stable nucleosome positions are protected against ...

Tolstorukov, Michael Y.; Volfovsky, Natalia; Stephens, Robert M.; Park, Peter J.

2011-01-01

167

Simultaneous Bayesian estimation of alignment and phylogeny under a joint model of protein sequence and structure.  

Science.gov (United States)

For sequences that are highly divergent, there is often insufficient information to infer accurate alignments, and phylogenetic uncertainty may be high. One way to address this issue is to make use of protein structural information, since structures generally diverge more slowly than sequences. In this work, we extend a recently developed stochastic model of pairwise structural evolution to multiple structures on a tree, analytically integrating over ancestral structures to permit efficient likelihood computations under the resulting joint sequence-structure model. We observe that the inclusion of structural information significantly reduces alignment and topology uncertainty, and reduces the number of topology and alignment errors in cases where the true trees and alignments are known. In some cases, the inclusion of structure results in changes to the consensus topology, indicating that structure may contain additional information beyond that which can be obtained from sequences. We use the model to investigate the order of divergence of cytoglobins, myoglobins, and hemoglobins and observe a stabilization of phylogenetic inference: although a sequence-based inference assigns significant posterior probability to several different topologies, the structural model strongly favors one of these over the others and is more robust to the choice of data set. PMID:24899668

Herman, Joseph L; Challis, Christopher J; Novák, Ádám; Hein, Jotun; Schmidler, Scott C

2014-09-01

168

Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes.  

Science.gov (United States)

Recent studies show that along with single nucleotide polymorphisms and small indels, larger structural variants among human individuals are common. The Human Genome Structural Variation Project aims to identify and classify deletions, insertions, and inversions (>5 Kbp) in a small number of normal individuals with a fosmid-based paired-end sequencing approach using traditional sequencing technologies. The realization of new ultra-high-throughput sequencing platforms now makes it feasible to detect the full spectrum of genomic variation among many individual genomes, including cancer patients and others suffering from diseases of genomic origin. Unfortunately, existing algorithms for identifying structural variation (SV) among individuals have not been designed to handle the short read lengths and the errors implied by the "next-gen" sequencing (NGS) technologies. In this paper, we give combinatorial formulations for the SV detection between a reference genome sequence and a next-gen-based, paired-end, whole genome shotgun-sequenced individual. We describe efficient algorithms for each of the formulations we give, which all turn out to be fast and quite reliable; they are also applicable to all next-gen sequencing methods (Illumina, 454 Life Sciences [Roche], ABI SOLiD, etc.) and traditional capillary sequencing technology. We apply our algorithms to identify SV among individual genomes very recently sequenced by Illumina technology. PMID:19447966

Hormozdiari, Fereydoun; Alkan, Can; Eichler, Evan E; Sahinalp, S Cenk

2009-07-01

169

Analysis of the rotational structure in the high-resolution infrared spectra of cis,cis- and trans,trans-1,4-difluorobutadiene-1-d1 and trans,trans-1,4-difluorobutadiene-1,4-d2  

Energy Technology Data Exchange (ETDEWEB)

Samples of cis,cis- and trans,trans-1,4-difluorobutadiene-1- d1 and of trans,trans-1,4-difluorobutadiene-1,4-d2 have been synthesized, and high-resolution (?0.0018 cm-1) infrared spectra of these substances have been recorded in the gas phase. Analysis of the rotational structure, mostly in C-type bands, has yielded ground state rotational constants. For the two 1-d1 species more than one band has been analyzed. For the 1,4-d2 species only one band was available for analysis. However, good agreement between the experimental centrifugal distortion constants and those predicted with a B3LYP/cc-pVTZ model give strong support to the analysis of the very dense spectrum. The ground state rotational constants are a contribution to finding semiexperimental equilibrium structures of the two nonpolar isomers of 1,4- difluorobutadiene.

Craig, Norman C.; Chen, Yihui; Lu, Yuhua; Neese, Christopher F.; Nemchick, Deacon J.; Blake, Thomas A.

2013-06-01

170

Molecular and Supramolecular Structural Studies on Human Tropoelastin Sequences  

OpenAIRE

One of the unusual properties of elastin is its ability to coacervate, which has been proposed to play an important role in the alignment of monomeric elastin for cross-linking into the polymeric elastin matrix. The temperature at which this transition takes place depends on several factors including protein concentration, ionic strength, and pH. Previously, polypeptide sequences encoded by different exons of the human tropoelastin gene have been analyzed for their ability to coacervate and t...

Ostuni, Angela; Bochicchio, Brigida; Armentano, Maria F.; Bisaccia, Faustino; Tamburro, Antonio M.

2007-01-01

171

RNAG: a new Gibbs sampler for predicting RNA secondary structure for unaligned sequences  

OpenAIRE

Motivation: RNA secondary structure plays an important role in the function of many RNAs, and structural features are often key to their interaction with other cellular components. Thus, there has been considerable interest in the prediction of secondary structures for RNA families. In this article, we present a new global structural alignment algorithm, RNAG, to predict consensus secondary structures for unaligned sequences. It uses a blocked Gibbs sampling algorithm, which has a theoretical...

Wei, Donglai; Alpert, Lauren V.; Lawrence, Charles E.

2011-01-01

172

RNAstrand: reading direction of structured RNAs in multiple sequence alignments  

OpenAIRE

Abstract Motivation Genome-wide screens for structured ncRNA genes in mammals, urochordates, and nematodes have predicted thousands of putative ncRNA genes and other structured RNA motifs. A prerequisite for their functional annotation is to determine the reading direction with high precision. Results While folding energies of an RNA and its reverse complement are similar, the differences are sufficient at least in conjunction with substitution patterns to discriminate between structured RNAs...

Stadler Peter F; Reiche Kristin

2007-01-01

173

Designing deep sequencing experiments: detecting structural variation and estimating transcript abundance  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Massively parallel DNA sequencing technologies have enabled the sequencing of several individual human genomes. These technologies are also being used in novel ways for mRNA expression profiling, genome-wide discovery of transcription-factor binding sites, small RNA discovery, etc. The multitude of sequencing platforms, each with their unique characteristics, pose a number of design challenges, regarding the technology to be used and the depth of sequencing required for a particular sequencing application. Here we describe a number of analytical and empirical results to address design questions for two applications: detection of structural variations from paired-end sequencing and estimating mRNA transcript abundance. Results For structural variation, our results provide explicit trade-offs between the detection and resolution of rearrangement breakpoints, and the optimal mix of paired-read insert lengths. Specifically, we prove that optimal detection and resolution of breakpoints is achieved using a mix of exactly two insert library lengths. Furthermore, we derive explicit formulae to determine these insert length combinations, enabling a 15% improvement in breakpoint detection at the same experimental cost. On empirical short read data, these predictions show good concordance with Illumina 200 bp and 2 Kbp insert length libraries. For transcriptome sequencing, we determine the sequencing depth needed to detect rare transcripts from a small pilot study. With only 1 Million reads, we derive corrections that enable almost perfect prediction of the underlying expression probability distribution, and use this to predict the sequencing depth required to detect low expressed genes with greater than 95% probability. Conclusions Together, our results form a generic framework for many design considerations related to high-throughput sequencing. We provide software tools http://bix.ucsd.edu/projects/NGS-DesignTools to derive platform independent guidelines for designing sequencing experiments (amount of sequencing, choice of insert length, mix of libraries for novel applications of next generation sequencing.

Bansal Vikas

2010-06-01

174

A Human Genome Structural Variation Sequencing Resource Reveals Insights into Mutational Mechanisms  

OpenAIRE

Understanding the prevailing mutational mechanisms responsible for human genome structural variation requires uniformity in the discovery of allelic variants and precision in terms of breakpoint delineation. We develop a resource based on capillary end-sequencing of 13.8 million fosmid clones from 17 human genomes and characterize the complete sequence of 1,054 large structural variants corresponding to 589 deletions, 384 insertions, and 81 inversions. We analyze the 2,081 breakpoint junction...

Kidd, Jeffrey M.; Graves, Tina; Newman, Tera; Fulton, Robert; Hayden, Hillary S.; Malig, Maika; Kallicki, Joelle; Kaul, Rajinder; Wilson, Richard K.; Eichler, Evan E.

2010-01-01

175

RNAProfile: an algorithm for finding conserved secondary structure motifs in unaligned RNA sequences  

OpenAIRE

The recent interest sparked due to the discovery of a variety of functions for non-coding RNA molecules has highlighted the need for suitable tools for the analysis and the comparison of RNA sequences. Many trans-acting non-coding RNA genes and cis-acting RNA regulatory elements present motifs, conserved both in structure and sequence, that can be hardly detected by primary sequence analysis alone. We present an algorithm that takes as input a set of unaligned RNA sequences expected to share ...

Pavesi, Giulio; Mauri, Giancarlo; Stefani, Marco; Pesole, Graziano

2004-01-01

176

Using deep RNA sequencing for the structural annotation of the laccaria bicolor mycorrhizal transcriptome.  

Energy Technology Data Exchange (ETDEWEB)

Accurate structural annotation is important for prediction of function and required for in vitro approaches to characterize or validate the gene expression products. Despite significant efforts in the field, determination of the gene structure from genomic data alone is a challenging and inaccurate process. The ease of acquisition of transcriptomic sequence provides a direct route to identify expressed sequences and determine the correct gene structure. We developed methods to utilize RNA-seq data to correct errors in the structural annotation and extend the boundaries of current gene models using assembly approaches. The methods were validated with a transcriptomic data set derived from the fungus Laccaria bicolor, which develops a mycorrhizal symbiotic association with the roots of many tree species. Our analysis focused on the subset of 1501 gene models that are differentially expressed in the free living vs. mycorrhizal transcriptome and are expected to be important elements related to carbon metabolism, membrane permeability and transport, and intracellular signaling. Of the set of 1501 gene models, 1439 (96%) successfully generated modified gene models in which all error flags were successfully resolved and the sequences aligned to the genomic sequence. The remaining 4% (62 gene models) either had deviations from transcriptomic data that could not be spanned or generated sequence that did not align to genomic sequence. The outcome of this process is a set of high confidence gene models that can be reliably used for experimental characterization of protein function. 69% of expressed mycorrhizal JGI 'best' gene models deviated from the transcript sequence derived by this method. The transcriptomic sequence enabled correction of a majority of the structural inconsistencies and resulted in a set of validated models for 96% of the mycorrhizal genes. The method described here can be applied to improve gene structural annotation in other species, provided that there is a sequenced genome and a set of gene models.

Larsen, P. E.; Trivedi, G.; Sreedasyam, A.; Lu, V.; Podila, G. K.; Collart, F. R.; Biosciences Division; Univ. of Alabama

2010-07-06

177

TurboFold: Iterative probabilistic estimation of secondary structures for multiple RNA sequences  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background The prediction of secondary structure, i.e. the set of canonical base pairs between nucleotides, is a first step in developing an understanding of the function of an RNA sequence. The most accurate computational methods predict conserved structures for a set of homologous RNA sequences. These methods usually suffer from high computational complexity. In this paper, TurboFold, a novel and efficient method for secondary structure prediction for multiple RNA sequences, is presented. Results TurboFold takes, as input, a set of homologous RNA sequences and outputs estimates of the base pairing probabilities for each sequence. The base pairing probabilities for a sequence are estimated by combining intrinsic information, derived from the sequence itself via the nearest neighbor thermodynamic model, with extrinsic information, derived from the other sequences in the input set. For a given sequence, the extrinsic information is computed by using pairwise-sequence-alignment-based probabilities for co-incidence with each of the other sequences, along with estimated base pairing probabilities, from the previous iteration, for the other sequences. The extrinsic information is introduced as free energy modifications for base pairing in a partition function computation based on the nearest neighbor thermodynamic model. This process yields updated estimates of base pairing probability. The updated base pairing probabilities in turn are used to recompute extrinsic information, resulting in the overall iterative estimation procedure that defines TurboFold. TurboFold is benchmarked on a number of ncRNA datasets and compared against alternative secondary structure prediction methods. The iterative procedure in TurboFold is shown to improve estimates of base pairing probability with each iteration, though only small gains are obtained beyond three iterations. Secondary structures composed of base pairs with estimated probabilities higher than a significance threshold are shown to be more accurate for TurboFold than for alternative methods that estimate base pairing probabilities. TurboFold-MEA, which uses base pairing probabilities from TurboFold in a maximum expected accuracy algorithm for secondary structure prediction, has accuracy comparable to the best performing secondary structure prediction methods. The computational and memory requirements for TurboFold are modest and, in terms of sequence length and number of sequences, scale much more favorably than joint alignment and folding algorithms. Conclusions TurboFold is an iterative probabilistic method for predicting secondary structures for multiple RNA sequences that efficiently and accurately combines the information from the comparative analysis between sequences with the thermodynamic folding model. Unlike most other multi-sequence structure prediction methods, TurboFold does not enforce strict commonality of structures and is therefore useful for predicting structures for homologous sequences that have diverged significantly. TurboFold can be downloaded as part of the RNAstructure package at http://rna.urmc.rochester.edu.

Sharma Gaurav

2011-04-01

178

New 1-D and 3-D thiocyanatocadmates modified by various amine molecules and Cl(-)/CH3COO(-) ions: synthesis, structural characterization, thermal behavior and photoluminescence properties.  

Science.gov (United States)

Under ambient conditions, reactions of CdCl2/Cd(CH3COO)2, SCN(-) and various organic amine molecules in strongly acidic solutions afforded the five new thiocyanatocadmates [H2(abpy)][CdCl2(SCN)2] (abpy = azobispyridine) , [H(apy)][Cd(SCN)3] (apy = 4-aminopyridine) , [H(ba)]2[CdCl2(SCN)2] (ba = tert-butylamine) , [H2(tmen)][Cd3Cl6(SCN)2] (tmen = N,N,N',N'-tetramethylethylenediamine) , and [H(dba)]2[Cd2(CH3COO)2(SCN)4] (dba = dibutylamine) . In compound only, the CH3COO(-) ions in Cd(CH3COO)2 were completely displaced by SCN(-), producing a chained thiocyanatocadmate [Cd(SCN)3](-). In the other four compounds, the Cl(-) or CH3COO(-) ions appeared in the final inorganic anion frameworks. In compound , the Cl(-) ions doubly bridge the Cd(2+) centers, forming a one-dimensional (1-D) infinite chain, and the SCN(-) group exists in a terminal form, whereas in compound , the reverse situation is observed. Due to a trans-mode arrangement for two terminal Cl(-) or SCN(-) ions around each Cd(2+) center, the inorganic anion chains in compounds and both show a linear shape. In compound , Cd(2+) and Cl(-) first aggregate to form a 1-D endless chain with a composition of Cd3Cl6, which can be described as a linear arrangement of the open double cubanes. SCN(-) serves as the second connector, propagating the Cd3Cl6 chain into a three-dimensional (3-D) network with the occluded H2(tmen)(2+) cations. In compound , the SCN(-) groups doubly bridge the Cd(2+) centers, forming a 1-D zigzag-shape chain. The formation of the zigzag chain likely derives from chelation of the CH3COO(-) group to the Cd(2+) center. The thermal behavior and the photoluminescence properties of the title compounds were also investigated. PMID:25669175

Guo, Bing; Zhang, Xiao; Wang, Yan-Ning; Huang, Jing-Jing; Yu, Jie-Hui; Xu, Ji-Qing

2015-03-01

179

Detection of structural DNA variation from next generation sequencing data: a review of informatic approaches.  

Science.gov (United States)

Next generation sequencing (NGS), or massively paralleled sequencing, refers to a collective group of methods in which numerous sequencing reactions take place simultaneously, resulting in enormous amounts of sequencing data for a small fraction of the cost of Sanger sequencing. Typically short (50-250 bp), NGS reads are first mapped to a reference genome, and then variants are called from the mapped data. While most NGS applications focus on the detection of single nucleotide variants (SNVs) or small insertions/deletions (indels), structural variation, including translocations, larger indels, and copy number variation (CNV), can be identified from the same data. Structural variation detection can be performed from whole genome NGS data or "targeted" data including exomes or gene panels. However, while targeted sequencing greatly increases sequencing coverage or depth of particular genes, it may introduce biases in the data that require specialized informatic analyses. In the past several years, there have been considerable advances in methods used to detect structural variation, and a full range of variants from SNVs to balanced translocations to CNV can now be detected with reasonable sensitivity from either whole genome or targeted NGS data. Such methods are being rapidly applied to clinical testing where they can supplement or in some cases replace conventional fluorescence in situ hybridization or array-based testing. Here we review some of the informatics approaches used to detect structural variation from NGS data. PMID:24405614

Abel, Haley J; Duncavage, Eric J

2013-12-01

180

Four basic symmetry types in the universal 7-cluster structure of 143 complete bacterial genomic sequences  

CERN Document Server

Coding information is the main source of heterogeneity (non-randomness) in the sequences of bacterial genomes. This information can be naturally modeled by analysing cluster structures in the "in-phase" triplet distributions of relatively short genomic fragments (200-400bp). We found a universal 7-cluster structure in bacterial genomic sequences and explained its properties. We show that codon usage of bacterial genomes is a multi-linear function of their genomic G+C-content with high accuracy. Based on the analysis of 143 completely sequenced bacterial genomes available in Genbank in August 2004, we show that there are four "pure" types of the 7-cluster structure observed. All 143 cluster animated 3D-scatters are collected in a database and is made available on our web-site: http://www.ihes.fr/~zinovyev/7clusters The finding can be readily introduced into any software for gene prediction, sequence alignment or bacterial genomes classification.

Gorban, A N; Zinovyev, A Yu

2011-01-01

181

Comprehensive characterization of complex structural variations in cancer by directly comparing genome sequence reads.  

Science.gov (United States)

The development of high-throughput sequencing technologies has advanced our understanding of cancer. However, characterizing somatic structural variants in tumor genomes is still challenging because current strategies depend on the initial alignment of reads to a reference genome. Here, we describe SMUFIN (somatic mutation finder), a single program that directly compares sequence reads from normal and tumor genomes to accurately identify and characterize a range of somatic sequence variation, from single-nucleotide variants (SNV) to large structural variants at base pair resolution. Performance tests on modeled tumor genomes showed average sensitivity of 92% and 74% for SNVs and structural variants, with specificities of 95% and 91%, respectively. Analyses of aggressive forms of solid and hematological tumors revealed that SMUFIN identifies breakpoints associated with chromothripsis and chromoplexy with high specificity. SMUFIN provides an integrated solution for the accurate, fast and comprehensive characterization of somatic sequence variation in cancer. PMID:25344728

Moncunill, Valentí; Gonzalez, Santi; Beà, Sílvia; Andrieux, Lise O; Salaverria, Itziar; Royo, Cristina; Martinez, Laura; Puiggròs, Montserrat; Segura-Wang, Maia; Stütz, Adrian M; Navarro, Alba; Royo, Romina; Gelpí, Josep L; Gut, Ivo G; López-Otín, Carlos; Orozco, Modesto; Korbel, Jan O; Campo, Elias; Puente, Xose S; Torrents, David

2014-11-01

182

Structure and sequence variation of mink interleukin-6 gene  

International Nuclear Information System (INIS)

Aleutian disease (AD) is the number one disease threat to the survival and future of the mink industry in Nova Scotia and the world. Several ranchers have gone out of business in recent years in Nova Scotia as a direct result of AD. Currently, the control measure for AD consists of testing and slaughtering of infected mink. This practice has not been effective in controlling the disease. Finding a means of controlling AD is the number one priority for the mink industry in Nova Scotia. An effective control measure will have a long-term positive effect on the rural economy by improving production potential of mink and reducing production cost. It has been shown that antiviral antibodies produced by activated immune system cells sometimes combine with interleukin-6 (IL-6) to form immune complexes that cause AD in mink. There is evidence of a significant relationship between nucleotide variations in IL-6 gene and the onset of certain diseases in humans, which bears similar symptoms to AD. Furthermore, pathological symptoms of AD resemble those of other conditions, such as systemic lupus erythematosus (SLE) and Castleman Diseases in humans, where overproduction of IL-6 coincides with the severity of the disease. These findings suggest that IL-6 could be a candidate gene and warrant investigation vis-a-vis differences among mink genotypes in resistance or tolerance to ADV infection. The sequence of the IL-6 gene in mink was done and identification of polymorphisms was used identification of polymorphisms was used to evaluate the potential role of this gene in the immune system response to infections. The 4678 bp promoter region, five exons and four introns of the interleukin-6 (IL-6) gene were bi-directionally sequenced in four unrelated mink from each of the wild, black, brown, pastel and sapphire mink (Genbank accession number (EF620932). The 344 bp promoter region of the gene contained several transcription binding sites. One exonic and seven intronic single nucleotide polymorphisms (SNP) were detected by sequencing of the 20 mink and genotyping of an additional 82 animals from the five colour types. Only two intronic SNP were segregating at high frequencies, indicating that the level of polymorphisms in the mink IL-6 gene was low. A bi-allelic tetranucleotide repeat was detected in the promoter region, with the frequency of 0.0, 0.17, 0.25, 0.25 and 0.40 in the wild, black, pastel, brown and sapphire mink, respectively, suggesting that this locus may influence immune response to infection. A polymorphic (CA)16 with 10 alleles was also detected in intron 2. (author)

183

1D transfer matrices  

International Nuclear Information System (INIS)

Many problems of physical interest - for instance, in statistical mechanics - are described by linear ordinary second-order differential systems for which different types of transfer matrices can be introduced and used. Focusing on heterostructures where matching at interfaces is involved, this paper discusses two of them with emphasis on one, here denoted T, which involves the linear differential form expressing the physical quantities matched at the interfaces. The mathematical background is summarized in a simple way and then T is used to study two types of heterostructures involving a large number of interfaces. Firstly, the regular periodic superlattices are studied and the role of different boundary conditions (BCs) at the end of one period is discussed. Only periodic BCs are suitable to study a simple regular superlattices but the discussion provides the background to study different approximants when the period is a largish generation of a quasi-regular heterostructure, like, for instance, a Fibonacci sequence. (author)

184

Detecting the temporal structure of sound sequences in newborn infants.  

Science.gov (United States)

Most high-level auditory functions require one to detect the onset and offset of sound sequences as well as registering the rate at which sounds are presented within the sound trains. By recording event-related brain potentials to onsets and offsets of tone trains as well as to changes in the presentation rate, we tested whether these fundamental auditory capabilities are functional at birth. Each of these events elicited significant event-related potential components in sleeping healthy neonates. The data thus demonstrate that the newborn brain is sensitive to these acoustic features suggesting that infants are geared towards the temporal aspects of segregating sound sources, speech and music perception already at birth. PMID:25722025

Háden, Gábor P; Honing, Henkjan; Török, Miklós; Winkler, István

2015-04-01

185

Sequence- and structure-based prediction of eukaryotic proteinphosphorylation sites  

DEFF Research Database (Denmark)

Protein phosphorylation at serine, threonine or tyrosine residues affects a multitude of cellular signaling processes. Howis specificity in substrate recognition and phosphorylation by protein kinases achieved? Here, we present an artificialneural network method that predicts phosphorylation sites in independent sequences with a sensitivity in the range from69 % to 96 %. As an example, we predict novel phosphorylation sites in the p300/CBP protein that may regulateinteraction with transcription factors and histone acetyltransferase activity. In addition, serine and threonine residues inp300/CBP that can be modified by O-linked glycosylation with N-acetylglucosamine are identified. Glycosylation mayprevent phosphorylation at these sites, a mechanism named yin-yang regulation. The prediction server is available on theInternet at http://www.cbs.dtu.dk/services/NetPhos/or via e-mail to NetPhos@cbs. dtu.dk. Copyright 1999 AcademicPress.

Blom, Nikolaj; Gammeltoft, Steen

1999-01-01

186

CATH: comprehensive structural and functional annotations for genome sequences.  

Science.gov (United States)

The latest version of the CATH-Gene3D protein structure classification database (4.0, http://www.cathdb.info) provides annotations for over 235,000 protein domain structures and includes 25 million domain predictions. This article provides an update on the major developments in the 2 years since the last publication in this journal including: significant improvements to the predictive power of our functional families (FunFams); the release of our 'current' putative domain assignments (CATH-B); a new, strictly non-redundant data set of CATH domains suitable for homology benchmarking experiments (CATH-40) and a number of improvements to the web pages. PMID:25348408

Sillitoe, Ian; Lewis, Tony E; Cuff, Alison; Das, Sayoni; Ashford, Paul; Dawson, Natalie L; Furnham, Nicholas; Laskowski, Roman A; Lee, David; Lees, Jonathan G; Lehtinen, Sonja; Studer, Romain A; Thornton, Janet; Orengo, Christine A

2015-01-01

187

The solution structure of a chimeric LEKTI domain reveals a chameleon sequence.  

Science.gov (United States)

The conversion of an alpha-helical to a beta-strand conformation and the presence of chameleon sequences are fascinating from the perspective that such structural features are implicated in the induction of amyloid-related fatal diseases. In this study, we have determined the solution structure of a chimeric domain (Dom1PI) from the multidomain Kazal-type serine proteinase inhibitor LEKTI using multidimensional NMR spectroscopy. This chimeric protein was constructed to investigate the reasons for differences in the folds of the homologous LEKTI domains 1 and 6 [Lauber, T., et al. (2003) J. Mol. Biol. 328, 205-219]. In Dom1PI, two adjacent phenylalanine residues (F28 and F29) of domain 1 were substituted with proline and isoleucine, respectively, as found in the corresponding P4' and P5' positions of domain 6. The three-dimensional structure of Dom1PI is significantly different from the structure of domain 1 and closely resembles the structure of domain 6, despite the sequence being identical to that of domain 1 except for the two substituted phenylalanine residues and being only 31% identical to the sequence of domain 6. The mutation converted a short 3(10)-helix into an extended loop conformation and parts of the long COOH-terminal alpha-helix of domain 1 into a beta-hairpin structure. The latter conformational change occurs in a sequence stretch distinct from the region containing the substituted residues. Therefore, this switch from an alpha-helical structure to a beta-hairpin structure indicates a chameleon sequence of seven residues. We conclude that the secondary structure of Dom1PI is determined not only by the local protein sequence but also by nonlocal interactions. PMID:15366933

Tidow, Henning; Lauber, Thomas; Vitzithum, Klaus; Sommerhoff, Christian P; Rösch, Paul; Marx, Ute C

2004-09-01

188

Elemental mapping of multilayered structures: A method to reconstruct 2D chemical maps from a set of 1D line scans  

International Nuclear Information System (INIS)

We introduce a method to characterize the chemical distribution in nanostructures using STEM and affiliated spectroscopy techniques. The method is applicable to any nanostructure where the continuous layers of arbitrary geometry and dimensions can be identified. The key feature of the suggested approach is digital warping of the original STEM image into the quasi-1D image. The chemical profiles of high resolution and high signal-to-noise ratio can be extracted from the minimal set of the STEM spectroscopy data while minimizing material damage during acquisitions. Finally, the 2D chemical maps of the area of interest are reconstructed. -- Highlights: ? Layered nanostructure is mapped with applying minimal electron dose. ? Spectroscopic data are aligned and accumulated using digital warping of the STEM image. ? Chemical maps are reconstructed by reverse warping.

189

Sequence Analysis of the Protein Structure Homology Modeling of Growth Hormone Gene from Salmo trutta caspius  

Directory of Open Access Journals (Sweden)

Full Text Available In view of the growth hormone protein investigated and characterized from Salmo trutta caspius. Growth hormone gene in the Salmo trutta caspius have six exons in the full length that is translated into a Molecular Weight (kDa: ssDNA: 64.98 and dsDNA: 129.6. There are also 210 amino acid residue. The assembled full length of DNA contains open reading frame of growth hormone gene that contains 15 sequences in the full length. The average GC content is 47% and AT content is 53%. This protein multiple alignment has shown that this peptide is 100% identical to the corresponding homologous protein in the growth hormone protein which including Salmo salar (Accession number: AAA49558.1 and Rainbow trout (Salmo trutta (Accession number: AAA49555.1" sequences. The sequence of protein had deposited in Gene Bank, Accession number: AEK70940. Also we were analyzed second and third structure between sequences reported in Gene Bank Network system. The results are shown, there are homology between second structure in three sequences including: Salmo trutta caspius, Salmo salar and Rainbow trout. Regarding third structure, Salmo trutta caspius and Salmo salar are same type, but Rainbow trout has different homology with Salmo trutta caspius and Salmo salar. However, the sequences were observed three parallel " helix and in second structure there were almost same percent ? sheet.

Abolhasan Rezaei

2012-03-01

190

Coronal structure geometries on pre-main sequence stars  

CERN Document Server

We have re-analyzed using a hydrodynamic model large flaring events on three different categories of pre-main sequence (PMS) stars: the young stellar object (YSO) YLW 15, the classical T Tauri star (CTTS) LkHalpha92, the weak-line T Tauri star (WTTS) V773 Tau, and the WTTS HD 283572 (the first three objects were observed by ASCA, the last by ROSAT; all have been previously reported in the literature). The first three flares were previously analyzed on the basis of the quasi-static model mostly used up to now, consistently yielding large loops (L >= R*) and no evidence of sustained heating. Our hydrodynamic modeling approach, however, shows that the size of the flaring regions must be much smaller (L <=R*) and moreover this method shows in all cases evidence of vigorous sustained heating during the flare decay, so that the decay of the observed light curve actually reflects the temporal profile of the heating rather than that of the free decay of the heated loop(s). The events on the protostar YLW 15 have d...

Favata, F; Reale, F

2001-01-01

191

Thousands of corresponding human and mouse genomic regions unalignable in primary sequence contain common RNA structure  

DEFF Research Database (Denmark)

Human and mouse genome sequences contain roughly 100,000 regions that are unalignable in primary sequence and neighbor corresponding alignable regions between both organisms. These pairs are generally assumed to be nonconserved, although the level of structural conservation between these has never been investigated. Owing to the limitations in computational methods, comparative genomics has been lacking the ability to compare such nonconserved sequence regions for conserved structural RNA elements. We have investigated the presence of structural RNA elements by conducting a local structural alignment, using FOLDALIGN, on a subset of these 100,000 corresponding regions and estimate that 1800 contain common RNA structures. Comparing our results with the recent mapping of transcribed fragments (transfrags) in human, we find that high-scoring candidates are twice as likely to be found in regions overlapped by transfrags than regions that are not overlapped by transfrags. To verify the coexpression between predicted candidates in human and mouse, we conducted expression studies by RT-PCR and Northern blotting on mouse candidates, which overlap with transfrags on human chromosome 20. RT-PCR results confirmed expression of 32 out of 36 candidates, whereas Northern blots confirmed four out of 12 candidates. Furthermore, many RT-PCR results indicate differential expression in different tissues. Hence, our findings suggest that there are corresponding regions between human and mouse, which contain expressed non-coding RNA sequences not alignable in primary sequence.

Torarinsson, Elfar; Sawera, Milena

2006-01-01

192

Structural domains of phytochrome deduced from homologies in amino acid sequences.  

Science.gov (United States)

A method of semiempirical identification of structural domains is proposed. The procedure is based on the comparison of amino acid sequences in groups of homologous proteins. This approach was tested using 32 known protein sequences from different cytochrome b5, cytochrome c, lysozyme, hemoglobin, and myoglobin proteins. The method presented was able to identify all structural domains of these reference proteins. A consensus secondary structure provided information on structural content of these domains predicting correctly 21 of 23 (91%) of alpha-helices. We applied this method to six homologous phytochrome sequences from Avena, Arabadopsis, Cucurbita, Maize, Oryza, and Pisum. Some of the identified domains can be assigned to the known tertiary structure categories. For example, an alpha/beta domain is localized in the region known to stabilize the phytochrome chromophore in the red light absorbing form (Pr). One alpha-helical and one alpha/beta domains are localized in regions important for the chromophore stabilization in the far-red absorbing form (Pfr). From an analysis of noncovalent interaction patterns in another domain it is proposed that a phytochrome dimer contact involves two segments localized between residues 730 and 821 (using numbering of aligned sequences). Also, a possible antiparallel beta-sheet structure of this region has been suggested. According to this model, the long axis of the interacting structures is perpendicular to a twofold symmetry axis of the phytochrome dimer. PMID:1326984

Romanowski, M; Song, P S

1992-04-01

193

Structural evolution and magnetic properties of Co(II) coordination polymers varied from 1D to 3D constructed by 1,4-bis(1,2,4-triazol-1-ylmethyl)benzene.  

Science.gov (United States)

Seven Co(II) coordination polymers, [Co(btx)(3)(H(2)O)(2)](ClO(4))(2)·(btx)·2H(2)O (1), [Co(btx)(3)(H(2)O)(2)](BF(4))(2)·(btx)·2H(2)O (2), [Co(btx)(2)(H(2)O)(2)](NO(3))(2)·2H(2)O (3), [Co(btx)(2)Cl(2)] (4), [Co(btx)(BA)(2)(H(2)O)(2)]·2HBA (5), [Co(btx)(IPA)] (6) and [Co(3)(btx)(3)(BTA)(2)(H(2)O)(2)] (7) (btx = (1,4-bis(1,2,4-triazol-1-ylmethyl)benzene), HBA = benzoic acid, H(2)IPA = isophthalic acid, H(3)BTA = benzene-1,3,5-tricarboxylic acid), have been hydrothermally synthesized and characterized. 1 and 2 are isostructural and show a 1D Co-?(2)-btx-Co chain structure, in which btx acts as both a bridging and terminal ligand. 3 is also a 1D chain structure but different from 1 and 2. The Co(II) ions are bridged by double ?(2)-btx to form Co(2)-btx(2) rings, which were further connected into 1D chains by sharing the Co(II) ions of the rings. 4 exists as a 2D grid with (4,4) topology structure. When aromatic acid was introduced to the synthetic system, three other coordination polymers 5-7 were obtained. In 5, the 1D chain is as that of 1, except that the terminal ligand was replaced by BA(-). 6 shows a two-fold parallel interpenetration framework featuring a 6-c uninodal net with (3(3),4(6),5(5),6) Schlafli topological symbol. 7 is an interesting 3D framework, which contains a 2-nodal net motif with the unprecedented (3(6),4(2),5(6),6)(3(9),4(9),5(3))(2) topology structure. The influence of the varieties of the structures and magnetic properties are studied and discussed in detail. PMID:21720639

Zhang, Shi-Yuan; Zhang, Zhen-Jie; Shi, Wei; Zhao, Bin; Cheng, Peng; Liao, Dai-Zheng; Yan, Shi-Ping

2011-08-21

194

MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts  

OpenAIRE

Abstract Background Multiple Sequence Alignment (MSA) is a basic tool for bioinformatics research and analysis. It has been used essentially in almost all bioinformatics tasks such as protein structure modeling, gene and protein function prediction, DNA motif recognition, and phylogenetic analysis. Therefore, improving the accuracy of multiple sequence alignment is important for advancing many bioinformatics fields. Results We designed and developed a new method, MSACompro, to synergistically...

Deng Xin; Cheng Jianlin

2011-01-01

195

Genome sequence, comparative analysis and haplotype structure of the domestic dog.  

OpenAIRE

Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of geno...

Lindblad-toh, K.; Wade, Cm; Mikkelsen, Ts; Karlsson, Ek; Jaffe, Db; Kamal, M.; Clamp, M.; Chang, Jl; Kulbokas, Ej; Zody, Mc; Mauceli, E.; Xie, X.; Breen, M.; Wayne, Rk; Ostrander, Ea

2005-01-01

196

Human serotonin 1D receptor is encoded by a subfamily of two distinct genes: 5-HT1D alpha and 5-HT1D beta.  

Science.gov (United States)

The serotonin 1D (5-HT1D) receptor is a pharmacologically defined binding site and functional receptor site. Observed variations in the properties of 5-HT1D receptors in different tissues have led to the speculation that multiple receptor proteins with slightly different properties may exist. We report here the cloning, deduced amino acid sequences, pharmacological properties, and second-messenger coupling of a pair of human 5-HT1D receptor genes, which we have designated 5-HT1D alpha and 5-HT1D beta due to their strong similarities in sequence, pharmacological properties, and second-messenger coupling. Both genes are free of introns in their coding regions, are expressed in the human cerebral cortex, and can couple to inhibition of adenylate cyclase activity. The pharmacological binding properties of these two human receptors are very similar, and match closely the pharmacological properties of human, bovine, and guinea pig 5-HT1D sites. Both receptors exhibit high-affinity binding of sumatriptan, a new anti-migraine medication, and thus are candidates for the pharmacological site of action of this drug. Images PMID:1565658

Weinshank, R L; Zgombick, J M; Macchi, M J; Branchek, T A; Hartig, P R

1992-01-01

197

Implicit Sequence Learning: Effects of Level of Structure, Adult Age, and Extended Practice  

OpenAIRE

The influence of structure and age on sequence learning was investigated by testing 24 young and 24 older participants for 10 sessions in an alternating serial response time task in which pattern trials alternated with random trials. Individuals encountered lag-2 or lag-3 structure, and learning was measured by the difference (in response time and accuracy) between pattern and random trials. Both ages learned lag-2 structure, but the young learned more than the older participants. Only the yo...

Howard, Darlene V.; Howard, James H.; Japikse, Karin; Diyanni, Cara; Thompson, Amanda; Somberg, Rachel

2004-01-01

198

FragSeq: transcriptome-wide RNA structure probing using high-throughput sequencing  

OpenAIRE

Previous efforts to determine structures of non-coding RNA (ncRNA) probed only one RNA at a time with enzymes and chemicals, using gel electrophoresis to identify reactive positions. To accelerate RNA structure inference, we have developed FragSeq, a high-throughput RNA structure probing method that uses high-throughput RNA sequencing on fragments generated by nuclease P1, which specifically cleaves single stranded nucleic acids. In experiments probing the entire mouse nuclear transcriptome, ...

Underwood, Jason G.; Uzilov, Andrew V.; Katzman, Sol; Onodera, Courtney S.; Mainzer, Jacob E.; Mathews, David H.; Lowe, Todd M.; Salama, Sofie R.; Haussler, David

2010-01-01

199

Sequence-structure alignment using a statistical analysis of core models and dynamic programming  

OpenAIRE

The expanding availability of protein data enforces the application of empirical methods necessary to recognize protein structures. In this paper a sequence-structure alignment method is described and applied to various Ubiquitin-like folded Ras-binding domains. On the basis of two probability functions that evaluate similarities between the occurrence of amino-acids in the primary and secondary protein structure, different versions of simple scoring functions are proposed. The application of...

Brunnert, Marcus; Fischer, Paul; Urfer, Wolfgang

2002-01-01

200

HorA web server to infer homology between proteins using sequence and structural similarity  

OpenAIRE

The biological properties of proteins are often gleaned through comparative analysis of evolutionary relatives. Although protein structure similarity search methods detect more distant homologs than purely sequence-based methods, structural resemblance can result from either homology (common ancestry) or analogy (similarity without common ancestry). While many existing web servers detect structural neighbors, they do not explicitly address the question of homology versus analogy. Here, we pre...

Kim, Bong-hyun; Cheng, Hua; Grishin, Nick V.

2009-01-01

201

Comparative analysis of MR sequences to detect structural brain lesions in tuberous sclerosis  

International Nuclear Information System (INIS)

Tuberous sclerosis (TS) is a neurocutaneous genetically inherited disease with variable penetrance characterized by dysplasias and hamartomas affecting multiple organs. MR is the imaging method of choice to demonstrate structural brain lesions in TS. To compare MR sequences and determine which is most useful for the demonstration of each type of brain lesion in TS patients. We reviewed MR scans of 18 TS patients for the presence of cortical tubers, white matter lesions (radial bands), subependymal nodules, and subependymal giant cell astrocytoma (SGCA) on the following sequences: (1) T1-weighted spin-echo (T1 SE) images before and after gadolinium (Gd) injection; (2) nonenhanced T1 SE sequence with an additional magnetization transfer contrast medium pulse on resonance (T1 SE/MTC); and (3) fluid-attenuated inversion recovery (FLAIR) sequence. Cortical tubers were found in significantly (P<0.05) larger numbers and more conspicuously in FLAIR and T1 SE/MTC sequences. The T1 SE/MTC sequence was far superior to other methods in detecting white matter lesions (P<0.01). There was no significant difference between the T1 SE/MTC and T1 SE (before and after Gd injection) sequences in the detection of subependymal nodules; FLAIR sequence showed less sensitivity than the others in identifying the nodules. T1 SE sequences after Gd injection demonstrated better the limits of the SGCA. We demonstrated the importance of appropriate MRI sequences for diagnosis of the most frequent brnces for diagnosis of the most frequent brain lesions in TS. Our study reinforces the fact that each sequence has a particular application according to the type of TS lesion. Gd injection might be useful in detecting SGCA; however, the parameters of size and location are also important for a presumptive diagnosis of these tumors. (orig.)

202

Dimension reduction for extracting geometrical structure of multidimensional phase space: Application to fast energy exchange in the reaction O(1D)+N2O?NO+NO  

International Nuclear Information System (INIS)

One of the most fundamental problems in studying general Hamiltonian systems with many degrees of freedom is to extract a low-dimensional subsystem including the essential dynamics. In this paper, a new partial normal form (PNF) method is developed to reduce the number of coupling terms in the Hamiltonian and to simplify the dynamics analyses. The PNF method allows one to decouple many unimportant bath modes as well as the reactive mode from the system by assessing the significance of the coupling terms. The method is applied to the chemical reaction O(1D)+N2O?NO+NO, which was found to exhibit efficient energy exchange between the two NO stretching modes despite the short lifetime of the reaction intermediate [S. Kawai et al., J. Chem. Phys. 124, 184315 (2006)]. Through the analysis of the two-dimensional PNF Hamiltonian subsystem, it is found that the motion of the subsystem preserves the 'normal mode picture' of the symmetric and antisymmetric NO stretching modes despite its high energy. Then the vibrational energy, initially localized in the newly formed NO bond, is transferred to the reactants' NO bond through the beating between the symmetric and antisymmetric stretching modes. The preservation of the normal mode picture and the short period of the beating explain the fast energy exchange between the two NO bonds. This successful application proves that the PNF method can extract the essential small subspace from many-degrees-of-freedom Hasubspace from many-degrees-of-freedom Hamiltonian systems

203

Comparative genomics beyond sequence-based alignments : RNA structures in the ENCODE regions  

DEFF Research Database (Denmark)

Recent computational scans for non-coding RNAs (ncRNAs) in multiple organisms have relied on existing multiple sequence alignments. However, as sequence similarity drops, a key signal of RNA structure--frequent compensating base changes--is increasingly likely to cause sequence-based alignment methods to misalign, or even refuse to align, homologous ncRNAs, consequently obscuring that structural signal. We have used CMfinder, a structure-oriented local alignment tool, to search the ENCODE regions of vertebrate multiple alignments. In agreement with other studies, we find a large number of potential RNA structures in the ENCODE regions. We report 6587 candidate regions with an estimated false-positive rate of 50%. More intriguingly, many of these candidates may be better represented by alignments taking the RNA secondary structure into account than those based on primary sequence alone, often quite dramatically. For example, approximately one-quarter of our predicted motifs show revisions in >50% of their aligned positions. Furthermore, our results are strongly complementary to those discovered by sequence-alignment-based approaches--84% of our candidates are not covered by Washietl et al., increasing the number of ncRNA candidates in the ENCODE region by 32%. In a group of 11 ncRNA candidates that were tested by RT-PCR, 10 were confirmed to be present as RNA transcripts in human tissue, and most show evidence of significant differential expression across tissues. Our results broadly suggest caution in any analysis relying on multiple sequence alignments in less well-conserved regions, clearly support growing appreciation for the biological significance of ncRNAs, and strongly support the argument for considering RNA structure directly in any searches for these elements.

Torarinsson, Elfar; Yao, Zizhen

2008-01-01

204

Online homology modelling as a means of bridging the sequence-structure gap.  

Science.gov (United States)

For even the best-studied species, there is a large gap in their representation in the protein databank (PDB) compared to within sequence databases. Typically, less than 2% of sequences are represented in the PDB. This is partly due to the considerable experimental challenge and manual inputs required to solve three dimensional structures by methods such as X-ray diffraction and multi-dimensional nuclear magnetic resonance (NMR) spectroscopy in comparison to high-throughput sequencing. This gap is made even wider by the high level of redundancy within the PDB and under-representation of some protein categories such as membrane-associated proteins which comprise approximately 25% of proteins encoded in genomes. A traditional route to closing the sequence-structure gap is offered by homology modelling whereby the sequence of a target protein is modelled on a template represented in the PDB using in silico energy minimisation approaches. More recently, online homology servers have become available which automatically generate models from proffered sequences. However, many online servers give little indication of the structural plausibility of the generated model. In this paper, the online homology server Geno3D will be described. This server uses similar software to that used in modelling structures during structure determination and thus generates data allowing determination of the structural plausibility of models. For illustration, modelling of a chemotaxis protein (CheY) from Pseudomononas entomophila L48 (accession YP_609298) on a template (PDB id. 1mvo), the phosphorylation domain of an outer membrane protein PhoP from Bacillus subtilis, will be described. PMID:22064508

Sheehan, David; O'Sullivan, Siobhán

2011-01-01

205

SeqHound: biological sequence and structure database as a platform for bioinformatics research  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background SeqHound has been developed as an integrated biological sequence, taxonomy, annotation and 3-D structure database system. It provides a high-performance server platform for bioinformatics research in a locally-hosted environment. Results SeqHound is based on the National Center for Biotechnology Information data model and programming tools. It offers daily updated contents of all Entrez sequence databases in addition to 3-D structural data and information about sequence redundancies, sequence neighbours, taxonomy, complete genomes, functional annotation including Gene Ontology terms and literature links to PubMed. SeqHound is accessible via a web server through a Perl, C or C++ remote API or an optimized local API. It provides functionality necessary to retrieve specialized subsets of sequences, structures and structural domains. Sequences may be retrieved in FASTA, GenBank, ASN.1 and XML formats. Structures are available in ASN.1, XML and PDB formats. Emphasis has been placed on complete genomes, taxonomy, domain and functional annotation as well as 3-D structural functionality in the API, while fielded text indexing functionality remains under development. SeqHound also offers a streamlined WWW interface for simple web-user queries. Conclusions The system has proven useful in several published bioinformatics projects such as the BIND database and offers a cost-effective infrastructure for research. SeqHound will continue to develop and be provided as a service of the Blueprint Initiative at the Samuel Lunenfeld Research Institute. The source code and examples are available under the terms of the GNU public license at the Sourceforge site http://sourceforge.net/projects/slritools/ in the SLRI Toolkit.

Dumontier Michel

2002-10-01

206

Integrating sequencing technologies in personal genomics: optimal low cost reconstruction of structural variants.  

Science.gov (United States)

The goal of human genome re-sequencing is obtaining an accurate assembly of an individual's genome. Recently, there has been great excitement in the development of many technologies for this (e.g. medium and short read sequencing from companies such as 454 and SOLiD, and high-density oligo-arrays from Affymetrix and NimbelGen), with even more expected to appear. The costs and sensitivities of these technologies differ considerably from each other. As an important goal of personal genomics is to reduce the cost of re-sequencing to an affordable point, it is worthwhile to consider optimally integrating technologies. Here, we build a simulation toolbox that will help us optimally combine different technologies for genome re-sequencing, especially in reconstructing large structural variants (SVs). SV reconstruction is considered the most challenging step in human genome re-sequencing. (It is sometimes even harder than de novo assembly of small genomes because of the duplications and repetitive sequences in the human genome.) To this end, we formulate canonical problems that are representative of issues in reconstruction and are of small enough scale to be computationally tractable and simulatable. Using semi-realistic simulations, we show how we can combine different technologies to optimally solve the assembly at low cost. With mapability maps, our simulations efficiently handle the inhomogeneous repeat-containing structure of the human genome and the computational complexity of practical assembly algorithms. They quantitatively show how combining different read lengths is more cost-effective than using one length, how an optimal mixed sequencing strategy for reconstructing large novel SVs usually also gives accurate detection of SNPs/indels, how paired-end reads can improve reconstruction efficiency, and how adding in arrays is more efficient than just sequencing for disentangling some complex SVs. Our strategy should facilitate the sequencing of human genomes at maximum accuracy and low cost. PMID:19593373

Du, Jiang; Bjornson, Robert D; Zhang, Zhengdong D; Kong, Yong; Snyder, Michael; Gerstein, Mark B

2009-07-01

207

Structural studies of ?-bungarotoxin. 1. Sequence-specific 1H NMR resonance assignments  

International Nuclear Information System (INIS)

The authors report the complete sequence-specific assignment of the backbone resonances and most of the side-chain resonances in the 1H NMR spectrum of ?-bungarotoxin by two-dimensional NMR. Problems with resonance overlap were resolved with the assistance of the HRNOESY experiment described in an accompanying paper. Significant differences exist between the solution structure described here and the crystal structure of ?-bungarotoxin, on the basis of the proton to proton distances obtained by nuclear Overhauser enhancement spectroscopy (NOESY) and the corresponding distances from the X-ray crystal structure. These differences include a larger ?-sheet in solution and a different orientation of the invariant tryptophan, Trp-28, making the solution structure more consistent with the crystal structure of the homologous neurotoxin ?-cobratoxin. Four errors in the order of the amino acids in the primary sequence were indicated by the NMR data. These errors were confirmed by chemical means, as described in an accompanying paper

208

FragSeq: transcriptome-wide RNA structure probing using high-throughput sequencing.  

Science.gov (United States)

Classical approaches to determine structures of noncoding RNA (ncRNA) probed only one RNA at a time with enzymes and chemicals, using gel electrophoresis to identify reactive positions. To accelerate RNA structure inference, we developed fragmentation sequencing (FragSeq), a high-throughput RNA structure probing method that uses high-throughput RNA sequencing of fragments generated by digestion with nuclease P1, which specifically cleaves single-stranded nucleic acids. In experiments probing the entire mouse nuclear transcriptome, we accurately and simultaneously mapped single-stranded RNA regions in multiple ncRNAs with known structure. We probed in two cell types to verify reproducibility. We also identified and experimentally validated structured regions in ncRNAs with, to our knowledge, no previously reported probing data. PMID:21057495

Underwood, Jason G; Uzilov, Andrew V; Katzman, Sol; Onodera, Courtney S; Mainzer, Jacob E; Mathews, David H; Lowe, Todd M; Salama, Sofie R; Haussler, David

2010-12-01

209

Evidence for intramolecularly folded i-DNA structures in biologically relevant CCC-repeat sequences.  

OpenAIRE

The structural behaviour of repetitive cytosine DNA is examined in the oligodeoxynucleotide sequences of (CCCTAA)3CCCT (HTC4), GC(TCCC)3TCCT(TCCC)3 (KRC6) and the methylated (CCCT)3TCCT(CCCT)3C (KRM6) by circular dichroism (CD), gel electrophoresis (PAGE), and ultra violet (UV) absorbance studies. All the three sequences exhibit a pH-induced cooperative structural transition as monitored by CD. An intense positive CD band around 285 nm develops on lowering the pH from 8 to slightly acidic con...

Manzini, G.; Yathindra, N.; Xodo, L. E.

1994-01-01

210

Consequences of domain insertion on sequence-structure divergence in a superfold.  

Science.gov (United States)

Although the universe of protein structures is vast, these innumerable structures can be categorized into a finite number of folds. New functions commonly evolve by elaboration of existing scaffolds, for example, via domain insertions. Thus, understanding structural diversity of a protein fold evolving via domain insertions is a fundamental challenge. The haloalkanoic dehalogenase superfamily serves as an excellent model system wherein a variable cap domain accessorizes the ubiquitous Rossmann-fold core domain. Here, we determine the impact of the cap-domain insertion on the sequence and structure divergence of the core domain. Through quantitative analysis on a unique dataset of 154 core-domain-only and cap-domain-only structures, basic principles of their evolution have been uncovered. The relationship between sequence and structure divergence of the core domain is shown to be monotonic and independent of the corresponding type of domain insert, reflecting the robustness of the Rossmann fold to mutation. However, core domains with the same cap type share greater similarity at the sequence and structure levels, suggesting interplay between the cap and core domains. Notably, results reveal that the variance in structure maps to ?-helices flanking the central ?-sheet and not to the domain-domain interface. Collectively, these results hint at intramolecular coevolution where the fold diverges differentially in the context of an accessory domain, a feature that might also apply to other multidomain superfamilies. PMID:23959887

Pandya, Chetanya; Brown, Shoshana; Pieper, Ursula; Sali, Andrej; Dunaway-Mariano, Debra; Babbitt, Patricia C; Xia, Yu; Allen, Karen N

2013-09-01

211

A multilocus sequence typing scheme implies population structure and reveals several putative novel Achromobacter species.  

Science.gov (United States)

The genus Achromobacter currently is comprised of seven species, including Achromobacter xylosoxidans, an opportunistic and nosocomial pathogen that displays broad-spectrum antimicrobial resistance and is recognized as causing chronic respiratory tract infection in persons with cystic fibrosis (CF). To enable strain typing for global epidemiologic investigations, to clarify the taxonomy of "Achromobacter-like" strains, and to elucidate the population structure of this genus, we developed a genus-level multilocus sequence typing (MLST) scheme. We employed in silico analyses of whole-genome sequences of several phylogenetically related genera, including Bordetella, Burkholderia, Cupriavidus, Herminiimonas, Janthinobacterium, Methylibium, and Ralstonia, for selecting loci and designing PCR primers. Using this MLST scheme, we analyzed 107 genetically diverse Achromobacter isolates cultured from biologic specimens from CF and non-CF patients, 1 isolate recovered from sludge, and an additional 39 strains obtained from culture collections. Sequence data from these 147 strains, plus three recently genome-sequenced Achromobacter strains, were assigned to 129 sequence types based on seven loci. Calculation of the nucleotide divergence of concatenated locus sequences within and between MLST clusters confirmed the seven previously named Achromobacter species and revealed 14 additional genogroups. Indices of association showed significant linkage disequilibrium in all of the species/genogroups able to be tested, indicating that each group has a clonal population structure. No clear segregation of species/genogroups between CF and non-CF sources was found. PMID:22785192

Spilker, Theodore; Vandamme, Peter; Lipuma, John J

2012-09-01

212

The influence of the local sequence environment on RNA loop structures.  

Science.gov (United States)

RNA folding is assumed to be a hierarchical process. The secondary structure of an RNA molecule, signified by base-pairing and stacking interactions between the paired bases, is formed first. Subsequently, the RNA molecule adopts an energetically favorable three-dimensional conformation in the structural space determined mainly by the rotational degrees of freedom associated with the backbone of regions of unpaired nucleotides (loops). To what extent the backbone conformation of RNA loops also results from interactions within the local sequence context or rather follows global optimization constraints alone has not been addressed yet. Because the majority of base stacking interactions are exerted locally, a critical influence of local sequence on local structure appears plausible. Thus, local loop structure ought to be predictable, at least in part, from the local sequence context alone. To test this hypothesis, we used Random Forests on a nonredundant data set of unpaired nucleotides extracted from 97 X-ray structures from the Protein Data Bank (PDB) to predict discrete backbone angle conformations given by the discretized ?/?-pseudo-torsional space. Predictions on balanced sets with four to six conformational classes using local sequence information yielded average accuracies of up to 55%, thus significantly better than expected by chance (17%-25%). Bases close to the central nucleotide appear to be most tightly linked to its conformation. Our results suggest that RNA loop structure does not only depend on long-range base-pairing interactions; instead, it appears that local sequence context exerts a significant influence on the formation of the local loop structure. PMID:21628431

Schudoma, Christian; Larhlimi, Abdelhalim; Walther, Dirk

2011-07-01

213

Structural characterization of HDPE/LLDPE blend-based nano composites obtained by different blending sequence  

International Nuclear Information System (INIS)

The blending sequence affects the morphology formation of the nanocomposites. In this work, the blending sequences were explored to determine its influence in the rheological behavior of HDPE/LLDPE/OMMT nanocomposites. The nanocomposites were obtained by melt-intercalation using a mixture of LLDPE-g-MA and HDPE-g-MA as compatibilizer system in a torque rheometer at 180 deg C and five blending sequences were studied. The materials structures were characterized by wide angle X-ray diffraction (WAXD) and by rheological properties. The nanoclay's addition increased the shear viscosity at low shear rates, changing the behavior of HDPE/LLDPE matrix to a Bingham model behavior with an apparent yield stress. Intense interactions were obtained for the blending sequence where LLDPE and/or LLDPE-g-MA were first reinforced with organoclay since the intercalation process occurs preferentially in the amorphous phase. (author)

214

Alignment editing and identification of consensus secondary structures for nucleic acid sequences: interactive use of dot matrix representations.  

OpenAIRE

We present a computer-aided approach for identifying and aligning consensus secondary structure within a set of functionally related oligonucleotide sequences aligned by sequence. The method relies on visualization of secondary structure using a generalization of the dot matrix representation appropriate for consensus sequence data sets. An interactive computer program implementing such a visualization of consensus structure has been developed. The program allows for alignment editing, data a...

Davis, J. P.; Janjic?, N.; Pribnow, D.; Zichi, D. A.

1995-01-01

215

fRMSDPred: predicting local RMSD between structural fragments using sequence information.  

Science.gov (United States)

The effectiveness of comparative modeling approaches for protein structure prediction can be substantially improved by incorporating predicted structural information in the initial sequence-structure alignment. Motivated by the approaches used to align protein structures, this article focuses on developing machine learning approaches for estimating the RMSD value of a pair of protein fragments. These estimated fragment-level RMSD values can be used to construct the alignment, assess the quality of an alignment, and identify high-quality alignment segments. We present algorithms to solve this fragment-level RMSD prediction problem using a supervised learning framework based on support vector regression and classification that incorporates protein profiles, predicted secondary structure, effective information encoding schemes, and novel second-order pairwise exponential kernel functions. Our comprehensive empirical study shows superior results compared with the profile-to-profile scoring schemes. We also show that for protein pairs with low sequence similarity (less than 12% sequence identity) these new local structural features alone or in conjunction with profile-based information lead to alignments that are considerably accurate than those obtained by schemes that use only profile and/or predicted secondary structure information. PMID:18300251

Rangwala, Huzefa; Karypis, George

2008-08-15

216

Reading the three-dimensional structure of a protein from its amino acid sequence  

CERN Document Server

While all the information required for the folding of a protein is contained in its amino acid sequence, one has not yet learnt how to extract this information so as to predict the detailed, biological active, three-dimensional structure of a protein whose sequence is known. This situation is not particularly satisfactory, in keeping with the fact that while linear sequencing of the amino acids specifying a protein is relatively simple to carry out, the determination of the folded-native-conformation can only be done by an elaborate X-ray diffraction analysis performed on crystals of the protein or, if the protein is very small, by nuclear magnetic resonance techniques. Using insight obtained from lattice model simulations of the folding of small proteins (fewer than 100 residues), in particular of the fact that this phenomenon is essentially controlled by conserved contacts among strongly interacting amino acids, which also stabilize local elementary structures formed early in the folding process and leading...

Broglia, R A

2000-01-01

217

Sequence and secondary structure of the mitochondrial 16S ribosomal RNA gene of Ixodes scapularis.  

Science.gov (United States)

The complete DNA sequences and secondary structure of the mitochondrial (mt) 16S ribosomal (r) RNA gene were determined for six Ixodes scapularis adults. There were 44 variable nucleotide positions in the 1252 bp sequence alignment. Most (95%) nucleotide alterations did not affect the integrity of the secondary structure of the gene because they either occurred at unpaired positions or represented compensatory changes that maintained the base pairing in helices. A large proportion (75%) of the intraspecific variation in DNA sequence occurred within Domains I, II and VI of the 16S gene. Therefore, several regions within this gene may be highly informative for studies of the population genetics and phylogeography of I. scapularis, a major vector of pathogens of humans and domestic animals in North America. PMID:25444935

Krakowetz, Chantel N; Chilton, Neil B

2015-02-01

218

Structural and Sequence Stratigraphic Analysis of the Onshore Nile Delta, Egypt.  

Science.gov (United States)

The Nile Delta is considered the earliest known delta in the world. It was already described by Herodotus in the 5th Century AC. Nowadays; the Nile Delta is an emerging giant gas province in the Middle East with proven gas reserves which have more than doubled in size in the last years. The Nile Delta basin contains a thick sedimentary sequence inferred to extend from Jurassic to recent time. Structural styles and depositional environments varied during this period. Facies architecture and sequence stratigraphy of the Nile Delta are resolved using seismic stratigraphy based on (2D seismic lines) including synthetic seismograms and tying in well log data. Synthetic seismograms were constructed using sonic and density logs. The combination of structural interpretation and sequence stratigraphy of the development of the basin was resolved. Seven chrono-stratigraphic boundaries have been identified and correlated on seismic and well log data. Several unconformity boundaries also identified on seismic lines range from angular to disconformity type. Furthermore, time structure maps, velocity maps, depth structure maps as well as Isopach maps were constructed using seismic lines and log data. Several structural features were identified: normal faults, growth faults, listric faults, secondary antithetic faults and large rotated fault blocks of manly Miocene age. In some cases minor rollover structures could be identified. Sedimentary features such as paleo-channels were distinctively recognized. Typical Sequence stratigraphic features such as incised valley, clinoforms, topsets, offlaps and onlaps are identified and traced on the seismic lines allowing a good insight into sequence stratigraphic history of the Nile Delta most especially in the Miocene to Pliocene clastic sedimentary succession.

Barakat, Moataz; Dominik, Wilhelm

2010-05-01

219

Accurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure information  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Structural properties of proteins such as secondary structure and solvent accessibility contribute to three-dimensional structure prediction, not only in the ab initio case but also when homology information to known structures is available. Structural properties are also routinely used in protein analysis even when homology is available, largely because homology modelling is lower throughput than, say, secondary structure prediction. Nonetheless, predictors of secondary structure and solvent accessibility are virtually always ab initio. Results Here we develop high-throughput machine learning systems for the prediction of protein secondary structure and solvent accessibility that exploit homology to proteins of known structure, where available, in the form of simple structural frequency profiles extracted from sets of PDB templates. We compare these systems to their state-of-the-art ab initio counterparts, and with a number of baselines in which secondary structures and solvent accessibilities are extracted directly from the templates. We show that structural information from templates greatly improves secondary structure and solvent accessibility prediction quality, and that, on average, the systems significantly enrich the information contained in the templates. For sequence similarity exceeding 30%, secondary structure prediction quality is approximately 90%, close to its theoretical maximum, and 2-class solvent accessibility roughly 85%. Gains are robust with respect to template selection noise, and significant for marginal sequence similarity and for short alignments, supporting the claim that these improved predictions may prove beneficial beyond the case in which clear homology is available. Conclusion The predictive system are publicly available at the address http://distill.ucd.ie.

Vullo Alessandro

2007-06-01

220

The amino acid alphabet and the architecture of the protein sequence-structure map. I. Binary alphabets.  

Science.gov (United States)

The correspondence between protein sequences and structures, or sequence-structure map, relates to fundamental aspects of structural, evolutionary and synthetic biology. The specifics of the mapping, such as the fraction of accessible sequences and structures, or the sequences' ability to fold fast, are dictated by the type of interactions between the monomers that compose the sequences. The set of possible interactions between monomers is encapsulated by the potential energy function. In this study, I explore the impact of the relative forces of the potential on the architecture of the sequence-structure map. My observations rely on simple exact models of proteins and random samples of the space of potential energy functions of binary alphabets. I adopt a graph perspective and study the distribution of viable sequences and the structures they produce, as networks of sequences connected by point mutations. I observe that the relative proportion of attractive, neutral and repulsive forces defines types of potentials, that induce sequence-structure maps of vastly different architectures. I characterize the properties underlying these differences and relate them to the structure of the potential. Among these properties are the expected number and relative distribution of sequences associated to specific structures and the diversity of structures as a function of sequence divergence. I study the types of binary potentials observed in natural amino acids and show that there is a strong bias towards only some types of potentials, a bias that seems to characterize the folding code of natural proteins. I discuss implications of these observations for the architecture of the sequence-structure map of natural proteins, the construction of random libraries of peptides, and the early evolution of the natural amino acid alphabet. PMID:25473967

Ferrada, Evandro

2014-12-01

221

Bioinformatic Analysis of the Contribution of Primer Sequences to Aptamer Structures  

OpenAIRE

Aptamers are nucleic acid molecules selected in vitro to bind a particular ligand. While numerous experimental studies have examined the sequences, structures, and functions of individual aptamers, considerably fewer studies have applied bioinformatics approaches to try to infer more general principles from these individual studies. We have used a large Aptamer Database to parse the contributions of both random and constant regions to the secondary structures of more than 2000 aptamers. We fi...

Cowperthwaite, Matthew C.; Ellington, Andrew D.

2008-01-01

222

Genotyping by sequencing resolves shallow population structure to inform conservation of Chinook salmon (Oncorhynchus tshawytscha)  

OpenAIRE

Recent advances in population genomics have made it possible to detect previously unidentified structure, obtain more accurate estimates of demographic parameters, and explore adaptive divergence, potentially revolutionizing the way genetic data are used to manage wild populations. Here, we identified 10 944 single-nucleotide polymorphisms using restriction-site-associated DNA (RAD) sequencing to explore population structure, demography, and adaptive divergence in five populations of Chinook...

Larson, Wesley A.; Seeb, Lisa W.; Everett, Meredith V.; Waples, Ryan K.; Templin, William D.; Seeb, James E.

2014-01-01

223

Sequence and structure alignment of paramyxovirus hemagglutinin-neuraminidase with influenza virus neuraminidase.  

OpenAIRE

A model is proposed for the three-dimensional structure of the paramyxovirus hemagglutinin-neuraminidase (HN) protein. The model is broadly similar to the structure of the influenza virus neuraminidase and is based on the identification of invariant amino acids among HN sequences which have counterparts in the enzyme-active center of influenza virus neuraminidase. The influenza virus enzyme-active site is constructed from strain-invariant functional and framework residues, but in this model o...

Colman, P. M.; Hoyne, P. A.; Lawrence, M. C.

1993-01-01

224

Structural studies of an arabinan from the stems of Ephedra sinica by methylation analysis and 1D and 2D NMR spectroscopy.  

Science.gov (United States)

Plant arabinan has important biological activity. In this study, a water-soluble arabinan (Mw?6.15kDa) isolated from the stems of Ephedra sinica was found to consist of (1?5)-Araƒ, (1?3,5)-Araƒ, T-Araƒ, (1?3)-Araƒ and (1?2,5)-Araƒ residues at proportions of 10:2:3:2:1. A tentative structure was proposed by methylation analysis, nuclear magnetic resonance (NMR) spectroscopy ((1)H NMR, (13)C NMR, DEPT-135, (1)H-(1)H COSY, HSQC, HMBC and ROESY) and literature. The structure proposed includes a branched (1?5)-?-Araf backbone where branching occurs at the O-2 and O-3 positions of the residues with 7.7% and 15.4% of the 1,5-linked ?-Araf substituted at the O-2 and O-3 positions. The presence of a branched structure was further observed by atomic force microscopy. This polymer was characterized as having a much longer linear (1?5)-?-Araf backbone as a repeating unit. In particular, the presence of ?-Araf?3)-?-Araf-(1?3)-?-Araf-(1? attached at the O-2 is a new finding. This study may facilitate a deeper understanding of structure-activity relationships of biological polysaccharides from the stems of E. sinica. PMID:25659720

Xia, Yong-Gang; Liang, Jun; Yang, Bing-You; Wang, Qiu-Hong; Kuang, Hai-Xue

2015-05-01

225

Multiple Sequence Alignments as Tools for Protein Structure and Function Prediction  

Directory of Open Access Journals (Sweden)

Full Text Available Multiple sequence alignments have much to offer to the understanding of protein structure, evolution and function. We are developing approaches to use this information in predicting protein-binding specificity, intra-protein and protein-protein interactions, and in reconstructing protein interaction networks.

Alfonso Valencia

2006-04-01

226

Multiple Sequence Alignments as Tools for Protein Structure and Function Prediction  

OpenAIRE

Multiple sequence alignments have much to offer to the understanding of protein structure, evolution and function. We are developing approaches to use this information in predicting protein-binding specificity, intra-protein and protein-protein interactions, and in reconstructing protein interaction networks.

Alfonso Valencia

2006-01-01

227

WebScipio: An online tool for the determination of gene structures using protein sequences  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Obtaining the gene structure for a given protein encoding gene is an important step in many analyses. A software suited for this task should be readily accessible, accurate, easy to handle and should provide the user with a coherent representation of the most probable gene structure. It should be rigorous enough to optimise features on the level of single bases and at the same time flexible enough to allow for cross-species searches. Results WebScipio, a web interface to the Scipio software, allows a user to obtain the corresponding coding sequence structure of a here given a query protein sequence that belongs to an already assembled eukaryotic genome. The resulting gene structure is presented in various human readable formats like a schematic representation, and a detailed alignment of the query and the target sequence highlighting any discrepancies. WebScipio can also be used to identify and characterise the gene structures of homologs in related organisms. In addition, it offers a web service for integration with other programs. Conclusion WebScipio is a tool that allows users to get a high-quality gene structure prediction from a protein query. It offers more than 250 eukaryotic genomes that can be searched and produces predictions that are close to what can be achieved by manual annotation, for in-species and cross-species searches alike. WebScipio is freely accessible at http://www.webscipio.org.

Waack Stephan

2008-09-01

228

Resolution-optimized NMR measurement of {sup 1}D{sub CH}, {sup 1}D{sub CC} and {sup 2}D{sub CH} residual dipolar couplings in nucleic acid bases  

Energy Technology Data Exchange (ETDEWEB)

New methods are described for accurate measurement of multiple residual dipolar couplings in nucleic acid bases. The methods use TROSY-type pulse sequences for optimizing resolution and sensitivity, and rely on the E.COSY principle to measure the relatively small two-bond {sup 2}D{sub CH} couplings at high precision. Measurements are demonstrated for a 24-nt stem-loop RNA sequence, uniformly enriched in {sup 13}C, and aligned in Pf1. The recently described pseudo-3D method is used to provide homonuclear {sup 1}H-{sup 1}H decoupling, which minimizes cross-correlation effects and optimizes resolution. Up to seven {sup 1}H-{sup 13}C and {sup 13}C-{sup 13}C couplings are measured for pyrimidines (U and C), including {sup 1}D{sub C5H5}, {sup 1}D{sub C6H6}, {sup 2}D{sub C5H6}, {sup 2}D{sub C6H5}, {sup 1}D{sub C5C4}, {sup 1}D{sub C5C6}, and {sup 2}D{sub C4H5}. For adenine, four base couplings ({sup 1}D{sub C2H2}, {sup 1}D{sub C8H8}, {sup 1}D{sub C4C5}, and {sup 1}D{sub C5C6}) are readily measured whereas for guanine only three couplings are accessible at high relative accuracy ({sup 1}D{sub C8H8}, {sup 1}D{sub C4C5}, and {sup 1}D{sub C5C6}). Only three dipolar couplings are linearly independent in planar structures such as nucleic acid bases, permitting cross validation of the data and evaluation of their accuracies. For the vast majority of dipolar couplings, the error is found to be less than {+-}3% of their possible range, indicating that the measurement accuracy is not limiting when using these couplings as restraints in structure calculations. Reported isotropic values of the one- and two-bond J couplings cluster very tightly for each type of nucleotide.

Boisbouvier, Jerome; Bryce, David L.; O' Neil-Cabello, Erin [Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health (United States); Nikonowicz, Edward P. [Rice University, Department of Biochemistry and Cell Biology (United States); Bax, Ad [Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health (United States)], E-mail: bax@nih.gov

2004-11-15

229

Rapid search for tertiary fragments reveals protein sequence-structure relationships.  

Science.gov (United States)

Finding backbone substructures from the Protein Data Bank that match an arbitrary query structural motif, composed of multiple disjoint segments, is a problem of growing relevance in structure prediction and protein design. Although numerous protein structure search approaches have been proposed, methods that address this specific task without additional restrictions and on practical time scales are generally lacking. Here, we propose a solution, dubbed MASTER, that is both rapid, enabling searches over the Protein Data Bank in a matter of seconds, and provably correct, finding all matches below a user-specified root-mean-square deviation cutoff. We show that despite the potentially exponential time complexity of the problem, running times in practice are modest even for queries with many segments. The ability to explore naturally plausible structural and sequence variations around a given motif has the potential to synthesize its design principles in an automated manner; so we go on to illustrate the utility of MASTER to protein structural biology. We demonstrate its capacity to rapidly establish structure-sequence relationships, uncover the native designability landscapes of tertiary structural motifs, identify structural signatures of binding, and automatically rewire protein topologies. Given the broad utility of protein tertiary fragment searches, we hope that providing MASTER in an open-source format will enable novel advances in understanding, predicting, and designing protein structure. PMID:25420575

Zhou, Jianfu; Grigoryan, Gevorg

2015-04-01

230

Can Clustal-style progressive pairwise alignment of multiple sequences be used in RNA secondary structure prediction?  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background In ribonucleic acid (RNA molecules whose function depends on their final, folded three-dimensional shape (such as those in ribosomes or spliceosome complexes, the secondary structure, defined by the set of internal basepair interactions, is more consistently conserved than the primary structure, defined by the sequence of nucleotides. Results The research presented here investigates the possibility of applying a progressive, pairwise approach to the alignment of multiple RNA sequences by simultaneously predicting an energy-optimized consensus secondary structure. We take an existing algorithm for finding the secondary structure common to two RNA sequences, Dynalign, and alter it to align profiles of multiple sequences. We then explore the relative successes of different approaches to designing the tree that will guide progressive alignments of sequence profiles to create a multiple alignment and prediction of conserved structure. Conclusion We have found that applying a progressive, pairwise approach to the alignment of multiple ribonucleic acid sequences produces highly reliable predictions of conserved basepairs, and we have shown how these predictions can be used as constraints to improve the results of a single-sequence structure prediction algorithm. However, we have also discovered that the amount of detail included in a consensus structure prediction is highly dependent on the order in which sequences are added to the alignment (the guide tree, and that if a consensus structure does not have sufficient detail, it is less likely to provide useful constraints for the single-sequence method.

Turcotte Marcel

2007-06-01

231

Evidence for intramolecularly folded i-DNA structures in biologically relevant CCC-repeat sequences.  

Science.gov (United States)

The structural behaviour of repetitive cytosine DNA is examined in the oligodeoxynucleotide sequences of (CCCTAA)3CCCT (HTC4), GC(TCCC)3TCCT(TCCC)3 (KRC6) and the methylated (CCCT)3TCCT(CCCT)3C (KRM6) by circular dichroism (CD), gel electrophoresis (PAGE), and ultra violet (UV) absorbance studies. All the three sequences exhibit a pH-induced cooperative structural transition as monitored by CD. An intense positive CD band around 285 nm develops on lowering the pH from 8 to slightly acidic condition, indicative of the formation of base pairs between protonated cytosines. The oligomers are found to melt in a fully reversible and cooperative fashion, with a melting temperature (Tm) of around 50 degrees C at pH 5.5. The melting temperatures are independent from DNA concentration, indicative of an intramolecular process involved in the structural formation. PAGE experiments performed with 32P-labeled samples as well as with normal staining procedures show a predominantly single band migration for all the three oligomers suggestive of a unimolecular structure. From pH titrations the number of protons required for generating the structures formed by HTC4, KRC6 and KRM6 results to be around six. These findings strongly suggest that all the three sequences adopt an intramolecular i-motif structure. The demonstration of i-motif structure for KRC6, a critical functional stretch of the c-ki-ras promoter proto-oncogene, besides the human telomeric sequence HTC4, may be suggestive of larger significance in the functioning of DNA. PMID:7984411

Manzini, G; Yathindra, N; Xodo, L E

1994-11-11

232

A sequence-based survey of the complex structural organization of tumor genomes  

Energy Technology Data Exchange (ETDEWEB)

The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using End Sequencing Profiling (ESP), which relies on paired-end sequencing of cloned tumor genomes. In this study, brain, breast, ovary and prostate tumors along with three breast cancer cell lines were surveyed with ESP yielding the largest available collection of sequence-ready tumor genome breakpoints and providing evidence that some rearrangements may be recurrent. Sequencing and fluorescence in situ hybridization (FISH) confirmed translocations and complex tumor genome structures that include coamplification and packaging of disparate genomic loci with associated molecular heterogeneity. Comparison of the tumor genomes suggests recurrent rearrangements. Some are likely to be novel structural polymorphisms, whereas others may be bona fide somatic rearrangements. A recurrent fusion transcript in breast tumors and a constitutional fusion transcript resulting from a segmental duplication were identified. Analysis of end sequences for single nucleotide polymorphisms (SNPs) revealed candidate somatic mutations and an elevated rate of novel SNPs in an ovarian tumor. These results suggest that the genomes of many epithelial tumors may be far more dynamic and complex than previously appreciated and that genomic fusions including fusion transcripts and proteins may be common, possibly yielding tumor-specific biomarkers and therapeutic targets.

Collins, Colin; Raphael, Benjamin J.; Volik, Stanislav; Yu, Peng; Wu, Chunxiao; Huang, Guiqing; Linardopoulou, Elena V.; Trask, Barbara J.; Waldman, Frederic; Costello, Joseph; Pienta, Kenneth J.; Mills, Gordon B.; Bajsarowicz, Krystyna; Kobayashi, Yasuko; Sridharan, Shivaranjani; Paris, Pamela; Tao, Quanzhou; Aerni, Sarah J.; Brown, Raymond P.; Bashir, Ali; Gray, Joe W.; Cheng, Jan-Fang; de Jong, Pieter; Nefedov, Mikhail; Ried, Thomas; Padilla-Nash, Hesed M.; Collins, Colin C.

2008-04-03

233

Influence of sequence context and length on the structure and stability of triplet repeat DNA oligomers.  

Science.gov (United States)

Genetic expansion diseases have been linked to the properties of triplet repeat DNA sequences during replication. The most common triplet repeats associated with such diseases are CAG, CCG, CGG, and CTG. It has been suggested that gene expansion occurs as a result of hairpin formation of long stretches of these sequences on the leading daughter strand synthesized during DNA replication [Gellibolian, R., Bacolla, A., and Wells, R. D. (1997) J. Biol. Chem. 272, 16793-7]. To test the biophysical basis for this model, oligonucleotides of general sequence (CNG)(n), where N = A, C, G, or T and n = 4, 5, 10, 15, or 25, were synthesized and characterized by circular dichroism (CD) spectropolarimetry, optical melting studies, and differential scanning calorimetry (DSC). The goal of these studies was to evaluate the influence of sequence context and oligomer length on their secondary structures and stabilities. The results indicate that all single oligomers, even those as short as 12 nucleotides, form stable hairpin structures at 25 degrees C. Such hairpins are characterized by the presence of N:N mismatched base pairs sandwiched between G:C base pairs in the stems and loops of three to four unpaired bases. Thermodynamic analysis of these structures reveals that their stabilities are influenced by both the sequence of the particular oligomer and its length. Specifically, the stability order of CGG > CTG > CAG > CCG was observed. In addition, longer oligomers were found to be more stable than shorter oligomers of the same sequence. However, a stability plateau above 45 nucleotides suggests that the length dependence reaches a maximum value where the stability of the G:C base pairs can no longer compensate the instability of the N:N mismatches in the stems of the hairpins. The results are discussed in terms of the above model proposed for gene expansion. PMID:15518572

Paiva, Anthony M; Sheardy, Richard D

2004-11-01

234

Electronic structure of S = 1/2 1 D Hiesenberg antiferromagnetic systems: (Sr, Ba)2Cu(PO4)2  

International Nuclear Information System (INIS)

We have studied the electronic structure of the quasi one-dimensional (spin-chain) compound Sr2Cu(PO4)2 and Ba2Cu(PO4)2 using the self-consistent tight binding linearized muffin-tin-orbital (TB-LMTO) method. We have calculated the interaction as well as intrachain hopping parameters for both the compounds and found that the intrachain hoppings are dominant suggesting the compound to be indeed one-dimensional. Our estimate of the exchange interaction (J) compares well with experiment. (author)

235

Bioinformatical approaches to RNA structure prediction & Sequencing of an ancient human genome  

DEFF Research Database (Denmark)

Stinus Lindgreen has been working in two different fields during his Ph.D. The first part has been focused on computational approaches to predict the structure of non-coding RNA molecules at the base pairing level. This has resulted in the analysis of various measures of the base pairing potential in families of related RNA sequences. Also, the program MASTR was developed to perform simultaneous alignment of multiple RNA sequences and prediction of a common secondary structure. The webserver WAR was developed to make it easy for non-computer savy researchers to use the many RNA structure prediction tools that exist. The second part has been focused on the mapping and genotyping of ancient genomic DNA. The development of next generation sequencing technologies combined with the use of ancient DNA material present the researchers with some special challenges in the analyses. This work resulted in the publication of the first genome of an ancient human individual, where close to the theoretical maximum of the genome sequence was recovered with high confidence. Part of the project was the development of the program SNPest for genotyping and SNP calling that models various sources of error and predicts genotypes with the highest posterior probability.

Lindgreen, Stinus

2010-01-01

236

Structural, electronic, and magnetic properties of quasi-1D quantum magnets [Ni(HF2)(pyz)2]X (pyz = pyrazine; X = PF6(-), SbF6(-)) exhibiting Ni-FHF-Ni and Ni-pyz-Ni spin interactions.  

Science.gov (United States)

[Ni(HF(2))(pyz)(2)]X {pyz = pyrazine; X = PF(6)(-) (1), SbF(6)(-) (2)} were structurally characterized by synchrotron X-ray powder diffraction and found to possess axially compressed NiN(4)F(2) octahedra. At 298 K, 1 is monoclinic (C2/c) with unit cell parameters, a = 9.9481(3), b = 9.9421(3), c = 12.5953(4) Å, and ? = 81.610(3)° while 2 is tetragonal (P4/nmm) with a = b = 9.9359(3) and c = 6.4471(2) Å and is isomorphic with the Cu-analogue. Infinite one-dimensional (1D) Ni-FHF-Ni chains propagate along the c-axis which are linked via ?-pyz bridges in the ab-plane to afford three-dimensional polymeric frameworks with PF(6)(-) and SbF(6)(-) counterions occupying the interior sites. A major difference between 1 and 2 is that the Ni-F-H bonds are bent (?157°) in 1 but are linear in 2. Ligand field calculations (LFT) based on an angular overlap model (AOM), with comparison to the electronic absorption spectra, indicate greater ?-donation of the HF(2)(-) ligand in 1 owing to the bent Ni-F-H bonds. Magnetic susceptibility data for 1 and 2 exhibit broad maxima at 7.4 and 15 K, respectively, and ?-like peaks in d?T/dT at 6.2 and 12.2 K that are ascribed to transitions to long-range antiferromagnetic order (T(N)). Muon-spin relaxation and specific heat studies confirm these T(N)'s. A comparative analysis of ? vs T to various 1D Heisenberg/Ising models suggests moderate antiferromagnetic interactions, with the primary interaction strength determined to be 3.05/3.42 K (1) and 5.65/6.37 K (2). However, high critical fields of 19 and 37.4 T obtained from low temperature pulsed-field magnetization data indicate that a single exchange constant (J(1D)) alone is insufficient to explain the data and that residual terms in the spin Hamiltonian, which could include interchain magnetic couplings (J(?)), as mediated by Ni-pyz-Ni, and single-ion anisotropy (D), must be considered. While it is difficult to draw absolute conclusions regarding the magnitude (and sign) of J(?) and D based solely on powder data, further support offered by related Ni(II)-pyz compounds and our LFT and density-functional theory (DFT) results lead us to a consistent quasi-1D magnetic description for 1 and 2. PMID:21598910

Manson, Jamie L; Lapidus, Saul H; Stephens, Peter W; Peterson, Peter K; Carreiro, Kimberly E; Southerland, Heather I; Lancaster, Tom; Blundell, Stephen J; Steele, Andrew J; Goddard, Paul A; Pratt, Francis L; Singleton, John; Kohama, Yoshimitsu; McDonald, Ross D; Del Sesto, Rico E; Smith, Nickolaus A; Bendix, Jesper; Zvyagin, Sergei A; Kang, Jinhee; Lee, Changhoon; Whangbo, Myung-Hwan; Zapf, Vivien S; Plonczak, Alex

2011-07-01

237

Stem-loop structures of the repetitive DNA sequences located at human centromeres  

Energy Technology Data Exchange (ETDEWEB)

The presence of the highly conserved repetitive DNA sequences in the human centromeres argues for a special role of these sequences in their biological functions - most likely achieved by the formation of unusual structures. This prompted us to carry out quantitative one- and two-dimensional nuclear magnetic resonance (lD/2D NMR) spectroscopy to determine the structural properties of the human centromeric repeats, d(AATGG){sub n.d}(CCATT){sub n}. The studies on centromeric DNAs reveal that the complementary sequence, d(AATGG){sub n.d}(CCATT){sub n}, adopts the usual Watson-Crick B-DNA duplex and the pyrimidine-rich d(CCATT){sub n} strand is essentially a random coil. However, the purine-rich d(AATGG){sub n} strand is shown to adopt unusual stem-loop structures for repeat lengths, n=2,3,4, and 6. In addition to normal Watson-Crick A{center_dot}T pairs, the stem-loop structures are stabilized by mismatch A{center_dot}G and G{center_dot}G pairs in the stem and G-G-A stacking in the loop. Stem-loop structures of d(AATGG)n are independently verified by gel electrophoresis and nuclease digestion studies. Thermal melting studies show that the DNA repeats, d(AATGG){sub n}, are as stable as the corresponding Watson-Crick duplex d(AATGG){sub n.d}(CCATT){sub n}. Therefore, the sequence d(AATGG){sub n} can, indeed, nucleate a stem-loop structure at little free-energy cost and if, during mitosis, they are located on the chromosome surface they can provide specific recognition sites for kinetochore function.

Gupta, G.; Garcia, A.E.; Ratliff, R.; Moyzis, R.K. [Los Alamos National Lab., NM (United States); Catasti, P.; Hong, Lin; Yau, P. [California Univ., Davis, CA (United States). Dept. of Biological Chemistry; Bradbury, E.M. [Los Alamos National Lab., NM (United States)]|[California Univ., Davis, CA (United States). Dept. of Biological Chemistry

1993-09-01

238

Identification of microRNA precursors with new sequence-structure features  

OpenAIRE

MicroRNAs are an important subclass of non-coding RNAs (ncRNA), and serve as main players into RNA interference (RNAi). Mature microRNA derived from stem-loop structure called precursor. Identification of precursor microRNA (pre-miRNA) is essential step to target microRNA in whole genome. The present work proposed 25 novel local features for identifying stem- loop structure of pre-miRNAs, which captures characteristics on both the sequence and structure. Firstly, we pulled the stem of hairpin...

Ying-Jie Zhao; Qing-Shan Ni; Zheng-Zhi Wang

2009-01-01

239

Linking experimental results, biological networks and sequence analysis methods using Ontologies and Generalised Data Structures.  

Science.gov (United States)

The structure of a closely integrated data warehouse is described that is designed to link different types and varying numbers of biological networks, sequence analysis methods and experimental results such as those coming from microarrays. The data schema is inspired by a combination of graph based methods and generalised data structures and makes use of ontologies and meta-data. The core idea is to consider and store biological networks as graphs, and to use generalised data structures (GDS) for the storage of further relevant information. This is possible because many biological networks can be stored as graphs: protein interactions, signal transduction networks, metabolic pathways, gene regulatory networks etc. Nodes in biological graphs represent entities such as promoters, proteins, genes and transcripts whereas the edges of such graphs specify how the nodes are related. The semantics of the nodes and edges are defined using ontologies of node and relation types. Besides generic attributes that most biological entities possess (name, attribute description), further information is stored using generalised data structures. By directly linking to underlying sequences (exons, introns, promoters, amino acid sequences) in a systematic way, close interoperability to sequence analysis methods can be achieved. This approach allows us to store, query and update a wide variety of biological information in a way that is semantically compact without requiring changes at the database schema level when new kinds of biological information is added. We describe how this datawarehouse is being implemented by extending the text-mining framework ONDEX to link, support and complement different bioinformatics applications and research activities such as microarray analysis, sequence analysis and modelling/simulation of biological systems. The system is developed under the GPL license and can be downloaded from http://sourceforge.net/projects/ondex/ PMID:15972003

Koehler, Jacob; Rawlings, Chris; Verrier, Paul; Mitchell, Rowan; Skusa, Andre; Ruegg, Alexander; Philippi, Stephan

2005-01-01

240

Structure and Active Stie Residues of Pg1D, an N-Acetyltransferase from the Bacillosamine Synthetic Pathway Required for N-Glycan Synthesis in Campylobacter jejuni  

Energy Technology Data Exchange (ETDEWEB)

Campylobacter jejuni is highly unusual among bacteria in forming N-linked glycoproteins. The heptasaccharide produced by its pgl system is attached to protein Asn through its terminal 2, 4-diacetamido-2, 4,6-trideoxy-d-Glc (QuiNAc4NAc or N, N'-diacetylbacillosamine) moiety. The crucial, last part of this sugar's synthesis is the acetylation of UDP-2-acetamido-4-amino-2, 4,6-trideoxy-d-Glc by the enzyme PglD, with acetyl-CoA as a cosubstrate. We have determined the crystal structures of PglD in CoA-bound and unbound forms, refined to 1.8 and 1.75 Angstroms resolution, respectively. PglD is a trimer of subunits each comprised of two domains, an N-terminal {alpha}/{beta}-domain and a C-terminal left-handed {beta}-helix. Few structural differences accompany CoA binding, except in the C-terminal region following the {beta}-helix (residues 189-195), which adopts an extended structure in the unbound form and folds to extend the {beta}-helix upon binding CoA. Computational molecular docking suggests a different mode of nucleotide-sugar binding with respect to the acetyl-CoA donor, with the molecules arranged in an 'L-shape', compared with the 'in-line' orientation in related enzymes. Modeling indicates that the oxyanion intermediate would be stabilized by the NH group of Gly143', with His125' the most likely residue to function as a general base, removing H+ from the amino group prior to nucleophilic attack at the carbonyl carbon of acetyl-CoA. Site-specific mutations of active site residues confirmed the importance of His125', Glu124', and Asn118. We conclude that Asn118 exerts its function by stabilizing the intricate hydrogen bonding network within the active site and that Glu124' may function to increase the pKa of the putative general base, His125'.

Rangarajan,E.; Ruane, K.; Sulea, T.; Watson, D.; Proteau, A.; Leclerc, S.; Cygler, M.; Matte, A.; Young, N.

2008-01-01

241

Structural determination of 3beta-stearyloxy-urs-12-ene from Maytenus salicifolia by 1D and 2D NMR and quantitative 13C NMR spectroscopy.  

Science.gov (United States)

Six pentacyclic triterpenoids, 3beta-stearyloxy-urs-12-ene (1), friedelin (2), 3beta-friedelinol (3), alpha-amyrin (4), beta-amyrin (5), and lupeol (6), have been isolated from the hexane extract of Maytenus salicifolia Reissek (Celastraceae) leaves. The molecular and structural formula as well as the stereochemistry of a new pentacyclic triterpene (1) were determined using data obtained from 1H and 13C NMR spectra, DEPT135 and by 2D HSQC, HMBC, COSY and NOESY experiments. The molecular formula C48H84O2 was established using quantitative 13C NMR, and the molecular weight (692 Da) was confirmed by elemental analysis and mass spectrometry (GC-MS). PMID:16358293

Miranda, R R S; Silva, G D F; Duarte, L P; Fortes, I C P; Filho, S A Vieira

2006-02-01

242

Moments of the spin structure functions g1p and g1d for 0.05  

Science.gov (United States)

The spin structure functions g for the proton and the deuteron have been measured over a wide kinematic range in x and Q using 1.6 and 5.7 GeV longitudinally polarized electrons incident upon polarized NH 3 and ND 3 targets at Jefferson Lab. Scattered electrons were detected in the CEBAF Large Acceptance Spectrometer, for 0.05Gerasimov-Drell-Hearn sum rule. The first extraction of the generalized forward spin polarizability of the proton ?0p is also reported. This quantity shows strong Q dependence at low Q. Our analysis of the Q evolution of the first moment of g shows agreement in leading order with Heavy Baryon Chiral Perturbation Theory. However, a significant discrepancy is observed between the ?0p data and Chiral Perturbation calculations for ?0p, even at the lowest Q.

Prok, Y.; Bosted, P.; Burkert, V. D.; Deur, A.; Dharmawardane, K. V.; Dodge, G. E.; Griffioen, K. A.; Kuhn, S. E.; Minehart, R.; Adams, G.; Amaryan, M. J.; Anghinolfi, M.; Asryan, G.; Audit, G.; Avakian, H.; Bagdasaryan, H.; Baillie, N.; Ball, J. P.; Baltzell, N. A.; Barrow, S.; Battaglieri, M.; Beard, K.; Bedlinskiy, I.; Bektasoglu, M.; Bellis, M.; Benmouna, N.; Berman, B. L.; Biselli, A. S.; Blaszczyk, L.; Boiarinov, S.; Bonner, B. E.; Bouchigny, S.; Bradford, R.; Branford, D.; Briscoe, W. J.; Brooks, W. K.; Bültmann, S.; Butuceanu, C.; Calarco, J. R.; Careccia, S. L.; Carman, D. S.; Casey, L.; Cazes, A.; Chen, S.; Cheng, L.; Cole, P. L.; Collins, P.; Coltharp, P.; Cords, D.; Corvisiero, P.; Crabb, D.; Crede, V.; Cummings, J. P.; Dale, D.; Dashyan, N.; De Masi, R.; De Vita, R.; De Sanctis, E.; Degtyarenko, P. V.; Denizli, H.; Dennis, L.; Dhuga, K. S.; Dickson, R.; Djalali, C.; Doughty, D.; Dugger, M.; Dytman, S.; Dzyubak, O. P.; Egiyan, H.; Egiyan, K. S.; El Fassi, L.; Elouadrhiri, L.; Eugenio, P.; Fatemi, R.; Fedotov, G.; Feldman, G.; Fersh, R. G.; Feuerbach, R. J.; Forest, T. A.; Fradi, A.; Funsten, H.; Garçon, M.; Gavalian, G.; Gevorgyan, N.; Gilfoyle, G. P.; Giovanetti, K. L.; Girod, F. X.; Goetz, J. T.; Golovatch, E.; Gothe, R. W.; Guidal, M.; Guillo, M.; Guler, N.; Guo, L.; Gyurjyan, V.; Hadjidakis, C.; Hafidi, K.; Hakobyan, H.; Hanretty, C.; Hardie, J.; Hassall, N.; Heddle, D.; Hersman, F. W.; Hicks, K.; Hleiqawi, I.; Holtrop, M.; Huertas, M.; Hyde-Wright, C. E.; Ilieva, Y.; Ireland, D. G.; Ishkhanov, B. S.; Isupov, E. L.; Ito, M. M.; Jenkins, D.; Jo, H. S.; Johnstone, J. R.; Joo, K.; Juengst, H. G.; Kalantarians, N.; Keith, C. D.; Kellie, J. D.; Khandaker, M.; Kim, K. Y.; Kim, K.; Kim, W.; Klein, A.; Klein, F. J.; Klusman, M.; Kossov, M.; Krahn, Z.; Kramer, L. H.; Kubarovsky, V.; Kuhn, J.; Kuleshov, S. V.; Kuznetsov, V.; Lachniet, J.; Laget, J. M.; Langheinrich, J.; Lawrence, D.; Li, Ji; Lima, A. C. S.; Livingston, K.; Lu, H. Y.; Lukashin, K.; MacCormick, M.; Marchand, C.; Markov, N.; Mattione, P.; McAleer, S.; McKinnon, B.; McNabb, J. W. C.; Mecking, B. A.; Mestayer, M. D.; Meyer, C. A.; Mibe, T.; Mikhailov, K.; Mirazita, M.; Miskimen, R.; Mokeev, V.; Morand, L.; Moreno, B.; Moriya, K.; Morrow, S. A.; Moteabbed, M.; Mueller, J.; Munevar, E.; Mutchler, G. S.; Nadel-Turonski, P.; Nasseripour, R.; Niccolai, S.; Niculescu, G.; Niculescu, I.; Niczyporuk, B. B.; Niroula, M. R.; Niyazov, R. A.; Nozar, M.; O'Rielly, G. V.; Osipenko, M.; Ostrovidov, A. I.; Park, K.; Pasyuk, E.; Paterson, C.; Pereira, S. Anefalos; Philips, S. A.; Pierce, J.; Pivnyuk, N.; Pocanic, D.; Pogorelko, O.; Popa, I.; Pozdniakov, S.; Preedom, B. M.; Price, J. W.; Procureur, S.; Protopopescu, D.; Qin, L. M.; Raue, B. A.; Riccardi, G.; Ricco, G.; Ripani, M.; Ritchie, B. G.; Rosner, G.; Rossi, P.; Rowntree, D.; Rubin, P. D.; Sabatié, F.; Salamanca, J.; Salgado, C.; Santoro, J. P.; Sapunenko, V.; Schumacher, R. A.; Seely, M. L.; Serov, V. S.; Sharabian, Y. G.; Sharov, D.; Shaw, J.; Shvedunov, N. V.; Skabelin, A. V.; Smith, E. S.; Smith, L. C.; Sober, D. I.; Sokhan, D.; Stavinsky, A.; Stepanyan, S. S.; Stepanyan, S.; Stokes, B. E.; Stoler, P.; Strakovsky, I. I.; Strauch, S.; Suleiman, R.; Taiuti, M.; Tedeschi, D. J.; Tkabladze, A.; Tkachenko, S.; Todor, L.; Ungaro, M.; Vineyard, M. F.; Vlassov, A. V.; Watts, D. P.; Weinstein, L. B.; Weygand, D. P.; Williams, M.; Wolin, E.; Wood, M. H.; Yegneswaran, A.; Yun, J.; Zana, L.; Zhang, J.; Zhao, B.; Zhao, Z. W.; CLAS Collaboration

2009-02-01

243

A comprehensive update of the sequence and structure classification of kinases  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background A comprehensive update of the classification of all available kinases was carried out. This survey presents a complete global picture of this large functional class of proteins and confirms the soundness of our initial kinase classification scheme. Results The new survey found the total number of kinase sequences in the protein database has increased more than three-fold (from 17,310 to 59,402, and the number of determined kinase structures increased two-fold (from 359 to 702 in the past three years. However, the framework of the original two-tier classification scheme (in families and fold groups remains sufficient to describe all available kinases. Overall, the kinase sequences were classified into 25 families of homologous proteins, wherein 22 families (~98.8% of all sequences for which three-dimensional structures are known fall into 10 fold groups. These fold groups not only include some of the most widely spread proteins folds, such as the Rossmann-like fold, ferredoxin-like fold, TIM-barrel fold, and antiparallel ?-barrel fold, but also all major classes (all ?, all ?, ?+?, ?/? of protein structures. Fold predictions are made for remaining kinase families without a close homolog with solved structure. We also highlight two novel kinase structural folds, riboflavin kinase and dihydroxyacetone kinase, which have recently been characterized. Two protein families previously annotated as kinases are removed from the classification based on new experimental data. Conclusion Structural annotations of all kinase families are now revealed, including fold descriptions for all globular kinases, making this the first large functional class of proteins with a comprehensive structural annotation. Potential uses for this classification include deduction of protein function, structural fold, or enzymatic mechanism of poorly studied or newly discovered kinases based on proteins in the same family.

Zhang Hong

2005-03-01

244

Multi-scale coding of genomic information: From DNA sequence to genome structure and function  

International Nuclear Information System (INIS)

Understanding how chromatin is spatially and dynamically organized in the nucleus of eukaryotic cells and how this affects genome functions is one of the main challenges of cell biology. Since the different orders of packaging in the hierarchical organization of DNA condition the accessibility of DNA sequence elements to trans-acting factors that control the transcription and replication processes, there is actually a wealth of structural and dynamical information to learn in the primary DNA sequence. In this review, we show that when using concepts, methodologies, numerical and experimental techniques coming from statistical mechanics and nonlinear physics combined with wavelet-based multi-scale signal processing, we are able to decipher the multi-scale sequence encoding of chromatin condensation-decondensation mechanisms that play a fundamental role in regulating many molecular processes involved in nuclear functions.

245

Multi-scale coding of genomic information: From DNA sequence to genome structure and function  

Energy Technology Data Exchange (ETDEWEB)

Understanding how chromatin is spatially and dynamically organized in the nucleus of eukaryotic cells and how this affects genome functions is one of the main challenges of cell biology. Since the different orders of packaging in the hierarchical organization of DNA condition the accessibility of DNA sequence elements to trans-acting factors that control the transcription and replication processes, there is actually a wealth of structural and dynamical information to learn in the primary DNA sequence. In this review, we show that when using concepts, methodologies, numerical and experimental techniques coming from statistical mechanics and nonlinear physics combined with wavelet-based multi-scale signal processing, we are able to decipher the multi-scale sequence encoding of chromatin condensation-decondensation mechanisms that play a fundamental role in regulating many molecular processes involved in nuclear functions.

Arneodo, Alain, E-mail: alain.arneodo@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); Vaillant, Cedric, E-mail: cedric.vaillant@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); Audit, Benjamin, E-mail: benjamin.audit@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); Argoul, Francoise, E-mail: francoise.argoul@ens-lyon.f [Universite de Lyon, F-69000 Lyon (France); Laboratoire Joliot-Curie and Laboratoire de Physique, CNRS, Ecole Normale Superieure de Lyon, F-69007 Lyon (France); D' Aubenton-Carafa, Yves, E-mail: daubenton@cgm.cnrs-gif.f [Centre de Genetique Moleculaire, CNRS, Allee de la Terrasse, 91198 Gif-sur-Yvette (France); Thermes, Claude, E-mail: claude.thermes@cgm.cnrs-gif.f [Centre de Genetique Moleculaire, CNRS, Allee de la Terrasse, 91198 Gif-sur-Yvette (France)

2011-02-15

246

Genome sequence, comparative analysis and haplotype structure of the domestic dog.  

Science.gov (United States)

Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of genomes and genes. Notably, the majority of the most highly conserved non-coding sequences in mammalian genomes are clustered near a small subset of genes with important roles in development. Analysis of SNPs reveals long-range haplotypes across the entire dog genome, and defines the nature of genetic diversity within and across breeds. The current SNP map now makes it possible for genome-wide association studies to identify genes responsible for diseases and traits, with important consequences for human and companion animal health. PMID:16341006

Lindblad-Toh, Kerstin; Wade, Claire M; Mikkelsen, Tarjei S; Karlsson, Elinor K; Jaffe, David B; Kamal, Michael; Clamp, Michele; Chang, Jean L; Kulbokas, Edward J; Zody, Michael C; Mauceli, Evan; Xie, Xiaohui; Breen, Matthew; Wayne, Robert K; Ostrander, Elaine A; Ponting, Chris P; Galibert, Francis; Smith, Douglas R; DeJong, Pieter J; Kirkness, Ewen; Alvarez, Pablo; Biagi, Tara; Brockman, William; Butler, Jonathan; Chin, Chee-Wye; Cook, April; Cuff, James; Daly, Mark J; DeCaprio, David; Gnerre, Sante; Grabherr, Manfred; Kellis, Manolis; Kleber, Michael; Bardeleben, Carolyne; Goodstadt, Leo; Heger, Andreas; Hitte, Christophe; Kim, Lisa; Koepfli, Klaus-Peter; Parker, Heidi G; Pollinger, John P; Searle, Stephen M J; Sutter, Nathan B; Thomas, Rachael; Webber, Caleb; Baldwin, Jennifer; Abebe, Adal; Abouelleil, Amr; Aftuck, Lynne; Ait-Zahra, Mostafa; Aldredge, Tyler; Allen, Nicole; An, Peter; Anderson, Scott; Antoine, Claudel; Arachchi, Harindra; Aslam, Ali; Ayotte, Laura; Bachantsang, Pasang; Barry, Andrew; Bayul, Tashi; Benamara, Mostafa; Berlin, Aaron; Bessette, Daniel; Blitshteyn, Berta; Bloom, Toby; Blye, Jason; Boguslavskiy, Leonid; Bonnet, Claude; Boukhgalter, Boris; Brown, Adam; Cahill, Patrick; Calixte, Nadia; Camarata, Jody; Cheshatsang, Yama; Chu, Jeffrey; Citroen, Mieke; Collymore, Alville; Cooke, Patrick; Dawoe, Tenzin; Daza, Riza; Decktor, Karin; DeGray, Stuart; Dhargay, Norbu; Dooley, Kimberly; Dooley, Kathleen; Dorje, Passang; Dorjee, Kunsang; Dorris, Lester; Duffey, Noah; Dupes, Alan; Egbiremolen, Osebhajajeme; Elong, Richard; Falk, Jill; Farina, Abderrahim; Faro, Susan; Ferguson, Diallo; Ferreira, Patricia; Fisher, Sheila; FitzGerald, Mike; Foley, Karen; Foley, Chelsea; Franke, Alicia; Friedrich, Dennis; Gage, Diane; Garber, Manuel; Gearin, Gary; Giannoukos, Georgia; Goode, Tina; Goyette, Audra; Graham, Joseph; Grandbois, Edward; Gyaltsen, Kunsang; Hafez, Nabil; Hagopian, Daniel; Hagos, Birhane; Hall, Jennifer; Healy, Claire; Hegarty, Ryan; Honan, Tracey; Horn, Andrea; Houde, Nathan; Hughes, Leanne; Hunnicutt, Leigh; Husby, M; Jester, Benjamin; Jones, Charlien; Kamat, Asha; Kanga, Ben; Kells, Cristyn; Khazanovich, Dmitry; Kieu, Alix Chinh; Kisner, Peter; Kumar, Mayank; Lance, Krista; Landers, Thomas; Lara, Marcia; Lee, William; Leger, Jean-Pierre; Lennon, Niall; Leuper, Lisa; LeVine, Sarah; Liu, Jinlei; Liu, Xiaohong; Lokyitsang, Yeshi; Lokyitsang, Tashi; Lui, Annie; Macdonald, Jan; Major, John; Marabella, Richard; Maru, Kebede; Matthews, Charles; McDonough, Susan; Mehta, Teena; Meldrim, James; Melnikov, Alexandre; Meneus, Louis; Mihalev, Atanas; Mihova, Tanya; Miller, Karen; Mittelman, Rachel; Mlenga, Valentine; Mulrain, Leonidas; Munson, Glen; Navidi, Adam; Naylor, Jerome; Nguyen, Tuyen; Nguyen, Nga; Nguyen, Cindy; Nguyen, Thu; Nicol, Robert; Norbu, Nyima; Norbu, Choe; Novod, Nathaniel; Nyima, Tenchoe; Olandt, Peter; O'Neill, Barry; O'Neill, Keith; Osman, Sahal; Oyono, Lucien; Patti, Christopher; Perrin, Danielle; Phunkhang, Pema; Pierre, Fritz; Priest, Margaret; Rachupka, Anthony; Raghuraman, Sujaa; Rameau, Rayale; Ray, Verneda; Raymond, Christina; Rege, Filip; Rise, Cecil; Rogers, Julie; Rogov, Peter; Sahalie, Julie; Settipalli, Sampath; Sharpe, Theodore; Shea, Terrance; Sheehan, Mechele; Sherpa, Ngawang; Shi, Jianying; Shih, Diana; Sloan, Jessie; Smith, Cherylyn; Sparrow, Todd; Stalker, John; Stange-Thomann, Nicole; Stavropoulos, Sharon; Stone, Catherine; Stone, Sabrina; Sykes, Sean; Tchuinga, Pierre; Tenzing, Pema; Tesfaye, Senait; Thoulutsang, Dawa; Thoulutsang, Yama; Topham, Kerri; Topping, Ira; Tsamla, Tsamla; Vassiliev, Helen; Venkataraman, Vijay; Vo, Andy; Wangchuk, Tsering; Wangdi, Tsering; Weiand, Michael; Wilkinson, Jane; Wilson, Adam; Yadav, Shailendra; Yang, Shuli; Yang, Xiaoping; Young, Geneva; Yu, Qing; Zainoun, Joanne; Zembek, Lisa; Zimmer, Andrew; Lander, Eric S

2005-12-01

247

Structure and patterns of sequence variation in the mitochondrial DNA control region of the great cats.  

Science.gov (United States)

Mitochondrial DNA control region structure and variation were determined in the five species of the genus Panthera. Comparative analyses revealed two hypervariable segments, a central conserved region, and the occurrence of size and sequence heteroplasmy. As observed in the domestic cat, but not commonly seen in other animals, two repetitive sequence arrays (RS-2 with an 80-bp motif and RS-3 with a 6-10-bp motif) were identified. The 3' ends of RS-2 and RS-3 were highly conserved among species, suggesting that these motifs have different functional constraints. Control region sequences provided improved phylogenetic resolution grouping the sister taxa lion (Panthera leo) and leopard (Panthera pardus), with the jaguar (Panthera onca). PMID:16120284

Jae-Heup, K; Eizirik, E; O'Brien, S J; Johnson, W E

2001-10-01

248

A structural study for the optimisation of functional motifs encoded in protein sequences  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background A large number of PROSITE patterns select false positives and/or miss known true positives. It is possible that – at least in some cases – the weak specificity and/or sensitivity of a pattern is due to the fact that one, or maybe more, functional and/or structural key residues are not represented in the pattern. Multiple sequence alignments are commonly used to build functional sequence patterns. If residues structurally conserved in proteins sharing a function cannot be aligned in a multiple sequence alignment, they are likely to be missed in a standard pattern construction procedure. Results Here we present a new procedure aimed at improving the sensitivity and/ or specificity of poorly-performing patterns. The procedure can be summarised as follows: 1. residues structurally conserved in different proteins, that are true positives for a pattern, are identified by means of a computational technique and by visual inspection. 2. the sequence positions of the structurally conserved residues falling outside the pattern are used to build extended sequence patterns. 3. the extended patterns are optimised on the SWISS-PROT database for their sensitivity and specificity. The method was applied to eight PROSITE patterns. Whenever structurally conserved residues are found in the surface region close to the pattern (seven out of eight cases, the addition of information inferred from structural analysis is shown to improve pattern selectivity and in some cases selectivity and sensitivity as well. In some of the cases considered the procedure allowed the identification of functionally interesting residues, whose biological role is also discussed. Conclusion Our method can be applied to any type of functional motif or pattern (not only PROSITE ones which is not able to select all and only the true positive hits and for which at least two true positive structures are available. The computational technique for the identification of structurally conserved residues is already available on request and will be soon accessible on our web server. The procedure is intended for the use of pattern database curators and of scientists interested in a specific protein family for which no specific or selective patterns are yet available.

Helmer-Citterich Manuela

2004-04-01

249

Moments of the Spin Structure Functions g_1^p and g_1^d for 0.05 < Q^2 < 3.0 GeV^2  

CERN Document Server

The spin structure functions g_1 for the proton and the deuteron have been measured over a wide kinematic range in x and Q2 using 1.6 and 5.7 GeV longitudinally polarized electrons incident upon polarized NH_3 and ND_3 targets at Jefferson Lab. Scattered electrons were detected in the CEBAF Large Acceptance Spectrometer, for 0.05 < Q^2 < 5 GeV^2 and W < 3 GeV. The first moments of g_1 for the proton and deuteron are presented -- both have a negative slope at low Q2, as predicted by the extended Gerasimov-Drell-Hearn sum rule. The first result for the generalized forward spin polarizability of the proton gamma_0^p is also reported, and shows evidence of scaling above Q^2 = 1.5 GeV^2. Although the first moments of g_1 are consistent with Chiral Perturbation Theory (ChPT) calculations up to approximately Q^2 = 0.06 GeV^2, a significant discrepancy is observed between the \\gamma_0^p data and ChPT for gamma_0^p,even at the lowest Q2.

Prok, Y; Burkert, V D; Deur, A; Dharmawardane, K V; Dodge, G E; Griffioen, K A; Kuhn, S E; Minehart, R; Adams, G; Amaryan, M J; Anghinolfi, M; Asryan, G; Audit, G; Avakian, H; Bagdasaryan, H; Baillie, N; Ball, J P; Baltzell, N A; Barrow, S; Battaglieri, M; Beard, K; Bedlinskiy, I; Bektasoglu, M; Bellis, M; Benmouna, N; Berman, B L; Biselli, A S; Blaszczyk, L; Boiarinov, S; Bonner, B E; Bouchigny, S; Bradford, R; Branford, D; Briscoe, W J; Brooks, W K; Bültmann, S; Butuceanu, C; Calarco, J R; Careccia, S L; Carman, D S; Casey, L; Cazes, A; Chen, S; Cheng, L; Cole, P L; Collins, P; Coltharp, P; Cords, D; Corvisiero, P; Crabb, D; Credé, V; Cummings, J P; Dale, D; Dashyan, N; De Masi, R; De Vita, R; De Sanctis, E; Degtyarenko, P V; Denizli, H; Dennis, L; Dhuga, K S; Dickson, R; Djalali, C; Doughty, D; Dugger, M; Dytman, S; Dzyubak, O P; Egiyan, H; Egiyan, K S; El Fassi, L; Elouadrhiri, L; Eugenio, P; Fatemi, R; Fedotov, G; Feldman, G; Fersh, R G; Feuerbach, R J; Forest, T A; Fradi, A; Funsten, H; Garçon, M; Gavalian, G; Gevorgyan, N; Gilfoyle, G P; Giovanetti, K L; Girod, F X; Goetz, J T; Golovatch, E; Gothe, R W; Guidal, M; Guillo, M; Guler, N; Guo, L; Gyurjyan, V; Hadjidakis, C; Hafidi, K; Hakobyan, H; Hanretty, C; Hardie, J; Hassall, N; Heddle, D; Hersman, F W; Hicks, K; Hleiqawi, I; Holtrop, M; Huertas, M; Hyde-Wright, C E; Ilieva, Y; Ireland, D G; Ishkhanov, B S; Isupov, E L; Ito, M M; Jenkins, D; Jo, H S; Johnstone, J R; Joo, K; Jüngst, H G; Kalantarians, N; Keith, C D; Kellie, J D; Khandaker, M; Kim, K Y; Kim, K; Kim, W; Klein, A; Klein, F J; Klusman, M; Kossov, M; Krahn, Z; Kramer, L H; Kubarovski, V; Kühn, J; Kuleshov, S V; Kuznetsov, V; Lachniet, J; Laget, J M; Langheinrich, J; Lawrence, D; Ji Li; Lima, A C S; Livingston, K; Lu, H Y; Lukashin, K; MacCormick, M; Marchand, C; Markov, N; Mattione, P; McAleer, S; McKinnon, B; McNabb, J W C; Mecking, B A; Mestayer, M D; Meyer, C A; Mibe, T; Mikhailov, K; Mirazita, M; Miskimen, R; Mokeev, V; Morand, L; Moreno, B; Moriya, K; Morrow, S A; Moteabbed, M; Müller, J; Munevar, E; Mutchler, G S; Nadel-Turonski, P; Nasseripour, R; Niccolai, S; Niculescu, G; Niculescu, I; Niczyporuk, B B; Niroula, M R; Niyazov, R A; Nozar, M; O'Rielly, G V; Osipenko, M; Ostrovidov, A I; Park, K; Pasyuk, E; Paterson, C; Anefalos Pereira, S; Philips, S A; Pierce, J; Pivnyuk, N; Pocanic, D; Pogorelko, O; Popa, I; Pozdniakov, S; Preedom, B M; Price, J W; Procureur, S; Protopopescu, D; Qin, L M; Raue, B A; Riccardi, G; Ricco, G; Ripani, M; Ritchie, B G; Rosner, G; Rossi, P; Rowntree, D; Rubin, P D; Sabati, F; Salamanca, J; Salgado, C; Santoro, e J P; Sapunenko, V; Schumacher, R A; Seely, M L; Serov, V S; Sharabyan, Yu G; Sharov, D; Shaw, J; Shvedunov, N V; Skabelin, A V; Smith, E S; Smith, L C; Sober, D I; Sokhan, D; Stavinsky, A; Stepanyan, S S; Stepanyan, S; Stokes, B E; Stoler, P; Strakovsky, I I; Strauch, S; Suleiman, R; Taiuti, M; Tedeschi, D J; Tkabladze, A; Tkachenko, S; Todor, L; Ungaro, M; Vineyard, M F; Vlassov, A V; Watts, D P; Weinstein, L B; Weygand, D P; Williams, M; Wolin, E; Wood, M H; Yegneswaran, A; Yun, J; Zana, L; Zhang, J; Zhao, B; Zhao, Z W

2008-01-01

250

Moments of the Spin Structure Functions g1p and g1d for 0.05 < Q2 < 3.0 GeV2  

Energy Technology Data Exchange (ETDEWEB)

The spin structure functions $g_1$ for the proton and the deuteron have been measured over a wide kinematic range in $x$ and \\Q2 using 1.6 and 5.7 GeV longitudinally polarized electrons incident upon polarized NH$_3$ and ND$_3$ targets at Jefferson Lab. Scattered electrons were detected in the CEBAF Large Acceptance Spectrometer, for $0.05 < Q^2 < 5 $\\ GeV$^2$ and $W < 3$ GeV. The first moments of $g_1$ for the proton and deuteron are presented -- both have a negative slope at low \\Q2, as predicted by the extended Gerasimov-Drell-Hearn sum rule. The first result for the generalized forward spin polarizability of the proton $\\gamma_0^p$ is also reported, and shows evidence of scaling above $Q^2$ = 1.5 GeV$^2$. Although the first moments of $g_1$ are consistent with Chiral Perturbation Theory (\\ChPT) calculations up to approximately $Q^2 = 0.06$ GeV$^2$, a significant discrepancy is observed between the $\\gamma_0^p$ data and \\ChPT\\ for $\\gamma_0^p$,even at the lowest \\Q2.

Prok, Yelena; Bosted, Peter; Burkert, Volker; Deur, Alexandre; Dharmawardane, Kahanawita; Dodge, Gail; Griffioen, Keith; Kuhn, Sebastian; Minehart, Ralph; Adams, Gary; Amaryan, Moscov; Amaryan, Moskov; Anghinolfi, Marco; Asryan, G.; Audit, Gerard; Avagyan, Harutyun; Baghdasaryan, Hovhannes; Baillie, Nathan; Ball, J.P.; Ball, Jacques; Baltzell, Nathan; Barrow, Steve; Battaglieri, Marco; Beard, Kevin; Bedlinskiy, Ivan; Bektasoglu, Mehmet; Bellis, Matthew; Benmouna, Nawal; Berman, Barry; Biselli, Angela; Blaszczyk, Lukasz; Boyarinov, Sergey; Bonner, Billy; Bouchigny, Sylvain; Bradford, Robert; Branford, Derek; Briscoe, William; Brooks, William; Bultmann, S.; Bueltmann, Stephen; Butuceanu, Cornel; Calarco, John; Careccia, Sharon; Carman, Daniel; Casey, Liam; Cazes, Antoine; Chen, Shifeng; Cheng, Lu; Cole, Philip; Collins, Patrick; Coltharp, Philip; Cords, Dieter; Corvisiero, Pietro; Crabb, Donald; Crede, Volker; Cummings, John; Dale, Daniel; Dashyan, Natalya; De Masi, Rita; De Vita, Raffaella; De Sanctis, Enzo; Degtiarenko, Pavel; Denizli, Haluk; Dennis, Lawrence; Dhuga, Kalvir; Dickson, Richard; Djalali, Chaden; Doughty, David; Dugger, Michael; Dytman, Steven; Dzyubak, Oleksandr; Egiyan, Hovanes; Egiyan, Kim; Elfassi, Lamiaa; Elouadrhiri, Latifa; Eugenio, Paul; Fatemi, Renee; Fedotov, Gleb; Feldman, Gerald; Fersch, Robert; Feuerbach, Robert; Forest, Tony; Fradi, Ahmed; Funsten, Herbert; Garcon, Michel; Gavalian, Gagik; Gevorgyan, Nerses; Gilfoyle, Gerard; Giovanetti, Kevin; Girod, Francois-Xavier; Goetz, John; Golovach, Evgeny; Gothe, Ralf; Guidal, Michel; Guillo, Matthieu; Guler, Nevzat; Guo, Lei; Gyurjyan, Vardan; Hadjidakis, Cynthia; Hafidi, Kawtar; Hakobyan, Hayk; Hanretty, Charles; Hardie, John; Hassall, Neil; Heddle, David; Hersman, F.; Hicks, Kenneth; Hleiqawi, Ishaq; Holtrop, Maurik; Huertas, Marco; Hyde, Charles; Ilieva, Yordanka; Ireland, David; Ishkhanov, Boris; Isupov, Evgeny; Ito, Mark; Jenkins, David; Jo, Hyon-Suk; Johnstone, John; Joo, Kyungseon; Juengst, Henry; Kalantarians, Narbe; Keith, Christopher; Kellie, James; Khandaker, Mahbubul; Kim, Kui; Kim, Kyungmo; Kim, Wooyoung; Klein, Andreas; Klein, Franz; Klusman, Mike; Kossov, Mikhail; Krahn, Zebulun; Kramer, Laird; Kubarovsky, Valery; Kuhn, Joachim; Kuleshov, Sergey; Kuznetsov, Viacheslav; Lachniet, Jeff; Laget, Jean; Langheinrich, Jorn; Lawrence, Dave; Lima, Ana; Livingston, Kenneth; Lu, Haiyun; Lukashin, K.; MacCormick, Marion; Marchand, Claude; Markov, Nikolai; Mattione, Paul; McAleer, Simeon; McKinnon, Bryan; McNabb, John; Mecking, Bernhard; Mestayer, Mac; Meyer, Curtis; Mibe, Tsutomu; Mikhaylov, Konstantin; Mirazita, Marco; Miskimen, Rory; Mokeev, Viktor; Morand, Ludyvine; Moreno, Brahim; Moriya, Kei; Morrow, Steven; Moteabbed, Maryam; Mueller, James; Munevar Espitia, Edwin; Mutchler, Gordon; Nadel-Turonski, Pawel; Nasseripour, Rakhsha; Niccolai, Silvia; Niculescu, Gabriel; Niculescu, Maria-Ioana; Niczyporuk, Bogdan; Niroula, Megh; Niyazov, Rustam; Nozar, Mina; O' Rielly, Grant; Osipenko, Mikhail; Ostrovidov, Alexander; Park, Kijun; Pasyuk, Evgueni; Paterson, Craig; Anefalos Pereira, S.; Philips, Sasha; Pierce, J.; Pivnyuk, Nikolay; Pocanic, Dinko; Pogorelko, Oleg; Popa, Iulian; Pozdnyakov, Sergey; Preedom, Barry; Price, John; Procureur, Sebastien; Protopopescu, Dan; Qin, Liming; Raue, Brian; Riccardi, Gregory; Ricco, Giovanni; Ripani, Marco; Ritchie, Barry; Rosner, Guenther; Rossi, Patrizia; Rowntree, David; Rubin, Philip; Sabatie, Franck; Salamanca, Julian; Salgado, Carlos; Santoro, Joseph; Sapunenko, Vladimir; Schumacher, Reinhard; Seely, Mikell; Serov, Vladimir; Sharabian, Youri; Sharov, Dmitri; Shaw, Jeffrey; Shvedunov, Nikolay; Skabelin, Alexander; Smith, Elton; Smith, Lee; Sober, Daniel; Sokhan, Daria; Stavinskiy, Aleksey; Stepanyan, Samuel; Stepanyan, Stepan; Stokes, Burnham; Stoler, Paul; Strakovski, Igor; Strauch, Steffen; Suleiman, Riad; Taiuti, Mauro; Tedeschi, David; Tkabladze, Avtandil; Tkachenko, Svyatoslav; Todor, Luminita; Ungaro, Maurizio; V

2009-02-01

251

Synthesis, structure and luminescence of novel 1D chain coordination polymers [Ln(isophth)(Hisophth)(H 2O) 4·4H 2O] n (Ln=Sm, Dy)  

Science.gov (United States)

In this paper, we place emphasis on the structure and luminescent properties of novel [Ln(isophth)(Hisophth)(H 2O) 4·4H 2O] n (Ln=Sm, Dy; H 2isophth=isophthalic acid) coordination polymers, which was characterized by elementary analysis, IR, UV, and especially the X-rays single-crystal diffraction. The two complexes are an isostructural series which crystallize in the monoclinic system with P21/ c space group. The samarium compound, isomorphous with dysprosium one were determined as a one-dimensional (1D) chain-like configuration by one of the two isophthalic acid ligands in bidentate chelated pattern, while the other isophthalic acid ligands only offer a single carboxylic acid group to bond which is abbreviated as 'Hisophth'. Fluorescence excitation and emission spectra show that the isophthalic acid is suitable for the sensitization on the luminescence of both Sm(III) and Dy(III).

Yan, Bing; Bai, Yingying; Chen, Zhenxia

2005-05-01

252

T-Coffee: a web server for the multiple sequence alignment of protein and RNA sequences using structural information and homology extension  

OpenAIRE

This article introduces a new interface for T-Coffee, a consistency-based multiple sequence alignment program. This interface provides an easy and intuitive access to the most popular functionality of the package. These include the default T-Coffee mode for protein and nucleic acid sequences, the M-Coffee mode that allows combining the output of any other aligners, and template-based modes of T-Coffee that deliver high accuracy alignments while using structural or homology derived templates. ...

Di Tommaso, Paolo; Moretti, Sebastien; Xenarios, Ioannis; Orobitg Cortada, Miquel; Montan?ola Lacort, Alberto

2011-01-01

253

BreaKmer: detection of structural variation in targeted massively parallel sequencing data using kmers.  

Science.gov (United States)

Genomic structural variation (SV), a common hallmark of cancer, has important predictive and therapeutic implications. However, accurately detecting SV using high-throughput sequencing data remains challenging, especially for 'targeted' resequencing efforts. This is critically important in the clinical setting where targeted resequencing is frequently being applied to rapidly assess clinically actionable mutations in tumor biopsies in a cost-effective manner. We present BreaKmer, a novel approach that uses a 'kmer' strategy to assemble misaligned sequence reads for predicting insertions, deletions, inversions, tandem duplications and translocations at base-pair resolution in targeted resequencing data. Variants are predicted by realigning an assembled consensus sequence created from sequence reads that were abnormally aligned to the reference genome. Using targeted resequencing data from tumor specimens with orthogonally validated SV, non-tumor samples and whole-genome sequencing data, BreaKmer had a 97.4% overall sensitivity for known events and predicted 17 positively validated, novel variants. Relative to four publically available algorithms, BreaKmer detected SV with increased sensitivity and limited calls in non-tumor samples, key features for variant analysis of tumor specimens in both the clinical and research settings. PMID:25428359

Abo, Ryan P; Ducar, Matthew; Garcia, Elizabeth P; Thorner, Aaron R; Rojas-Rudilla, Vanesa; Lin, Ling; Sholl, Lynette M; Hahn, William C; Meyerson, Matthew; Lindeman, Neal I; Van Hummelen, Paul; MacConaill, Laura E

2015-02-18

254

High-resolution NMR structure of an AT-rich DNA sequence  

International Nuclear Information System (INIS)

We have determined, by proton NMR and complete relaxation matrix methods, the high-resolution structure of a DNA oligonucleotide in solution with nine contiguous AT base pairs. The stretch of AT pairs, TAATTATAA.TTATAATTA, is imbedded in a 27-nucleotide stem-and-loop construct, which is stabilized by terminal GC base pairs and an extraordinarily stable DNA loop GAA (Hirao et al., 1994, Nucleic Acids Res.22, 576-582). The AT-rich sequence has three repeated TAA.TTA motifs, one in the reverse orientation. Comparison of the local conformations of the three motifs shows that the sequence context has a minor effect here: atomic RMSD between the three TAA.TTA fragments is 0.4-0.5 A, while each fragment is defined within the RMSD of 0.3-0.4 A. The AT-rich stem also contains a consensus sequence for the Pribnow box, TATAAT. The TpA, ApT, and TpT.ApA steps have characteristic local conformations, a combination of which determines a unique sequence-dependent pattern of minor groove width variation. All three TpA steps are locally bent in the direction compressing the major groove of DNA. These bends, however, compensate each other, because of their relative position in the sequence, so that the overall helical axis is essentially straight

255

Genomic structure and nucleotide sequence of the p55 gene of the puffer fish Fugu rubripes  

Energy Technology Data Exchange (ETDEWEB)

The p55 gene, which codes for a 55-kDa erythrocyte membrane protein, has been cloned and sequenced from the genome of the Japanese puffer fish Fugu rubripes (Fugu). This organism has the smallest recorded vertebrate genome and therefore provides an efficient way to sequence genes at the genomic level. The gene encoding p55 covers 5.5 kb from the beginning to the end of the coding sequence, four to six times smaller than the estimated size of the human gene, and is encoded by 12 exons. The structure of this gene has not been previously elucidated, but from this and other data we would predict a similar or identical structure in mammals. The predicted amino acid sequence of this gene in Fugu, coding for a polypeptide of 467 amino acids, is very similar to that of the human gene with the exception of the first two exons, which differ considerably. The predicted Fugu protein has a molecular weight (52.6 kDa compared with 52.3 kDa) and an isoelectric point very similar to those of human p55. In human, the p55 gene lies in the gene-dense Xq28 region, just 30 kb 3{prime} to the Factor VIII gene, and is estimated to cover 20-30 kb. Its 5{prime} end is associated with a CpG island, although there is no evidence that this is the case in Fugu. The small size of genes in Fugu and the high coding homology that they share with their mammalian equivalents, both in structure and sequence, make this compact vertebrate genome an ideal model for genomic studies. 23 refs., 3 figs.

Elgar, G.; Rattray, F.; Greystrong, J.; Brenner, S. [Univ. of Cambridge (United Kingdom)

1995-06-10

256

Timing of developmental sequences in different brain structures: physiological and pathological implications.  

Science.gov (United States)

The developing brain is not a small adult brain. Voltage- and transmitter-gated currents, like network-driven patterns, follow a developmental sequence. Studies initially performed in cortical structures and subsequently in subcortical structures have unravelled a developmental sequence of events in which intrinsic voltage-gated calcium currents are followed by nonsynaptic calcium plateaux and synapse-driven giant depolarising potentials, orchestrated by depolarizing actions of GABA and long-lasting NMDA receptor-mediated currents. The function of these early patterns is to enable heterogeneous neurons to fire and wire together rather than to code specific modalities. However, at some stage, behaviourally relevant activities must replace these immature patterns, implying the presence of programmed stop signals. Here, we show that the developing striatum follows a developmental sequence in which immature patterns are silenced precisely when the pup starts locomotion. This is mediated by a loss of the long-lasting NMDA-NR2C/D receptor-mediated current and the expression of a voltage-gated K(+) current. At the same time, the descending inputs to the spinal cord become fully functional, accompanying a GABA/glycine polarity shift and ending the expression of developmental patterns. Therefore, although the timetable of development differs in different brain structures, the g sequence is quite similar, relying first on nonsynaptic events and then on synaptic oscillations that entrain large neuronal populations. In keeping with the 'neuroarcheology' theory, genetic mutations or environmental insults that perturb these developmental sequences constitute early signatures of developmental disorders. Birth dating developmental disorders thus provides important indicators of the event that triggers the pathological cascade leading ultimately to disease. PMID:22708595

Dehorter, N; Vinay, L; Hammond, C; Ben-Ari, Y

2012-06-01

257

Describing sequencing results of structural chromosome rearrangements with a suggested next-generation cytogenetic nomenclature.  

Science.gov (United States)

With recent rapid advances in genomic technologies, precise delineation of structural chromosome rearrangements at the nucleotide level is becoming increasingly feasible. In this era of "next-generation cytogenetics" (i.e., an integration of traditional cytogenetic techniques and next-generation sequencing), a consensus nomenclature is essential for accurate communication and data sharing. Currently, nomenclature for describing the sequencing data of these aberrations is lacking. Herein, we present a system called Next-Gen Cytogenetic Nomenclature, which is concordant with the International System for Human Cytogenetic Nomenclature (2013). This system starts with the alignment of rearrangement sequences by BLAT or BLAST (alignment tools) and arrives at a concise and detailed description of chromosomal changes. To facilitate usage and implementation of this nomenclature, we are developing a program designated BLA(S)T Output Sequence Tool of Nomenclature (BOSToN), a demonstrative version of which is accessible online. A standardized characterization of structural chromosomal rearrangements is essential both for research analyses and for application in the clinical setting. PMID:24746958

Ordulu, Zehra; Wong, Kristen E; Currall, Benjamin B; Ivanov, Andrew R; Pereira, Shahrin; Althari, Sara; Gusella, James F; Talkowski, Michael E; Morton, Cynthia C

2014-05-01

258

How Structural and Physicochemical Determinants Shape Sequence Constraints in a Functional Enzyme  

Science.gov (United States)

The need for interfacing structural biology and biophysics to molecular evolution is being increasingly recognized. One part of the big problem is to understand how physics and chemistry shape the sequence space available to functional proteins, while satisfying the needs of biology. Here we present a quantitative, structure-based analysis of a high-resolution map describing the tolerance to all substitutions in all positions of a functional enzyme, namely a TEM lactamase previously studied through deep sequencing of mutants growing in competition experiments with selection against ampicillin. Substitutions are rarely observed within 7 Å of the active site, a stringency that is relaxed slowly and extends up to 15–20 Å, with buried residues being especially sensitive. Substitution patterns in over one third of the residues can be quantitatively modeled by monotonic dependencies on amino acid descriptors and predictions of changes in folding stability. Amino acid volume and steric hindrance shape constraints on the protein core; hydrophobicity and solubility shape constraints on hydrophobic clusters underneath the surface, and on salt bridges and polar networks at the protein surface together with charge and hydrogen bonding capacity. Amino acid solubility, flexibility and conformational descriptors also provide additional constraints at many locations. These findings provide fundamental insights into the chemistry underlying protein evolution and design, by quantitating links between sequence and different protein traits, illuminating subtle and unexpected sequence-trait relationships and pinpointing what traits are sacrificed upon gain-of-function mutation. PMID:25706742

Abriata, Luciano A.; Palzkill, Timothy; Dal Peraro, Matteo

2015-01-01

259

Homoleptic 1-D iron selenolate complexes-synthesis, structure, magnetic and thermal behaviour of (1)(?)[Fe(SeR)2] (R=Ph, Mes).  

Science.gov (United States)

The first examples of polymeric homoleptic iron chalcogenolato complexes (1)(?)[Fe(SePh)(2)] and (1)(?)[Fe(SeMes)(2)] (Ph = phenyl = C(6)H(5), Mes = mesityl = C(6)H(2)-2,4,6-(CH(3))(3)) have been both prepared by reaction of [Fe(N(SiMe(3))(2))(2)] with two equivalents of HSeR (R = Ph, Mes) while (1)(?)[Fe(SePh)(2)] was found to be also easily accessible through reactions of either FeCl(2), Fe(OOCCH(3))(2) or FeCl(3) with PhSeSiMe(3) in THF. In the crystal, the two compounds form one-dimensional chains with bridging selenolate ligands comprising distinctly different Fe-Se-Fe bridging angles, namely 71.15-72.57° in (1)(?)[Fe(SePh)(2)] and 91.80° in (1)(?)[Fe(SeMes)(2)]. Magnetic measurements supported by DFT calculations reveal that this geometrical change has a pronounced influence on the antiferromagnetic exchange interactions of the unpaired electrons along the chains in the two different compounds with a calculated magnetic exchange coupling constant of J = -137 cm(-1) in (1)(?)[Fe(SePh)(2)] and J = -20 cm(-1) in (1)(?)[Fe(SeMes)(2)]. In addition we were able to show that the ring molecule [Fe(SePh)(2)](12) which is a structural isomer of (1)(?)[Fe(SePh)(2)] behaves magnetically similar to the latter one. Investigations by powder XRD reveal that the ring molecule is only a metastable intermediate which converts in THF completely to form (1)(?)[Fe(SePh)(2)]. Thermal gravimetric analysis of (1)(?)[Fe(SePh)(2)] under vacuum conditions shows that the compound is thermally labile and already starts to decompose above 30 °C in a two step process under cleavage of SePh(2) to finally form at 250 °C tetragonal PbO-type FeSe. The reaction of (1)(?)[Fe(SePh)(2)] with the Lewis base 1,10-phenanthroline yielded, depending on the conditions, the octahedral monomeric complexes [Fe(SePh)(2)(1,10-phen)(2)] and [Fe(1,10-phen)(3)][Fe(SePh)(4)]. PMID:21637874

Eichhöfer, Andreas; Buth, Gernot; Dolci, Francesco; Fink, Karin; Mole, Richard A; Wood, Paul T

2011-07-14

260

Main: 1D6R [RPSD[Archive  

Lifescience Database Archive (English)

Full Text Available 1D6R ?? Soybean Glycine max (L.) Merrill Bowman-Birk Type Proteinase Inhibitor Precursor Glyci ... Warkentin, G.Wenzl, P.Flecker Crystal Structure Of Cancer ... Chemopreventive Bowman-Birk Inhibitor In Ternary C ...

261

Prediction of protein structural features from sequence data based on shannon entropy and kolmogorov complexity.  

Science.gov (United States)

While the genome for a given organism stores the information necessary for the organism to function and flourish it is the proteins that are encoded by the genome that perhaps more than anything else characterize the phenotype for that organism. It is therefore not surprising that one of the many approaches to understanding and predicting protein folding and properties has come from genomics and more specifically from multiple sequence alignments. In this work I explore ways in which data derived from sequence alignment data can be used to investigate in a predictive way three different aspects of protein structure: secondary structures, inter-residue contacts and the dynamics of switching between different states of the protein. In particular the use of Kolmogorov complexity has identified a novel pathway towards achieving these goals. PMID:25856073

Bywater, Robert Paul

2015-01-01

262

De novo prediction of structured RNAs from genomic sequences  

DEFF Research Database (Denmark)

Growing recognition of the numerous, diverse and important roles played by non-coding RNA in all organisms motivates better elucidation of these cellular components. Comparative genomics is a powerful tool for this task and is arguably preferable to any high-throughput experimental technology currently available, because evolutionary conservation highlights functionally important regions. Conserved secondary structure, rather than primary sequence, is the hallmark of many functionally important RNAs, because compensatory substitutions in base-paired regions preserve structure. Unfortunately, such substitutions also obscure sequence identity and confound alignment algorithms, which complicates analysis greatly. This paper surveys recent computational advances in this difficult arena, which have enabled genome-scale prediction of cross-species conserved RNA elements. These predictions suggest that a wealth of these elements indeed exist

Gorodkin, Jan; Hofacker, Ivo L.

2010-01-01

263

A molecular phylogeny of Hypnales (Bryophyta inferred from ITS2 sequence-structure data  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Hypnales comprise over 50% of all pleurocarpous mosses. They provide a young radiation complicating phylogenetic analyses. To resolve the hypnalean phylogeny, it is necessary to use a phylogenetic marker providing highly variable features to resolve species on the one hand and conserved features enabling a backbone analysis on the other. Therefore we used highly variable internal transcribed spacer 2 (ITS2 sequences and conserved secondary structures, as deposited with the ITS2 Database, simultaneously. Findings We built an accurate and in parts robustly resolved large scale phylogeny for 1,634 currently available hypnalean ITS2 sequence-structure pairs. Conclusions Profile Neighbor-Joining revealed a possible hypnalean backbone, indicating that most of the hypnalean taxa classified as different moss families are polyphyletic assemblages awaiting taxonomic changes.

Wolf Matthias

2010-11-01

264

Slipped structures in DNA triplet repeat sequences: entropic contributions to genetic instabilities.  

Science.gov (United States)

Slipped DNA structures can occur in sequences with direct repeats. DNA triplet repeats, particularly (CTG)n, (CGC)n, and (GAA)n, are known to be associated with several neurological diseases. Slippage is probably the cause of expansion of the number of repeats, a process called dynamic mutation, which is known to be the cause of the diseased state. Here it is shown that the conformational entropy associated with slippage is more destabilizing for long direct repeats (300-1000 base pairs) than shorter runs (10-30 base pairs), by about 2 kcal/mol. This contributes to the greater instability of longer sequences. Entropic considerations also favor the formation of simple bulges, rather than hairpin structures. A model is presented for dynamic mutations, and experimentally testable predictions are made that will allow the model to be tested. PMID:9115978

Harvey, S C

1997-03-18

265

Prediction of Protein Structural Features from Sequence Data Based on Shannon Entropy and Kolmogorov Complexity  

Science.gov (United States)

While the genome for a given organism stores the information necessary for the organism to function and flourish it is the proteins that are encoded by the genome that perhaps more than anything else characterize the phenotype for that organism. It is therefore not surprising that one of the many approaches to understanding and predicting protein folding and properties has come from genomics and more specifically from multiple sequence alignments. In this work I explore ways in which data derived from sequence alignment data can be used to investigate in a predictive way three different aspects of protein structure: secondary structures, inter-residue contacts and the dynamics of switching between different states of the protein. In particular the use of Kolmogorov complexity has identified a novel pathway towards achieving these goals. PMID:25856073

Bywater, Robert Paul

2015-01-01

266

Nearly Identical Bacteriophage Structural Gene Sequences Are Widely Distributed in both Marine and Freshwater Environments  

OpenAIRE

Primers were designed to amplify a 592-bp region within a conserved structural gene (g20) found in some cyanophages. The goal was to use this gene as a proxy to infer genetic richness in natural cyanophage communities and to determine if sequences were more similar in similar environments. Gene products were amplified from samples from the Gulf of Mexico, the Arctic, Southern, and Northeast and Southeast Pacific Oceans, an Arctic cyanobacterial mat, a catfish production pond, lakes in Canada ...

Short, Cindy M.; Suttle, Curtis A.

2005-01-01

267

Retinoblastoma susceptibility genes contain 5' sequences with a high propensity to form guanine-tetrad structures.  

OpenAIRE

Retinoblastoma susceptibility genes contain significant runs of oligoguanine at their 5' ends. Oligonucleotides having these sequences underwent complex formation in the presence of sodium ions, in which there was association of four strands. Formation of this structure was completely prevented if guanine was replaced by 7-deazaguanine, indicating the importance of guanine N7 in the formation of the complex. Complex formation lead to protection of guanine N7 against methylation by dimethyl su...

Murchie, A. I.; Lilley, D. M.

1992-01-01

268

Signature of the oligomeric behaviour of nuclear receptors at the sequence and structural level  

OpenAIRE

Nuclear receptors (NRs) are ligand-dependent transcription factors that control a large number of physiological events through the regulation of gene transcription. NRs function either as homodimers or as heterodimers with retinoid X receptor/ultraspiracle protein (RXR/USP). A structure-based sequence analysis aimed at discovering the molecular mechanism that controls the dimeric association of the ligand-binding domain reveals two sets of differentially conserved residues, which partition th...

Brelivet, Yann; Kammerer, Sabrina; Rochel, Natacha; Poch, Olivier; Moras, Dino

2004-01-01

269

Statistical aspects of discerning indel-type structural variation via DNA sequence alignment  

OpenAIRE

Abstract Background Structural variations in the form of DNA insertions and deletions are an important aspect of human genetics and especially relevant to medical disorders. Investigations have shown that such events can be detected via tell-tale discrepancies in the aligned lengths of paired-end DNA sequencing reads. Quantitative aspects underlying this method remain poorly understood, despite its importance and conceptual simplicity. We report the statistical theory characterizing the lengt...

Wilson Richard K; Wendl Michael C

2009-01-01

270

A comprehensive analysis of structural and sequence conservation in the TetR family transcriptional regulators.  

Science.gov (United States)

The tetracycline repressor family transcriptional regulators (TFRs) are homodimeric DNA-binding proteins that generally act as transcriptional repressors. Their DNA-binding activity is allosterically inactivated by the binding of small-molecule ligands. TFRs constitute the third most frequently occurring transcriptional regulator family found in bacteria with more than 10,000 representatives in the nonredundant protein database. In addition, more than 100 unique TFR structures have been solved by X-ray crystallography. In this study, we have used computational and experimental approaches to reveal the variations and conservation present within TFRs. Although TFR structures are very diverse, we were able to identify a conserved central triangle in their ligand-binding domains that forms the foundation of the structure and the framework for the ligand-binding cavity. While the sequences of DNA-binding domains of TFRs are highly conserved across the whole family, the sequences of their ligand-binding domains are so diverse that pairwise sequence similarity is often undetectable. Nevertheless, by analyzing subfamilies of TFRs, we were able to identify distinct regions of conservation in ligand-binding domains that may be important for allostery. To aid in large-scale analyses of TFR function, we have developed a simple and reliable computational approach to predict TFR operator sequences, a temperature melt-based assay to measure DNA binding, and a generic ligand-binding assay that will likely be applicable to most TFRs. Finally, our analysis of TFR structures highlights their flexibility and provides insight into a conserved allosteric mechanism for this family. PMID:20595046

Yu, Zhou; Reichheld, Sean E; Savchenko, Alexei; Parkinson, John; Davidson, Alan R

2010-07-23

271

Cold dark matter, the structure of galactic haloes and the origin of the Hubble sequence  

International Nuclear Information System (INIS)

The authors describe a simulation of a flat CDM (cold, dark matter) universe which can resolve structures of comparable scale to the luminous parts of galaxies. It is found that such a universe produces objects with the abundance and characteristic properties inferred for galaxy haloes. The results imply that merging plays an important part in galaxy formation and suggest a possible explanation for the Hubble sequence. (author)

272

A comprehensive update of the sequence and structure classification of kinases  

OpenAIRE

Abstract Background A comprehensive update of the classification of all available kinases was carried out. This survey presents a complete global picture of this large functional class of proteins and confirms the soundness of our initial kinase classification scheme. Results The new survey found the total number of kinase sequences in the protein database has increased more than three-fold (from 17,310 to 59,402), and the number of determined kinase structures increased two-fold (from 359 to...

Zhang Hong; Ginalski Krzysztof; Cheek Sara; Grishin Nick V

2005-01-01

273

A molecular phylogeny of Hypnales (Bryophyta) inferred from ITS2 sequence-structure data  

OpenAIRE

Abstract Background Hypnales comprise over 50% of all pleurocarpous mosses. They provide a young radiation complicating phylogenetic analyses. To resolve the hypnalean phylogeny, it is necessary to use a phylogenetic marker providing highly variable features to resolve species on the one hand and conserved features enabling a backbone analysis on the other. Therefore we used highly variable internal transcribed spacer 2 (ITS2) sequences and conserved secondary structures, as deposited with th...

Wolf Matthias; Merget Benjamin

2010-01-01

274

Capturing "attrition intensifying" structural traits from didactic interaction sequences of MOOC learners  

OpenAIRE

This work is an attempt to discover hidden structural configurations in learning activity sequences of students in Massive Open Online Courses (MOOCs). Leveraging combined representations of video clickstream interactions and forum activities, we seek to fundamentally understand traits that are predictive of decreasing engagement over time. Grounded in the interdisciplinary field of network science, we follow a graph based approach to successfully extract indicators of activ...

Sinha, Tanmay; Li, Nan; Jermann, Patrick; Dillenbourg, Pierre

2014-01-01

275

PETcofold : predicting conserved interactions and structures of two multiple alignments of RNA sequences  

DEFF Research Database (Denmark)

MOTIVATION: Predicting RNA-RNA interactions is essential for determining the function of putative non-coding RNAs. Existing methods for the prediction of interactions are all based on single sequences. Since comparative methods have already been useful in RNA structure determination, we assume that conserved RNA-RNA interactions also imply conserved function. Of these, we further assume that a non-negligible amount of the existing RNA-RNA interactions have also acquired compensating base changes throughout evolution. We implement a method, PETcofold, that can take covariance information in intra-molecular and inter-molecular base pairs into account to predict interactions and secondary structures of two multiple alignments of RNA sequences. RESULTS: PETcofold's ability to predict RNA-RNA interactions was evaluated on a carefully curated dataset of 32 bacterial small RNAs and their targets, which was manually extracted from the literature. For evaluation of both RNA-RNA interaction and structure prediction, we were able to extract only a few high-quality examples: one vertebrate small nucleolar RNA and four bacterial small RNAs. For these we show that the prediction can be improved by our comparative approach. Furthermore, PETcofold was evaluated on controlled data with phylogenetically simulated sequences enriched for covariance patterns at the interaction sites. We observed increased performance with increased amounts of covariance. AVAILABILITY: The program PETcofold is available as source code and can be downloaded from http://rth.dk/resources/petcofold. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Seemann, Ernst Stefan; Richter, Andreas S.

2011-01-01

276

RPI-Pred: predicting ncRNA-protein interaction using sequence and structural information  

Science.gov (United States)

RNA-protein complexes are essential in mediating important fundamental cellular processes, such as transport and localization. In particular, ncRNA-protein interactions play an important role in post-transcriptional gene regulation like mRNA localization, mRNA stabilization, poly-adenylation, splicing and translation. The experimental methods to solve RNA-protein interaction prediction problem remain expensive and time-consuming. Here, we present the RPI-Pred (RNA-protein interaction predictor), a new support-vector machine-based method, to predict protein-RNA interaction pairs, based on both the sequences and structures. The results show that RPI-Pred can correctly predict RNA-protein interaction pairs with ?94% prediction accuracy when using sequence and experimentally determined protein and RNA structures, and with ?83% when using sequences and predicted protein and RNA structures. Further, our proposed method RPI-Pred was superior to other existing ones by predicting more experimentally validated ncRNA-protein interaction pairs from different organisms. Motivated by the improved performance of RPI-Pred, we further applied our method for reliable construction of ncRNA-protein interaction networks. The RPI-Pred is publicly available at: http://ctsb.is.wfubmc.edu/projects/rpi-pred. PMID:25609700

Suresh, V.; Liu, Liang; Adjeroh, Donald; Zhou, Xiaobo

2015-01-01

277

RPI-Pred: predicting ncRNA-protein interaction using sequence and structural information.  

Science.gov (United States)

RNA-protein complexes are essential in mediating important fundamental cellular processes, such as transport and localization. In particular, ncRNA-protein interactions play an important role in post-transcriptional gene regulation like mRNA localization, mRNA stabilization, poly-adenylation, splicing and translation. The experimental methods to solve RNA-protein interaction prediction problem remain expensive and time-consuming. Here, we present the RPI-Pred (RNA-protein interaction predictor), a new support-vector machine-based method, to predict protein-RNA interaction pairs, based on both the sequences and structures. The results show that RPI-Pred can correctly predict RNA-protein interaction pairs with ?94% prediction accuracy when using sequence and experimentally determined protein and RNA structures, and with ?83% when using sequences and predicted protein and RNA structures. Further, our proposed method RPI-Pred was superior to other existing ones by predicting more experimentally validated ncRNA-protein interaction pairs from different organisms. Motivated by the improved performance of RPI-Pred, we further applied our method for reliable construction of ncRNA-protein interaction networks. The RPI-Pred is publicly available at: http://ctsb.is.wfubmc.edu/projects/rpi-pred. PMID:25609700

Suresh, V; Liu, Liang; Adjeroh, Donald; Zhou, Xiaobo

2015-02-18

278

Fast computational methods for predicting protein structure from primary amino acid sequence  

Science.gov (United States)

The present invention provides a method utilizing primary amino acid sequence of a protein, energy minimization, molecular dynamics and protein vibrational modes to predict three-dimensional structure of a protein. The present invention also determines possible intermediates in the protein folding pathway. The present invention has important applications to the design of novel drugs as well as protein engineering. The present invention predicts the three-dimensional structure of a protein independent of size of the protein, overcoming a significant limitation in the prior art.

Agarwal, Pratul Kumar (Knoxville, TN)

2011-07-19

279

Population genetic structure and historical demography of Oratosquilla oratoria revealed by mitochondrial DNA sequences.  

Science.gov (United States)

Genetic diversity, population genetic structure and molecular phylogeographic pattern of mantis shrimp Oratosquilla oratoria in Bohai Sea and South China Sea were analyzed by mitochondrial DNA sequences. Nucleotide and haplotype diversities were 0.00409-0.00669 and 0.894-0.953 respectively. Neighbor-Joining phylogenetic tree clustered two distinct lineages. Both phylogenetic tree and median-joining network showed the consistent genetic structure corresponding to geographical distribution. Mismatch distributions, negative neutral test and "star-like" network supported a sudden population expansion event. And the time was estimated about 44000 and 50000 years ago. PMID:23516902

Zhang, D; Ding, Ge; Ge, B; Zhang, H; Tang, B

2012-12-01

280

The ITS2 Database III—sequences and structures for phylogeny  

OpenAIRE

The internal transcribed spacer 2 (ITS2) is a widely used phylogenetic marker. In the past, it has mainly been used for species level classifications. Nowadays, a wider applicability becomes apparent. Here, the conserved structure of the RNA molecule plays a vital role. We have developed the ITS2 Database (http://its2.bioapps.biozentrum.uni-wuerzburg.de) which holds information about sequence, structure and taxonomic classification of all ITS2 in GenBank. In the new version, we use Hidden Mar...

Koetschan, Christian; Fo?rster, Frank; Keller, Alexander; Schleicher, Tina; Ruderisch, Benjamin; Schwarz, Roland; Mu?ller, Tobias; Wolf, Matthias; Schultz, Jo?rg

2009-01-01

281

Nucleotide sequence and structural features of the group III citrus viroids.  

Science.gov (United States)

The nucleotide sequence and secondary structure of two representative variants from the Group III citrus viroids. CVd-IIIa (297 bases) and CVd-IIIb (294 bases) were determined. The variants are related to the apple scar skin viroid (ASSVd) family. Although smaller in size than any of the ASSVd-related viroids, the central conserved region as well as most of the terminal conserved region of ASSVd is retained. The rod-like structural configuration (characteristic of ASSVd) of the variants as predicted by minimum free energy analysis is presented. PMID:7996150

Rakowski, A G; Szychowski, J A; Avena, Z S; Semancik, J S

1994-12-01

282

Pattern matching through Chaos Game Representation: bridging numerical and discrete data structures for biological sequence analysis  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Chaos Game Representation (CGR is an iterated function that bijectively maps discrete sequences into a continuous domain. As a result, discrete sequences can be object of statistical and topological analyses otherwise reserved to numerical systems. Characteristically, CGR coordinates of substrings sharing an L-long suffix will be located within 2-L distance of each other. In the two decades since its original proposal, CGR has been generalized beyond its original focus on genomic sequences and has been successfully applied to a wide range of problems in bioinformatics. This report explores the possibility that it can be further extended to approach algorithms that rely on discrete, graph-based representations. Results The exploratory analysis described here consisted of selecting foundational string problems and refactoring them using CGR-based algorithms. We found that CGR can take the role of suffix trees and emulate sophisticated string algorithms, efficiently solving exact and approximate string matching problems such as finding all palindromes and tandem repeats, and matching with mismatches. The common feature of these problems is that they use longest common extension (LCE queries as subtasks of their procedures, which we show to have a constant time solution with CGR. Additionally, we show that CGR can be used as a rolling hash function within the Rabin-Karp algorithm. Conclusions The analysis of biological sequences relies on algorithmic foundations facing mounting challenges, both logistic (performance and analytical (lack of unifying mathematical framework. CGR is found to provide the latter and to promise the former: graph-based data structures for sequence analysis operations are entailed by numerical-based data structures produced by CGR maps, providing a unifying analytical framework for a diversity of pattern matching problems.

Vinga Susana

2012-05-01

283

Rat mammary-gland transferrin: nucleotide sequence, phylogenetic analysis and glycan structure.  

Science.gov (United States)

The complete cDNA for rat mammary-gland transferrin (Tf) has been sequenced and also the native protein isolated from milk in order to analyse the structure of the main glycan variants present. A lactating-rat mammary-gland cDNA library in lambda gt10 was screened with a partial cDNA copy of rat liver Tf and subsequently rescreened with 5' fragments of the longest clones. This produced a 2275 bp insert coding for an open reading frame of 695 amino acid residues. This includes a 19-amino acid signal sequence and the mature protein containing 676 amino acids and one N-glycosylation site in the C-terminal domain at residue 490. Phylogenetic analysis was carried out using 14 translated Tf nucleotide sequences, and the derived evolutionary tree shows that at least three gene duplication events have occurred during Tf evolution, one of which generated the N- and C-terminal domains and occurred before separation of arthropods and chordates. The two halves of human melanotransferrin are more similar to each other than to any other sequence, which contrasts with the pattern shown by the remaining sequences. Native rat milk Tf is separated into four bands on native PAGE that differ only in their sialic acid content: one biantennary glycan is present containing either no sialic acid residues or up to three. The complete structures of the two major variants were determined by methylation, m.s. and 400 MHz 1H-n.m.r. spectroscopy. They contain either one or two neuraminic acid residues (alpha 2-->6)-linked to galactose in conventional biantennary N-acetyl-lactosamine-type glycans. Most contain fucose (alpha 1-->6)-linked to the terminal non-reducing N-acetylglucosamine. PMID:7717992

Escrivá, H; Pierce, A; Coddeville, B; González, F; Benaissa, M; Léger, D; Wieruszeski, J M; Spik, G; Pamblanco, M

1995-04-01

284

Structural studies of. cap alpha. -bungarotoxin. 1. Sequence-specific /sub 1/H NMR resonance assignments  

Energy Technology Data Exchange (ETDEWEB)

The authors report the complete sequence-specific assignment of the backbone resonances and most of the side-chain resonances in the /sup 1/H NMR spectrum of ..cap alpha..-bungarotoxin by two-dimensional NMR. Problems with resonance overlap were resolved with the assistance of the HRNOESY experiment described in an accompanying paper. Significant differences exist between the solution structure described here and the crystal structure of ..cap alpha..-bungarotoxin, on the basis of the proton to proton distances obtained by nuclear Overhauser enhancement spectroscopy (NOESY) and the corresponding distances from the X-ray crystal structure. These differences include a larger ..beta..-sheet in solution and a different orientation of the invariant tryptophan, Trp-28, making the solution structure more consistent with the crystal structure of the homologous neurotoxin ..cap alpha..-cobratoxin. Four errors in the order of the amino acids in the primary sequence were indicated by the NMR data. These errors were confirmed by chemical means, as described in an accompanying paper.

Basus, V.J.; Billeter, M.; Love, R.A.; Stroud, R.M.; Kuntz, I.D.

1988-04-19

285

Ribosomal DNA sequence heterogeneity reflects intraspecies phylogenies and predicts genome structure in two contrasting yeast species.  

Science.gov (United States)

The ribosomal RNA encapsulates a wealth of evolutionary information, including genetic variation that can be used to discriminate between organisms at a wide range of taxonomic levels. For example, the prokaryotic 16S rDNA sequence is very widely used both in phylogenetic studies and as a marker in metagenomic surveys and the internal transcribed spacer region, frequently used in plant phylogenetics, is now recognized as a fungal DNA barcode. However, this widespread use does not escape criticism, principally due to issues such as difficulties in classification of paralogous versus orthologous rDNA units and intragenomic variation, both of which may be significant barriers to accurate phylogenetic inference. We recently analyzed data sets from the Saccharomyces Genome Resequencing Project, characterizing rDNA sequence variation within multiple strains of the baker's yeast Saccharomyces cerevisiae and its nearest wild relative Saccharomyces paradoxus in unprecedented detail. Notably, both species possess single locus rDNA systems. Here, we use these new variation datasets to assess whether a more detailed characterization of the rDNA locus can alleviate the second of these phylogenetic issues, sequence heterogeneity, while controlling for the first. We demonstrate that a strong phylogenetic signal exists within both datasets and illustrate how they can be used, with existing methodology, to estimate intraspecies phylogenies of yeast strains consistent with those derived from whole-genome approaches. We also describe the use of partial Single Nucleotide Polymorphisms, a type of sequence variation found only in repetitive genomic regions, in identifying key evolutionary features such as genome hybridization events and show their consistency with whole-genome Structure analyses. We conclude that our approach can transform rDNA sequence heterogeneity from a problem to a useful source of evolutionary information, enabling the estimation of highly accurate phylogenies of closely related organisms, and discuss how it could be extended to future studies of multilocus rDNA systems. [concerted evolution; genome hydridisation; phylogenetic analysis; ribosomal DNA; whole genome sequencing; yeast]. PMID:24682414

West, Claire; James, Stephen A; Davey, Robert P; Dicks, Jo; Roberts, Ian N

2014-07-01

286

Structural Constraints on the Covariance Matrix Derived from Multiple Aligned Protein Sequences  

Science.gov (United States)

Residue contact predictions were calculated based on the mutual information observed between pairs of positions in large multiple protein sequence alignments. Where previously only the statistical properties of these data have been considered important, we introduce new measures to impose constraints that make the contact map more consistent with a three dimensional structure. These included global (bulk) properties and local secondary structure properties. The latter allowed the contact constraints to be employed at the level of filtering pairs of secondary structure contacts which led to a more efficient (lower-level) implementation in the PLATO structure prediction server. Where previously the measure of success with this method had been whether the correct fold was predicted in the top 10 ranked models, with the current implementation, our summary statistic is the number of correct folds included in the top 10 models — which is on average over 50 percent. PMID:22194819

Taylor, William R.; Sadowski, Michael I.

2011-01-01

287

RNAG: a new Gibbs sampler for predicting RNA secondary structure for unaligned sequences  

Science.gov (United States)

Motivation: RNA secondary structure plays an important role in the function of many RNAs, and structural features are often key to their interaction with other cellular components. Thus, there has been considerable interest in the prediction of secondary structures for RNA families. In this article, we present a new global structural alignment algorithm, RNAG, to predict consensus secondary structures for unaligned sequences. It uses a blocked Gibbs sampling algorithm, which has a theoretical advantage in convergence time. This algorithm iteratively samples from the conditional probability distributions P(Structure | Alignment) and P(Alignment | Structure). Not surprisingly, there is considerable uncertainly in the high-dimensional space of this difficult problem, which has so far received limited attention in this field. We show how the samples drawn from this algorithm can be used to more fully characterize the posterior space and to assess the uncertainty of predictions. Results: Our analysis of three publically available datasets showed a substantial improvement in RNA structure prediction by RNAG over extant prediction methods. Additionally, our analysis of 17 RNA families showed that the RNAG sampled structures were generally compact around their ensemble centroids, and at least 11 families had at least two well-separated clusters of predicted structures. In general, the distance between a reference structure and our predicted structure was large relative to the variation among structures within an ensemble. Availability: The Perl implementation of the RNAG algorithm and the data necessary to reproduce the results described in Sections 3.1 and 3.2 are available at http://ccmbweb.ccv.brown.edu/rnag.html Contact: charles_lawrence@brown.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21788211

Wei, Donglai; Alpert, Lauren V.; Lawrence, Charles E.

2011-01-01

288

Structure and thermodynamic properties of (C5H12N)CuBr3: a new weakly coupled antiferromagnetic spin-1/2 chain complex lying in the 1D-3D dimensional cross-over regime.  

Science.gov (United States)

Single crystals of a metal organic complex (C5H12N)CuBr3 (C5H12N = piperidinium, pipH for short) have been synthesized, and the structure was determined by single-crystal X-ray diffraction. (pipH)CuBr3 crystallizes in the monoclinic group C2/c. Edging-sharing CuBr5 units link to form zigzag chains along the c axis, and the neighboring Cu(II) ions with spin-1/2 are bridged by bibromide ions. Magnetic susceptibility data down to 1.8 K can be well fitted by the Bonner-Fisher formula for the antiferromagnetic spin-1/2 chain, giving the intrachain magnetic coupling constant J ? -17 K. At zero field, (pipH)CuBr3 shows three-dimensional (3D) order below TN = 1.68 K. Calculated by the mean-field theory, the interchain coupling constant J' = -0.91 K is obtained and the ordered magnetic moment m0 is about 0.23 ?B. This value of m0 makes (pipH)CuBr3 a rare compound suitable to study the 1D-3D dimensional cross-over problem in magnetism, since both 3D order and one-dimensional (1D) quantum fluctuations are prominent. In addition, specific heat measurements reveal two successive magnetic transitions with lowering temperature when external field ?0H ? 3 T is applied along the a' axis. The ?0H-T phase diagram of (pipH)CuBr3 is roughly constructed. PMID:24617285

Pan, Bingying; Wang, Yang; Zhang, Lijuan; Li, Shiyan

2014-04-01

289

The zebrafish Fgf-3 gene: cDNA sequence, transcript structure and genomic organization.  

Science.gov (United States)

We report the isolation and characterization of genomic and cDNA clones encoding zebrafish fibroblast growth factor 3 (FGF3). An initial cDNA clone was generated by PCR amplification using degenerate oligo primers corresponding to a conserved region of protein found in the mouse and human homologues. Screening a cDNA library made from 30-33-h-old zebrafish embryos with this PCR product led to the isolation of two cDNA clones. Sequence analysis of the longest cDNA insert (1810 bp) revealed a 256-amino-acid (aa) orf. The central region, composed of approx. 155 aa, shares 78% identity with the analogous region of Xenopus laevis FGF3 and 72% identity with the product of the more distantly related human gene. However, the N-and C-terminal domains of zebrafish FGF3 are very different from those of other known homologues. The cDNA was used as a probe on genomic DNA to create a physical map of the locus and to isolate a genomic clone encompassing the entire coding region and 5' sequences. DNA sequencing and RNase protection analyses indicate that zebrafish Fgf-3 (ZFgf-3) is structurally analogous to the mouse gene and regulated through two different promoters. The transcription start point of the proximal promoter aligns to that of mouse promoter P3 and lies within a conserved region of sequence. PMID:8654946

Kiefer, P; Strähle, U; Dickson, C

1996-02-12

290

Cloning, sequence analysis and crystal structure determination of a miraculin-like protein from Murraya koenigii.  

Science.gov (United States)

Earlier, the purification of a 21.4kDa protein with trypsin inhibitory activity from seeds of Murraya koenigii has been reported. The present study, based on the amino acid sequence deduced from both cDNA and genomic DNA, establishes it to be a miraculin-like protein and provides crystal structure at 2.9A resolution. The mature protein consists of 190 amino acid residues with seven cysteines arranged in three disulfide bridges. The amino acid sequence showed maximum homology and formed a distinct cluster with miraculin-like proteins, a soybean Kunitz super family member, in phylogenetic analyses. The major differences in sequence were observed at primary and secondary specificity sites in the reactive loop when compared to classical Kunitz family members. The crystal structure analysis showed that the protein is made of twelve antiparallel beta-strands, loops connecting beta-strands and two short helices. Despite similar overall fold, it showed significant differences from classical Kunitz trypsin inhibitors. PMID:19914199

Gahloth, Deepankar; Selvakumar, Purushotham; Shee, Chandan; Kumar, Pravindra; Sharma, Ashwani Kumar

2010-02-01

291

Evolutionary conservation of sequence and secondary structures inCRISPR repeats  

Energy Technology Data Exchange (ETDEWEB)

Clustered Regularly Interspaced Palindromic Repeats (CRISPRs) are a novel class of direct repeats, separated by unique spacer sequences of similar length, that are present in {approx}40% of bacterial and all archaeal genomes analyzed to date. More than 40 gene families, called CRISPR-associated sequences (CAS), appear in conjunction with these repeats and are thought to be involved in the propagation and functioning of CRISPRs. It has been proposed that the CRISPR/CAS system samples, maintains a record of, and inactivates invasive DNA that the cell has encountered, and therefore constitutes a prokaryotic analog of an immune system. Here we analyze CRISPR repeats identified in 195 microbial genomes and show that they can be organized into multiple clusters based on sequence similarity. All individual repeats in any given cluster were inferred to form characteristic RNA secondary structure, ranging from non-existent to pronounced. Stable secondary structures included G:U base pairs and exhibited multiple compensatory base changes in the stem region, indicating evolutionary conservation and functional importance. We also show that the repeat-based classification corresponds to, and expands upon, a previously reported CAS gene-based classification including specific relationships between CRISPR and CAS subtypes.

Kunin, Victor; Sorek, Rotem; Hugenholtz, Philip

2006-09-01

292

Inferring Aftershock Sequence Properties and Tectonic Structure Using Empirical Signal Detectors  

Science.gov (United States)

Seismotectonic studies of the 2008 Storfjorden aftershock sequence were limited to data acquired by the permanent, but sparse, regional seismic network in the Svalbard archipelago. Storfjorden's remote location and harsh polar environment inhibited deployment of temporary seismometers that would have improved observations of sequence events. The lack of good station coverage prevented the detection and computation of hypocenter locations of many low magnitude events (mb time distribution was not captured. In this study, an autonomous event detection and clustering framework is employed to build a more complete catalog of Storfjorden events using data from the Spitsbergen (SPITS) array. The new catalog allows the spatiotemporal distribution of seismicity within the fjord to be studied in greater detail. Information regarding the location of active event clusters provides a means of inferring the tectonic structure within the fault zone. The distribution of active clusters and moment tensor solutions for the Storfjorden sequence suggests there are at least two different structures within the fjord: a NE-SW trending linear feature with oblique-normal to strike-slip faulting and E-W trending normal faults.

Junek, William N.; Kværna, Tormod; Pirli, Myrto; Schweitzer, Johannes; Harris, David B.; Dodge, Douglas A.; Woods, Mark T.

2015-02-01

293

Comparative Analysis of Structure and Sequences of Oryza sativa Superoxide Dismutase  

Directory of Open Access Journals (Sweden)

Full Text Available One of the major classes of antioxidant enzymes, which protect the cellular and subcellular components against harmful reactive oxygen species (ROS, is superoxide dismutase (SOD. SODs play pivotal role in scavenging highly reactive free oxygen radicals and protecting cells from toxic effects. In Oryza sativa three types of SODs are available based on their metal content viz. Cu-Zn SOD, Mn SOD and Fe SOD. In the present study attempts were made to critically assess the structure and phylogenetic relationship among Oryza sativa SODs. The sequence similarity search using local BLAST shows that Mn SODs and Fe SODs have greater degree of similarity compared with that of Cu-Zn SODs. The multiple alignment reveals that seven amino acids were found to be totally conserved. The secondary structure shows that Mn SODs and Fe SODs have similar helixes, sheets, turns and coils compared with that of Cu-Zn SODs. The comparative analysis also displayed greater resemblance in primary, secondary and tertiary structures of Fe SODs and Mn SODs. Comparison between the structure and sequence analysis reveals that Mn SOD and Fe SOD are found to be closely related whereas Cu-Zn SOD evolves independently.

Aiyar Balasubramanian

2012-09-01

294

Role of sequence and membrane composition in structure of transmembrane domain of Amyloid Precursor Protein  

Science.gov (United States)

Aggregation of proteins of known sequence is linked to a variety of neurodegenerative disorders. The amyloid ? (A?) protein associated with Alzheimer's Disease (AD) is derived from cleavage of the 99 amino acid C-terminal fragment of Amyloid Precursor Protein (APP-C99) by ?-secretase. Certain familial mutations of APP-C99 have been shown to lead to altered production of A? protein and the early onset of AD. We describe simulation studies exploring the structure of APP-C99 in micelle and membrane environments. Our studies explore how changes in sequence and membrane composition influence (1) the structure of monomeric APP-C99 and (2) APP-C99 homodimer structure and stability. Comparison of simulation results with recent NMR studies of APP-C99 monomers and dimers in micelle and bicelle environments provide insight into how critical aspects of APP-C99 structure and dimerization correlate with secretase processing, an essential component of the A? protein aggregation pathway and AD.

Straub, John

2013-03-01

295

Combining classifiers for improved classification of proteins from sequence or structure  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Predicting a protein's structural or functional class from its amino acid sequence or structure is a fundamental problem in computational biology. Recently, there has been considerable interest in using discriminative learning algorithms, in particular support vector machines (SVMs, for classification of proteins. However, because sufficiently many positive examples are required to train such classifiers, all SVM-based methods are hampered by limited coverage. Results In this study, we develop a hybrid machine learning approach for classifying proteins, and we apply the method to the problem of assigning proteins to structural categories based on their sequences or their 3D structures. The method combines a full-coverage but lower accuracy nearest neighbor method with higher accuracy but reduced coverage multiclass SVMs to produce a full coverage classifier with overall improved accuracy. The hybrid approach is based on the simple idea of "punting" from one method to another using a learned threshold. Conclusion In cross-validated experiments on the SCOP hierarchy, the hybrid methods consistently outperform the individual component methods at all levels of coverage. Code and data sets are available at http://noble.gs.washington.edu/proj/sabretooth

Leslie Christina S

2008-09-01

296

Indel PDB: A database of structural insertions and deletions derived from sequence alignments of closely related proteins  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Insertions and deletions (indels represent a common type of sequence variations, which are less studied and pose many important biological questions. Recent research has shown that the presence of sizable indels in protein sequences may be indicative of protein essentiality and their role in protein interaction networks. Examples of utilization of indels for structure-based drug design have also been recently demonstrated. Nonetheless many structural and functional characteristics of indels remain less researched or unknown. Description We have created a web-based resource, Indel PDB, representing a structural database of insertions/deletions identified from the sequence alignments of highly similar proteins found in the Protein Data Bank (PDB. Indel PDB utilized large amounts of available structural information to characterize 1-, 2- and 3-dimensional features of indel sites. Indel PDB contains 117,266 non-redundant indel sites extracted from 11,294 indel-containing proteins. Unlike loop databases, Indel PDB features more indel sequences with secondary structures including alpha-helices and beta-sheets in addition to loops. The insertion fragments have been characterized by their sequences, lengths, locations, secondary structure composition, solvent accessibility, protein domain association and three dimensional structures. Conclusion By utilizing the data available in Indel PDB, we have studied and presented here several sequence and structural features of indels. We anticipate that Indel PDB will not only enable future functional studies of indels, but will also assist protein modeling efforts and identification of indel-directed drug binding sites.

Cherkasov Artem

2008-06-01

297

A memory-efficient dynamic programming algorithm for optimal alignment of a sequence to an RNA secondary structure  

OpenAIRE

Abstract Background Covariance models (CMs) are probabilistic models of RNA secondary structure, analogous to profile hidden Markov models of linear sequence. The dynamic programming algorithm for aligning a CM to an RNA sequence of length N is O(N3) in memory. This is only practical for small RNAs. Results I describe a divide and conquer variant of the alignment algorithm that is analogous to memory-efficient Myers/Miller dynamic programming algorithms for linear sequence alignment. The new ...

Eddy Sean R

2002-01-01

298

Enzyme-Free Translation of DNA into Sequence-Defined Synthetic Polymers Structurally Unrelated to Nucleic Acids  

OpenAIRE

The translation of DNA sequences into corresponding biopolymers enables the production, function, and evolution of the macromolecules of life. In contrast, methods to generate sequence-defined synthetic polymers with similar levels of control have remained elusive. Here we report the development of a DNA-templated translation system that enables the enzyme-free translation of DNA templates into sequence-defined synthetic polymers that have no necessary structural relationship with nucleic aci...

Niu, Jia; Hili, Ryan; Liu, David R.

2013-01-01

299

1D oxide nanostructures from chemical solutions.  

Science.gov (United States)

Nanotechnology has motivated a tremendous effort in the synthesis approaches to grow free standing or hierarchical nanomaterials such as nanowires and nanorods. Bottom-up approaches based on chemistry are an important approach to produce nanomaterials, and here the concepts of growing oxide 1D nanostructures from chemical solutions are reviewed. The thermodynamic and kinetic aspects of the nucleation and growth of oxide compounds in solutions are presented with emphasis on hydrothermal and molten salt synthesis. The importance of solubility of precursors, the precursor chemistry, role of organic additives as well as the chemical complexity and dimensionality and symmetry of the crystal structure of the compound grown are highlighted. PMID:24129769

Einarsrud, Mari-Ann; Grande, Tor

2014-04-01

300

Sequence composition and environment effects on residue fluctuations in protein structures  

Science.gov (United States)

Structure fluctuations in proteins affect a broad range of cell phenomena, including stability of proteins and their fragments, allosteric transitions, and energy transfer. This study presents a statistical-thermodynamic analysis of relationship between the sequence composition and the distribution of residue fluctuations in protein-protein complexes. A one-node-per-residue elastic network model accounting for the nonhomogeneous protein mass distribution and the interatomic interactions through the renormalized inter-residue potential is developed. Two factors, a protein mass distribution and a residue environment, were found to determine the scale of residue fluctuations. Surface residues undergo larger fluctuations than core residues in agreement with experimental observations. Ranking residues over the normalized scale of fluctuations yields a distinct classification of amino acids into three groups: (i) highly fluctuating-Gly, Ala, Ser, Pro, and Asp, (ii) moderately fluctuating-Thr, Asn, Gln, Lys, Glu, Arg, Val, and Cys, and (iii) weakly fluctuating-Ile, Leu, Met, Phe, Tyr, Trp, and His. The structural instability in proteins possibly relates to the high content of the highly fluctuating residues and a deficiency of the weakly fluctuating residues in irregular secondary structure elements (loops), chameleon sequences, and disordered proteins. Strong correlation between residue fluctuations and the sequence composition of protein loops supports this hypothesis. Comparing fluctuations of binding site residues (interface residues) with other surface residues shows that, on average, the interface is more rigid than the rest of the protein surface and Gly, Ala, Ser, Cys, Leu, and Trp have a propensity to form more stable docking patches on the interface. The findings have broad implications for understanding mechanisms of protein association and stability of protein structures.

Ruvinsky, Anatoly M.; Vakser, Ilya A.

2010-10-01

301

Sequence-specific 1H NMR assignments and secondary structure of eglin c  

International Nuclear Information System (INIS)

Sequence-specific nuclear magnetic resonance assignments were obtained for eglin c, a polypeptide inhibitor of the granulocytic proteinases elastase and cathepsin G and some other proteinases. The protein consists of a single polypeptide chain of 70 residues. All proton resonances were assigned except for some labile protons of arginine side chains. The patterns of nuclear Overhauser enhancements and coupling constants and the observation of slow hydrogen exchange were used to characterize the secondary structure of the protein. The results indicate that the solution structure of the free inhibitor is very similar to the crystal structure reported for the same protein in the complex with subtilisin Carlsberg. However, a part of the binding loop seems to have a significantly different conformation in the free protein

302

Structural growth sequences and electronic properties of gold clusters: Highly symmetric tubelike cages  

International Nuclear Information System (INIS)

The structural growth sequences and electronic properties of Aun (n=14+6m and m=0, 1, 2, 3, 4, 5, 6, and 10) clusters have been investigated using the DMol3 DFT package. The structures of Aun (n=20, 26, 32, 38, 44, 50, and 74) are obtained in turn by directly adding a gold ring of six atoms on the tubelike configuration of Au14 cluster. These tubelike gold clusters are all highly symmetric cages. Their average atomic coordination numbers, nearest-neighbor distances and average tube radius are very near. The average binding energies of Aun clusters increase substantially with size n. Au14, Au26, and Au44 have larger energy gaps and ionization potentials than their neighboring clusters. During the studied clusters, Au74 cluster has the highest average energy, ionization potential and the lowest electron affinity, which are corresponding to its highest structural and chemical stability, respectively.

303

The PETfold and PETcofold web servers for intra- and intermolecular structures of multiple RNA sequences  

DEFF Research Database (Denmark)

The function of non-coding RNA genes largely depends on their secondary structure and the interaction with other molecules. Thus, an accurate prediction of secondary structure and RNA-RNA interaction is essential for the understanding of biological roles and pathways associated with a specific RNA gene. We present web servers to analyze multiple RNA sequences for common RNA structure and for RNA interaction sites. The web servers are based on the recent PET (Probabilistic Evolutionary and Thermodynamic) models PETfold and PETcofold, but add user friendly features ranging from a graphical layer to interactive usage of the predictors. Additionally, the web servers provide direct access to annotated RNA alignments, such as the Rfam 10.0 database and multiple alignments of 16 vertebrate genomes with human. The web servers are freely available at: http://rth.dk/resources/petfold/

Seemann, Ernst Stefan; Menzel, Karl Peter

2011-01-01

304

Sequence composition and environment effects on residue fluctuations in protein structures  

CERN Document Server

The spectrum and scale of fluctuations in protein structures affect the range of cell phenomena, including stability of protein structures or their fragments, allosteric transitions and energy transfer. The study presents a statistical-thermodynamic analysis of relationship between the sequence composition and the distribution of residue fluctuations in protein-protein complexes. A one-node-per residue elastic network model accounting for the nonhomogeneous protein mass distribution and the inter-atomic interactions through the renormalized inter-residue potential is developed. Two factors, a protein mass distribution and a residue environment, were found to determine the scale of residue fluctuations. Surface residues undergo larger fluctuations than core residues, showing agreement with experimental observations. Ranking residues over the normalized scale of fluctuations yields a distinct classification of amino acids into three groups. The structural instability in proteins possibly relates to the high con...

Ruvinsky, Anatoly M

2009-01-01

305

Large scale identification and categorization of protein sequences using structured logistic regression  

DEFF Research Database (Denmark)

Abstract Background Structured Logistic Regression (SLR) is a newly developed machine learning tool first proposed in the context of text categorization. Current availability of extensive protein sequence databases calls for an automated method to reliably classify sequences and SLR seems well-suited for this task. The classification of P-type ATPases, a large family of ATP-driven membrane pumps transporting essential cations, was selected as a test-case that would generate important biological information as well as provide a proof-of-concept for the application of SLR to a large scale bioinformatics problem. Results Using SLR, we have built classifiers to identify and automatically categorize P-type ATPases into one of 11 pre-defined classes. The SLR-classifiers are compared to a Hidden Markov Model approach and shown to be highly accurate and scalable. Representing the bulk of currently known sequences, we analysed 9.3 million sequences in the UniProtKB and attempted to classify a large numberof P-type ATPases. To examine the distribution of pumps on organisms, we also applied SLR to 1,123 complete genomes from the Entrez genome database. Finally, we analysed the predicted membrane topology of the identified P-type ATPases. Conclusions Using the SLR-based classification tool we are able to run a large scale study of P-type ATPases. This study provides proof-of-concept for the application of SLR to a bioinformatics problem and the analysis of P-type ATPases pinpoints new and interesting targets for further biochemical characterization and structural analysis.

Pedersen, BjØrn Panella; Ifrim, Georgiana

2014-01-01

306

TFpredict and SABINE: sequence-based prediction of structural and functional characteristics of transcription factors.  

Science.gov (United States)

One of the key mechanisms of transcriptional control are the specific connections between transcription factors (TF) and cis-regulatory elements in gene promoters. The elucidation of these specific protein-DNA interactions is crucial to gain insights into the complex regulatory mechanisms and networks underlying the adaptation of organisms to dynamically changing environmental conditions. As experimental techniques for determining TF binding sites are expensive and mostly performed for selected TFs only, accurate computational approaches are needed to analyze transcriptional regulation in eukaryotes on a genome-wide level. We implemented a four-step classification workflow which for a given protein sequence (1) discriminates TFs from other proteins, (2) determines the structural superclass of TFs, (3) identifies the DNA-binding domains of TFs and (4) predicts their cis-acting DNA motif. While existing tools were extended and adapted for performing the latter two prediction steps, the first two steps are based on a novel numeric sequence representation which allows for combining existing knowledge from a BLAST scan with robust machine learning-based classification. By evaluation on a set of experimentally confirmed TFs and non-TFs, we demonstrate that our new protein sequence representation facilitates more reliable identification and structural classification of TFs than previously proposed sequence-derived features. The algorithms underlying our proposed methodology are implemented in the two complementary tools TFpredict and SABINE. The online and stand-alone versions of TFpredict and SABINE are freely available to academics at http://www.cogsys.cs.uni-tuebingen.de/software/TFpredict/ and http://www.cogsys.cs.uni-tuebingen.de/software/SABINE/. PMID:24349230

Eichner, Johannes; Topf, Florian; Dräger, Andreas; Wrzodek, Clemens; Wanke, Dierk; Zell, Andreas

2013-01-01

307

Synthesis, structures and magnetic properties of two 3D 3,4-pyridinedicarboxylate bridged manganese(II) coordination polymers incorporating 1D helical Mn(carboxylate)2 chain or Mn3(OH)2 chain  

International Nuclear Information System (INIS)

The hydrothermal reactions of MnCl2.4H2O, 3,4-pyridinedicarboxylic acid (3,4-pydaH2) and triethylamine in aqueous medium yield two 3D metal-organic hybrid materials, [Mn(3,4-pyda)] (1) and [Mn3(OH)2(3,4-pyda)2(H2O)2] (2), respectively. In both complexes, each 3,4-pyda acts as a pentadentate ligand to connect five Mn(II) atoms via the pyridyl group and the two ?2-carboxylate groups (one in syn,anti-mode and one in syn-syn mode for 1 and both in syn,anti-mode for 2). Complex 1 possesses an interesting 3D coordination polymeric structure incorporating 1D helical Mn(?2-carboxylate)2 chain units, in which each Mn(II) atom is coordinated in less common square pyramidal geometry to four carboxylato oxygen atoms and one pyridyl nitrogen atom. Each 3,4-pyda links three helical Mn(?2-carboxylate)2 chains and each Mn(?2-carboxylate)2 chain is linked by other eight helical Mn(?2-carboxylate)2 chains via sharing 3,4-pyda bridges. Complex 2 is a 3D coordination network consisting of 1D Mn3(OH)2 chains and 3,4-pyda bridges. The repeating trimeric structural unit in the manganese(II) hydroxide chain consists of two edge-sharing symmetry-related manganese octahedra linked via ?3-OH to a vertex of Mn2 octahedron. Each 3,4-pyda links three Mn3 3,4-pyda links three Mn3(OH)2 chains and each Mn3(OH)2 chain is linked by other six Mn3(OH)2 chains via 3,4-pyda bridges, resulting in a 3D coordination solid. Magnetic measurements reveal that a weak antiferromagnetic interaction between the MnII ions occurs in complex 1 and a 3D magnetic ordering at about 7.0K in complex 2

308

A shared system for learning serial and temporal structure of sensori-motor sequences? Evidence from simulation and human experiments.  

Science.gov (United States)

This research investigates the influences of temporal structure on the representation of serial order. Experiments are performed in a neural network model of sequence learning and in human subjects. In the sequence learning model, a recurrent network of leaky integrator neurons encodes a succession of internal states that become associated, by reinforcement learning, with the correct sequential responses. First, the model is shown to learn a simple temporal discrimination task. The model is then exposed to two novel serial reaction time (SRT) experiments. In the standard SRT task (M.J. Nissen, P. Bullemer, Attentional requirements of learning: evidence from performance measures, Cogn. Psychol. 19 (1987) 1-32 [16]), reaction times for stimuli presented in a repeating sequence are reduced with respect to those for random stimuli, providing a measure of sequence learning. The novelty of the current experiments is that imbedded in the serial order of the sequences, there is a temporal structure of delays. The model is sensitive to both the serial structure and the temporal structure of the sequences. This observation is then confirmed in human subjects. These results demonstrate how a novel recurrent architecture encodes the interaction of temporal and serial structure and provide insight into related aspects of human sensori-motor sequence learning. PMID:9479067

Dominey, P F

1998-01-01

309

Memory Efficient Sequence Analysis Using Compressed Data Structures (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)  

Energy Technology Data Exchange (ETDEWEB)

Wellcome Trust Sanger Institute's Jared Simpson on "Memory efficient sequence analysis using compressed data structures" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011

Simpson, Jared [Wellcome Trust Sanger Institute

2011-10-13

310

Genome sequences and structures of two biologically distinct strains of Grapevine leafroll-associated virus 2 and sequence analysis.  

Science.gov (United States)

Grapevine leafroll-associated virus 2 (GLRaV-2), a member of the genus Closterovirus within Closteroviridae, is implicated in several important diseases of grapevines including "leafroll", "graft-incompatibility", and "quick decline" worldwide. Several GLRaV-2 isolates have been detected from different grapevine genotypes. However, the genomes of these isolates were not sequenced or only partially sequenced. Consequently, the relationship of these viral isolates at the molecular level has not been determined. Here, we group the various GLRaV-2 isolates into four strains based on their coat protein gene sequences. We show that isolates "PN" (originated from Vitis vinifera cv. "Pinot noir"), "Sem" (from V. vinifera cv. "Semillon") and "94/970" (from V. vinifera cv. "Muscat of Alexandria") belong to the same strain, "93/955" (from hybrid "LN-33") and "H4" (from V. rupestris "St. George") each represents a distinct strain, while Grapevine rootstock stem lesion-associated virus. PMID:15965606

Meng, Baozhong; Li, Caihong; Goszczynski, Dariusz E; Gonsalves, Dennis

2005-08-01

311

In-vineyard population structure of 'Candidatus Phytoplasma solani' using multilocus sequence typing analysis.  

Science.gov (United States)

'Candidatus Phytoplasma solani' is a phytoplasma of the stolbur group (16SrXII subgroup A) that is associated with 'Bois noir' and causes heavy damage to the quality and quantity of grapevine yields in several European countries, and particularly in the Mediterranean area. Analysis of 'Ca. P. solani' genetic diversity was carried out for strains infecting a cv. 'Chardonnay' vineyard, through multilocus sequence typing analysis for the vmp1, stamp and secY genes. Several types per gene were detected: seven out of 20 types for vmp1, six out of 17 for stamp, and four out of 16 for secY. High correlations were seen among the vmp1, stamp and secY typing with the tuf typing. However, no correlations were seen among the tuf and vmp1 types and the Bois noir severity in the surveyed grapevines. Grouping the 'Ca. P. solani' sequences on the basis of their origins (i.e., study vineyard, Italian regions, Euro-Mediterranean countries), dN/dS ratio analysis revealed overall positive selection for stamp (3.99, P=0.019) and vmp1 (2.28, P=0.001). For secY, the dN/dS ratio was 1.02 (P=0.841), showing neutral selection across this gene. Using analysis of the nucleotide sequencing by a Bayesian approach, we determined the population structure of 'Ca. P. solani', which appears to be structured in 3, 5 and 6 subpopulations, according to the secY, stamp and vmp1 genes, respectively. The high genetic diversity of 'Ca. P. solani' from a single vineyard reflects the population structure across wider geographical scales. This information is useful to trace inoculum source and movement of pathogen strains at the local level and over long distances. PMID:25660034

Murolo, Sergio; Romanazzi, Gianfranco

2015-04-01

312

Complete sequence of the genome of the human isolate of Andes virus CHI-7913: comparative sequence and protein structure analysis  

OpenAIRE

We report here the complete genomic sequence of the Chilean human isolate of Andes virus CHI-7913. The S, M, and L genome segment sequences of this isolate are 1,802, 3,641 and 6,466 bases in length, with an overall GC content of 38.7%. These genome segments code for a nucleocapsid protein of 428 amino acids, a glycoprotein precursor protein of 1,138 amino acids and a RNA-dependent RNA polymerase of 2,152 amino acids. In addition, the genome also has other ORFs coding for putative proteins of...

Tischler, Nicole D.; JORGE FERNÁNDEZ; ILSE MÜLLER; RODRIGO MARTÍNEZ; HÉCTOR GALENO; ELIECER VILLAGRA; JUDITH MORA; EUGENIO RAMÍREZ; MARIO ROSEMBLATT; Valenzuela, Pablo D. T.

2003-01-01

313

Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships  

OpenAIRE

Pairwise sequence comparison methods have been assessed using proteins whose relationships are known reliably from their structures and functions, as described in the scop database [Murzin, A. G., Brenner, S. E., Hubbard, T. & Chothia C. (1995) J. Mol. Biol. 247, 536–540]. The evaluation tested the programs blast [Altschul, S. F., Gish, W., Miller, W., Myers, E. W. & Lipman, D. J. (1990). J. Mol. Biol. 215, 403–410], wu-blast2 [Altschul, S. F. & Gish, W. (1996) Methods Enzymol. 266, 460?...

Brenner, Steven E.; Chothia, Cyrus; Hubbard, Tim J. P.

1998-01-01

314

CPHmodels-3.0--remote homology modeling using structure-guided sequence profiles  

OpenAIRE

CPHmodels-3.0 is a web server predicting protein 3D structure by use of single template homology modeling. The server employs a hybrid of the scoring functions of CPHmodels-2.0 and a novel remote homology-modeling algorithm. A query sequence is first attempted modeled using the fast CPHmodels-2.0 profile–profile scoring function suitable for close homology modeling. The new computational costly remote homology-modeling algorithm is only engaged provided that no suitable PDB template is iden...

Nielsen, Morten; Lundegaard, Claus; Lund, Ole; Petersen, Thomas Nordahl

2010-01-01

315

Indel PDB: A database of structural insertions and deletions derived from sequence alignments of closely related proteins  

OpenAIRE

Abstract Background Insertions and deletions (indels) represent a common type of sequence variations, which are less studied and pose many important biological questions. Recent research has shown that the presence of sizable indels in protein sequences may be indicative of protein essentiality and their role in protein interaction networks. Examples of utilization of indels for structure-based drug design have also been recently demonstrated. Nonetheless many structural and functional charac...

Cherkasov Artem; Hsing Michael

2008-01-01

316

On the structure and the behaviour of Collatz 3n + 1 sequences - Finite subsequences and the role of the Fibonacci sequence  

OpenAIRE

It is shown that every Collatz sequence $C(s)$ consists only of same structured finite subsequences $C^h(s)$ for $s\\equiv9\\ (mod\\ 12)$ or $C^t(s)$ for $s\\equiv3,7\\ (mod\\ 12)$. For starting numbers of specific residue classes ($mod\\ 12\\cdot2^h$) or ($mod\\ 12\\cdot2^{t+1}$) the finite subsequences have the same length $h,t$. It is conjectured that for each $h,t\\geq2$ the number of all admissible residue classes is given exactly by the Fibonacci sequence. This has been proved fo...

Winkler, Mike

2014-01-01

317

Structural organization of glycophorin A and B genes: Glycophorin B gene evolved by homologous recombination at Alu repeat sequences  

International Nuclear Information System (INIS)

Glycophorins A (GPA) and B (GPB) are two major sialoglycoproteins of the human erythrocyte membrane. Here the authors present a comparison of the genomic structures of GPA and GPB developed by analyzing DNA clones isolated from a K562 genomic library. Nucleotide sequences of exon-intron junctions and 5' and 3' flanking sequences revealed that the GPA and GPB genes consist of 7 and 5 exons, respectively, and both genes have >95% identical sequence from the 5' flanking region to the region ? 1 kilobase downstream from the exon encoding the transmembrane regions. In this homologous part of the genes, GPB lacks one exon due to a point mutation at the 5' splicing site of the third intron, which inactivates the 5' cleavage event of splicing and leads to ligation of the second to the fourth exon. Following these very homologous sequences, the genomic sequences for GPA and GPB diverge significantly and no homology can be detected in their 3' end sequences. The analysis of the Alu sequences and their flanking direct repeat sequences suggest that an ancestral genomic structure has been maintained in the GPA gene, whereas the GPB gene has arisen from the acquisition of 3' sequences different from those of the GPA gene by homologous recombination at the Alu repeats during or after gene duplication

318

In Silico sequence analysis and molecular modeling of the three-dimensional structure of DAHP synthase from Pseudomonas fragi.  

Science.gov (United States)

The shikimate pathway is involved in production of aromatic amino acids in microorganisms and plants. The enzymes of this biosynthetic pathway are a potential target for the design of antimicrobial compounds and herbicides. 3-deoxy-D-arabinoheptulosonate-7-phosphate synthase (DAHPS) catalyzes the first step of the pathway. The gene encoding DAHPS was cloned and sequenced from Pseudomonas fragi, the bacterium responsible for spoilage of milk, dairy products and meat. Amino acid sequence deduced from the nucleotide sequence revealed that P. fragi DAHPS (Pf-DAHPS) consists of 448 amino acids with calculated molecular weight of ?50 kDa and isoelectric point of 5.81. Primary sequence analysis of Pf-DAHPS shows that it has more than 84% identity with DAHPS of other Pseudomonas species, 46% identity with Mycobacterium tuberculosis DAHPS (Mt-DAHPS), the type II DAHPS and less than 11% sequence identity with the type I DAHPS. The three-dimensional structure of Pf-DAHPS was predicted by homology modeling based on the crystal structure of Mt-DAHPS. Pf-DAHPS model contains a (?/?)(8) TIM barrel structure. Sequence alignment, phylogenetic analysis and 3D structure model classifies Pf-DAHPS as a type II DAHPS. Sequence analysis revealed the presence of DAHPS signature motif DxxHxN in Pf-DAHPS. Highly conserved sequence motif RxxxxxxKPRT(S/T) and xGxR present in type II DAHPS were also identified in Pf-DAHPS sequence. High sequence homology of DAHPS within Pseudomonas species points to the option of designing a broad spectrum drug for the genus. Pf-DAHPS 3D model provides molecular insights that may be beneficial in rationale inhibitor design for developing effective food preservative against P. fragi. PMID:20517625

Tapas, Satya; Kumar Patel, Girijesh; Dhindwal, Sonali; Tomar, Shailly

2011-04-01

319

Regulatory sequences of Arabidopsis drive reporter gene expression in nematode feeding structures.  

Science.gov (United States)

In the quest for plant regulatory sequences capable of driving nematode-triggered effector gene expression in feeding structures, we show that promoter tagging is a valuable tool. A large collection of transgenic Arabidopsis plants was generated. They were transformed with a beta-glucuronidase gene functioning as a promoter tag. Three T-DNA constructs, pGV1047, p delta gusBin19, and pMOG553, were used. Early responses to nematode invasion were of primary interest. Six lines exhibiting beta-glucuronidase activity in syncytia induced by the beet cyst nematode were studied. Reporter gene activation was also identified in galls induced by root knot and ectoparasitic nematodes. Time-course studies revealed that all six tags were differentially activated during the development of the feeding structure. T-DNA-flanking regions responsible for the observed responses after nematode infection were isolated and characterized for promoter activity. PMID:9437858

Barthels, N; van der Lee, F M; Klap, J; Goddijn, O J; Karimi, M; Puzio, P; Grundler, F M; Ohl, S A; Lindsey, K; Robertson, L; Robertson, W M; Van Montagu, M; Gheysen, G; Sijmons, P C

1997-01-01

320

Focused Evolution of HIV-1 Neutralizing Antibodies Revealed by Structures and Deep Sequencing  

Energy Technology Data Exchange (ETDEWEB)

Antibody VRC01 is a human immunoglobulin that neutralizes about 90% of HIV-1 isolates. To understand how such broadly neutralizing antibodies develop, we used x-ray crystallography and 454 pyrosequencing to characterize additional VRC01-like antibodies from HIV-1-infected individuals. Crystal structures revealed a convergent mode of binding for diverse antibodies to the same CD4-binding-site epitope. A functional genomics analysis of expressed heavy and light chains revealed common pathways of antibody-heavy chain maturation, confined to the IGHV1-2*02 lineage, involving dozens of somatic changes, and capable of pairing with different light chains. Broadly neutralizing HIV-1 immunity associated with VRC01-like antibodies thus involves the evolution of antibodies to a highly affinity-matured state required to recognize an invariant viral structure, with lineages defined from thousands of sequences providing a genetic roadmap of their development.

Wu, Xueling; Zhou, Tongqing; Zhu, Jiang; Zhang, Baoshan; Georgiev, Ivelin; Wang, Charlene; Chen, Xuejun; Longo, Nancy S.; Louder, Mark; McKee, Krisha; O?Dell, Sijy; Perfetto, Stephen; Schmidt, Stephen D.; Shi, Wei; Wu, Lan; Yang, Yongping; Yang, Zhi-Yong; Yang, Zhongjia; Zhang, Zhenhai; Bonsignori, Mattia; Crump, John A.; Kapiga, Saidi H.; Sam, Noel E.; Haynes, Barton F.; Simek, Melissa; Burton, Dennis R.; Koff, Wayne C.; Doria-Rose, Nicole A.; Connors, Mark; Mullikin, James C.; Nabel, Gary J.; Roederer, Mario; Shapiro, Lawrence; Kwong, Peter D.; Mascola, John R. (Tumaini); (NIH); (Duke); (Kilimanjaro Repro.); (IAVI)

2013-03-04

321

De novo structure prediction of globular proteins aided by sequence variation-derived contacts.  

Science.gov (United States)

The advent of high accuracy residue-residue intra-protein contact prediction methods enabled a significant boost in the quality of de novo structure predictions. Here, we investigate the potential benefits of combining a well-established fragment-based folding algorithm--FRAGFOLD, with PSICOV, a contact prediction method which uses sparse inverse covariance estimation to identify co-varying sites in multiple sequence alignments. Using a comprehensive set of 150 diverse globular target proteins, up to 266 amino acids in length, we are able to address the effectiveness and some limitations of such approaches to globular proteins in practice. Overall we find that using fragment assembly with both statistical potentials and predicted contacts is significantly better than either statistical potentials or contacts alone. Results show up to nearly 80% of correct predictions (TM-score ?0.5) within analysed dataset and a mean TM-score of 0.54. Unsuccessful modelling cases emerged either from conformational sampling problems, or insufficient contact prediction accuracy. Nevertheless, a strong dependency of the quality of final models on the fraction of satisfied predicted long-range contacts was observed. This not only highlights the importance of these contacts on determining the protein fold, but also (combined with other ensemble-derived qualities) provides a powerful guide as to the choice of correct models and the global quality of the selected model. A proposed quality assessment scoring function achieves 0.93 precision and 0.77 recall for the discrimination of correct folds on our dataset of decoys. These findings suggest the approach is well-suited for blind predictions on a variety of globular proteins of unknown 3D structure, provided that enough homologous sequences are available to construct a large and accurate multiple sequence alignment for the initial contact prediction step. PMID:24637808

Kosciolek, Tomasz; Jones, David T

2014-01-01

322

Cloning, Sequencing, Purification, and Crystal Structure of Grenache (Vitis vinifera) Polyphenol Oxidase  

Energy Technology Data Exchange (ETDEWEB)

The full-length cDNA sequence (P93622{_}VITVI) of polyphenol oxidase (PPO) cDNA from grape Vitis vinifera L., cv Grenache, was found to encode a translated protein of 607 amino acids with an expected molecular weight of ca. 67 kDa and a predicted pI of 6.83. The translated amino acid sequence was 99%, identical to that of a white grape berry PPO (1) (5 out of 607 amino acid potential sequence differences). The protein was purified from Grenache grape berries by using traditional methods, and it was crystallized with ammonium acetate by the hanging-drop vapor diffusion method. The crystals were orthorhombic, space group C2221. The structure was obtained at 2.2 {angstrom} resolution using synchrotron radiation using the 39 kDa isozyme of sweet potato PPO (PDB code: 1BT1) as a phase donor. The basic symmetry of the cell parameters (a, b, and c and {alpha}, {beta}, and {gamma}) as well as in the number of asymmetric units in the unit cell of the crystals of PPO, differed between the two proteins. The structures of the two enzymes are quite similar in overall fold, the location of the helix bundles at the core, and the active site in which three histidines bind each of the two catalytic copper ions, and one of the histidines is engaged in a thioether linkage with a cysteine residue. The possibility that the formation of the Cys-His thioether linkage constitutes the activation step is proposed. No evidence of phosphorylation or glycoslyation was found in the electron density map. The mass of the crystallized protein appears to be only 38.4 kDa, and the processing that occurs in the grape berry that leads to this smaller size is discussed.

Virador, V.; Reyes Grajeda, J; Blanco-Labra, A; Mendiola-Olaya, E; Smith, G; Moreno, A; Whitaker, J

2010-01-01

323

Cloning, sequencing, purification, and crystal structure of Grenache (Vitis vinifera) polyphenol oxidase.  

Science.gov (United States)

The full-length cDNA sequence (P93622_VITVI) of polyphenol oxidase (PPO) cDNA from grape Vitis vinifera L., cv Grenache, was found to encode a translated protein of 607 amino acids with an expected molecular weight of ca. 67 kDa and a predicted pI of 6.83. The translated amino acid sequence was 99%, identical to that of a white grape berry PPO (1) (5 out of 607 amino acid potential sequence differences). The protein was purified from Grenache grape berries by using traditional methods, and it was crystallized with ammonium acetate by the hanging-drop vapor diffusion method. The crystals were orthorhombic, space group C222(1). The structure was obtained at 2.2 A resolution using synchrotron radiation using the 39 kDa isozyme of sweet potato PPO (PDB code: 1BT1 ) as a phase donor. The basic symmetry of the cell parameters (a, b, and c and alpha, beta, and gamma) as well as in the number of asymmetric units in the unit cell of the crystals of PPO, differed between the two proteins. The structures of the two enzymes are quite similar in overall fold, the location of the helix bundles at the core, and the active site in which three histidines bind each of the two catalytic copper ions, and one of the histidines is engaged in a thioether linkage with a cysteine residue. The possibility that the formation of the Cys-His thioether linkage constitutes the activation step is proposed. No evidence of phosphorylation or glycoslyation was found in the electron density map. The mass of the crystallized protein appears to be only 38.4 kDa, and the processing that occurs in the grape berry that leads to this smaller size is discussed. PMID:20039636

Virador, Victoria M; Reyes Grajeda, Juan P; Blanco-Labra, Alejandro; Mendiola-Olaya, Elizabeth; Smith, Gary M; Moreno, Abel; Whitaker, John R

2010-01-27

324

Functional and Structural Overview of G-Protein-Coupled Receptors Comprehensively Obtained from Genome Sequences  

Directory of Open Access Journals (Sweden)

Full Text Available An understanding of the functional mechanisms of G-protein-coupled receptors (GPCRs is very important for GPCR-related drug design. We have developed an integrated GPCR database (SEVENS http://sevens.cbrc.jp/ that includes 64,090 reliable GPCR genes comprehensively identified from 56 eukaryote genome sequences, and overviewed the sequences and structure spaces of the GPCRs. In vertebrates, the number of receptors for biological amines, peptides, etc. is conserved in most species, whereas the number of chemosensory receptors for odorant, pheromone, etc. significantly differs among species. The latter receptors tend to be single exon type or a few exon type and show a high ratio in the numbers of GPCRs, whereas some families, such as Class B and Class C receptors, have long lengths due to the presence of many exons. Statistical analyses of amino acid residues reveal that most of the conserved residues in Class A GPCRs are found in the cytoplasmic half regions of transmembrane (TM helices, while residues characteristic to each subfamily found on the extracellular half regions. The 69 of Protein Data Bank (PDB entries of complete or fragmentary structures could be mapped on the TM/loop regions of Class A GPCRs covering 14 subfamilies.

Makiko Suwa

2011-04-01

325

A more accurate relocation of the 2013 M s7.0 Lushan, Sichuan, China, earthquake sequence, and the seismogenic structure analysis  

Science.gov (United States)

We use a combined earthquake location technique to relocate the M s7.0 Lushan, Sichuan, China, earthquake sequence of April 20, 2013. A stepwise approach, employing three existing location methods (the HYPOINVERSE method, the Minimum 1-D model, and the Double Difference method), is used to improve location precision by iteratively revising the velocity model station corrections, and hypocenter relocations throughout the process. Our stepwise approach has significantly improved the location precision of the Lushan earthquake sequence, yielding hypocenter locations with final errors of 359, 309, and 605 m in the E-W, N-S, and vertical directions, respectively, with average travel time residuals of 0.12 s. Furthermore, we analyzed the seismogenic structure surrounding the Lushan earthquake sequence by combining the results of the relocated hypocenter distribution with new focal mechanism solutions and information from regional geological and geophysical investigations. From our analysis, we conclude that the vast majority of the aftershocks of the Lushan earthquake sequence occurred at depths of 6-9 km, near the front of the southwestern segment of the NE-trending Longmenshan fault zone. Densely aligned hypocenters clearly suggest that the seismogenic structure of the mainshock consists of a set of basal thrust faults dipping to the NW at 40-50°, at a ramp of the deep basal décollement-thrust system at depths of 7-18 km. Focal mechanism solutions suggest that the seismogenic faults have produced almost pure thrusting. At least one SE-dipping back-thrust is also observed within the basement, as indicated by the hypocenter relocations, which points to either a secondary rupture plane during the mainshock or a plane of aftershock slips. A small number of minor events in the Lushan sequence are located at depths of 0-6 km, with a distribution suggesting that the three NE-trending faults with surface traces running through or passing close to the aftershock area are confined to the upper Mesozoic sedimentary cover, making them independent of the deeper thrust faults that ruptured during the mainshock. Therefore, the 2013 M s7.0 Lushan earthquake was a blind thrust fault generated on active thrust faults within the basement of the southwestern Longmenshan fault zone, with an upper limit estimation of the rupture length, average down-dip width, and rupture area of 40, 16, and 640 km2, respectively.

Long, F.; Wen, X. Z.; Ruan, X.; Zhao, M.; Yi, G. X.

2015-02-01

326

Transitive Homology-Guided Structural Studies Lead to Discovery of Cro Proteins With 40% Sequence Identify But Different Folds  

Energy Technology Data Exchange (ETDEWEB)

Proteins that share common ancestry may differ in structure and function because of divergent evolution of their amino acid sequences. For a typical diverse protein superfamily, the properties of a few scattered members are known from experiment. A satisfying picture of functional and structural evolution in relation to sequence changes, however, may require characterization of a larger, well chosen subset. Here, we employ a 'stepping-stone' method, based on transitive homology, to target sequences intermediate between two related proteins with known divergent properties. We apply the approach to the question of how new protein folds can evolve from preexisting folds and, in particular, to an evolutionary change in secondary structure and oligomeric state in the Cro family of bacteriophage transcription factors, initially identified by sequence-structure comparison of distant homologs from phages P22 and {lambda}. We report crystal structures of two Cro proteins, Xfaso 1 and Pfl 6, with sequences intermediate between those of P22 and {lambda}. The domains show 40% sequence identity but differ by switching of {alpha}-helix to {beta}-sheet in a C-terminal region spanning {approx}25 residues. Sedimentation analysis also suggests a correlation between helix-to-sheet conversion and strengthened dimerization.

Roessler, C.G.; Hall, B.M.; Anderson, W.J.; Ingram, W.M.; Roberts, S.A.; Montfort, W.R.; Cordes, M.H.J.

2009-05-27

327

PyMod: sequence similarity searches, multiple sequence-structure alignments, and homology modeling within PyMOL  

OpenAIRE

Abstract Background In recent years, an exponential growing number of tools for protein sequence analysis, editing and modeling tasks have been put at the disposal of the scientific community. Despite the vast majority of these tools have been released as open source software, their deep learning curves often discourages even the most experienced users. Results A simple and intuitive interface, PyMod, between the popular molecular graphics system PyMOL and several other tools (i.e., [PSI-]BLA...

Bramucci Emanuele; Paiardini Alessandro; Bossa Francesco; Pascarella Stefano

2012-01-01

328

Complete sequence of the genome of the human isolate of Andes virus CHI-7913: comparative sequence and protein structure analysis  

Directory of Open Access Journals (Sweden)

Full Text Available We report here the complete genomic sequence of the Chilean human isolate of Andes virus CHI-7913. The S, M, and L genome segment sequences of this isolate are 1,802, 3,641 and 6,466 bases in length, with an overall GC content of 38.7%. These genome segments code for a nucleocapsid protein of 428 amino acids, a glycoprotein precursor protein of 1,138 amino acids and a RNA-dependent RNA polymerase of 2,152 amino acids. In addition, the genome also has other ORFs coding for putative proteins of 34 to 103 amino acids. The encoded proteins have greater than 98% overall similarity with the proteins of Andes virus isolates AH-1 and Chile R123. Among other sequenced Hantavirus, CHI-7913 is more closely related to Sin Nombre virus, with an overall protein similarity of 92%. The characteristics of the encoded proteins of this isolate, such as hydrophobic domains, glycosylation sites, and conserved amino acid motifs shared with other Hantavirus and other members of the Bunyaviridae family, are identified and discussed.

NICOLE D TISCHLER

2003-01-01

329

Complete sequence of the genome of the human isolate of Andes virus CHI-7913: comparative sequence and protein structure analysis  

Scientific Electronic Library Online (English)

Full Text Available We report here the complete genomic sequence of the Chilean human isolate of Andes virus CHI-7913. The S, M, and L genome segment sequences of this isolate are 1,802, 3,641 and 6,466 bases in length, with an overall GC content of 38.7%. These genome segments code for a nucleocapsid protein of 428 am [...] ino acids, a glycoprotein precursor protein of 1,138 amino acids and a RNA-dependent RNA polymerase of 2,152 amino acids. In addition, the genome also has other ORFs coding for putative proteins of 34 to 103 amino acids. The encoded proteins have greater than 98% overall similarity with the proteins of Andes virus isolates AH-1 and Chile R123. Among other sequenced Hantavirus, CHI-7913 is more closely related to Sin Nombre virus, with an overall protein similarity of 92%. The characteristics of the encoded proteins of this isolate, such as hydrophobic domains, glycosylation sites, and conserved amino acid motifs shared with other Hantavirus and other members of the Bunyaviridae family, are identified and discussed.

NICOLE D, TISCHLER; JORGE, FERNÁNDEZ; ILSE, MÜLLER; RODRIGO, MARTÍNEZ; HÉCTOR, GALENO; ELIECER, VILLAGRA; JUDITH, MORA; EUGENIO, RAMÍREZ; MARIO, ROSEMBLATT; PABLO D.T., VALENZUELA.

330

Identification of novel DNA repair proteins via primary sequence, secondary structure, and homology  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background DNA repair is the general term for the collection of critical mechanisms which repair many forms of DNA damage such as methylation or ionizing radiation. DNA repair has mainly been studied in experimental and clinical situations, and relatively few information-based approaches to new extracting DNA repair knowledge exist. As a first step, automatic detection of DNA repair proteins in genomes via informatics techniques is desirable; however, there are many forms of DNA repair and it is not a straightforward process to identify and classify repair proteins with a single optimal method. We perform a study of the ability of homology and machine learning-based methods to identify and classify DNA repair proteins, as well as scan vertebrate genomes for the presence of novel repair proteins. Combinations of primary sequence polypeptide frequency, secondary structure, and homology information are used as feature information for input to a Support Vector Machine (SVM. Results We identify that SVM techniques are capable of identifying portions of DNA repair protein datasets without admitting false positives; at low levels of false positive tolerance, homology can also identify and classify proteins with good performance. Secondary structure information provides improved performance compared to using primary structure alone. Furthermore, we observe that machine learning methods incorporating homology information perform best when data is filtered by some clustering technique. Analysis by applying these methodologies to the scanning of multiple vertebrate genomes confirms a positive correlation between the size of a genome and the number of DNA repair protein transcripts it is likely to contain, and simultaneously suggests that all organisms have a non-zero minimum number of repair genes. In addition, the scan result clusters several organisms' repair abilities in an evolutionarily consistent fashion. Analysis also identifies several functionally unconfirmed proteins that are highly likely to be involved in the repair process. A new web service, INTREPED, has been made available for the immediate search and annotation of DNA repair proteins in newly sequenced genomes. Conclusion Despite complexity due to a multitude of repair pathways, combinations of sequence, structure, and homology with Support Vector Machines offer good methods in addition to existing homology searches for DNA repair protein identification and functional annotation. Most importantly, this study has uncovered relationships between the size of a genome and a genome's available repair repetoire, and offers a number of new predictions as well as a prediction service, both which reduce the search time and cost for novel repair genes and proteins.

Akutsu Tatsuya

2009-01-01

331

Main: 1D8U [RPSD[Archive  

Lifescience Database Archive (English)

Full Text Available 1D8U ?? Rice Oryza sativa L. Non-Symbiotic Hemoglobin ... 1 Name=Hb1; Synonyms=Glb1a; Oryza Sativa ... Molecule: Non-Symbiotic Hemoglobin ; Chain: A, B; Engineered: Yes Oxygen Storage/Trans ... lips Jr. Crystal Structure Of A Nonsymbiotic Plant Hemoglobin ... Structure V. 8 1005 2000 Globin, Bis-Histidyl, Hem ...

332

The impact of CRISPR repeat sequence on structures of a Cas6 protein?RNA complex  

Energy Technology Data Exchange (ETDEWEB)

The repeat-associated mysterious proteins (RAMPs) comprise the most abundant family of proteins involved in prokaryotic immunity against invading genetic elements conferred by the clustered regularly interspaced short palindromic repeat (CRISPR) system. Cas6 is one of the first characterized RAMP proteins and is a key enzyme required for CRISPR RNA maturation. Despite a strong structural homology with other RAMP proteins that bind hairpin RNA, Cas6 distinctly recognizes single-stranded RNA. Previous structural and biochemical studies show that Cas6 captures the 5' end while cleaving the 3' end of the CRISPR RNA. Here, we describe three structures and complementary biochemical analysis of a noncatalytic Cas6 homolog from Pyrococcus horikoshii bound to CRISPR repeat RNA of different sequences. Our study confirms the specificity of the Cas6 protein for single-stranded RNA and further reveals the importance of the bases at Positions 5-7 in Cas6-RNA interactions. Substitutions of these bases result in structural changes in the protein-RNA complex including its oligomerization state.

Wang, Ruiying; Zheng, Han; Preamplume, Gan; Shao, Yaming; Li, Hong (FSU)

2012-03-15

333

Genotyping by sequencing resolves shallow population structure to inform conservation of Chinook salmon (Oncorhynchus tshawytscha).  

Science.gov (United States)

Recent advances in population genomics have made it possible to detect previously unidentified structure, obtain more accurate estimates of demographic parameters, and explore adaptive divergence, potentially revolutionizing the way genetic data are used to manage wild populations. Here, we identified 10 944 single-nucleotide polymorphisms using restriction-site-associated DNA (RAD) sequencing to explore population structure, demography, and adaptive divergence in five populations of Chinook salmon (Oncorhynchus tshawytscha) from western Alaska. Patterns of population structure were similar to those of past studies, but our ability to assign individuals back to their region of origin was greatly improved (>90% accuracy for all populations). We also calculated effective size with and without removing physically linked loci identified from a linkage map, a novel method for nonmodel organisms. Estimates of effective size were generally above 1000 and were biased downward when physically linked loci were not removed. Outlier tests based on genetic differentiation identified 733 loci and three genomic regions under putative selection. These markers and genomic regions are excellent candidates for future research and can be used to create high-resolution panels for genetic monitoring and population assignment. This work demonstrates the utility of genomic data to inform conservation in highly exploited species with shallow population structure. PMID:24665338

Larson, Wesley A; Seeb, Lisa W; Everett, Meredith V; Waples, Ryan K; Templin, William D; Seeb, James E

2014-03-01

334

Structural mechanisms of the degenerate sequence recognition by Bse634I restriction endonuclease  

Science.gov (United States)

Restriction endonuclease Bse634I recognizes and cleaves the degenerate DNA sequence 5?-R/CCGGY-3? (R stands for A or G; Y for T or C, ‘/’ indicates a cleavage position). Here, we report the crystal structures of the Bse634I R226A mutant complexed with cognate oligoduplexes containing ACCGGT and GCCGGC sites, respectively. In the crystal, all potential H-bond donor and acceptor atoms on the base edges of the conserved CCGG core are engaged in the interactions with Bse634I amino acid residues located on the ?6 helix. In contrast, direct contacts between the protein and outer base pairs are limited to van der Waals contact between the purine nucleobase and Pro203 residue in the major groove and a single H-bond between the O2 atom of the outer pyrimidine and the side chain of the Asn73 residue in the minor groove. Structural data coupled with biochemical experiments suggest that both van der Waals interactions and indirect readout contribute to the discrimination of the degenerate base pair by Bse634I. Structure comparison between related enzymes Bse634I (R/CCGGY), NgoMIV (G/CCGGC) and SgrAI (CR/CCGGYG) reveals how different specificities are achieved within a conserved structural core. PMID:22495930

Manakova, Elena; Gražulis, Saulius; Zaremba, Mindaugas; Tamulaitiene, Giedre; Golovenko, Dmitrij; Siksnys, Virginijus

2012-01-01

335

Structural mechanisms of the degenerate sequence recognition by Bse634I restriction endonuclease.  

Science.gov (United States)

Restriction endonuclease Bse634I recognizes and cleaves the degenerate DNA sequence 5'-R/CCGGY-3' (R stands for A or G; Y for T or C, '/' indicates a cleavage position). Here, we report the crystal structures of the Bse634I R226A mutant complexed with cognate oligoduplexes containing ACCGGT and GCCGGC sites, respectively. In the crystal, all potential H-bond donor and acceptor atoms on the base edges of the conserved CCGG core are engaged in the interactions with Bse634I amino acid residues located on the ?6 helix. In contrast, direct contacts between the protein and outer base pairs are limited to van der Waals contact between the purine nucleobase and Pro203 residue in the major groove and a single H-bond between the O2 atom of the outer pyrimidine and the side chain of the Asn73 residue in the minor groove. Structural data coupled with biochemical experiments suggest that both van der Waals interactions and indirect readout contribute to the discrimination of the degenerate base pair by Bse634I. Structure comparison between related enzymes Bse634I (R/CCGGY), NgoMIV (G/CCGGC) and SgrAI (CR/CCGGYG) reveals how different specificities are achieved within a conserved structural core. PMID:22495930

Manakova, Elena; Grazulis, Saulius; Zaremba, Mindaugas; Tamulaitiene, Giedre; Golovenko, Dmitrij; Siksnys, Virginijus

2012-08-01

336

Influence of loading sequence and stress ratio on Fatigue damage accumulation of a structural component  

Scientific Electronic Library Online (English)

Full Text Available Este artigo apresenta resultados experimentais relativos à acumulação de dano de fadiga de um componente estrutural de aço P355NL1. O componente estrutural é uma placa rectangular com duplo entalhe. Foram aplicadas sequências de dois e múltiplos blocos de carga de amplitude constante, para várias co [...] mbinações de razões de tensão remotas, nomeadamente R=0, R=0.15 e R=0.3. Também foram analisados os efeitos da aplicação de blocos de amplitude variável, aplicados de acordo com um espectro de carga predefinido. Este estudo foi complementado com resultados de ensaios realizados em amplitude constante, os quais serviram para os cálculos de acumulação de dano. Em geral, o carregamento por blocos demonstra que o dano provocado por fadiga apresenta uma evolução não linear com o número de ciclos de carga, sendo esta evolução de dano função da sequência de carga, do nível de tensão e da razão de tensões. Geralmente, a aplicação de carregamentos de amplitude variável indicia um importante efeito da razão de tensões na acumulação de dano por fadiga. Particularmente, é observado um efeito claro da sequência de carga nos carregamentos compostos por dois blocos de carga, com razão de tensões nula. Para as outras razões de tensões (altas), os efeitos da sequência de carga são praticamente desprezáveis; contudo a evolução de dano continua a ser não linear. Abstract in english This paper presents experimental results about the fatigue damage accumulation behaviour of a structural component made of P355NL1 steel. The structural component is a rectangular double notched plate. Two and multiple alternated constant amplitude block sequences were applied for various combinatio [...] ns of remote stress ranges. Three stress ratios were investigated, namely R=0, R=0.15 and R=0.3. Variable amplitude blocks were also investigated according predefined stress spectra. Constant amplitude data was also generated which is applied for damage calculation purposes. In general, the block loading demonstrates that fatigue damage evolves nonlinearly with the number of loading cycles, function of the load sequence, stress level and stress ratios. Generally, the application of variable amplitude loading suggests an important stress ratio effect on fatigue damage accumulation. In particular, a clear load sequence effect is verified for the two block loading, with null stress ratio. For the other (higher) stress ratios, the load sequence effects are almost negligible; however the damage evolution still is non-linear.

Hélder F. S. G., Pereira; Abílio M.P. de, Jesus; Alfredo S., Ribeiro; António A., Fernandes.

2008-01-01

337

Computer analysis of phytochrome sequences and reevaluation of the phytochrome secondary structure by Fourier transform infrared spectroscopy.  

Science.gov (United States)

A repertoire of various methods of computer sequence analysis was applied to phytochromes in order to gain new insights into their structure and function. A statistical analysis of 23 complete phytochrome sequences revealed regions of non-random amino acid composition, which are supposed to be of particular structural or functional importance. All phytochromes other than phyD and phyE from Arabidopsis have at least one such region at the N-terminus between residues 2 and 35. A sequence similarity search of current databases indicated striking homologies between all phytochromes and a hypothetical 84.2-kDa protein from the cyanobacterium Synechocystis. Furthermore, scanning the phytochrome sequences for the occurrence of patterns defined in the PROSITE database detected the signature of the WD repeats of the beta-transducin family within the functionally important 623-779 region (sequence numbering of phyA from Avena) in a number of phytochromes. A multiple sequence alignment performed with 23 complete phytochrome sequences is made available via the IMB Jena World-Wide Web server (http://www.imb-jena.de/PHYTO.html). It can be used as a working tool for future theoretical and experimental studies. Based on the multiple alignment striking sequence differences between phytochromes A and B were detected directly at the N-terminal end, where all phytochromes B have an additional stretch of 15-42 amino acids. There is also a variety of positions with totally conserved but different amino acids in phytochromes A and B. Most of these changes are found in the sequence segment 150-200. It is, therefore, suggested that this region might be of importance in determining the photosensory specificity of the two phytochromes. The secondary structure prediction based on the multiple alignment resulted in a small but significant beta-sheet content. This finding is confirmed by a reevaluation of the secondary structure using FTIR spectroscopy. PMID:9252112

Sühnel, J; Hermann, G; Dornberger, U; Fritzsche, H

1997-07-18

338

Use of Endogenous Retroviral Sequences (ERVs and structural markers for retroviral phylogenetic inference and taxonomy  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Endogenous retroviral sequences (ERVs are integral parts of most eukaryotic genomes and vastly outnumber exogenous retroviruses (XRVs. ERVs with a relatively complete structure were retrieved from the genetic archives of humans and chickens, diametrically opposite representatives of vertebrate retroviruses (over 3300 proviruses, and analyzed, using a bioinformatic program, RetroTector©, developed by us. This rich source of proviral information, accumulated in a local database, and a collection of XRV sequences from the literature, allowed the reconstruction of a Pol based phylogenetic tree, more extensive than previously possible. The aim was to find traits useful for classification and evolutionary studies of retroviruses. Some of these traits have been used by others, but they are here tested in a wider context than before. Results In the ERV collection we found sequences similar to the XRV-based genera: alpha-, beta-, gamma-, epsilon- and spumaretroviruses. However, the occurrence of intermediates between them indicated an evolutionary continuum and suggested that taxonomic changes eventually will be necessary. No delta or lentivirus representatives were found among ERVs. Classification based on Pol similarity is congruent with a number of structural traits. Acquisition of dUTPase occurred three times in retroviral evolution. Loss of one or two NC zinc fingers appears to have occurred several times during evolution. Nucleotide biases have been described earlier for lenti-, delta- and betaretroviruses and were here confirmed in a larger context. Conclusion Pol similarities and other structural traits contribute to a better understanding of retroviral phylogeny. "Global" genomic properties useful in phylogenies are i. translational strategy, ii. number of Gag NC zinc finger motifs, iii. presence of Pro N-terminal dUTPase (dUTPasePro, iv. presence of Pro C-terminal G-patch and v. presence of a GPY/F motif in the Pol integrase (IN C-terminal domain. "Local" retroviral genomic properties useful for delineation of lower level taxa are i. host species range, ii. nucleotide compositional bias and iii. LTR lengths.

Sperber Göran O

2005-08-01

339

CPHmodels-3.0--remote homology modeling using structure-guided sequence profiles  

DEFF Research Database (Denmark)

CPHmodels-3.0 is a web server predicting protein 3D structure by use of single template homology modeling. The server employs a hybrid of the scoring functions of CPHmodels-2.0 and a novel remote homology-modeling algorithm. A query sequence is first attempted modeled using the fast CPHmodels-2.0 profile-profile scoring function suitable for close homology modeling. The new computational costly remote homology-modeling algorithm is only engaged provided that no suitable PDB template is identified in the initial search. CPHmodels-3.0 was benchmarked in the CASP8 competition and produced models for 94% of the targets (117 out of 128), 74% were predicted as high reliability models (87 out of 117). These achieved an average RMSD of 4.6 A when superimposed to the 3D structure. The remaining 26% low reliably models (30 out of 117) could superimpose to the true 3D structure with an average RMSD of 9.3 A. These performance values place the CPHmodels-3.0 method in the group of high performing 3D prediction tools. Beside its accuracy, one of the important features of the method is its speed. For most queries, the response time of the server is

Nielsen, Morten; Lundegaard, Claus

2010-01-01

340

The interplay of peptide sequence and local structure in TiO2 biomineralization.  

Science.gov (United States)

Using cyclic constrained TiO(2) binding peptides STB1 (CHKKPSKSC), RSTB1 (CHRRPSRSC) and linear peptide LSTB1 (AHKKPSKSA), it was shown that while affinity of the peptide to TiO(2) is essential to enable TiO(2) biomineralization, other factors such as biomineralization kinetics and peptide local structure need to be considered to predict biomineralization efficacy. Cyclic and linear TiO(2) binding peptides show significantly different biomineralization activities. Cyclic STB1 and RSTB1 could induce TiO(2) precipitation in the presence of titanium(IV)-bis-ammonium-lactato-dihydroxide (TiBALDH) precursor in water or tris buffer at pH 8. In contrast, linear LSTB1 was unable to mineralize TiO(2) under the same experimental conditions despite its high affinity to TiO(2) comparable with STB1 and/or RSTB1. LSTB1 being a flexible molecule could not render the stable condensation of TiBALDH precursor to form TiO(2) particles. However, in the presence of phosphate buffer ions, the structure of LSTB1 is stabilized, leading to efficient condensation of TiBALDH and TiO(2) particle formation. This study demonstrates that peptide-mediated TiO(2) mineralization is governed by a complicated interplay of peptide sequence, local structure, kinetics and the presence of mineralizing aider such as phosphate ions. PMID:22922289

Choi, Noori; Tan, Lihan; Jang, Ji-ryang; Um, Yu Mi; Yoo, Pil J; Choe, Woo-Seok

2012-10-01

341

Chromatic dispersion compensation and Coherent Direct-Sequence OCDMA operation on a single super structured FBG.  

Science.gov (United States)

We have proposed, fabricated and demonstrated experimentally a set of Coherent Direct Sequence-OCDMA en/decoders based on Super Structured Fiber Bragg Gratings (SSFBGs) which are able to compensate the fiber chromatic dispersion at the same time that they perform the en/decoding task. The proposed devices avoid the use of additional dispersion compensation stages reducing system complexity and losses. This performance was evaluated for 5.4, 11.4 and 16.8 km of SSMF. The twofold performance was verified in Low Reflectivity regime employing only one GVD compensating device at decoder or sharing out the function between encoder and decoder devices. Shared functionality requires shorter SSFBGs designs and also provides added flexibility to the optical network design. Moreover, dispersion compensated en/decoders were also designed into the High Reflectivity regime employing synthesis methods achieving more than 9 dB reduction of insertion loss for each device. PMID:22714462

Baños, Rocío; Pastor, Daniel; Amaya, Waldimar; Garcia-Munoz, Victor

2012-06-18

342

Inferring action structure and causal relationships in continuous sequences of human action.  

Science.gov (United States)

In the real world, causal variables do not come pre-identified or occur in isolation, but instead are embedded within a continuous temporal stream of events. A challenge faced by both human learners and machine learning algorithms is identifying subsequences that correspond to the appropriate variables for causal inference. A specific instance of this problem is action segmentation: dividing a sequence of observed behavior into meaningful actions, and determining which of those actions lead to effects in the world. Here we present a Bayesian analysis of how statistical and causal cues to segmentation should optimally be combined, as well as four experiments investigating human action segmentation and causal inference. We find that both people and our model are sensitive to statistical regularities and causal structure in continuous action, and are able to combine these sources of information in order to correctly infer both causal relationships and segmentation boundaries. PMID:25527974

Buchsbaum, Daphna; Griffiths, Thomas L; Plunkett, Dillon; Gopnik, Alison; Baldwin, Dare

2015-02-01

343

Term structure of 4d-electron configurations and calculated spectrum in Sn-isonuclear sequence  

Science.gov (United States)

Theoretical calculations of term structure are carried out for the ground configurations 4dw, of atomic ions in the Sn isonuclear sequence. Atomic computations are performed to give a detailed account of the transitions in Sn+6 to Sn+13 ions. The spectrum is calculated for the most important excited configurations 4p5 4dn+1, 4dn-1 4f1, and 4dn-1 5p1 with respect to the ground configuration 4dn, with n=8 1, respectively. The importance of 4p 4d, 4d 4f, and 4d 5p transitions is stressed, as well as the need for the configuration-interaction CI treatment of the ?n=0 transitions. In the region of importance for extreme ultraviolet (EUV) lithography around 13.4 nm, the strongest lines were expected to be 4dn 4p5 4dn+1 and 4dn 4dn-1 4f1.

Al-Rabban, Moza M.

2006-01-01

344

Term structure of 4d-electron configurations and calculated spectrum in Sn-isonuclear sequence  

International Nuclear Information System (INIS)

Theoretical calculations of term structure are carried out for the ground configurations 4dw, of atomic ions in the Sn isonuclear sequence. Atomic computations are performed to give a detailed account of the transitions in Sn+6 to Sn+13 ions. The spectrum is calculated for the most important excited configurations 4p5 4dn+1, 4dn-1 4f1, and 4dn-1 5p1 with respect to the ground configuration 4dn, with n=8-1, respectively. The importance of 4p-4d, 4d-4f, and 4d-5p transitions is stressed, as well as the need for the configuration-interaction CI treatment of the ?n=0 transitions. In the region of importance for extreme ultraviolet (EUV) lithography around 13.4nm, the strongest lines were expected to be 4dn-4p5 4dn+1 and 4dn-4dn-1 4f1

345

Enzyme-free translation of DNA into sequence-defined synthetic polymers structurally unrelated to nucleic acids  

Science.gov (United States)

The translation of DNA sequences into corresponding biopolymers enables the production, function and evolution of the macromolecules of life. In contrast, methods to generate sequence-defined synthetic polymers with similar levels of control have remained elusive. Here, we report the development of a DNA-templated translation system that enables the enzyme-free translation of DNA templates into sequence-defined synthetic polymers that have no necessary structural relationship with nucleic acids. We demonstrate the efficiency, sequence-specificity and generality of this translation system by oligomerizing building blocks including polyethylene glycol, ?-(D)-peptides, and ?-peptides in a DNA-programmed manner. Sequence-defined synthetic polymers with molecular weights of 26 kDa containing 16 consecutively coupled building blocks and 90 densely functionalized ?-amino acid residues were translated from DNA templates using this strategy. We integrated the DNA-templated translation system developed here into a complete cycle of translation, coding sequence replication, template regeneration and re-translation suitable for the iterated in vitro selection of functional sequence-defined synthetic polymers unrelated in structure to nucleic acids.

Niu, Jia; Hili, Ryan; Liu, David R.

2013-04-01

346

Detecting rare structural variation in evolving microbial populations from new sequence junctions using breseq  

Science.gov (United States)

New mutations leading to structural variation (SV) in genomes—in the form of mobile element insertions, large deletions, gene duplications, and other chromosomal rearrangements—can play a key role in microbial evolution. Yet, SV is considerably more difficult to predict from short-read genome resequencing data than single-nucleotide substitutions and indels (SN), so it is not yet routinely identified in studies that profile population-level genetic diversity over time in evolution experiments. We implemented an algorithm for detecting polymorphic SV as part of the breseq computational pipeline. This procedure examines split-read alignments, in which the two ends of a single sequencing read match disjoint locations in the reference genome, in order to detect structural variants and estimate their frequencies within a sample. We tested our algorithm using simulated Escherichia coli data and then applied it to 500- and 1000-generation population samples from the Lenski E. coli long-term evolution experiment (LTEE). Knowledge of genes that are targets of selection in the LTEE and mutations present in previously analyzed clonal isolates allowed us to evaluate the accuracy of our procedure. Overall, SV accounted for ~25% of the genetic diversity found in these samples. By profiling rare SV, we were able to identify many cases where alternative mutations in key genes transiently competed within a single population. We also found, unexpectedly, that mutations in two genes that rose to prominence at these early time points always went extinct in the long term. Because it is not limited by the base-calling error rate of the sequencing technology, our approach for identifying rare SV in whole-population samples may have a lower detection limit than similar predictions of SNs in these data sets. We anticipate that this functionality of breseq will be useful for providing a more complete picture of genome dynamics during evolution experiments with haploid microorganisms. PMID:25653667

Deatherage, Daniel E.; Traverse, Charles C.; Wolf, Lindsey N.; Barrick, Jeffrey E.

2014-01-01

347

Predicting deleterious nsSNPs: an analysis of sequence and structural attributes  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background There has been an explosion in the number of single nucleotide polymorphisms (SNPs within public databases. In this study we focused on non-synonymous protein coding single nucleotide polymorphisms (nsSNPs, some associated with disease and others which are thought to be neutral. We describe the distribution of both types of nsSNPs using structural and sequence based features and assess the relative value of these attributes as predictors of function using machine learning methods. We also address the common problem of balance within machine learning methods and show the effect of imbalance on nsSNP function prediction. We show that nsSNP function prediction can be significantly improved by 100% undersampling of the majority class. The learnt rules were then applied to make predictions of function on all nsSNPs within Ensembl. Results The measure of prediction success is greatly affected by the level of imbalance in the training dataset. We found the balanced dataset that included all attributes produced the best prediction. The performance as measured by the Matthews correlation coefficient (MCC varied between 0.49 and 0.25 depending on the imbalance. As previously observed, the degree of sequence conservation at the nsSNP position is the single most useful attribute. In addition to conservation, structural predictions made using a balanced dataset can be of value. Conclusion The predictions for all nsSNPs within Ensembl, based on a balanced dataset using all attributes, are available as a DAS annotation. Instructions for adding the track to Ensembl are at http://www.brightstudy.ac.uk/das_help.html

Saqi Mansoor AS

2006-04-01

348

Crystal Structure of Human Thymine DNA Glycosylase Bound to DNA Elucidates Sequence-Specific Mismatch Recognition  

Energy Technology Data Exchange (ETDEWEB)

Cytosine methylation at CpG dinucleotides produces m{sup 5}CpG, an epigenetic modification that is important for transcriptional regulation and genomic stability in vertebrate cells. However, m{sup 5}C deamination yields mutagenic G{center_dot}T mispairs, which are implicated in genetic disease, cancer, and aging. Human thymine DNA glycosylase (hTDG) removes T from G{center_dot}T mispairs, producing an abasic (or AP) site, and follow-on base excision repair proteins restore the G{center_dot}C pair. hTDG is inactive against normal A{center_dot}T pairs, and is most effective for G{center_dot}T mispairs and other damage located in a CpG context. The molecular basis of these important catalytic properties has remained unknown. Here, we report a crystal structure of hTDG (catalytic domain, hTDG{sup cat}) in complex with abasic DNA, at 2.8 {angstrom} resolution. Surprisingly, the enzyme crystallized in a 2:1 complex with DNA, one subunit bound at the abasic site, as anticipated, and the other at an undamaged (nonspecific) site. Isothermal titration calorimetry and electrophoretic mobility-shift experiments indicate that hTDG and hTDG{sup cat} can bind abasic DNA with 1:1 or 2:1 stoichiometry. Kinetics experiments show that the 1:1 complex is sufficient for full catalytic (base excision) activity, suggesting that the 2:1 complex, if adopted in vivo, might be important for some other activity of hTDG, perhaps binding interactions with other proteins. Our structure reveals interactions that promote the stringent specificity for guanine versus adenine as the pairing partner of the target base and interactions that likely confer CpG sequence specificity. We find striking differences between hTDG and its prokaryotic ortholog (MUG), despite the relatively high (32%) sequence identity.

Maiti, A.; Morgan, M.T.; Pozharski, E.; Drohat, A.C.

2009-05-19

349

Structural dynamics of cereal mitochondrial genomes as revealed by complete nucleotide sequencing of the wheat mitochondrial genome  

OpenAIRE

The application of a new gene-based strategy for sequencing the wheat mitochondrial genome shows its structure to be a 452?528 bp circular molecule, and provides nucleotide-level evidence of intra-molecular recombination. Single, reciprocal and double recombinant products, and the nucleotide sequences of the repeats that mediate their formation have been identified. The genome has 55 genes with exons, including 35 protein-coding, 3 rRNA and 17 tRNA genes. Nucleotide sequences of seven wheat...

Ogihara, Yasunari; Yamazaki, Yukiko; Murai, Koji; Kanno, Akira; Terachi, Toru; Shiina, Takashi; Miyashita, Naohiko; Nasuda, Shuhei; Nakamura, Chiharu; Mori, Naoki; Takumi, Shigeo; Murata, Minoru; Futo, Satoshi; Tsunewaki, Koichiro

2005-01-01

350

From 1D to 3D single-crystal-to-single-crystal structural transformations based on linear polyanion [Mn4(H2O)18WZnMn2(H2O)2(ZnW9O34)2]4-.  

Science.gov (United States)

A 1D anionic polyoxometalate, [Mn(4)(H(2)O)(18)WZnMn(2)(H(2)O)(2)(ZnW(9)O(34))(2)](4-), undergoes 1D to 3D single-crystal-to-single-crystal structural transformations that are induced by transition-metal cations (Co(2+) and Cu(2+)) and solvent molecules. These solid materials present interesting catalytic activity for the oxidative aromatization of Hantzsch 1,4-dihydropyridines that is dependent on the inserted heterogeneous metal cations. PMID:22074312

Shi, Lian-Xu; Zhao, Wen-Feng; Xu, Xuan; Tang, Jing; Wu, Chuan-De

2011-12-19

351

GntR family of regulators in Mycobacterium smegmatis: a sequence and structure based characterization  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Mycobacterium smegmatis is fast growing non-pathogenic mycobacteria. This organism has been widely used as a model organism to study the biology of other virulent and extremely slow growing species like Mycobacterium tuberculosis. Based on the homology of the N-terminal DNA binding domain, the recently sequenced genome of M. smegmatis has been shown to possess several putative GntR regulators. A striking characteristic feature of this family of regulators is that they possess a conserved N-terminal DNA binding domain and a diverse C-terminal domain involved in the effector binding and/or oligomerization. Since the physiological role of these regulators is critically dependent upon effector binding and operator sites, we have analysed and classified these regulators into their specific subfamilies and identified their potential binding sites. Results The sequence analysis of M. smegmatis putative GntRs has revealed that FadR, HutC, MocR and the YtrA-like regulators are encoded by 45, 8, 8 and 1 genes respectively. Further out of 45 FadR-like regulators, 19 were classified into the FadR group and 26 into the VanR group. All these proteins showed similar secondary structural elements specific to their respective subfamilies except MSMEG_3959, which showed additional secondary structural elements. Using the reciprocal BLAST searches, we further identified the orthologs of these regulators in Bacillus subtilis and other mycobacteria. Since the expression of many regulators is auto-regulatory, we have identified potential operator sites for a number of these GntR regulators by analyzing the upstream sequences. Conclusion This study helps in extending the annotation of M. smegmatis GntR proteins. It identifies the GntR regulators of M. smegmatis that could serve as a model for studying orthologous regulators from virulent as well as other saprophytic mycobacteria. This study also sheds some light on the nucleotide preferences in the target-motifs of GntRs thus providing important leads for initiating the experimental characterization of these proteins, construction of the gene regulatory network for these regulators and an understanding of the influence of these proteins on the physiology of the mycobacteria.

Ranjan Akash

2007-08-01

352

Analysis of Sequence Polymorphism and Population Structure of Tomato chlorotic dwarf viroid and Potato spindle tuber viroid in Viroid-Infected Tomato Plants  

OpenAIRE

The sequence polymorphism and population structure of Tomato chlorotic dwarf viroid (TCDVd) (isolate Trust) and Potato tuber spindle viroid (PSTVd) (isolate FN) in tomato plants were investigated. Of the 9 and 35 TCDVd clones sequenced from 2 different TCDVd-infected plants, 2 and 4 sequence variants were identified, respectively, leading to a total of 4 sequence variants of 360 nucleotides in length. Variant I was identical to AF162131, the first TCDVd sequence to be reported, and the rest e...

Nie, Xianzhou

2012-01-01

353

The investigation of the secondary structures of various peptide sequences of ?-casein by the multicanonical simulation method  

Science.gov (United States)

The structural properties of Arginine-Glutamic acid-Leucine-Glutamic acid-Glutamic acid-Leucine-Asparagine-Valine-Proline-Glycine (RELEELNVPG, in one letter code), Glutamic acid-Glutamic acid-Glutamine-Glutamine-Glutamine-Threonine-Glutamic acid (EEQQQTE) and Glutamic acid-Aspartic acid-Glutamic acid-Leucine-Glutamine-Aspartic acid-Lysine-Isoleucine (EDELQDKI) peptide sequences of ?-casein were studied by three-dimensional molecular modeling. In this work, the three-dimensional conformations of each peptide from their primary sequences were obtained by multicanonical simulations. With using major advantage of this simulation technique, Ramachandran plots were prepared and analysed to predict the relative occurrence probabilities of ?-turn, ?-turn and helical structures. Structural predictions of these sequences of ?-casein molecule indicate the presence of high level of helical structures and ?III-turns. The occurrence probabilities of inverse and classical ?-turns were low. The probability of helical structure of each sequence significantly decreased when the temperature increased. Our results show these peptides have highly helical structure and better agreement with the results of spectroscopic techniques and other prediction methods.

Ya?ar, F.; Çelik, S.; Köksel, H.

2006-05-01

354

Structure of the axial-vector meson $D_{s1}(2460)$ and the strong coupling constant $g_{D_{s1} D^* K}$ with the light-cone QCD sum rules  

OpenAIRE

In this article, we take the point of view that the charmed axial-vector meson $D_{s1}(2460)$ is the conventional $c\\bar{s}$ meson and calculate the strong coupling constant $g_{D_{s1} D^* K}$ in the framework of the light-cone QCD sum rules approach. The numerical values of strong coupling constants $g_{D_{s1} D^* K}$ and $g_{D_{s0} D K}$ are very large, and support the hadronic dressing mechanism. Just like the scalar mesons $f_0(980)$ and $a_0(980)$, the scalar meson $D_{...

Wang, Z. G.

2006-01-01

355

Novel sequence variations in LAMA2 and SGCG genes modulating cis-acting regulatory elements and RNA secondary structure  

Scientific Electronic Library Online (English)

Full Text Available SciELO Brazil | Language: English Abstract in english In this study, we detected new sequence variations in LAMA2 and SGCG genes in 5 ethnic populations, and analysed their effect on enhancer composition and mRNA structure. PCR amplification and DNA sequencing were performed and followed by bioinformatics analyses using ESEfinder as well as MFOLD softw [...] are. We found 3 novel sequence variations in the LAMA2 (c.3174+22_23insAT and c.6085 +12delA) and SGCG (c.*102A/C) genes. These variations were present in 210 tested healthy controls from Tunisian, Moroccan, Algerian, Lebanese and French populations suggesting that they represent novel polymorphisms within LAMA2 and SGCG genes sequences. ESEfinder showed that the c.*102A/C substitution created a new exon splicing enhancer in the 3'UTR of SGCG genes, whereas the c.6085 +12delA deletion was situated in the base pairing region between LAMA2 mRNA and the U1snRNA spliceosomal components. The RNA structure analyses showed that both variations modulated RNA secondary structure. Our results are suggestive of correlations between mRNA folding and the recruitment of spliceosomal components mediating splicing, including SR proteins. The contribution of common sequence variations to mRNA structural and functional diversity will contribute to a better study of gene expression.

Olfa, Siala; Ikhlass Hadj, Salem; Abdelaziz, Tlili; Imen, Ammar; Hanen, Belguith; Faiza, Fakhfakh.

356

X-ray sequence and crystal structure of luffaculin 1, a novel type 1 ribosome-inactivating protein  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Protein sequence can be obtained through Edman degradation, mass spectrometry, or cDNA sequencing. High resolution X-ray crystallography can also be used to derive protein sequence information, but faces the difficulty in distinguishing the Asp/Asn, Glu/Gln, and Val/Thr pairs. Luffaculin 1 is a new type 1 ribosome-inactivating protein (RIP isolated from the seeds of Luffa acutangula. Besides rRNA N-glycosidase activity, luffaculin 1 also demonstrates activities including inhibiting tumor cells' proliferation and inducing tumor cells' differentiation. Results The crystal structure of luffaculin 1 was determined at 1.4 Å resolution. Its amino-acid sequence was derived from this high resolution structure using the following criteria: 1 high resolution electron density; 2 comparison of electron density between two molecules that exist in the same crystal; 3 evaluation of the chemical environment of residues to break down the sequence assignment ambiguity in residue pairs Glu/Gln, Asp/Asn, and Val/Thr; 4 comparison with sequences of the homologous proteins. Using the criteria 1 and 2, 66% of the residues can be assigned. By incorporating with criterion 3, 86% of the residues were assigned, suggesting the effectiveness of chemical environment evaluation in breaking down residue ambiguity. In total, 94% of the luffaculin 1 sequence was assigned with high confidence using this improved X-ray sequencing strategy. Two N-acetylglucosamine moieties, linked respectively to the residues Asn77 and Asn84, can be identified in the structure. Residues Tyr70, Tyr110, Glu159 and Arg162 define the active site of luffaculin 1 as an RNA N-glycosidase. Conclusion X-ray sequencing method can be effective to derive sequence information of proteins. The evaluation of the chemical environment of residues is a useful method to break down the assignment ambiguity in Glu/Gln, Asp/Asn, and Val/Thr pairs. The sequence and the crystal structure confirm that luffaculin 1 is a new type 1 RIP.

Meehan Edward J

2007-04-01

357

Structural characterization of genomes by large scale sequence-structure threading: application of reliability analysis in structural genomics  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background We establish that the occurrence of protein folds among genomes can be accurately described with a Weibull function. Systems which exhibit Weibull character can be interpreted with reliability theory commonly used in engineering analysis. For instance, Weibull distributions are widely used in reliability, maintainability and safety work to model time-to-failure of mechanical devices, mechanisms, building constructions and equipment. Results We have found that the Weibull function describes protein fold distribution within and among genomes more accurately than conventional power functions which have been used in a number of structural genomic studies reported to date. It has also been found that the Weibull reliability parameter ? for protein fold distributions varies between genomes and may reflect differences in rates of gene duplication in evolutionary history of organisms. Conclusions The results of this work demonstrate that reliability analysis can provide useful insights and testable predictions in the fields of comparative and structural genomics.

Brunham Robert C

2004-07-01

358

Impact of genomic structural variation in Drosophila melanogaster based on population-scale sequencing.  

Science.gov (United States)

Genomic structural variation (SV) is a major determinant for phenotypic variation. Although it has been extensively studied in humans, the nucleotide resolution structure of SVs within the widely used model organism Drosophila remains unknown. We report a highly accurate, densely validated map of unbalanced SVs comprising 8962 deletions and 916 tandem duplications in 39 lines derived from short-read DNA sequencing in a natural population (the "Drosophila melanogaster Genetic Reference Panel," DGRP). Most SVs (>90%) were inferred at nucleotide resolution, and a large fraction was genotyped across all samples. Comprehensive analyses of SV formation mechanisms using the short-read data revealed an abundance of SVs formed by mobile element and nonhomologous end-joining-mediated rearrangements, and clustering of variants into SV hotspots. We further observed a strong depletion of SVs overlapping genes, which, along with population genetics analyses, suggests that these SVs are often deleterious. We inferred several gene fusion events also highlighting the potential role of SVs in the generation of novel protein products. Expression quantitative trait locus (eQTL) mapping revealed the functional impact of our high-resolution SV map, with quantifiable effects at >100 genic loci. Our map represents a resource for population-level studies of SVs in an important model organism. PMID:23222910

Zichner, Thomas; Garfield, David A; Rausch, Tobias; Stütz, Adrian M; Cannavó, Enrico; Braun, Martina; Furlong, Eileen E M; Korbel, Jan O

2013-03-01

359

Visualization and probability-based scoring of structural variants within repetitive sequences  

OpenAIRE

Motivation: Repetitive sequences account for approximately half of the human genome. Accurately ascertaining sequences in these regions with next generation sequencers is challenging, and requires a different set of analytical techniques than for reads originating from unique sequences. Complicating the matter are repetitive regions subject to programmed rearrangements, as is the case with the antigen-binding domains in the Immunoglobulin (Ig) and T-cell receptor (TCR) loci.

Halper-stromberg, Eitan; Steranka, Jared; Burns, Kathleen H.; Sabunciyan, Sarven; Irizarry, Rafael A.

2014-01-01

360

Identification and analysis of conserved sequence motifs in cytochrome P450 family 2. Functional and structural role of a motif 187RFDYKD192 in CYP2B enzymes.  

Science.gov (United States)

Using a multiple alignment of 175 cytochrome P450 (CYP) family 2 sequences, 20 conserved sequence motifs (CSMs) were identified with the program PCPMer. Functional importance of the CSM in CYP2B enzymes was assessed from available data on site-directed mutants and genetic variants. These analyses suggested an important role of the CSM 8, which corresponds to(187)RFDYKD(192) in CYP2B4. Further analysis showed that residues 187, 188, 190, and 192 have a very high rank order of conservation compared with 189 and 191. Therefore, eight mutants (R187A, R187K, F188A, D189A, Y190A, K191A, D192A, and a negative control K186A) were made in an N-terminal truncated and modified form of CYP2B4 with an internal mutation, which is termed 2B4dH/H226Y. Function was examined with the substrates 7-methoxy-4-(trifluoromethyl)coumarin (7-MFC), 7-ethoxy-4-(trifluoromethyl)coumarin (7-EFC), 7-benzyloxy-4-(trifluoromethyl)coumarin (7-BFC), and testosterone and with the inhibitors 4-(4-chlorophenyl)imidazole (4-CPI) and bifonazole (BIF). Compared with the template and K186A, the mutants R187A, R187K, F188A, Y190A, and D192A showed > or =2-fold altered substrate specificity, k(cat), K(m), and/or k(cat)/K(m) for 7-MFC and 7-EFC and 3- to 6-fold decreases in differential inhibition (IC(50,BIF)/IC(50,4-CPI)). Subsequently, these mutants displayed 5-12 degrees C decreases in thermal stability (T(m)) and 2-8 degrees C decreases in catalytic tolerance to temperature (T(50)) compared with the template and K186A. Furthermore, when R187A and D192A were introduced in CYP2B1dH, the P450 expression and thermal stability were decreased. In addition, R187A showed increased activity with 7-EFC and decreased IC(50,BIF)/IC(50,4-CPI) compared with 2B1dH. Analysis of long range residue-residue interactions in the CYP2B4 crystal structures indicated strong hydrogen bonds involving Glu(149)-Asn(177)-Arg(187)-Tyr(190) and Asp(192)-Val(194), which were significantly-reduced/abolished by the Arg(187)-->Ala and Asp(192)-->Alasubstitutions, respectively. PMID:18495666

Oezguen, Numan; Kumar, Santosh; Hindupur, Aditya; Braun, Werner; Muralidhara, B K; Halpert, James R

2008-08-01

361

Compression-based classification of biological sequences and structures via the Universal Similarity Metric: experimental assessment  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Similarity of sequences is a key mathematical notion for Classification and Phylogenetic studies in Biology. It is currently primarily handled using alignments. However, the alignment methods seem inadequate for post-genomic studies since they do not scale well with data set size and they seem to be confined only to genomic and proteomic sequences. Therefore, alignment-free similarity measures are actively pursued. Among those, USM (Universal Similarity Metric has gained prominence. It is based on the deep theory of Kolmogorov Complexity and universality is its most novel striking feature. Since it can only be approximated via data compression, USM is a methodology rather than a formula quantifying the similarity of two strings. Three approximations of USM are available, namely UCD (Universal Compression Dissimilarity, NCD (Normalized Compression Dissimilarity and CD (Compression Dissimilarity. Their applicability and robustness is tested on various data sets yielding a first massive quantitative estimate that the USM methodology and its approximations are of value. Despite the rich theory developed around USM, its experimental assessment has limitations: only a few data compressors have been tested in conjunction with USM and mostly at a qualitative level, no comparison among UCD, NCD and CD is available and no comparison of USM with existing methods, both based on alignments and not, seems to be available. Results We experimentally test the USM methodology by using 25 compressors, all three of its known approximations and six data sets of relevance to Molecular Biology. This offers the first systematic and quantitative experimental assessment of this methodology, that naturally complements the many theoretical and the preliminary experimental results available. Moreover, we compare the USM methodology both with methods based on alignments and not. We may group our experiments into two sets. The first one, performed via ROC (Receiver Operating Curve analysis, aims at assessing the intrinsic ability of the methodology to discriminate and classify biological sequences and structures. A second set of experiments aims at assessing how well two commonly available classification algorithms, UPGMA (Unweighted Pair Group Method with Arithmetic Mean and NJ (Neighbor Joining, can use the methodology to perform their task, their performance being evaluated against gold standards and with the use of well known statistical indexes, i.e., the F-measure and the partition distance. Based on the experiments, several conclusions can be drawn and, from them, novel valuable guidelines for the use of USM on biological data. The main ones are reported next. Conclusion UCD and NCD are indistinguishable, i.e., they yield nearly the same values of the statistical indexes we have used, accross experiments and data sets, while CD is almost always worse than both. UPGMA seems to yield better classification results with respect to NJ, i.e., better values of the statistical indexes (10% difference or above, on a substantial fraction of experiments, compressors and USM approximation choices. The compression program PPMd, based on PPM (Prediction by Partial Matching, for generic data and Gencompress for DNA, are the best performers among the compression algorithms we have used, although the difference in performance, as measured by statistical indexes, between them and the other algorithms depends critically on the data set and may not be as large as expected. PPMd used with UCD or NCD and UPGMA, on sequence data is very close, although worse, in performance with the alignment methods (less than 2% difference on the F-measure. Yet, it scales well with data set size and it can work on data other than sequences. In summary, our quantitative analysis naturally complements the rich theory behind USM and supports the conclusion that the methodology is worth using because of its robustness, flexibility, scalability, and competitiveness with existing techniques. In particular, the methodology applies to all biological

Manzini Giovanni

2007-07-01

362

Conversion of human steroid 5?-reductase (AKR1D1) into 3?-hydroxysteroid dehydrogenase by single point mutation E120H: example of perfect enzyme engineering.  

Science.gov (United States)

Human aldo-keto reductase 1D1 (AKR1D1) and AKR1C enzymes are essential for bile acid biosynthesis and steroid hormone metabolism. AKR1D1 catalyzes the 5?-reduction of ?(4)-3-ketosteroids, whereas AKR1C enzymes are hydroxysteroid dehydrogenases (HSDs). These enzymes share high sequence identity and catalyze 4-pro-(R)-hydride transfer from NADPH to an electrophilic carbon but differ in that one residue in the conserved AKR catalytic tetrad, His(120) (AKR1D1 numbering), is substituted by a glutamate in AKR1D1. We find that the AKR1D1 E120H mutant abolishes 5?-reductase activity and introduces HSD activity. However, the E120H mutant unexpectedly favors dihydrosteroids with the 5?-configuration and, unlike most of the AKR1C enzymes, shows a dominant stereochemical preference to act as a 3?-HSD as opposed to a 3?-HSD. The catalytic efficiency achieved for 3?-HSD activity is higher than that observed for any AKR to date. High resolution crystal structures of the E120H mutant in complex with epiandrosterone, 5?-dihydrotestosterone, and ?(4)-androstene-3,17-dione elucidated the structural basis for this functional change. The glutamate-histidine substitution prevents a 3-ketosteroid from penetrating the active site so that hydride transfer is directed toward the C3 carbonyl group rather than the ?(4)-double bond and confers 3?-HSD activity on the 5?-reductase. Structures indicate that stereospecificity of HSD activity is achieved because the steroid flips over to present its ?-face to the A-face of NADPH. This is in contrast to the AKR1C enzymes, which can invert stereochemistry when the steroid swings across the binding pocket. These studies show how a single point mutation in AKR1D1 can introduce HSD activity with unexpected configurational and stereochemical preference. PMID:22437839

Chen, Mo; Drury, Jason E; Christianson, David W; Penning, Trevor M

2012-05-11

363

The HIVToolbox 2 Web System Integrates Sequence, Structure, Function and Mutation Analysis  

Science.gov (United States)

There is enormous interest in studying HIV pathogenesis for improving the treatment of patients with HIV infection. HIV infection has become one of the best-studied systems for understanding how a virus can hijack a cell. To help facilitate discovery, we previously built HIVToolbox, a web system for visual data mining. The original HIVToolbox integrated information for HIV protein sequence, structure, functional sites, and sequence conservation. This web system has been used for almost 40,000 searches. We report improvements to HIVToolbox including new functions and workflows, data updates, and updates for ease of use. HIVToolbox2, is an improvement over HIVToolbox with new functions. HIVToolbox2 has new functionalities focused on HIV pathogenesis including drug-binding sites, drug-resistance mutations, and immune epitopes. The integrated, interactive view enables visual mining to generate hypotheses that are not readily revealed by other approaches. Most HIV proteins form multimers, and there are posttranslational modification and protein-protein interaction sites at many of these multimerization interfaces. Analysis of protease drug binding sites reveals an anatomy of drug resistance with different types of drug-resistance mutations regionally localized on the surface of protease. Some of these drug-resistance mutations have a high prevalence in specific HIV-1 M subtypes. Finally, consolidation of Tat functional sites reveals a hotspot region where there appear to be 30 interactions or posttranslational modifications. A cursory analysis with HIVToolbox2 has helped to identify several global patterns for HIV proteins. An initial analysis with this tool identifies homomultimerization of almost all HIV proteins, functional sites that overlap with multimerization sites, a global drug resistance anatomy for HIV protease, and specific distributions of some DRMs in specific HIV M subtypes. HIVToolbox2 is an open-access web application available at [http://hivtoolbox2.bio-toolkit.com]. PMID:24886930

Sargeant, David P.; Deverasetty, Sandeep; Strong, Christy L.; Alaniz, Izua J.; Bartlett, Alexandria; Brandon, Nicholas R.; Brooks, Steven B.; Brown, Frederick A.; Bufi, Flaviona; Chakarova, Monika; David, Roxanne P.; Dobritch, Karlyn M.; Guerra, Horacio P.; Hedden, Michael W.; Kumra, Rma; Levitt, Kelvy S.; Mathew, Kiran R.; Matti, Ray; Maza, Dorothea Q.; Mistry, Sabyasachy; Novakovic, Nemanja; Pomerantz, Austin; Portillo, Josue; Rafalski, Timothy F.; Rathnayake, Viraj R.; Rezapour, Noura; Songao, Sarah; Tuggle, Sean L.; Yousif, Sandy; Dorsky, David I.; Schiller, Martin R.

2014-01-01

364

Emergence of complex haplotypes from microevolutionary variation in sequence and structure of Colias phosphoglucose isomerase.  

Science.gov (United States)

A molecular evolutionary explanation of natural genetic variation requires analysis of specific variants' evolutionary dynamics. To pursue this for phosphoglucose isomerase (PGI) of Colias butterflies, whose polymorphism is maintained by strong natural selection, we assembled a large data set of wild haplotypes, highly variable at the amino acid and DNA levels. The most common electrophoretic, i.e., charge macrostate, allele class, 3, is conserved in its pattern of charged amino acid residues. The next most common macrostate, 4, has multiple patterns of charge, i.e., microstates, while less common (1, 2, 5, 6) macrostates are very diverse. Macrostate 4 shows significant linkage disequilibrium (LD) among its variants, especially for two groups of five haplotypes each. We find extensive intragenic recombination among all haplotypes except the two high-LD groups of macrostate 4, which display none. Phyletic relations among haplotypes are largely reticulate, again except for the high-LD groups of macrostate 4, which form clades with strong bootstrap support. Charge-changing and linked charge-neutral amino acid variants occur in diverse parts of PGI's sequence. Homology-based modeling of PGI's structure shows that these regions are related spatially in ways suggesting functional interaction. The high-LD groups of macrostate 4 display parallel amino acid variation in these regions. This pattern of haplotype clades with high LD among multiple varying sites, emerging from chaotically recombining variation, may be a "signature" of refinement of complex adaptive sequences by recombination and selection. It should be tested further in this study system and others as a possibly general feature of the evolution of living complexity. PMID:19424742

Wang, Baiqing; Watt, Ward B; Aakre, Christopher; Hawthorne, Noah

2009-05-01

365

Analysis of the Population Structure of Anaplasma phagocytophilum Using Multilocus Sequence Typing  

Science.gov (United States)

Anaplasma phagocytophilum is a Gram-negative obligate intracellular bacterium that replicates in neutrophils. It is transmitted via tick-bite and causes febrile disease in humans and animals. Human granulocytic anaplasmosis is regarded as an emerging infectious disease in North America, Europe and Asia. However, although increasingly detected, it is still rare in Europe. Clinically apparent A. phagocytophilum infections in animals are mainly found in horses, dogs, cats, sheep and cattle. Evidence from cross-infection experiments that A. phagocytophilum isolates of distinct host origin are not uniformly infectious for heterologous hosts has led to several approaches of molecular strain characterization. Unfortunately, the results of these studies are not always easily comparable, because different gene regions and fragment lengths were investigated. Multilocus sequence typing is a widely accepted method for molecular characterization of bacteria. We here provide for the first time a universal typing method that is easily transferable between different laboratories. We validated our approach on an unprecedented large data set of almost 400 A. phagocytophilum strains from humans and animals mostly from Europe. The typability was 74% (284/383). One major clonal complex containing 177 strains was detected. However, 54% (49/90) of the sequence types were not part of a clonal complex indicating that the population structure of A. phagocytophilum is probably semiclonal. All strains from humans, dogs and horses from Europe belonged to the same clonal complex. As canine and equine granulocytic anaplasmosis occurs frequently in Europe, human granulocytic anaplasmosis is likely to be underdiagnosed in Europe. Further, wild boars and hedgehogs may serve as reservoir hosts of the disease in humans and domestic animals in Europe, because their strains belonged to the same clonal complex. In contrast, as they were only distantly related, roe deer, voles and shrews are unlikely to harbor A. phagocytophilum strains infectious for humans, domestic or farm animals. PMID:24699849

Huhn, Christian; Winter, Christina; Wolfsperger, Timo; Wüppenhorst, Nicole; Strašek Smrdel, Katja; Skuballa, Jasmin; Pfäffle, Miriam; Petney, Trevor; Silaghi, Cornelia; Dyachenko, Viktor; Pantchev, Nikola; Straubinger, Reinhard K.; Schaarschmidt-Kiener, Daniel; Ganter, Martin; Aardema, Matthew L.; von Loewenich, Friederike D.

2014-01-01

366

Phosphorylation-dependent PIH1D1 interactions define substrate specificity of the R2TP cochaperone complex.  

Science.gov (United States)

The R2TP cochaperone complex plays a critical role in the assembly of multisubunit machines, including small nucleolar ribonucleoproteins (snoRNPs), RNA polymerase II, and the mTORC1 and SMG1 kinase complexes, but the molecular basis of substrate recognition remains unclear. Here, we describe a phosphopeptide binding domain (PIH-N) in the PIH1D1 subunit of the R2TP complex that preferentially binds to highly acidic phosphorylated proteins. A cocrystal structure of a PIH-N domain/TEL2 phosphopeptide complex reveals a highly specific phosphopeptide recognition mechanism in which Lys57 and 64 in PIH1D1, along with a conserved DpSDD phosphopeptide motif within TEL2, are essential and sufficient for binding. Proteomic analysis of PIH1D1 interactors identified R2TP complex substrates that are recruited by the PIH-N domain in a sequence-specific and phosphorylation-dependent manner suggestive of a common mechanism of substrate recognition. We propose that protein complexes assembled by the R2TP complex are defined by phosphorylation of a specific motif and recognition by the PIH1D1 subunit. PMID:24656813

Ho?ejší, Zuzana; Stach, Lasse; Flower, Thomas G; Joshi, Dhira; Flynn, Helen; Skehel, J Mark; O'Reilly, Nicola J; Ogrodowicz, Roksana W; Smerdon, Stephen J; Boulton, Simon J

2014-04-10

367

Structure and sequence of mutations induced by ionizing radiation at selectable loci in Chinese hamster ovary cells  

International Nuclear Information System (INIS)

The spectrum of mutations induced by ionizing radiation at two non-essential genetic loci varies markedly. Those at the adenine phosphoribosyl transferase (aprt) locus predominantly have no detectable alterations of gene structure on Southern blots, while those at the hypoxanthine guanine phosphoribosyl transferase (hprt) locus are largely massive deletions eliminating all coding sequence. Insertion mutations were detected at both loci. To characterize the sequence alterations producing the minor changes at the aprt locus, two mutant genes were cloned from lambda genomic libraries and sequenced. One of these mutants proved to be a 20 base-pair deletion formed between two short (3 base-pair) direct repeat sequences, while the second was the result of a 58 base-pair insertion accompanied by a 13 base-pair deletion. (author)

368

Correlation property of length sequences based on global structure of complete genome  

CERN Document Server

This paper considers three kinds of length sequences of the complete genome. Detrended fluctuation analysis, spectral analysis, and the mean distance spanned within time $L$ are used to discuss the correlation property of these sequences. Through comparing the appropriate exponents of the three methods, it is found that the exponent related to the mean distance is the best scale to characterise the correlation property of the time series. The values of the exponents of these three kinds of length sequences of bacteria indicate that the increments of the sequences are uncorrelated ($\\gamma =1.0\\pm 0.03$). It is also found that these sequences exhibit $1/f$ noise in some interval of frequency ($f>1$). The length of this interval of frequency depends on the length of the sequence. The shape of the periodogram in $f>1$ exhibits some periodicity. The period seems to depend on the length and the complexity of the length sequence.

Yu, Z G; Wang, B; Wang, Bin

2001-01-01

369

[Sequence and structure analysis of mitochondrial tRNApro and tRNAthr genes in domestic goose breeds].  

Science.gov (United States)

We report here the results of the sequence and structure analysis of mitochondrial tRNApro and tRNAthr genes in domestic goose breeds by sequencing the mitochondrial DNA from a total of 25 samples from 6 breeds of Chinese geese and 2 breeds of domestic Europe geese. Sequences and the cloverleaf structure of tRNApro (69 bp) and tRNAthr (68 bp) in domestic goose breeds were described and analysed They were compared amongst the three domestic goose breeds as well as between Anseriformes (Anser cygnoides) and Galliformes (Gallus gallus domesticus, Genbank accession number NC001323). Both goose tRNApro and tRNAthr genes have normal cloverleaf secondary structures. The amino acid arm and the anticodon loop of the cloverleaf structure of tRNApro and tRNAthr are very conservative among Anser albifrons, Anser anser and Anser cygnoides. The gene sequences in this study were deposited to GenBank under accession numbers AY427800-AY427805 and AY427812-AY427814. PMID:16818428

Liu, An-Fang; Wang, Ji-Wen; Zhu, Qing

2006-06-01

370

STING Millennium: a web-based suite of programs for comprehensive and simultaneous analysis of protein structure and sequence  

Science.gov (United States)

STING Millennium Suite (SMS) is a new web-based suite of programs and databases providing visualization and a complex analysis of molecular sequence and structure for the data deposited at the Protein Data Bank (PDB). SMS operates with a collection of both publicly available data (PDB, HSSP, Prosite) and its own data (contacts, interface contacts, surface accessibility). Biologists find SMS useful because it provides a variety of algorithms and validated data, wrapped-up in a user friendly web interface. Using SMS it is now possible to analyze sequence to structure relationships, the quality of the structure, nature and volume of atomic contacts of intra and inter chain type, relative conservation of amino acids at the specific sequence position based on multiple sequence alignment, indications of folding essential residue (FER) based on the relationship of the residue conservation to the intra-chain contacts and C?–C? and C?–C? distance geometry. Specific emphasis in SMS is given to interface forming residues (IFR)—amino acids that define the interactive portion of the protein surfaces. SMS may simultaneously display and analyze previously superimposed structures. PDB updates trigger SMS updates in a synchronized fashion. SMS is freely accessible for public data at http://www.cbi.cnptia.embrapa.br, http://mirrors.rcsb.org/SMS and http://trantor.bioc.columbia.edu/SMS. PMID:12824333

Neshich, Goran; Togawa, Roberto C.; Mancini, Adauto L.; Kuser, Paula R.; Yamagishi, Michel E. B.; Pappas, Georgios; Torres, Wellington V.; Campos, Tharsis Fonseca e; Ferreira, Leonardo L.; Luna, Fabio M.; Oliveira, Adilton G.; Miura, Ronald T.; Inoue, Marcus K.; Horita, Luiz G.; de Souza, Dimas F.; Dominiquini, Fabiana; Álvaro, Alexandre; Lima, Cleber S.; Ogawa, Fabio O.; Gomes, Gabriel B.; Palandrani, Juliana F.; dos Santos, Gabriela F.; de Freitas, Esther M.; Mattiuz, Amanda R.; Costa, Ivan C.; de Almeida, Celso L.; Souza, Savio; Baudet, Christian; Higa, Roberto H.

2003-01-01

371

Developing 1D nanostructure arrays for future nanophotonics  

Directory of Open Access Journals (Sweden)

Full Text Available AbstractThere is intense and growing interest in one-dimensional (1-D nanostructures from the perspective of their synthesis and unique properties, especially with respect to their excellent optical response and an ability to form heterostructures. This review discusses alternative approaches to preparation and organization of such structures, and their potential properties. In particular, molecular-scale printing is highlighted as a method for creating organized pre-cursor structure for locating nanowires, as well as vapor–liquid–solid (VLS templated growth using nano-channel alumina (NCA, and deposition of 1-D structures with glancing angle deposition (GLAD. As regards novel optical properties, we discuss as an example, finite size photonic crystal cavity structures formed from such nanostructure arrays possessing highQand small mode volume, and being ideal for developing future nanolasers.

Cooke DG

2006-01-01

372

Term structure of 4d-electron configurations and calculated spectrum in Sn-isonuclear sequence  

Energy Technology Data Exchange (ETDEWEB)

Theoretical calculations of term structure are carried out for the ground configurations 4d{sup w}, of atomic ions in the Sn isonuclear sequence. Atomic computations are performed to give a detailed account of the transitions in Sn{sup +6} to Sn{sup +13} ions. The spectrum is calculated for the most important excited configurations 4p{sup 5} 4d{sup n+1}, 4d{sup n-1} 4f{sup 1}, and 4d{sup n-1} 5p{sup 1} with respect to the ground configuration 4d{sup n}, with n=8-1, respectively. The importance of 4p-4d, 4d-4f, and 4d-5p transitions is stressed, as well as the need for the configuration-interaction CI treatment of the {delta}n=0 transitions. In the region of importance for extreme ultraviolet (EUV) lithography around 13.4nm, the strongest lines were expected to be 4d{sup n}-4p{sup 5} 4d{sup n+1} and 4d{sup n}-4d{sup n-1} 4f{sup 1}.

Al-Rabban, Moza M. [Department of Physics, University of Qatar, PO Box 24905, Doha (Qatar)]. E-mail: mmalrabban@hotmail.com

2006-01-15

373

Crystal structure of actinomycin D bound to the CTG triplet repeat sequences linked to neurological diseases  

OpenAIRE

The potent anticancer drug actinomycin D (ActD) acts by binding to DNA GpC sequences, thereby interfering with essential biological processes including replication, transcription and topoisomerase. Certain neurological diseases are correlated with expansion of (CTG)n trinucleotide sequences, which contain many contiguous GpC sites separated by a single base pair. In order to characterize the binding of ActD to CTG triplet repeat sequences, we carried out heat denaturation and CD analyses, whi...

Hou, Ming-hon; Robinson, Howard; Gao, Yi-gui; Wang, Andrew H. -j

2002-01-01

374

Tripartite structure of the Saccharomyces cerevisiae arginase (CAR1) gene inducer-responsive upstream activation sequence.  

OpenAIRE

Arginase (CAR1) gene expression in Saccharomyces cerevisiae is induced by arginine. The 5' regulatory region of CAR1 contains four separable regulatory elements--two inducer-independent upstream activation sequences (UASs) (UASC1 and UASC2), an inducer-dependent UAS (UASI), and an upstream repression sequence (URS1) which negatively regulates CAR1 and many other yeast genes. Here we demonstrate that three homologous DNA sequences originally reported to be present in the inducer-responsive UAS...

Viljoen, M.; Kovari, L. Z.; Kovari, I. A.; Park, H. D.; Vuuren, H. J.; Cooper, T. G.

1992-01-01

375

Syntheses, structures and electrochemical properties of a class of 1-D double chain polyoxotungstate hybrids [H(2)dap][Cu(dap)(2)](0.5)[Cu(dap)(2)(H2O)][Ln(H(2)O)3(?-GeW(11)O(39))]·3H(2)O.  

Science.gov (United States)

A series of novel organic-inorganic hybrid 1-D double chain germanotungstates [H2dap][Cu(dap)2]0.5[Cu(dap)2(H2O)][Ln(H2O)3(?-GeW11O39)]·3H2O [Ln = La(III) (1), Pr(III) (2), Nd(III) (3), Sm(III) (4), Eu(III) (5), Tb(III) (6), Er(III) (7)] (dap = 1,2-diaminopropane) have been hydrothermally prepared and structurally characterized by elemental analyses, powder X-ray diffraction (PXRD), IR spectra, thermogravimetric (TG) analyses, X-ray photoelectron spectroscopy (XPS) and single-crystal X-ray diffraction. The most prominent structural feature of 1-7 is that the [Ln(H2O)3(?-GeW11O39)](5-) moieties are firstly connected with each other via the W-O-Ln-O-W bridges creating a 1-D {[Cu(dap)2(H2O)][Ln(H2O)3(?-GeW11O39)]}n(3n-) polymeric chain and then two adjacent antiparallel 1-D polymeric chains are linked together through [Cu(dap)2](2+) linkages giving rise to the rare organic-inorganic hybrid 1-D Cu(II)-Ln(III) heterometallic double-chain architectures. To the best of our knowledge, 1-7 represent the first 1-D double-chain Cu(II)-Ln(III) heterometallic germanotungstates. The variable-temperature magnetic susceptibilities of 2, 4 and 7 have been investigated. Furthermore, the solid-state electrochemical and electro-catalytic properties of 3 and 4 have been measured in 0.5 mol L(-1) Na2SO4 + H2SO4 aqueous solution by entrapping them in a carbon paste electrode. 3 and 4 display apparent electro-catalytic activities for nitrite, bromate and hydrogen peroxide reduction. PMID:24554042

Zhao, Jun-Wei; Li, Yan-Zhou; Ji, Fan; Yuan, Jing; Chen, Li-Juan; Yang, Guo-Yu

2014-04-21

376

A Revised Parallel-Sequence Morphological Classification of Galaxies: Structure and Formation of S0 and Spheroidal Galaxies  

CERN Document Server

We update van den Bergh's parallel sequence galaxy classification in which S0 galaxies form a sequence S0a-S0b-S0c that parallels the sequence Sa-Sb-Sc of spiral galaxies. The ratio B/T of bulge to total light defines the position of a galaxy in each sequence. Our classification makes one improvement. We extend the S0a-S0b-S0c sequence to spheroidal ("Sph'") galaxies that are positioned in parallel to irregular galaxies in a similarly extended Sa-Sb-Sc-Im sequence. This provides a natural "home" for spheroidals, which previously were omitted from galaxy classifications. To motivate our juxtaposition of Sph and irregular galaxies, we present photometry and bulge-disk decompositions of Virgo S0s, including late-type S0s that bridge the gap between S0b and Sph galaxies. NGC 4762 is a SB0bc with B/T = 0.13. NGC 4452 is a SB0c galaxy with an even tinier pseudobulge. VCC 2048 and NGC 4638 have properties of both S0cs and Sphs. We update the structural parameter correlations Sphs, irregulars, bulges, and disks. We s...

Kormendy, John

2011-01-01

377

Sequence based residue depth prediction using evolutionary information and predicted secondary structure  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Residue depth allows determining how deeply a given residue is buried, in contrast to the solvent accessibility that differentiates between buried and solvent-exposed residues. When compared with the solvent accessibility, the depth allows studying deep-level structures and functional sites, and formation of the protein folding nucleus. Accurate prediction of residue depth would provide valuable information for fold recognition, prediction of functional sites, and protein design. Results A new method, RDPred, for the real-value depth prediction from protein sequence is proposed. RDPred combines information extracted from the sequence, PSI-BLAST scoring matrices, and secondary structure predicted with PSIPRED. Three-fold/ten-fold cross validation based tests performed on three independent, low-identity datasets show that the distance based depth (computed using MSMS predicted by RDPred is characterized by 0.67/0.67, 0.66/0.67, and 0.64/0.65 correlation with the actual depth, by the mean absolute errors equal 0.56/0.56, 0.61/0.60, and 0.58/0.57, and by the mean relative errors equal 17.0%/16.9%, 18.2%/18.1%, and 17.7%/17.6%, respectively. The mean absolute and the mean relative errors are shown to be statistically significantly better when compared with a method recently proposed by Yuan and Wang [Proteins 2008; 70:509–516]. The results show that three-fold cross validation underestimates the variability of the prediction quality when compared with the results based on the ten-fold cross validation. We also show that the hydrophilic and flexible residues are predicted more accurately than hydrophobic and rigid residues. Similarly, the charged residues that include Lys, Glu, Asp, and Arg are the most accurately predicted. Our analysis reveals that evolutionary information encoded using PSSM is characterized by stronger correlation with the depth for hydrophilic amino acids (AAs and aliphatic AAs when compared with hydrophobic AAs and aromatic AAs. Finally, we show that the secondary structure of coils and strands is useful in depth prediction, in contrast to helices that have relatively uniform distribution over the protein depth. Application of the predicted residue depth to prediction of buried/exposed residues shows consistent improvements in detection rates of both buried and exposed residues when compared with the competing method. Finally, we contrasted the prediction performance among distance based (MSMS and DPX and volume based (SADIC depth definitions. We found that the distance based indices are harder to predict due to the more complex nature of the corresponding depth profiles. Conclusion The proposed method, RDPred, provides statistically significantly better predictions of residue depth when compared with the competing method. The predicted depth can be used to provide improved prediction of both buried and exposed residues. The prediction of exposed residues has implications in characterization/prediction of interactions with ligands and other proteins, while the prediction of buried residues could be used in the context of folding predictions and simulations.

Chen Ke

2008-09-01

378

Genome Sequence, Structural Proteins, and Capsid Organization of the Cyanophage Syn5: A “Horned” Bacteriophage of Marine Synechococcus  

Science.gov (United States)

Marine Synechococcus spp and marine Prochlorococcus spp are numerically dominant photoautotrophs in the open oceans and contributors to the global carbon cycle. Syn5 is a short-tailed cyanophage isolated from the Sargasso Sea on Synechococcus strain WH8109. Syn5 has been grown in WH8109 to high titer in the laboratory and purified and concentrated retaining infectivity. Genome sequencing and annotation of Syn5 revealed that the linear genome is 46,214bp with a 237bp terminal direct repeat. Sixty-one open reading frames (ORFs) were identified. Based on genomic organization and sequence similarity to known protein sequences within GenBank, Syn5 shares features with T7-like phages. The presence of a putative integrase suggests access to a temperate life-cycle. Assignment of eleven ORFs to structural proteins found within the phage virion was confirmed by mass-spectrometry and N-terminal sequencing. Eight of these identified structural proteins exhibited amino acid sequence similarity to enteric phage proteins. The remaining three virion proteins did not resemble any known phage sequences in GenBank as of August 2006. Cryoelectron micrographs of purified Syn5 virions revealed that the capsid has a single “horn”, a novel fibrous structure protruding from the opposing end of the capsid from the tail of the virion. The tail appendage displayed an apparent three-fold rather than six-fold symmetry. An 18Å-resolution icosahedral reconstruction of the capsid revealed a T=7 lattice, but with an unusual pattern of surface knobs. This phage/host system should allow detailed investigation of the physiology and biochemistry of phage propagation in marine photosynthetic bacteria. PMID:17383677

Pope, Welkin H.; Weigele, Peter R.; Chang, Juan; Pedulla, Marisa L.; Ford, Michael E.; Houtz, Jennifer M.; Jiang, Wen; Chiu, Wah; Hatfull, Graham F.; Hendrix, Roger W.; King, Jonathan

2010-01-01

379

Genome sequence, structural proteins, and capsid organization of the cyanophage Syn5: a "horned" bacteriophage of marine synechococcus.  

Science.gov (United States)

Marine Synechococcus spp and marine Prochlorococcus spp are numerically dominant photoautotrophs in the open oceans and contributors to the global carbon cycle. Syn5 is a short-tailed cyanophage isolated from the Sargasso Sea on Synechococcus strain WH8109. Syn5 has been grown in WH8109 to high titer in the laboratory and purified and concentrated retaining infectivity. Genome sequencing and annotation of Syn5 revealed that the linear genome is 46,214 bp with a 237 bp terminal direct repeat. Sixty-one open reading frames (ORFs) were identified. Based on genomic organization and sequence similarity to known protein sequences within GenBank, Syn5 shares features with T7-like phages. The presence of a putative integrase suggests access to a temperate life cycle. Assignment of 11 ORFs to structural proteins found within the phage virion was confirmed by mass-spectrometry and N-terminal sequencing. Eight of these identified structural proteins exhibited amino acid sequence similarity to enteric phage proteins. The remaining three virion proteins did not resemble any known phage sequences in GenBank as of August 2006. Cryo-electron micrographs of purified Syn5 virions revealed that the capsid has a single "horn", a novel fibrous structure protruding from the opposing end of the capsid from the tail of the virion. The tail appendage displayed an apparent 3-fold rather than 6-fold symmetry. An 18 A resolution icosahedral reconstruction of the capsid revealed a T=7 lattice, but with an unusual pattern of surface knobs. This phage/host system should allow detailed investigation of the physiology and biochemistry of phage propagation in marine photosynthetic bacteria. PMID:17383677

Pope, Welkin H; Weigele, Peter R; Chang, Juan; Pedulla, Marisa L; Ford, Michael E; Houtz, Jennifer M; Jiang, Wen; Chiu, Wah; Hatfull, Graham F; Hendrix, Roger W; King, Jonathan

2007-05-11

380

Sequence-Specific Assignment and Secondary Structure of the Catalytic Domain of Protein from Ubiquitination Pathway  

International Nuclear Information System (INIS)

Ubiquitination is a post-translational protein modification which plays an important role in a wide variety of cellular processes including cell cycle, DNA repair and cell apoptosis. It is well known, that the ubiquitination requires sequential activity of three enzymes with different functions: activation, conjugation and ligation. Unfortunately, the three-dimensional structures of all three proteins responsible for these processes are not available at present and the process of proteins ubiquitination still is not understood in detail. In our communication, we present first, preliminary NMR data for the sequence-specific assignments for 112 amino acid residues long domain of one of the proteins from the ubiquitination pathway. The NMR samples were prepared by dissolving 1 mm either 15N-labeled or 15N, 13C-double labeled protein in 90%/10% H2O/D2O, 50 mm TRIS buffer, and 50 mm NaCl. The ph was adjusted to 6.5 (uncorrected value). All NMR measurements were performed on the Varian Unity+ 500 NMR spectrometer (11.7 T) equipped with three channels, Performa II PFG unit and 5 mm 1H, 13C, 15N-triple resonance pro behead. The 1H, 15N, and 13C backbone resonances were assigned by standard methods using 3D heteronuclear HNCACB, CBCA(CO)NH, HNCA, HN(CO)CA, HNCO, (HCA)CO(CA)NH NMR spectra collected at 303 K. The aliphatic 1H and 13C resoatic 1H and 13C resonances were assigned on the basis of C(CO)NH, HBHA(CO)NH, and H(CO)NH experiments. After finishing of assignment procedure, solution of secondary structure in studied protein has been performed. The exact position of the ?-helices and ?-strands were solved on base analysis of cross-peaks between HN and H? protons in 3D 15N-edited NOESY-HSQC spectrum, 3JNH? coupling constants evaluated from 3D HNHA experiment, and chemical shifts of backbone nuclei (TALOS software). Obtained results will be used in future for solution of three-dimensional structure of catalytic domain with high resolution by means NMR methods. (author)