CLIPS-1D: analysis of multiple sequence alignments to deduce for residue-positions a role in catalysis, ligand-binding, or protein structure  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Abstract Background One aim of the in silico characterization of proteins is to identify all residue-positions, which are crucial for function or structure. Several sequence-based algorithms exist, which predict functionally important sites. However, with respect to sequence information, many functionally and structurally important sites are hard to distinguish and consequently a large number of incorrectly predicted functional sites have to be expected. This is why ...



𝒩-Structures Applied to Closed Ideals in BCH-Algebras  

Directory of Open Access Journals (Sweden)

Full Text Available The notions of 𝒩-subalgebras and 𝒩-closed ideals in BCH-algebras are introduced, and the relation between 𝒩-subalgebras and 𝒩-closed ideals is considered. Characterizations of 𝒩-subalgebras and 𝒩-closed ideals are provided. Using special subsets, 𝒩-subalgebras and 𝒩-closed ideals are constructed. A condition for an 𝒩-subalgebra to be an 𝒩-closed ideal is discussed. Given an 𝒩-structure, the greatest 𝒩-closed ideal which is contained in the 𝒩-structure is established.

Mehmet Ali Öztürk



Exome sequencing identifies somatic gain-of-function PPM1D mutations in brainstem gliomas. (United States)

Gliomas arising in the brainstem and thalamus are devastating tumors that are difficult to surgically resect. To determine the genetic and epigenetic landscape of these tumors, we performed exomic sequencing of 14 brainstem gliomas (BSGs) and 12 thalamic gliomas. We also performed targeted mutational analysis of an additional 24 such tumors and genome-wide methylation profiling of 45 gliomas. This study led to the discovery of tumor-specific mutations in PPM1D, encoding wild-type p53-induced protein phosphatase 1D (WIP1), in 37.5% of the BSGs that harbored hallmark H3F3A mutations encoding p.Lys27Met substitutions. PPM1D mutations were mutually exclusive with TP53 mutations in BSG and attenuated p53 activation in vitro. PPM1D mutations were truncating alterations in exon 6 that enhanced the ability of PPM1D to suppress the activation of the DNA damage response checkpoint protein CHK2. These results define PPM1D as a frequent target of somatic mutation and as a potential therapeutic target in brainstem gliomas. PMID:24880341

Zhang, Liwei; Chen, Lee H; Wan, Hong; Yang, Rui; Wang, Zhaohui; Feng, Jie; Yang, Shaohua; Jones, Siân; Wang, Sizhen; Zhou, Weixin; Zhu, Huishan; Killela, Patrick J; Zhang, Junting; Wu, Zhen; Li, Guilin; Hao, Shuyu; Wang, Yu; Webb, Joseph B; Friedman, Henry S; Friedman, Allan H; McLendon, Roger E; He, Yiping; Reitman, Zachary J; Bigner, Darell D; Yan, Hai



Tandem repeats modify the structure of the canine CD1D gene  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Among the CD1 proteins that present lipid antigens to T cells, CD1d is the only one that stimulates a population of T cells with an invariant T-cell receptor known as NKT cells. Sequencing of a 722 nucleotide gap in the dog (Canis lupus familiaris) genome revealed that the canine CD1D gene lacks a sequence homologous to exon 2 of human CD1D, coding for the start codon and signal peptide. Also, the canine CD1D gene contains three different short tandem repeats that disrupt the e...

Looringh Beeck, F. A.; Leegwater, P. A. J.; Herrmann, T.; Broere, F.; Rutten, V. P. M. G.; Willemse, T.; Rhijn, I.



SSViewer: Sequence Structure Viewer  

Directory of Open Access Journals (Sweden)

Full Text Available An important aspect of bioinformatics is sequence. Sequence is a discrete function which contains the combinations of amino acids in proteins and nucleotides in Dna. Important functions of Amino Acids are to serve as the building blocks of proteins, which are linear chains of amino acids. Amino acids can be linked together in varying sequences to form a vast variety of proteins. Twenty-two amino acids are naturally incorporated into polypeptides and are called protein-o-genic or standard amino acids. Of these, 20 are encoded by the universal genetic code. In the case of the DNA sequence A, T, G, C is used to represent DNA. This sequence information is analysed to determine genes that encode polypeptides (proteins, RNA, genes, regulatory sequences, structural motifs, repetitive sequences and DNA sequences can be accurately analysed using computational techniques like BLAST, FASTA which is not possible manually. In the present study we developed a tool to visualize the 3D structure for a given sequence by using programming language Java and HTML.

Shyam Perugu,



A perfectly matched layer based technique for the scattering from 1-D periodic microstrip structures  

Digital Repository Infrastructure Vision for European Research (DRIVER)

An efficient technique is presented to compute the scattering from one-dimensional (1-D) periodic microstrip structures, illuminated by a plane wave under perpendicular incidence. The technique relies on a Mixed Potential Integral Equation (MPIE), discretized by the Method of Moments (MoM), solving for the unknown current density flowing within a unit cell of the periodic structure. The pertinent 1-D periodic Green's functions are obtained by invoking the Perfectly Matched Layer (PML)-paradig...

Vande Ginste, Dries; Rogier, Hendrik



Structure and Catalytic Mechanism of Human Steroid 5-Reductase (AKR1D1)  

Energy Technology Data Exchange (ETDEWEB)

Human steroid 5{beta}-reductase (aldo-keto reductase (AKR) 1D1) catalyzes reduction of {Delta}{sup 4}-ene double bonds in steroid hormones and bile acid precursors. We have reported the structures of an AKR1D1-NADP{sup +} binary complex, and AKR1D1-NADP{sup +}-cortisone, AKR1D1-NADP{sup +}-progesterone and AKR1D1-NADP{sup +}-testosterone ternary complexes at high resolutions. Recently, structures of AKR1D1-NADP{sup +}-5{beta}-dihydroprogesterone complexes showed that the product is bound unproductively. Two quite different mechanisms of steroid double bond reduction have since been proposed. However, site-directed mutagenesis supports only one mechanism. In this mechanism, the 4-pro-R hydride is transferred from the re-face of the nicotinamide ring to C5 of the steroid substrate. E120, a unique substitution in the AKR catalytic tetrad, permits a deeper penetration of the steroid substrate into the active site to promote optimal reactant positioning. It participates with Y58 to create a 'superacidic' oxyanion hole for polarization of the C3 ketone. A role for K87 in the proton relay proposed using the AKR1D1-NADP{sup +}-5{beta}-dihydroprogesterone structure is not supported.

Costanzo, L.; Drury, J; Christianson, D; Penning, T



Recent ARPES experiments on quasi-1D bulk materials and artificial structures. (United States)

The spectroscopy of quasi-one-dimensional (1D) systems has been a subject of strong interest since the first experimental observations of unusual line shapes in the early 1990s. Angle-resolved photoemission (ARPES) measurements performed with increasing accuracy have greatly broadened our knowledge of the properties of bulk 1D materials and, more recently, of artificial 1D structures. They have yielded a direct view of 1D bands, of open Fermi surfaces, and of characteristic instabilities. They have also provided unique microscopic evidence for the non-conventional, non-Fermi-liquid, behavior predicted by theory, and for strong and singular interactions. Here we briefly review some of the remarkable experimental results obtained in the last decade. PMID:21813968

Grioni, M; Pons, S; Frantzeskakis, E



An evaluation of LSU rDNA D1-D2 sequences for their use in species identification  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Identification of species via DNA sequences is the basis for DNA taxonomy and DNA barcoding. Currently there is a strong focus on using a mitochondrial marker for this purpose, in particular a fragment from the cytochrome oxidase I gene (COI. While there is ample evidence that this marker is indeed suitable across a broad taxonomic range to delineate species, it has also become clear that a complementation by a nuclear marker system could be advantageous. Ribosomal RNA genes could be suitable for this purpose, because of their global occurrence and the possibility to design universal primers. However, it has so far been assumed that these genes are too highly conserved to allow resolution at, or even beyond the species level. On the other hand, it is known that ribosomal gene regions harbour also highly divergent parts. We explore here the information content of two adjacent divergence regions of the large subunit ribosomal gene, the D1-D2 region. Results Universal primers were designed to amplify the D1-D2 region from all metazoa. We show that amplification products in the size between 800–1300 bp can be obtained across a broad range of animal taxa, provided some optimizations of the PCR procedure are implemented. Although the ribosomal genes occur in multiple copies in the genomes, we find generally very little intra-individual polymorphism (Cottus and genus Aphyosemion show that the D1-D2 LSU sequence can resolve even very closely related species with the same fidelity as COI sequences. In one case we can even show that a mitochondrial transfer must have occurred, since the nuclear sequence confirms the taxonomic assignment, while the mitochondrial sequence would have led to the wrong classification. We have further explored whether hybrids between species can be detected with the nuclear sequence and we show for a test case of natural hybrids among cyprinid fish species (Alburnus alburnus and Rutilus rutilus that this is indeed possible. Conclusion The D1-D2 LSU region is a suitable marker region for applications in DNA based species identification and should be considered to be routinely used as a marker complementing broad scale studies based on mitochondrial markers.

Tautz Diethard



HERMES Precision Results on g1p, g1d and g1n and the First Measurement of the Tensor Structure Function b1d  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Final HERMES results on the proton, deuteron and neutron structure function g1 are presented in the kinematic range 0.0021structure function b1d are presented.

Riedl, Caroline; Hermes-collaboration, For The



HERMES Precision Results on g1p, g1d and g1n and the First Measurement of the Tensor Structure Function b1d  

CERN Document Server

Final HERMES results on the proton, deuteron and neutron structure function g1 are presented in the kinematic range 0.0021structure function b1d are presented.

Riedl, C; Akopov, Z; Amarian, M; Ammosov, V V; Andrus, A; Aschenauer, E C; Augustyniak, W; Avakian, R; Avetisian, A; Avetissian, E; Bailey, P; Baturin, V; Baumgarten, C; Beckmann, M; Belostotskii, S; Bernreuther, S; Bianchi, N; Blok, H P; Böttcher, Helmut B; Borisov, A; Bouwhuis, M; Brack, J; Brüll, A; Bryzgalov, V V; Capitani, G P; Chiang, H C; Ciullo, G; Contalbrigo, M; Dalpiaz, P F; De Leo, R; De Nardo, L; De Sanctis, E; Devitsin, E G; Di Nezza, P; Düren, M; Ehrenfried, M; Elalaoui-Moulay, A; Elbakian, G M; Ellinghaus, F; Elschenbroich, U; Ely, J; Fabbri, R; Fantoni, A; Feshchenko, A; Felawka, L; Fox, B; Franz, J; Frullani, S; Gärber, Y; Gapienko, G; Gapienko, V; Garibaldi, F; Garrow, K; Garutti, E; Gaskell, D; Gavrilov, G E; Karibian, V; Graw, G; Grebenyuk, O; Greeniaus, L G; Hafidi, K; Hartig, M; Hasch, D; Heesbeen, D; Henoch, M; Hertenberger, R; Hesselink, W H A; Hillenbrand, A; Hoek, M; Holler, Y; Hommez, B; Iarygin, G; Ivanilov, A; Izotov, A; Jackson, H E; Jgoun, A; Kaiser, R; Kinney, E; Kiselev, A; Königsmann, K C; Kopytin, M; Korotkov, V A; Kozlov, V; Krauss, B; Krivokhizhin, V G; Lagamba, L; Lapikas, L; Laziev, A; Lenisa, P; Liebing, P; Lindemann, T; Lipka, K; Lorenzon, W; Lü, J; Maiheu, B; Makins, N C R; Marianski, B; Marukyan, H O; Masoli, F; Mexner, V; Meyners, N; Miklukho, O; Miller, C A; Miyachi, Y; Muccifora, V; Nagaitsev, A; Nappi, E; Naryshkin, Yu; Nass, A; Negodaev, M A; Nowak, Wolf-Dieter; Oganessyan, K; Ohsuga, H; Orlandi, G; Pickert, N; Potashov, S Yu; Potterveld, D H; Raithel, M; Reggiani, D; Reimer, P E; Reischl, A; Reolon, A R; Rith, K; Airapetian, A; Rosner, G; Rostomyan, A; Rubacek, L; Ryckbosch, D; Salomatin, Yu I; Sanjiev, I; Savin, I; Scarlett, C; Schäfer, A; Schill, C; Schnell, G; Schüler, K P; Schwind, A; Seele, J; Seidl, R; Seitz, B; Shanidze, R G; Shearer, C; Shibata, T A; Shutov, V B; Simani, M C; Sinram, K; Stancari, M D; Statera, M; Steffens, E; Steijger, J J M; Stewart, J; Stösslein, U; Tait, P; Tanaka, H; Taroian, S P; Tchuiko, B; Terkulov, A R; Tkabladze, A V; Trzcinski, A; Tytgat, M; Vandenbroucke, A; Van der Nat, P B; van der Steenhoven, G; Vetterli, Martin C; Vikhrov, V; Vincter, M G; Visser, J; Vogel, C; Vogt, M; Volmer, J; Weiskopf, C; Wendland, J; Wilbert, J; Ybeles-Smit, G V; Yen, S; Zihlmann, B; Zohrabyan, H G; Zupranski, P; Riedl, Caroline



Non-linear Finite-Frequency Waveform Inversion for 1-D Structures (United States)

One-dimensional velocity models are representative of regional tectonic units. They are important in determining the locations and focal mechanisms of earthquakes, and provide initial models for tomographic studies. We develop a new approach to the non-linear inversion of finite-frequency traveltimes and amplitudes for 1-D models. Frequency-dependent traveltime and amplitude anomalies are measured by cross-correlation of three-component synthetic and recorded waveforms windowed around body and surface waves. Sensitivity kernels to parameters involved in the 1-D model, such as P- and S-wave speeds and depths of seismic discontinuities, are computed numerically by perturbing the reference model and measuring the resulting traveltime and amplitude perturbations, thus avoiding the invocation of Born approximation. An iterative inversion is carried out with updates of traveltime and amplitude measurements and sensitivity kernels following each iteration. We apply this new approach to the inversion of 1D structures around the source region of the May 12, 2008, Wenchuan earthquake. Numerous moderate aftershocks (Mw=5-6) and densely deployed broadband stations provide plenty of records for obtaining 1-D models along a variety of source-receiver path, revealing lateral structural variations in both the Tibetan Plateau and Sichuan Basin.

Wan, K.; Ni, S.; Zhao, L.



The Dynamic Structure Factor of the 1D Bose Gas near the Tonks-Girardeau Limit  

CERN Document Server

While the 1D Bose gas appears to exhibit superfluid response under certain conditions, it fails the Landau criterion according to the elementary excitation spectrum calculated by Lieb. The apparent riddle is solved by calculating the dynamic structure factor of the Lieb-Liniger 1D Bose gas. A pseudopotential Hamiltonian in the fermionic representation is used to derive a Hartree-Fock operator, which turns out to be well-behaved and local. The Random-Phase approximation for the dynamic structure factor based on this derivation is calculated analytically and is expected to be valid at least up to first order in $1/\\gamma$, where $\\gamma$ is the dimensionless interaction strength of the model. The dynamic structure factor in this approximation clearly indicates a crossover behavior from the non-superfluid Tonks to the superfluid weakly-interacting regime, which should be observable by Bragg scattering in current experiments.

Brand, J; Brand, Joachim; Cherny, Alexander Yu.



Structurally unstable regular dynamics in 1D piecewise smooth maps, and circle maps  

International Nuclear Information System (INIS)

Highlights: ? A discontinuous 1D map with two discontinuity points is considered. ? Dynamic behaviors are either periodic or quasiperiodic. ? Dynamics are always structurally unstable. ? Any small perturbation in one of the parameters leads to different dynamics. - Abstract: In this work we consider a simple system of piecewise linear discontinuous 1D map with two discontinuity points: X? = aX if ?X? z, where a and b can take any real value, and may have several applications. We show that its dynamic behaviors are those of a linear rotation: either periodic or quasiperiodic, and always structurally unstable. A generalization to piecewise monotone functions X? = F(X) if ?X? z is also given, proving the conditions leading to a homeomorphism of the circle.



Quark-Hadron Duality in Spin Structure Functions g1p and g1d  

Digital Repository Infrastructure Vision for European Research (DRIVER)

New measurements of the spin structure functions of the proton and deuteron g1p(x,Q2) and g1d(x,Q2) in the nucleon resonance region are compared with extrapolations of target-mass-corrected next-to-leading-order (NLO) QCD fits to higher energy data. Averaged over the entire resonance region (W1.7 GeV2. This global duality appears to result from cancellations among the prominent lo...

Bosted, P. E.; Dharmawardane, K. V.; Dodge, G. E.; Forest, T. A.; Kuhn, S. E.; Al, Y. Prok Et



Hyperfine structures of the nd _1D(n = 3 - 8) states of _3He I  

International Nuclear Information System (INIS)

We have used the beam-foil quantum beat method to measure the hyperfine structure separations F = 3/2 - 5/2 of the 1snd _1D states (n = 3 - 8) of _3He I. We observed the single frequency modulated decay curves of the 1s2p _1P - 1snd _1D transitions for times after excitation up to 50 ns, corresponding to 4 to 5 modulation periods. The frequencies obtained (with a precision of 2 to 5%) are compared with other experiments and theory. The frequencies are determined mainly by the singlet-triplet energy separations and mixing factors for the He I D-states. The results agree with the same parameters obtained from other recent level-crossing measurements in strong magnetic field mixing of the singlet-triplet states



Computational Study and Analysis of Structural Imperfections in 1D and 2D Photonic Crystals  

Energy Technology Data Exchange (ETDEWEB)

Dielectric reflectors that are periodic in one or two dimensions, also known as 1D and 2D photonic crystals, have been widely studied for many potential applications due to the presence of wavelength-tunable photonic bandgaps. However, the unique optical behavior of photonic crystals is based on theoretical models of perfect analogues. Little is known about the practical effects of dielectric imperfections on their technologically useful optical properties. In order to address this issue, a finite-difference time-domain (FDTD) code is employed to study the effect of three specific dielectric imperfections in 1D and 2D photonic crystals. The first imperfection investigated is dielectric interfacial roughness in quarter-wave tuned 1D photonic crystals at normal incidence. This study reveals that the reflectivity of some roughened photonic crystal configurations can change up to 50% at the center of the bandgap for RMS roughness values around 20% of the characteristic periodicity of the crystal. However, this reflectivity change can be mitigated by increasing the index contrast and/or the number of bilayers in the crystal. In order to explain these results, the homogenization approximation, which is usually applied to single rough surfaces, is applied to the quarter-wave stacks. The results of the homogenization approximation match the FDTD results extremely well, suggesting that the main role of the roughness features is to grade the refractive index profile of the interfaces in the photonic crystal rather than diffusely scatter the incoming light. This result also implies that the amount of incoherent reflection from the roughened quarterwave stacks is extremely small. This is confirmed through direct extraction of the amount of incoherent power from the FDTD calculations. Further FDTD studies are done on the entire normal incidence bandgap of roughened 1D photonic crystals. These results reveal a narrowing and red-shifting of the normal incidence bandgap with increasing RMS roughness. Again, the homogenization approximation is able to predict these results. The problem of surface scratches on 1D photonic crystals is also addressed. Although the reflectivity decreases are lower in this study, up to a 15% change in reflectivity is observed in certain scratched photonic crystal structures. However, this reflectivity change can be significantly decreased by adding a low index protective coating to the surface of the photonic crystal. Again, application of homogenization theory to these structures confirms its predictive power for this type of imperfection as well. Additionally, the problem of a circular pores in 2D photonic crystals is investigated, showing that almost a 50% change in reflectivity can occur for some structures. Furthermore, this study reveals trends that are consistent with the 1D simulations: parameter changes that increase the absolute reflectivity of the photonic crystal will also increase its tolerance to structural imperfections. Finally, experimental reflectance spectra from roughened 1D photonic crystals are compared to the results predicted computationally in this thesis. Both the computed and experimental spectra correlate favorably, validating the findings presented herein.

K.R. Maskaly



Protein Structure Predicted from Sequence  

CERN Document Server

The evolutionary trajectory of a protein through sequence space is constrained by function and three-dimensional (3D) structure. Residues in spatial proximity tend to co-evolve, yet attempts to invert the evolutionary record to identify these constraints and use them to computationally fold proteins have so far been unsuccessful. Here, we show that co-variation of residue pairs, observed in a large protein family, provides sufficient information to determine 3D protein structure. Using a data-constrained maximum entropy model of the multiple sequence alignment, we identify pairs of statistically coupled residue positions which are expected to be close in the protein fold, termed contacts inferred from evolutionary information (EICs). To assess the amount of information about the protein fold contained in these coupled pairs, we evaluate the accuracy of predicted 3D structures for proteins of 50-260 residues, from 15 diverse protein families, including a G-protein coupled receptor. These structure predictions ...

Marks, Debora S; Sheridan, Robert; Hopf, Thomas A; Pagnani, Andrea; Zecchina, Riccardo; Sander, Chris



Crystal Structures of Human TBC1D1 and TBC1D4 (AS160) RabGTPase-activating Protein (RabGAP) Domains Reveal Critical Elements for GLUT4 Translocation  

Energy Technology Data Exchange (ETDEWEB)

We have solved the x-ray crystal structures of the RabGAP domains of human TBC1D1 and human TBC1D4 (AS160), at 2.2 and 3.5 {angstrom} resolution, respectively. Like the yeast Gyp1p RabGAP domain, whose structure was solved previously in complex with mouse Rab33B, the human TBC1D1 and TBC1D4 domains both have 16 {alpha}-helices and no {beta}-sheet elements. We expected the yeast Gyp1p RabGAP/mouse Rab33B structure to predict the corresponding interfaces between cognate mammalian RabGAPs and Rabs, but found that residues were poorly conserved. We further tested the relevance of this model by Ala-scanning mutagenesis, but only one of five substitutions within the inferred binding site of the TBC1D1 RabGAP significantly perturbed catalytic efficiency. In addition, substitution of TBC1D1 residues with corresponding residues from Gyp1p did not enhance catalytic efficiency. We hypothesized that biologically relevant RabGAP/Rab partners utilize additional contacts not described in the yeast Gyp1p/mouse Rab33B structure, which we predicted using our two new human TBC1D1 and TBC1D4 structures. Ala substitution of TBC1D1 Met{sup 930}, corresponding to a residue outside of the Gyp1p/Rab33B contact, substantially reduced catalytic activity. GLUT4 translocation assays confirmed the biological relevance of our findings. Substitutions with lowest RabGAP activity, including catalytically dead RK and Met{sup 930} and Leu{sup 1019} predicted to perturb Rab binding, confirmed that biological activity requires contacts between cognate RabGAPs and Rabs beyond those in the yeast Gyp1p RabGAP/mouse Rab33B structure.

S Park; W Jin; S Shoelson



Inhibition of Human Steroid 5-Reductase (AKR1D1) by Finasteride and Structure of the Enzyme-Inhibitor Complex  

Energy Technology Data Exchange (ETDEWEB)

The {Delta}{sup 4}-3-ketosteroid functionality is present in nearly all steroid hormones apart from estrogens. The first step in functionalization of the A-ring is mediated in humans by steroid 5{alpha}- or 5{beta}-reductase. Finasteride is a mechanism-based inactivator of 5{alpha}-reductase type 2 with subnanomolar affinity and is widely used as a therapeutic for the treatment of benign prostatic hyperplasia. It is also used for androgen deprivation in hormone-dependent prostate carcinoma, and it has been examined as a chemopreventive agent in prostate cancer. The effect of finasteride on steroid 5{beta}-reductase (AKR1D1) has not been previously reported. We show that finasteride competitively inhibits AKR1D1 with low micromolar affinity but does not act as a mechanism-based inactivator. The structure of the AKR1D1 {center_dot} NADP{sup +} {center_dot} finasteride complex determined at 1.7 {angstrom} resolution shows that it is not possible for NADPH to reduce the {Delta}{sup 1-2}-ene of finasteride because the cofactor and steroid are not proximal to each other. The C3-ketone of finasteride accepts hydrogen bonds from the catalytic residues Tyr-58 and Glu-120 in the active site of AKR1D1, providing an explanation for the competitive inhibition observed. This is the first reported structure of finasteride bound to an enzyme involved in steroid hormone metabolism.

Drury, J.; Di Costanzo, L; Penning, T; Christianson, D



[Synthesis, structure and special study of 1D cadmium sulfate-H2biim coordination polymer]. (United States)

A novel cadmium complexes of {[Cd(H2 biim)2 (SO4)] x 3H2O}n(1) (H2 biim = 2,2'-biimidazole) was synthesized by hydrothermal reaction of 3CdSO4 x 8H2O and 2,2'-biimidazole ligand. The complex was built up by [Cd(H2 biim)2]2+ units bridged sequentially by SO4(2-) anions to form 1D zigzag chains parallel to the c-axis. The H2biim ligands were attached to the 1D chain as branches of the chain by coordinating to Cd2+ at two sides of the chain. The chains were held together by pi--pi interaction and O--H***O interactions, thus yielding a 3D extended supramolecular network The responses of symmetric stretching vibration of SO4(2-), N--C single bond, N==C double bond and anti-symmetric stretching vibration of N-C single bond were detected in 2D IR correlation spectra under thermal perturbation. The complex exhibited strong blue emission peak at 498 nm (lambda(ex) x = 397 nm) that can be assigned to a ligand-to-metal charge-transfer (LMCT) band. PMID:23240422

Deng, Song; Ge, Su-Zhi; Liu, Qi; Hu, Heng-Bin; Sun, Yan-Qiong; Chen, Yi-Ping; Zhang, Han-Hui



Band structure and slow waves experimental and theoretical characterization in an high frequency 1D phononic crystal  

Digital Repository Infrastructure Vision for European Research (DRIVER)

We present heterodyne detected transient grating (HD-TG) measurements on 1D surface corrugated Phononic Crystal (PC) with characteristic band edge wave vector of $\\pi/5$ $\\mu$m$^{-1}$. This experimental investigation enables both the direct band diagram characterization of surface waves and a direct measurement of the group velocity dispersion. The experimental data are compared to the simulations performed with the structural-mechanic module of a finite element method softw...

Malfanti, I.; Taschin, A.; Bartolini, P.; Bonello, B.; Torre, R.



The bovine CD1D gene has an unusual gene structure and is expressed but cannot present ?-galactosylceramide with a C26 fatty acid  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Although CD1d and NKT cells have been proposed to have highly conserved functions in mammals, data on functions of CD1d and NKT cells in species other than humans and rodents are lacking. Upon stimulation with the CD1d-presented synthetic antigen ?-galactosylceramide, human and rodent type I invariant NKT cells release large amounts of cytokines. The two bovine CD1D (boCD1D) genes have structural features that suggest that they cannot be translated into functional proteins exp...

Nguyen, T. K. A.; Koets, A. P.; Vordermeier, M.; Jervis, P. J.; Cox, L. R.; Graham, S. P.; Santema, W. J.; Moody, D. B.; Calenbergh, S.; Zajonc, D. M.; Besra, G. S.; Rhijn, I.



Development of input structure software for MARS 1D-3D graphic user interface  

International Nuclear Information System (INIS)

A user-friendly Input Software for MARS 1D-3D GUI called MARA (MARS Adjunct Reactor Assembler) has been developed. Extension of the current MARA to the overall input system for MARS will result in an integrated commercial GUI comparable to those for computational analysis codes ANSYS, ABAQUS, FLUENT and CFX. MARA will help accelerate marketing of MARS and other potential system analysis codes to developing countries in Southeast Asia planning to put nuclear power in their electrical grids. MARS code and associated developmental technology are in the process of being disseminated to twenty-two organizations spanning the industry, academia and laboratories across the country. MARA will find its way to practical applications in a variety of engineering problems



Predicting pseudoknotted structures across two RNA sequences  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Motivation: Laboratory RNA structure determination is demanding and costly and thus, computational structure prediction is an important task. Single sequence methods for RNA secondary structure prediction are limited by the accuracy of the underlying folding model, if a structure is supported by a family of evolutionarily related sequences, one can be more confident that the prediction is accurate. RNA pseudoknots are functional elements, which have highly conserved structures. However, few c...



iPBA: a tool for protein structure comparison using sequence alignment strategies  

Digital Repository Infrastructure Vision for European Research (DRIVER)

With the immense growth in the number of available protein structures, fast and accurate structure comparison has been essential. We propose an efficient method for structure comparison, based on a structural alphabet. Protein Blocks (PBs) is a widely used structural alphabet with 16 pentapeptide conformations that can fairly approximate a complete protein chain. Thus a 3D structure can be translated into a 1D sequence of PBs. With a simple Needleman–Wunsch approach and a raw PB substitutio...



iPBA: a tool for protein structure comparison using sequence alignment strategies.  

Digital Repository Infrastructure Vision for European Research (DRIVER)

With the immense growth in the number of available protein structures, fast and accurate structure comparison has been essential. We propose an efficient method for structure comparison, based on a structural alphabet. Protein Blocks (PBs) is a widely used structural alphabet with 16 pentapeptide conformations that can fairly approximate a complete protein chain. Thus a 3D structure can be translated into a 1D sequence of PBs. With a simple Needleman-Wunsch approach and a raw PB substitution ...



Temperature tuning of band-structure of 1D periodic elastic composites (United States)

In this paper we show that the bandstructure of a periodic elastic composite, in addition to being dependent upon the micro-constituents and their microarchitecture, may also be controlled by changing the temperature. The essential idea is to fabricate a periodic composite with constituent materials which have temperature dependent elastic properties. As temperature is changed, such a composite is expected to exhibit a bandstructure which changes with the temperature dependent properties of its micro-constituents. For our purpose, we use polyurea and steel to make a 1-D periodic composite. Ultrasonic measurements are done on the sample from 0.5 kHz to 1.5 MHz under changing temperature and the change in the second passband is studied. It is observed that the change in the bandstructure is significant when the temperature is changed from -50°C to 50°C. Experimental results are compared with the theoretical calculations and it is shown that good agreement exists for the observed bandstructure.

Sadeghi, H.; Srivastava, A.; Griswold, R.; Nemat-Nasser, S.



Spatially encoded phase-contrast MRI-3D MRI movies of 1D and 2D structures at millisecond resolution. (United States)

This work demonstrates that the principles underlying phase-contrast MRI may be used to encode spatial rather than flow information along a perpendicular dimension, if this dimension contains an MRI-visible object at only one spatial location. In particular, the situation applies to 3D mapping of curved 2D structures which requires only two projection images with different spatial phase-encoding gradients. These phase-contrast gradients define the field of view and mean spin-density positions of the object in the perpendicular dimension by respective phase differences. When combined with highly undersampled radial fast low angle shot (FLASH) and image reconstruction by regularized nonlinear inversion, spatial phase-contrast MRI allows for dynamic 3D mapping of 2D structures in real time. First examples include 3D MRI movies of the acting human hand at a temporal resolution of 50 ms. With an even simpler technique, 3D maps of curved 1D structures may be obtained from only three acquisitions of a frequency-encoded MRI signal with two perpendicular phase encodings. Here, 3D MRI movies of a rapidly rotating banana were obtained at 5 ms resolution or 200 frames per second. In conclusion, spatial phase-contrast 3D MRI of 2D or 1D structures is respective two or four orders of magnitude faster than conventional 3D MRI. PMID:21842502

Merboldt, Klaus-Dietmar; Uecker, Martin; Voit, Dirk; Frahm, Jens



Structure elucidation of two new unusual monoterpene glycosides from Euphorbia decipiens, by 1D and 2D NMR experiments. (United States)

Two new unusual monoterpene glycosides, (Z)-3,6-dimethyl-3-(?-D-O-glucosylmethylene)cyclohept-4-ene-1-one (1) and 3,6-dimethyl-3-(?-D-O-glucosylmethylene)cycloheptanone (2) have been isolated along with five known compounds, 3-hydroxy-4-methoxybenzoic acid, 6,7-dihydroxycoumarin, luteolin, apigenin 5-O-?l-L-rhamnoside, and pinocembrin-7-O-rutinoside from ethyl acetate extract of Euphorbia decipiens. The structures of the isolated compounds were elucidated by extensive 1D- and 2D-NMR, and mass spectroscopic analyses. PMID:21898586

Demirkiran, Ozlem; Topcu, Gulacti; Hussain, Javid; Ahmad, Viqar Uddin; Choudhary, M Iqbal



The phase structure of black D1/D5 (F/NS5) system in canonical ensemble  

CERN Document Server

In this paper, we explore means which can be used to change qualitatively the phase structure of charged black systems. For this, we consider a system of black D1/D5 (or its S-dual F/NS5). We find that the delocalized charged black D-strings (F-strings) alone share the same phase structure as the charged black D5 branes (NS5-branes), having no van der Waals-Maxwell liquid-gas type. However, when the two are combined to form D1/D5 (F/NS5), the resulting phase diagram has been changed dramatically to a richer one, containing now the above liquid-gas type. The effect of adding the charged D-strings (F-strings) on the phase structure can also be effectively described as a slight increase of the transverse dimensions to the original D5 (NS5). This may be viewed as a connection between a brane charge and a fraction of spatial dimension at least in a thermodynamical sense.

Lu, J X; Xu, Jianfei



Structure-based design of novel Chlamydomonas reinhardtii D1-D2 photosynthetic proteins for herbicide monitoring. (United States)

The D1-D2 heterodimer in the reaction center core of phototrophs binds the redox plastoquinone cofactors, Q(A) and Q(B), the terminal acceptors of the photosynthetic electron transfer chain in the photosystem II (PSII). This complex is the target of the herbicide atrazine, an environmental pollutant competitive inhibitor of Q(B) binding, and consequently it represents an excellent biomediator to develop biosensors for pollutant monitoring in ecosystems. In this context, we have undertaken a study of the Chlamydomonas reinhardtii D1-D2 proteins aimed at designing site directed mutants with increased affinity for atrazine. The three-dimensional structure of the D1 and D2 proteins from C. reinhardtii has been homology modeled using the crystal structure of the highly homologous Thermosynechococcus elongatus proteins as templates. Mutants of D1 and D2 were then generated in silico and the atrazine binding affinity of the mutant proteins has been calculated to predict mutations able to increase PSII affinity for atrazine. The computational approach has been validated through comparison with available experimental data and production and characterization of one of the predicted mutants. The latter analyses indicated an increase of one order of magnitude of the mutant sensitivity and affinity for atrazine as compared to the control strain. Finally, D1-D2 heterodimer mutants were designed and selected which, according to our model, increase atrazine binding affinity by up to 20 kcal/mol, representing useful starting points for the development of high affinity biosensors for atrazine. PMID:19693932

Rea, Giuseppina; Polticelli, Fabio; Antonacci, Amina; Scognamiglio, Viviana; Katiyar, Prashant; Kulkarni, Sudhir A; Johanningmeier, Udo; Giardi, Maria Teresa



Magnetic structure and interactions in the quasi-1D antiferromagnet CaV2O4  

International Nuclear Information System (INIS)

CaV2O4 is a spin-1 antiferromagnet where the magnetic vanadium ions are arranged on quasi-one-dimensional zig-zag chains with frustrated antiferromagnetic exchange interactions. Here we present high temperature susceptibility and single-crystal neutron diffraction measurements, which are used to deduce the magnetic structure, dominant exchange interactions and orbital configurations. The results suggest that at high temperatures of CaV2O4, the zig-zags behave as Haldane chains but at low temperatures, orbital ordering lifts the exchange frustration and the zig-zags become spin-1 ladders.



Local duality in spin structure functions g1(p) and g1(d)  

International Nuclear Information System (INIS)

Inclusive double spin asymmetries obtained by scattering polarized electrons off polarized protons and deuterons have been analyzed to address the issue of quark hadron duality in the polarized spin structure functions gp 1 and gd 1. A polarized electron beam, solid polarized NH3 and ND3 targets and the CEBAF Large Acceptance Spectrometer (CLAS) in Hall B were used to collect the data. The resulting gp 1 and gd 1 were averaged over the nucleon resonance energy region (M < W <2.00 Gev), and three lowest lying resonances individually for tests of global and local duality



3D mechanical measurements with an atomic force microscope on 1D structures.  

DEFF Research Database (Denmark)

We have developed a simple method to characterize the mechanical properties of three dimensional nanostructures, such as nanorods standing up from a substrate. With an atomic force microscope the cantilever probe is used to deflect a horizontally aligned nanorod at different positions along the nanorod, using the apex of the cantilever itself rather than the tip normally used for probing surfaces. This enables accurate determination of nanostructures' spring constant. From these measurements, Young's modulus is found on many individual nanorods with different geometrical and material structures in a short time. Based on this method Young's modulus of carbon nanofibers and epitaxial grown III-V nanowires has been determined.

Kallesøe, Christian; Larsen, Martin Benjamin Barbour Spanget



Analysis of Phase Space Structure of A 1-D Discrete System Using Global and Local Symbolic Dynamics  

International Nuclear Information System (INIS)

Symbolic dynamics, in which the system trajectory is represented as a string of symbols, appears as a convenient method for the analysis of properties of chaotic attractors. In this paper, we show that, using a noncanonical coding scheme based on a moving partition point, we are able to access such properties of the phase space of a dynamical system as the localisation of unstable periodic orbits and of their stable invariant manifolds. Applying different coding schemes enables us to extract different information about the phase space structure from the chaotic trajectory. A judicial choice of the method of symbolic coding allows to obtain information which may be missing in the symbolic dynamics from the generating partition. We present results for the 1-D case taking the logistic map as a numerical example. The extension to higher dimension is also discussed. The theoretical background of the methods used is also given. (author)



Optimization of quasi-normal eigenvalues for 1-D wave equations in inhomogeneous media; description of optimal structures  

CERN Document Server

The paper is devoted to optimization of quasi-normal eigenvalues of a spectral problem associated with a 1-D wave equation in an inhomogeneous medium. The wave equation is equipped with a radiation boundary condition, and so the set of quasi-normal eigenvalues lies in $\\C_+$. The problem is to design for a given $\\alpha \\in \\R$ the structure of the inhomogeneous medium such that it generates a quasi-normal eigenvalue on the line $\\alpha + \\i \\R$ with a minimal possible imaginary part. We consider the problem for three admissible families of structures. Two of these families have a natural mechanical interpretation as classes of Krein strings with total mass and static moment constraints. For these two classes we find optimal quasi-normal eigenvalues explicitly. The third class of admissible structures is connected with the problem of optimal design for photonic crystals. For this class, the paper gives a wider statement of the optimization problem, proves existence of optimal structures, and study their prope...

Karabash, Illya M



Syntheses, structures, and photoluminescence of d 10 coordination architectures: From 1D to 3D complexes based on mixed ligands (United States)

Six new compounds, namely, {[Cd 3(Himpy) 3(tda) 2]·3H 2O} n ( 1), {[Zn 3(bipy) 2(tda) 2(H 2O) 2]·4H 2O} n ( 2), {[Cd 3(bipy) 3(tda) 2]·4H 2O} n ( 3), {[Cd 3(tda) 2(H 2O) 3Cl]·H 2O} n ( 4), {[Zn 2(tz)(tda)(H 2O) 2]·H 2O} n ( 5) and {[Cd 7(pz)(tda) 4(OAc)(H 2O) 7]·3H 2O} n ( 6) [H 3tda = 1H-1,2,3-triazole-4,5-dicarboxylic acid, Himpy = 2-(1H-imidazol-2-yl)pyridine, bipy = 2,2'-bipyridine, Htz = 1H-1,2,4-triazole, H 2pz = piperazine] have been prepared under hydrothermal condition and characterized by elemental analyses, infrared spectroscopy, powder X-ray diffraction and single-crystal X-ray diffraction analyses. Compound 1 is a 1D column-like structure and displays a 3D supramolecular network via the ?···? stacking interaction. The compounds 2 and 3 exhibit similar 2D layer-like structure, which further extend to 3D supermolecular structure by the ?···? stacking interaction. All of compounds 4- 6 display 3D framework with diverse topology constructed from the tda 3- ligands in different coordination modes and secondary ligands (or bridging atom) connecting metal ions. Furthermore, the thermal stabilities and photoluminescent properties of compounds 1- 6 were studied.

Yuan, Gang; Shao, Kui-Zhan; Du, Dong-Ying; Wang, Xin-Long; Su, Zhong-Min



Syntheses, structures, spectroscopic and electrochemical properties of two 1D organic-inorganic CuII-LnIII heterometallic germanotungstates (United States)

Two organic-inorganic hybrid copper-lanthanide heterometallic germanotungstates KNa2H7[enH2]3[Cu(en)2(H2O)]2[Cu(en)2]2{Cu(en)2[Eu(?-GeW11O39)2]2}·13H2O (1) and Na2H4[Cu(en)2(H2O)]2[Cu(en)2]6[Cu(en)2]{Cu(en)2[La(?-GeW11O39)2]2}·12H2O (2) have been hydrothermally synthesized by reaction of K8Na2[A-?-GeW9O34]·25H2O with CuCl2·2H2O and EuCl3/LaCl3 in the presence of en (en = ethylenediamine) and structurally characterized by elemental analyses, IR spectra and single-crystal X-ray diffraction. 1 exhibits the 1D chain motif built by tetrameric {[Cu(en)2(H2O)]2[Cu(en)2]2{Cu(en)2[Eu(?-GeW11O39)2]2}}16- moieties through square antiprismatic K+ cations while 2 displays the 1D architecture made by tetrameric [[Cu(en)2]6[Cu(en)2]{Cu(en)2[La(?-GeW11O39)2]2}]10- units via octahedral [Cu(en)2]2+ cations. Furthermore, the solid-state electrochemical and electrocatalytic properties of 1 have been investigated and 1 indicates the good electrocatalytic activity for nitrite reduction. In addition, the photoluminescence property of 1 has been investigated.

Zhang, Jingli; Li, Jie; Li, Lijie; Zhao, Haozhe; Ma, Pengtao; Zhao, Junwei; Chen, Lijuan



Differences in CD1d protein structure determine species-selective antigenicity of isoglobotrihexosylceramide (iGb3) to invariant natural killer T (iNKT)Cells (United States)

Isoglobotrihexosylceramide (iGb3) has been identified as a potent CD1d-presented self-antigen for mouse iNKT cells. The role of iGb3 in humans remains unresolved, however, as there have been conflicting reports about iGb3-dependent human iNKT-cell activation, and humans lack iGb3 synthase, a key enzyme for iGb3 synthesis. Given the importance of human immune responses, we conducted a human-mouse cross-species analysis of iNKT-cell activation by iGb3-CD1d. Here we show that human and mouse iNKT cells were both able to recognise iGb3 presented by mouse CD1d (mCD1d), but not human CD1d (hCD1d), as iGb3-hCD1d was unable to support cognate interactions with the iNKT-cell TCR. The structural basis for this discrepancy was identified as a single amino acid variation between hCD1d and mCD1d, a glycine-to-tryptophan modification within the alpha2-helix that prevents flattening of the iGb3 headgroup upon TCR ligation. Mutation of the human residue, Trp153, to the mouse ortholog, Gly155, therefore allowed iGb3-hCD1d to stimulate human iNKT cells. In conclusion, our data indicate that iGb3 is unlikely to be a major antigen in human iNKT-cell biology.

Sanderson, Joseph P.; Brennan, Patrick J.; Mansour, Salah; Matulis, Gediminas; Patel, Onisha; Lissin, Nikolai; Godfrey, Dale I.; Kawahara, Kazuyoshi; Zahringer, Ulrich; Rossjohn, Jamie; Brenner, Michael B.; Gadola, Stephan D.



Predicting B-DNA structure from sequence  

Energy Technology Data Exchange (ETDEWEB)

This project developed a reliable method that is capable of predicting B-DNA duplex structure from sequence. From any given sequence, the method predicts a complete double helical structure at the atomic level. Tetramers are used as a basic unit for the study to include the sequence effects from the neighboring base pairs. The equilibrium structures of the 136 distinct Tetramers are deduced from Monte Carlo simulations on a set of reduced coordinates developed at LANL. The prediction methods by this project can be used for searching and defining structural motifs in the functional regions of the genes. We have constructed an atomic modeled structure of a 17 base-pair DNA operator (cro, from phage lambda) with the phosphorus structures solved by x-ray crystallography. With this predicted DNA structure and modeled structures of the alpha-3 helix based on the C- alpha atoms solved by x-ray crystallography, we were able to predict two specific interactions between the cro protein and the DNA (Ser-28 to Gua-14, Lys-32 and Gua-12). These interactions were partially verified by NMR using N-15 labeled DNA operator.

Tung, Chang-Shun; Hummer, G.; Soumpasis, D.M.



Modified Jeener Solid-Echo Pulse Sequences for the Measurement of the Proton Dipolar Spin-Lattice Relaxation-Time ( T1D) of Tissue Solid-like Macromolecular Components (United States)

Modified Jeener solid-echo pulse sequences are proposed for the measurement of the proton dipolar spin-lattice relaxation time, T1D, of motionally restricted (solid-like) components in the presence of mobile molecular species, such as encountered in biological tissue. A phase-cycled composite-pulse sequence was used for detection of the dipolar signal and cancellation of the Zeeman signal. A homospoil gradient pulse was added to the Jeener echo pulse sequence to enhance dephasing of the transverse magnetization components of mobile species, thereby aiding in elimination of the Zeeman signal during dipolar signal acquisition. A modified Jeener echo sequence incorporating water suppression is also proposed as a means to further depress the Zeeman signal arising from mobile components. The modified Jeener echo sequences were successfully used for the measurement of proton T1D values of solid 2,6-dimethylphenol and Sephadex gels of differing degrees of cross linking and hydration.

Yang, H.; Schleich, T.


Synthesis, crystal structures, magnetic and luminescent properties of unique 1D p-ferrocenylbenzoate-bridged lanthanide complexes (United States)

Treatments of p-ferrocenylbenzoate [ p-NaOOCH 4C 6Fc, Fc=( ?5-C 5H 5)Fe( ?5-C 5H 4)] with Ln(NO 3) 3· nH 2O afford seven p-ferrocenylbenzoate lanthanide complexes {[ Ln(OOCH 4C 6Fc) 2( ?2-OOCH 4C 6Fc) 2(H 2O) 2](H 3O)} n [ Ln=Ce ( 1), Pr ( 2), Sm ( 3), Eu ( 4), Gd ( 5), Tb ( 6) and Dy ( 7)]. X-ray crystallographic analysis reveals that the isomorphous complexes {[Ce(OOCH 4C 6Fc) 2( ?2-OOCH 4C 6Fc) 2(H 2O) 2](H 3O)} n ( 1) and {[Pr(OOCH 4C 6Fc) 2( ?2-OOCH 4C 6Fc) 2(H 2O) 2](H 3O)} n ( 2) form a unique 1D double-bridged infinite chain structure bridged by ?2-OOCH 4C 6Fc groups. Each Ln(III) ion adopts a dodecahedron coordination environment with eight coordinated oxygen atoms from two terminal monodentate coordinated FcC 6H 4COO - units, two terminal monodentate coordinated H 2O molecules and four ?2- -OOCH 4C 6Fc units. The luminescent spectra reveal that only 4 and 6 exhibit characteristic emissions of lanthanide ions, Eu(III) and Tb(III) ions, respectively. The variable-temperature magnetic properties of 5 and 7 suggest that a ferromagnetic coupling between spin carriers may exist in 5.

Yan, P. F.; Zhang, F. M.; Li, G. M.; Zhang, J. W.; Sun, W. B.; Suda, M.; Einaga, Y.



Hydration structure of -NH{2/+} and -COO- of L-proline zwitterion from data of 1D-RISM integral equation method (United States)

The hydration structure of hydrophilic groups of L-proline zwitterion is studied by means of the 1D-RISM integral equation method. The structural parameters of hydration and features of hydrogen bonding between water and -NH{2/+} and -COO- groups are discussed.

Fedotova, M. V.; Dmitrieva, O. A.



2D-1D structural phase transformation of Co(II) 3,5-pyridinedicarboxylate frameworks with chromotropism. (United States)

Two new metal-organic frameworks [Co(pydc)(H(2)O)(2)](n) (1) and [Co(pydc)(H(2)O)(4)](n)(H(2)O)(n) (2), (pydc = 3,5-pyridinedicarboxylate) have been synthesized by a diffusion method and characterized by single-crystal X-ray diffraction. The structure of 1 reveals an infinite 2D layer with honeycomb-like cavities in which each pydc ligand bridges three Co(II) ions. The adjacent 2D layers are orderly packed in an ABAB-type array via intermolecular interactions of the combined ?-? stacking and hydrogen bonds to form a 3D supramolecular architecture. Interestingly, compound 1 exhibits a water induced crystal-to-amorphous transformation with chromotropism confirmed by spectroscopic techniques, elemental analysis, TGA and XRPD. When this amorphous phase (1A) was exposed to water vapor, it was readily converted into the second crystalline phase 1B with a color change. Moreover, a reversible process between 1A and 1B was performed. In the case of compound 2, pydc acts as didentate bridging ligand connecting two Co(II) ions, leading to a 1D zig-zag chain. Guest water molecules fill the gaps in between chains and form hydrogen bonds with the host chains stabilizing the 3D network of 2. Additionally, compound 2 also exhibits a water induced crystal-to-amorphous transformation with chromotropism and the reversible process was also performed between the dehydrated (2A) and rehydrated (2') forms. Surprisingly, the IR and UV-vis spectra, elemental analysis, TGA curve and XRPD pattern of the rehydrated second phase 1B are found to be identical to that of 2 and 2', these results confirm that 2, 2' and 1B are the same compound. PMID:22842509

Cheansirisomboon, Achareeya; Pakawatchai, Chaveng; Youngme, Sujittra



Augmented GARCH sequences: Dependence structure and asymptotics  

CERN Document Server

The augmented GARCH model is a unification of numerous extensions of the popular and widely used ARCH process. It was introduced by Duan and besides ordinary (linear) GARCH processes, it contains exponential GARCH, power GARCH, threshold GARCH, asymmetric GARCH, etc. In this paper, we study the probabilistic structure of augmented $\\mathrm {GARCH}(1,1)$ sequences and the asymptotic distribution of various functionals of the process occurring in problems of statistical inference. Instead of using the Markov structure of the model and implied mixing properties, we utilize independence properties of perturbed GARCH sequences to directly reduce their asymptotic behavior to the case of independent random variables. This method applies for a very large class of functionals and eliminates the fairly restrictive moment and smoothness conditions assumed in the earlier theory. In particular, we derive functional CLTs for powers of the augmented GARCH variables, derive the error rate in the CLT and obtain asymptotic res...

Hörmann, Siegfried



Secondary structure of the Tetrahymena ribosomal RNA intervening sequence: structural homology with fungal mitochondrial intervening sequences.  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Splicing of the ribosomal RNA precursor of Tetrahymena is an autocatalytic reaction, requiring no enzyme or other protein in vitro. The structure of the intervening sequence (IVS) appears to direct the cleavage/ligation reactions involved in pre-rRNA splicing and IVS cyclization. We have probed this structure by treating the linear excised IVS RNA under nondenaturing conditions with various single- and double-strand-specific nucleases and then mapping the cleavage sites by using sequencing ge...

Cech, T. R.; Tanner, N. K.; Tinoco, I.; Weir, B. R.; Zuker, M.; Perlman, P. S.



Electronic structure of Cr1-dS (d=0,0.17) with NiAs-type crystal structure  

CERN Multimedia

Valence-band and conduction-band electronic structure of CrS (d=0) and Cr5S6 (d=0.17) has been investigated by means of photoemission and inverse-photoemission spectroscopies. Bandwidth of the valence bands of Cr5S6 (8.5 eV) is wider than that of CrS (8.1 eV), though the Cr 3d partial density of states evaluated from the Cr 3p-3d resonant photoemission spectroscopy is almost unchanged between the two compounds with respect to the shapes including binding energies. The Cr 3d (t2g) exchange splitting energies of CrS and Cr5S6 are determined to be 3.9 and 3.3 eV, respectively.

Koyama, M; Ueda, Y; Hirai, C; Taniguchi, M



Global structure of integer partitions sequences  

CERN Multimedia

Integer partitions are deeply related to many phenomena in statistical physics. A question naturally arises which is of interest to physics both on "purely" theoretical and on practical, computational grounds. Is it possible to apprehend the global pattern underlying integer partition sequences and to express the global pattern compactly, in the form of a "matrix" giving all of the partitions of N into exactly M parts? This paper demonstrates that the global structure of integer partitions sequences (IPS) is that of a complex tree. By analyzing the structure of this tree, we derive a closed form expression for a map from (N, M) to the set of all partitions of a positive integer N into exactly M positive integer summands without regard to order. The derivation is based on the use of modular arithmetic to solve an isomorphic combinatoric problem, that of describing the global organization of the sequence of all ordered placements of N indistinguishable balls into M distinguishable non-empty bins or boxes. This ...

Chase, N M



Structure Prediction of Partial-Length Protein Sequences  

Directory of Open Access Journals (Sweden)

Full Text Available Protein structure information is essential to understand protein function. Computational methods to accurately predict protein structure from the sequence have primarily been evaluated on protein sequences representing full-length native proteins. Here, we demonstrate that top-performing structure prediction methods can accurately predict the partial structures of proteins encoded by sequences that contain approximately 50% or more of the full-length protein sequence. We hypothesize that structure prediction may be useful for predicting functions of proteins whose corresponding genes are mapped expressed sequence tags (ESTs that encode partial-length amino acid sequences. Additionally, we identify a confidence score representing the quality of a predicted structure as a useful means of predicting the likelihood that an arbitrary polypeptide sequence represents a portion of a foldable protein sequence (“foldability”. This work has ramifications for the prediction of protein structure with limited or noisy sequence information, as well as genome annotation.

Ram Samudrala



Nonlinear deterministic structures and the randomness of protein sequences  

CERN Multimedia

To clarify the randomness of protein sequences, we make a detailed analysis of a set of typical protein sequences representing each structural classes by using nonlinear prediction method. No deterministic structures are found in these protein sequences and this implies that they behave as random sequences. We also give an explanation to the controversial results obtained in previous investigations.

Huang Yan Zhao



Modeling of Subject Arterial Segments Using 3D Fluid Structure Interaction and 1D-0D Arterial Tree Network Boundary Condition  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Modeling of Subject Specific Arterial Segments Using 3D Fluid Structure Interaction and a 1D-0D Arterial Tree Network Boundary Condition   Magnus Andersson, Jonas Lantz , Matts Karlsson   Department of Management and Engineering, Linköping University, SE-581 83 Linköping, Sweden   Introduction In recent years it has been possible to simulate 3D blood flow through CFD including the dilatation effect in elastic arteries using Fluid-Structure Interaction (FSI) to better match in vivo data. ...

Andersson, Magnus; Lantz, Jonas; Karlsson, Matts




Full Text Available olli1d17 Clone name olli1d17 Library olli 5' end seq. ID olli1d17 [NBRP] Acc. of 5' end - Tissue nd CLSTF24799 [advanced search] Homology of 5' end xanthine dehydrogenase [Poecilia reticulata] Score of 5' en nd CLSTR20493 [advanced search] Homology of 3' end xanthine dehydrogenase [Poecilia reticulata] Score of 3' en



Full Text Available olte1d19 Clone name olte1d19 Library olte 5' end seq. ID olte1d19 [NBRP] Acc. of 5' end FS509029 nd CLSTF19532 [advanced search] Homology of 5' end pf16 protein [Ciona intestinalis] Score of 5' end 887 E


EURDYN-1D: a computer code for the one-dimensional non-linear dynamic analysis of structural systems. Description and users' manual (release 1)  

International Nuclear Information System (INIS)

The goal of the present report is to provide for a comprehensive users' manual describing the capabilities of the computer code EURDYN-1D. It includes information and examples about the type of problems which can be solved with the code and explanation on how to prepare input data and, how to interpret output results. The field of applications of EURDYN-1D is the one dimensional dynamic analysis of general structural systems and the code is particularly suited for fast transient events involving propagation of longitudinal mechanical waves (subsonic) in structures. Both geometrical and physical non-linearities can be taken into account. Typical examples are impact problems, fast dynamic loading due the explosions or sudden release for initial loads due to failures, etc. To these classes belong many problems encountered in the reactor safety field as well as in more common and general technological applications



Recurrent structural RNA motifs, Isostericity Matrices and sequence alignments  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The occurrences of two recurrent motifs in ribosomal RNA sequences, the Kink-turn and the C-loop, are examined in crystal structures and systematically compared with sequence alignments of rRNAs from the three kingdoms of life in order to identify the range of the structural and sequence variations. Isostericity Matrices are used to analyze structurally the sequence variations of the characteristic non-Watson–Crick base pairs for each motif. We show that Isostericity Matrices for non-Watson...



Novel 1D coordination polymer {Tm(Piv)3}n: Synthesis, structure, magnetic properties and thermal behavior  

International Nuclear Information System (INIS)

The new 1D coordination polymer {Tm(Piv)3}n (1), where Piv=OOCBut?, was synthesized in high yield (>95%) by the reaction of thulium acetate with pivalic acid in air at 100 °S. According to the X-ray diffraction data, the metal atoms in compound 1 are in an octahedral ligand environment unusual for lanthanides. The magnetic and luminescence properties of polymer 1, it’s the solid-phase thermal decomposition in air and under argon, and the thermal behavior in the temperature range of ?50…+50 °S were investigated. The vaporization process of complex 1 was studied by the Knudsen effusion method combined with mass-spectrometric analysis of the gas-phase composition in the temperature range of 570–680 K. - Graphical Abstract: Novel 1D coordination polymer {Tm(Piv)3}n was synthesized and studied by X-ray diffraction. The magnetic, luminescence properties, the thermal behavior and the volatility for the compound {Tm(Piv)3}n were investigated.? Highlights: ? We synthesized the coordination polymer {Tm(Piv)3}n. ? Tm atoms in polymer have the coordination number 6. ? Polymer exhibits blue-color emission at room temperature. ? Polymer shows high thermal stability and volatility. ? Polymer has no phase transitions in the range of ?50…+50 °S.



Integrating sequence and structural biology with DAS  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background The Distributed Annotation System (DAS is a network protocol for exchanging biological data. It is frequently used to share annotations of genomes and protein sequence. Results Here we present several extensions to the current DAS 1.5 protocol. These provide new commands to share alignments, three dimensional molecular structure data, add the possibility for registration and discovery of DAS servers, and provide a convention how to provide different types of data plots. We present examples of web sites and applications that use the new extensions. We operate a public registry of DAS sources, which now includes entries for more than 250 distinct sources. Conclusion Our DAS extensions are essential for the management of the growing number of services and exchange of diverse biological data sets. In addition the extensions allow new types of applications to be developed and scientific questions to be addressed. The registry of DAS sources is available at

Finn Robert D



Structural changes in quasi- 1D many-electron systems: from linear to zig-zag and beyond  

CERN Document Server

Many-electron systems confined to a quasi-1D geometry by a cylindrical distribution of positive charge have been investigated by density functional computations in the unrestricted local spin density approximation. Our investigations have been focused on the low density regime, in which electrons are localised. The results reveal a wide variety of different charge and spin configurations, including linear and zig-zag chains, single and double-strand helices, and twisted chains of dimers. The spin-spin coupling turns from weakly anti-ferromagnetic at relatively high density, to weakly ferromagnetic at the lowest densities considered in our computations. The stability of linear chains of localised charge has been investigated by analysing the radial dependence of the self-consistent potential and by computing the dispersion relation of low-energy harmonic excitations.

Ballone, R Cortes-Huerto M Paternostro P



1D zigzag chain and 0D monomer Cd(II)/Zn(II) compounds based on flexible phenylenediacetic ligand: Synthesis, crystal structures and fluorescent properties (United States)

Three novel Cd(II)/Zn(II) compounds, [Cd 2(poda) 2(phen) 3(H 2O)] n· nEtOH·3 nH 2O (1), [Zn(poda) 2(bpy)(H 2O)] n(2) and [Zn(Hpoda) 2(bpy)] (3) (H 2poda = 1,2-phenylenediacetic acid, phen = 1,10-phenanthroline, bpy = 2,2'-bipyridyl), have been synthesized and characterized by IR, TG, fluorescent spectrum and single-crystal X-ray diffraction techniques. In 1, poda 2- anions link the adjacent Cd(II) centers to generate a 1D zigzag chain. Furthermore, an unprecedented four-footed "8-shaped" mixed water-ethanol (H 2O) 6(C 2H 5OH) 2 cluster connects four double chains based on 1D zigzag chain into 3D supramolecular architecture. By bis(chelate-monodentate) fashion of poda 2- ligand, compound 2 exhibits 1D zigzag chains, which forming a dense zipper-like 2D structure via strong ?-? stacking interactions. Differed from 1 and 2, compound 3 has a mononuclear motif, and displays a 3D 6-connected ?-Po net hydrogen-bonded topology. The structure-related solid-state fluorescence spectra of compounds 1 and 2 have been determined.

Yang, Fang; Ren, Yixia; Li, Dongsheng; Fu, Feng; Qi, Guangcai; Wang, Yaoyu



Syntheses, crystal structures and properties of two 1-D cadmium(II) coordination polymers based on 1,1'-(1,3-propanediyl)bis-1H-benzimidazole  

International Nuclear Information System (INIS)

The combination of framework-builders 1,1'-(1,3-propanediyl)bis-1H-benzimidazole (pbbm), Cd(II) ion and framework-regulator ClO4- or SO42- provides two new coordination polymers [Cd(pbbm)2(ClO4)2]n(1) and {[Cd(pbbm)SO4(H2O)2].CH3OH}n(2). Both of them display 1-D chain framework, but their detailed structures are clearly different from each other. 1 displays a 1-D ribbon of rings framework, 2 features an interesting infinite 1-D looped chain structure composed of two kinds of rings, the smaller 8-membered ring and the larger 20-membered ring. The antimicrobial activities of the two polymers were tested by the agar diffusion method and the results indicated that they exhibited antimicrobial activities against bacterial strands. The measurement of the non-isothermal kinetics of the thermal decomposition of 2 reveals that there are at least three steps that occur in its decomposition process. - Graphical abstract: Two new Cd(II)-containing complexes have been synthesized and characterized by single-crystal X-ray diffraction. The antimicrobial activity and the non-isothermal kinetics of the thermal decomposition of the polymers were also investigated. Display Omitted



Syntheses, crystal structures and luminescent properties of two new 1D d 1 coordination polymers constructed from 2,2'-bibenzimidazole and 1,4-benzenedicarboxylate  

International Nuclear Information System (INIS)

Two novel interesting d 1 metal coordination polymers, [Zn(H2bibzim)(BDC)] n (1) and [Cd(H2bibzim)(BDC)] n (2) [H2bibzim=2,2'-bibenzimidazole, BDC=1,4-benzenedicarboxylate] have been synthesized under solvothermal conditions and structurally characterized. Both 1 and 2 are constructed from infinite neutral zigzag-like one-dimensional (1D) chains. The ?-? interactions and interchain hydrogen-bonding interactions further extend the 1D arrangement to generate a 3D supramolecular architecture for 1 and 2. Both complexes have high thermal stability and display strong blue fluorescent emissions in the solid state upon photo-excitation at 365 nm at room temperature. They are the first two examples that 2,2'-bibenzimidazole has been introduced into the d 1 coordination polymeric framework



Histone and histone fold sequences and structures: a database.  

Digital Repository Infrastructure Vision for European Research (DRIVER)

A database of aligned histone protein sequences has been constructed based on the results of homology searches of the major public sequence databases. In addition, sequences of proteins identified as containing the histone fold motif and structures of all known histone and histone fold proteins have been included in the current release. Database resources include information on conflicts between similar sequence entries in different source databases, multiple sequence alignments, and links to...

Baxevanis, A. D.; Landsman, D.



Modular prediction of protein structural classes from sequences of twilight-zone identity with predicting sequences  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Abstract Background Knowledge of structural class is used by numerous methods for identification of structural/functional characteristics of proteins and could be used for the detection of remote homologues, particularly for chains that share twilight-zone similarity. In contrast to existing sequence-based structural class predictors, which target four major classes and which are designed for high identity sequences, we predict seven classes from sequences that share twilight...



Designing polymorphic ISSR primers in order to study gene sequences x and y types glutenin subunits in 1D locus controlling favourable baking quality in elite mutant lines of bread wheat  

International Nuclear Information System (INIS)

Baking quality is one of important traits in qualitative improvement of bread wheat. Gluten prolamins determine wheat flour quality for different technological process such as bread making. Between gluten proteins, High Molecular Glutenin (HMW) group and specially, d allele in 1D locus with x-type and y-type subunits are very valuable in baking quality. In this study, amino acid sequences of x-type subunits (2.1, 2.2, 2.2*, 5) and y-type subunits (10, 12) related to 1D locus were searched, found and compared together using Genedoc software. After amino acid sequences alignment of y-type subunits and x-type subunits, it was characterized that deletion, insertion (duplication) and point mutations in these subunits involved in biological function of proteins. most important insertion and deletion mutations were 185 amino acids sequence insertion of 2.2* subunit and 102 amino acids sequence insertion of x2.2 subunit in position 486 of amino acid sequence and six amino acid sequence deletion IGQGQQ in position 203 of y10 subunit. From important point mutations can be pointed to conversion of serine to cysteine in position 118 of x 5 subunit and substitution of glutamine to histidine in position 626 of x5 subunit. Finally, polymorph ISSR primers in repetitive domains were designed on similarities and differences in subunits of x and y-types. These primers show good banding polymorphisms in elite mutant lines, standard commercial cultivars and F2 populations from crosses. (author)



Structural Insights into the Binding of Vascular Endothelial Growth Factor-B by VEGFR-1D2: RECOGNITION AND SPECIFICITY*  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The formation of blood vessels (angiogenesis) is a highly orchestrated sequence of events involving crucial receptor-ligand interactions. Angiogenesis is critical for physiological processes such as development, wound healing, reproduction, tissue regeneration, and remodeling. It also plays a major role in sustaining tumor progression and chronic inflammation. Vascular endothelial growth factor (VEGF)-B, a member of the VEGF family of angiogenic growth factors, effects blood vessel formation ...

Iyer, Shalini; Darley, Paula I.; Acharya, K. Ravi



Electric structure of the Copahue Volcano (Neuquén Province, Argentina), from magnetotelluric soundings: 1D and 2D modellings (United States)

Four magnetotelluric soundings were carried out in 1993 in the region of the Copahue active volcano located at the border between Chile and Argentina (37°45'S, 71°18'W). Three soundings were located inside the caldera of the ancient stratovolcano (east of Copahue) and the fourth outside it. The soundings inside the caldera were situated at about 6, 11, and 14 km from the volcano. Digital data were obtained covering the range of periods from 1 sec to 10,000 sec using induction coils and a flux-gate magnetometer to obtain the magnetic data and Cu-SO 4Cu electrodes for electric field measurements. The apparent resistivity curves corresponding to principal directions were analyzed in conjunction with the geological background in order to eliminate distortion — which is very important in this hot volcanic region. Then, 1D modellings were performed using the "normal" curves — i.e., curves without distortions. Using the apparent resistivity curves with distortions, 2D modelling was also performed along a profile perpendicular to the regional tectonic trend suggested by MT soundings into the caldera. Results show low resistivity values of about 3-15 ?m between 9 km to 20 km depth in the crust, suggesting high temperatures, with minimum values of about 700°C with partially melted zones in the upper crust between 9 km to 20 km depth under the caldera. The presence of a possible sulphide-carbonaceous layer (SC layer) in the upper basement could play an important role in lowering the electrical resistivities because of its high electronic conductivity.

Mamaní, M. J.; Borzotta, E.; Venencia, J. E.; Maidana, A.; Moyano, C. E.; Castiglione, B.



Oxygen and methanol mediated irreversible coordination polymer structural transformation from a 3D Cu(i)-framework to a 1D Cu(ii)-chain. (United States)

An interesting irreversible structural transformation visible to the naked-eye occurs when a 3D Cu(i)-polymeric complex Cu2L(NO3)2(DMF)0.4 (1) is suspended in CH3OH in air to produce a 1D-Cu(ii) polymeric complex Cu(?-OCH3)(L)(NO3) (2) (L = 1,2-bis[4-(pyrimidin-4-yl)phenoxy]ethane). The transformation mechanism from 1 to 2 was also investigated. PMID:24643413

Ge, Jing-Yuan; Wang, Jian-Cheng; Cheng, Jun-Yan; Wang, Peng; Ma, Jian-Ping; Liu, Qi-Kui; Dong, Yu-Bin



Benchmark problems for predictive fem simulation of 1-D and 2-D guided waves for structural health monitoring with piezoelectric wafer active sensors (United States)

Predictive simulation of ultrasonic nondestructive evaluation and structural health monitoring (SHM) is challenging. This paper addresses this issue in the context of guided-waves with piezoelectric wafer active sensors (PWAS). The principle of guided wave with PWAS transducers is studied and an analytical model is developed to predict the waveform and theoretical frequency contents solution. Two benchmark problems, one 1-D and the other 2-D to achieve reliable and trustworthy predictive simulation of guided wave with finite element method have also been proposed.

Gresil, M.; Shen, Y.; Giurgiutiu, V.



Application of 1D- and 2D-NMR techniques for the structural studies of glycoprotein-derived carbohydrates  

International Nuclear Information System (INIS)

The first part of this thesis (Chapters 1 to 4) describe the determination of the primary structure for a large number of oligosaccharide-alditols obtained from bronchial sputum of cystic fibrosis patients suffering from chronic bronchitis. The second part (Chapters 5 to 8) is devoted to the application of two-dimensional NMR methods for the structural analysis of oligosaccharides. (H.W.). 163 refs.; 50 figs.; 25 tabs



Charge and structural components of mitochondrial leader sequences  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Mitochondrial leader sequences have been found to be statistically enriched for positively charged residues, with only a few known leader sequences possessing negatively charged residues. Mutational studies that have introduced negatively charged residues into various leader sequences have shown a general, but not absolute, trend toward reduced import. The leader sequence of rat liver aldehyde dehydrogenase (ALDH) has been previously determined by NMR to form a helix-linker-helix structure. I...



Recurrent structural RNA motifs, Isostericity Matrices and sequence alignments (United States)

The occurrences of two recurrent motifs in ribosomal RNA sequences, the Kink-turn and the C-loop, are examined in crystal structures and systematically compared with sequence alignments of rRNAs from the three kingdoms of life in order to identify the range of the structural and sequence variations. Isostericity Matrices are used to analyze structurally the sequence variations of the characteristic non-Watson–Crick base pairs for each motif. We show that Isostericity Matrices for non-Watson–Crick base pairs provide important tools for deriving the sequence signatures of recurrent motifs, for scoring and refining sequence alignments, and for determining whether motifs are conserved throughout evolution. The systematic use of Isostericity Matrices identifies the positions of the insertion or deletion of one or more nucleotides relative to the structurally characterized examples of motifs and, most importantly, specifies whether these changes result in new motifs. Thus, comparative analysis coupled with Isostericity Matrices allows one to produce and refine structural sequence alignments. The analysis, based on both sequence and structure, permits therefore the evaluation of the conservation of motifs across phylogeny and the derivation of rules of equivalence between structural motifs. The conservations observed in Isostericity Matrices form a predictive basis for identifying motifs in sequences.

Lescoute, Aurelie; Leontis, Neocles B.; Massire, Christian; Westhof, Eric



Prediction of RNA Secondary Structure from Random Sequences using ZEM  

Directory of Open Access Journals (Sweden)

Full Text Available The biological role of many RNA crucially depends on their structure. The in depth understanding of the secondary structure of RNA would provide a better insight in to their functionality. Predicting secondary structure of RNA is the most important factor in determining its 3d structure and functions. This work proposes a model for exploring the features of a number of RNA sequences simultaneously so that comparison of sequences can be made and relevant sequences can be identified. The proposed model accepts RNA sequences in any valid biological file format. For each given sequence, required number of random sequences are generated. The generated sequences should have the same base composition as that of original sequence. ZEM (Zuker?s Energy Minimization Algorithm finds the biologically correct structure of each RNA sequence and its corresponding free energy value. The proposed prototype enables to experiment with a number of RNA sequences and to study their features so that biologically relevant inferences can be made. An important area where it finds application is in the design of pharmaceutical products.

Cinita Mary Mathew



Improving protein secondary structure prediction with aligned homologous sequences.  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Most recent protein secondary structure prediction methods use sequence alignments to improve the prediction quality. We investigate the relationship between the location of secondary structural elements, gaps, and variable residue positions in multiple sequence alignments. We further investigate how these relationships compare with those found in structurally aligned protein families. We show how such associations may be used to improve the quality of prediction of the secondary structure el...

Di Francesco, V.; Garnier, J.; Munson, P. J.



Two extensions of 1D Toda hierarchy  

CERN Document Server

The extended Toda hierarchy of Carlet, Dubrovin and Zhang is reconsidered in the light of a 2+1D extension of the 1D Toda hierarchy constructed by Ogawa. These two extensions of the 1D Toda hierarchy turn out to have a very similar structure, and the former may be thought of as a kind of dimensional reduction of the latter. In particular, this explains an origin of the mysterious structure of the bilinear formalism proposed by Milanov.

Takasaki, Kanehisa



Formatt: Correcting protein multiple structural alignments by incorporating sequence alignment  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background The quality of multiple protein structure alignments are usually computed and assessed based on geometric functions of the coordinates of the backbone atoms from the protein chains. These purely geometric methods do not utilize directly protein sequence similarity, and in fact, determining the proper way to incorporate sequence similarity measures into the construction and assessment of protein multiple structure alignments has proved surprisingly difficult. Results We present Formatt, a multiple structure alignment based on the Matt purely geometric multiple structure alignment program, that also takes into account sequence similarity when constructing alignments. We show that Formatt outperforms Matt and other popular structure alignment programs on the popular HOMSTRAD benchmark. For the SABMark twilight zone benchmark set that captures more remote homology, Formatt and Matt outperform other programs; depending on choice of embedded sequence aligner, Formatt produces either better sequence and structural alignments with a smaller core size than Matt, or similarly sized alignments with better sequence similarity, for a small cost in average RMSD. Conclusions Considering sequence information as well as purely geometric information seems to improve quality of multiple structure alignments, though defining what constitutes the best alignment when sequence and structural measures would suggest different alignments remains a difficult open question.

Daniels Noah M



3D versus 1D quantum confinement in coherently strained CdS/ZnS quantum structures  

DEFF Research Database (Denmark)

Monolayer fluctuations in ultrathin, coherently strained CdS/ZnS quantum structures result in a very strong localization of excitons. The deepest localized excitons can be considered as individual, decoupled and three-dimensionally confined. Consequently, fingerprints of zero-dimensionality are found in the optical spectra like single, ultranarrow luminescence lines in micro-photoluminescence and spectrally broad optical gain in the deep blue spectral range. The exchange splitting is proven and a strong enhancement over the bulk value is observed.

Woggon, U.; Gindele, F.



Crystal Structures of Bovine CD1d Reveal Altered ?GalCer Presentation and a Restricted A’ Pocket Unable to Bind Long-Chain Glycolipids  

Digital Repository Infrastructure Vision for European Research (DRIVER)

NKT cells play important roles in immune surveillance. They rapidly respond to pathogens by detecting microbial glycolipids when presented by the non-classical MHC I homolog CD1d. Previously, ruminants were considered to lack NKT cells due to the lack of a functional CD1D gene. However, recent data suggest that cattle express CD1d with unknown function. In an attempt to characterize the function of bovine CD1d, we assessed the lipid binding properties of recombinant Bos taurus CD1d (boCD1d) i...

Wang, Jing; Guillaume, Joren; Pauwels, Nora; Calenbergh, Serge; Rhijn, Ildiko; Zajonc, Dirk M.



Tails of the dynamical structure factor of 1D spinless fermions beyond the Tomonaga-Luttinger approximation  

International Nuclear Information System (INIS)

We consider one-dimensional interacting spinless fermions with a non-linear spectrum in a clean quantum wire (non-linear bosonization). We compute diagrammatically the one-dimensional dynamical structure factor, S(?, q), beyond the Tomonaga-Luttinger approximation focusing on its tails, i.e. vertical bar ? vertical bar >> vq. We provide a re-derivation, through diagrammatics, of the result of Pustilnik, Mishchenko, Glazman, and Andreev. We also extend their results to finite temperatures and long-range interactions. As applications we determine curvature and interaction corrections to the small- momentum, high-frequency conductivity and the electron-electron scattering rate. (author)



1D 13C-NMR Data as Molecular Descriptors in Spectra — Structure Relationship Analysis of Oligosaccharides  

Directory of Open Access Journals (Sweden)

Full Text Available Spectra-structure relationships were investigated for estimating the anomeric configuration, residues and type of linkages of linear and branched trisaccharides using 13C-NMR chemical shifts. For this study, 119 pyranosyl trisaccharides were used that are trimers of the ? or ? anomers of D-glucose, D-galactose, D-mannose, L-fucose or L-rhamnose residues bonded through a or b glycosidic linkages of types 1?2, 1?3, 1?4, or 1?6, as well as methoxylated and/or N-acetylated amino trisaccharides. Machine learning experiments were performed for: (1 classification of the anomeric configuration of the first unit, second unit and reducing end; (2 classification of the type of first and second linkages; (3 classification of the three residues: reducing end, middle and first residue; and (4 classification of the chain type. Our previously model for predicting the structure of disaccharides was incorporated in this new model with an improvement of the predictive power. The best results were achieved using Random Forests with 204 di- and trisaccharides for the training set—it could correctly classify 83%, 90%, 88%, 85%, 85%, 75%, 79%, 68% and 94% of the test set (69 compounds for the nine tasks, respectively, on the basis of unassigned chemical shifts.

Florbela Pereira



Measurements of Spin Structure Function G1(P) and G1(D) for Proton and Deuteron at SLAC E143  

International Nuclear Information System (INIS)

E143 was a high precision measurement of the proton and deuteron spin structure functions g1 and g2 in SLAC's End Station A facility, with longitudinally and transversely polarized NH3 and ND3 targets, and a longitudinally polarized electron beam. The experiment was done,at beam energies of 29, 16 and 9.7 Gev. The deeply inelastic scattered electrons were detected by two independent spectrometers at 4.5o and 7o relative to the incident electron beam. At a beam energy of 29 Gev, the measurements covered the Bjorken x range from 0.03 to 0.8, and the Q2 range from 1.2 (GeV/c)2 to 9.8 (GeV/c)2 . It was found that the ?01 g1p(x, Q2)dx is more than two standard deviations away from the Ellis-Jaffe sum rule, and the corresponding deuteron integral is more than three standard deviations away from the Ellis-Jaffe's rule, but the Bjorken sum rule is consistent with the experimental data. Tests of the sum rules at different values of Q2, and the implications of these results for the quark-parton model have also been done



Combining Sequence and Structural Profiles for Protein Solvent Accessibility Prediction  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Solvent accessibility is an important structural feature for a protein. We propose a new method for solvent accessibility prediction that uses known structure and sequence information more efficiently. We first estimate the relative solvent accessibility of the query protein using fuzzy mean operator from the solvent accessibilities of known structure fragments that have similar sequences to the query protein. We then integrate the estimated solvent accessibility and the position specific sco...



1D to 3D heterobimetallic complexes tuned by cyanide precursors: synthesis, crystal structures, and magnetic properties. (United States)

Five new heterobimetallic complexes, namely, {[Ni(L)][Fe(bpb)(CN)2]}ClO4 (L = 2,12-dimethyl-3,7,11,17-tetraazabicyclo[11.3.1]heptadeca-1(17),13,15-triene, bpb(2-) = 1,2-bis(pyridine-2-carboxamido)benzenate) (1), {[Ni(L)]3[M(CN)6]2}·7H2O (M = Fe (2), Cr (3)), {[Ni(L)]2[Mo(CN)8]}·CH3CN·13H2O (4), and {[Ni(L)]2[W(CN)8]}·16H2O (5), were assembled from the polyaza macrocycle nickel(II) compound and five cyanidometalate precursors containing different numbers of cyanide groups. Single-crystal X-ray diffraction analysis reveals their different structure ranging from a cyanide-bridged cationic polymeric single chain for 1, a two-dimensional network for 2 and 3, and a three-dimensional network for 4 and 5. In addition, a systematic investigation over the magnetic properties of 1-3 indicates the ferromagnetic magnetic coupling between neighboring Fe(III)/Cr(III) and Ni(II) ions through the bridging cyanide group. For complex 1, the magnetic susceptibility has been simulated by the Seiden model using the Hamiltonian H = -J?i=0(N)SiSi+1, leading to the magnetic coupling constant of J = 3.67 cm(-1). The two-dimensional magnetic complexes exhibit three-dimensional magnetic ordering behavior with a magnetic phase transition temperature of TC = 4.0 K for 2 and TN = 6.0 K for 3, respectively. PMID:24655013

Zhang, Daopeng; Si, Weijiang; Wang, Ping; Chen, Xia; Jiang, Jianzhuang



Nucleosome DNA sequence structure of isochores  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Significant differences in G+C content between different isochore types suggest that the nucleosome positioning patterns in DNA of the isochores should be different as well. Results Extraction of the patterns from the isochore DNA sequences by Shannon N-gram extension reveals that while the general motif YRRRRRYYYYYR is characteristic for all isochore types, the dominant positioning patterns of the isochores vary between TAAAAATTTTTA and CGGGGGCCCCCG due to the large differences in G+C composition. This is observed in human, mouse and chicken isochores, demonstrating that the variations of the positioning patterns are largely G+C dependent rather than species-specific. The species-specificity of nucleosome positioning patterns is revealed by dinucleotide periodicity analyses in isochore sequences. While human sequences are showing CG periodicity, chicken isochores display AG (CT periodicity. Mouse isochores show very weak CG periodicity only. Conclusions Nucleosome positioning pattern as revealed by Shannon N-gram extension is strongly dependent on G+C content and different in different isochores. Species-specificity of the pattern is subtle. It is reflected in the choice of preferentially periodical dinucleotides.

Trifonov Edward N



Language as structured sequences: a causal role of Broca's region in sequence processing  

Digital Repository Infrastructure Vision for European Research (DRIVER)

In this thesis I approach language as a neurobiological system. I defend a sequence processing perspective on language and on the function of Broca's region in the left inferior frontal gyrus (LIFG). This perspective provides a way to express common structural aspects of language, music and action, which all engage the LIFG. It also facilitates the comparison of human language and structured sequence processing in animals. Research on infants, song-birds and non-human primates suggests ...

Udde?n, Julia



Massively Parallel Sequencing Approaches for Characterization of Structural Variation  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The emergence of next-generation sequencing (NGS) technologies offers an incredible opportunity to comprehensively study DNA sequence variation in human genomes. Commercially available platforms from Roche (454), Illumina (Genome Analyzer and Hiseq 2000), and Applied Biosystems (SOLiD) have the capability to completely sequence individual genomes to high levels of coverage. NGS data is particularly advantageous for the study of structural variation (SV) because it offers the sensitivity to de...

Koboldt, Daniel C.; Larson, David E.; Chen, Ken; Ding, Li; Wilson, Richard K.



??????: The genome sequence and structure of rice chromosome 1.  

Full Text Available The genome sequence and structure of rice chromosome 1. Sasaki T Nature. 2002 Nov 21;420(6 ????? _resource_portal/LATEST/workflow_images/


Swelfe: a detector of internal repeats in sequences and structures  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Summary: Intragenic duplications of genetic material have important biological roles because of their protein sequence and structural consequences. We developed Swelfe to find internal repeats at three levels. Swelfe quickly identifies statistically significant internal repeats in DNA and amino acid sequences and in 3D structures using dynamic programming. The associated web server also shows the relationships between repeats at each level and facilitates visualization of the results.

Abraham, Anne-laure; Rocha, Eduardo P. C.; Pothier, Joe?l



Swelfe: a detector of internal repeats in sequences and structures.  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Intragenic duplications of genetic material have important biological roles because of their protein sequence and structural consequences. We developed Swelfe to find internal repeats at three levels. Swelfe quickly identifies statistically significant internal repeats in DNA and amino acid sequences and in 3D structures using dynamic programming. The associated web server also shows the relationships between repeats at each level and facilitates visualization of the results. AVAILABILITY: ht...

Abraham, Anne-laure; Rocha, Eduardo P. C.; Pothier, Joe?l



Two novel 1-D helical chains Zn(II)/Cd(II) polymers based on tetrazolate-1-acetic acid: Crystal structures, solid state fluorescence and thermal behaviors (United States)

Two new d10 metal complexes with tetrazolate-1-acetic acid, [Zn(1-tza)(Cl)(H2O)] (1) and [Cd(1-tza)(phen)(NO3)] (2) (1-Htza = tetrazole-1-acetic acid, phen = 1,10-phenanthroline), have been prepared, and their structures have been characterized by single-crystal X-ray diffraction. The flexibilities of 1-tza ligands result in 1-D helical chained structures of the two obtained complexes, in which the 1-tza ligands adopt different coordination mode: 1 with ?2-kO1: kN4 and 2 with ?2-kO1, O2: kN3. Compounds 1 exhibits a nonracemic enantiopure topology while compound 2 reveals to be mesomeric structures. The crystal packing in 1 and 2 is controlled mainly by hydrogen bonds and face-to-face ?-? stacking interactions, respectively. Photoluminescence studies show that 1 and 2 exhibit strong luminescence. Moreover, compound 1 exhibits a second-order nonlinear optical coefficient equal to that of potassium dihydrogen phosphate (KDP). The thermal stability of the two complexes has also been investigated.

Lu, Ying-Bing; Jin, Shuang; Jian, Fang-Mei; Xie, Yong-Rong; Luo, Guo-Tian



Percolation of annotation errors through hierarchically structured protein sequence databases. (United States)

Databases of protein sequences have grown rapidly in recent years as a result of genome sequencing projects. Annotating protein sequences with descriptions of their biological function ideally requires careful experimentation, but this work lags far behind. Instead, biological function is often imputed by copying annotations from similar protein sequences. This gives rise to annotation errors, and more seriously, to chains of misannotation. [Percolation of annotation errors in a database of protein sequences (2002)] developed a probabilistic framework for exploring the consequences of this percolation of errors through protein databases, and applied their theory to a simple database model. Here we apply the theory to hierarchically structured protein sequence databases, and draw conclusions about database quality at different levels of the hierarchy. PMID:15748731

Gilks, Walter R; Audit, Benjamin; de Angelis, Daniela; Tsoka, Sophia; Ouzounis, Christos A



Lipophilic bismuth phosphates: a molecular tetradecanuclear cage and a 1D-coordination polymer. Synthesis, structure and conversion to BiPO4. (United States)

The reaction of the phosphate monoester {(ArO)PO(OH)2} (Ar = 2,6-i-Pr2C6H3) with BiPh3 in a 1 : 1 ratio in refluxing toluene afforded a tetradecabismuth-oxo-phosphate cage [{(ArO)PO3}10{(ArO)PO2OH}2(Bi14O10)·2(CH3OH)]·3C6H12·3CH3OH·2H2O (Ar = 2,6-i-Pr2C6H3) (1). On the other hand the reaction of the phosphate diester {((t)BuO)2PO(OH)} with BiPh3 in a 1 : 1 ratio at room temperature in ethanol afforded the 1D-coordination polymer [Bi(C6H5)2((t)BuO)2PO2]n (2). The molecular structure of 1 reveals that the cage is comprised of a central planar Bi6 rim and two Bi4 poles. The entire aggregate is held together by multiple coordination of O(2-), [(ArO)P(O)(OH)](-), [(ArO)PO3](2-) and methanol ligands. 2 is a 1D-coordination polymer where adjacent bismuth is bridged by isobidentate [((t)BuO)2PO2](-) ligands. In solution, however, 2 decomposes into the monomeric repeat unit [Ph2Bi{((t)BuO)2PO2}] which is indicated by ESI-MS studies. Thermolysis of 1 and 2 at 700 °C affords a pure phase of BiPO4. PMID:23632600

Chandrasekhar, Vadapalli; Metre, Ramesh K; Suriya Narayanan, Ramakirushnan



Accuracy of structure-based sequence alignment of automatic methods  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Accurate sequence alignments are essential for homology searches and for building three-dimensional structural models of proteins. Since structure is better conserved than sequence, structure alignments have been used to guide sequence alignments and are commonly used as the gold standard for sequence alignment evaluation. Nonetheless, as far as we know, there is no report of a systematic evaluation of pairwise structure alignment programs in terms of the sequence alignment accuracy. Results In this study, we evaluate CE, DaliLite, FAST, LOCK2, MATRAS, SHEBA and VAST in terms of the accuracy of the sequence alignments they produce, using sequence alignments from NCBI's human-curated Conserved Domain Database (CDD as the standard of truth. We find that 4 to 9% of the residues on average are either not aligned or aligned with more than 8 residues of shift error and that an additional 6 to 14% of residues on average are misaligned by 1–8 residues, depending on the program and the data set used. The fraction of correctly aligned residues generally decreases as the sequence similarity decreases or as the RMSD between the C? positions of the two structures increases. It varies significantly across CDD superfamilies whether shift error is allowed or not. Also, alignments with different shift errors occur between proteins within the same CDD superfamily, leading to inconsistent alignments between superfamily members. In general, residue pairs that are more than 3.0 Å apart in the reference alignment are heavily (>= 25% on average misaligned in the test alignments. In addition, each method shows a different pattern of relative weaknesses for different SCOP classes. CE gives relatively poor results for ?-sheet-containing structures (all-?, ?/?, and ?+? classes, DaliLite for "others" class where all but the major four classes are combined, and LOCK2 and VAST for all-? and "others" classes. Conclusion When the sequence similarity is low, structure-based methods produce better sequence alignments than by using sequence similarities alone. However, current structure-based methods still mis-align 11–19% of the conserved core residues when compared to the human-curated CDD alignments. The alignment quality of each program depends on the protein structural type and similarity, with DaliLite showing the most agreement with CDD on average.

Lee Byungkook



Structural variation from 1D to 3D: effects of ligands and solvents on the construction of lead(II)-organic coordination polymers. (United States)

A series of Pb(II) coordination polymers [Pb(ndc)(dpp)] (1), [Pb(ndc)(ptcp)].0.5 H2O (2), [Pb(ndc)(dppz)] (3), [Pb(ndc)(tcpn)(2)] (4), [Pb2(ndc)2(tcpp)] (5), [Pb(Hndc)2].H2O (6), [Pb(ndc)(dma)] (7), [Pb(bdc)(dma)] (8), [Pb(trans-chdc)(H2O)] (9), and [Pb2(cis-chdc)2].NH(CH3)2 (10), where ndc=1,4-naphthalenedicarboxylate, dpp=4,7-diphenyl-1,10-phenanthroline, ptcp=2-phenyl-1H-1,3,7,8-tetraazacyclopenta[l]phenanthrene, dppz=dipyrido[3,2-a:2',3'-c]phenazine, tcpn=2-(1H-1,3,7,8-tetraazacyclopenta[l]phenanthren-2-yl)naphthol, tcpp=4-(1H-1,3,7,8-tetraazacyclopenta[l]phenanthren-2-yl)phenol, dma=N,N-dimethylacetamide, bdc=1,4-benzenedicarboxylate, and chdc=1,4-cyclohexanedicarboxylate, have been synthesized from a hydrothermal or solvothermal reaction system by varying the ligands or the solvents. Compounds 1-5 crystallize with an N-donor chelating ligand and an aromatic dicarboxylate linker. Compounds 1-4 are 1D polymers with different pi-pi stacking interactions, whereas compound 5 consists of 2D layers. The structures of compounds 7, 8, and 10 are 3D frameworks formed by connection of the Pb(II) centers by organic acid ligands. Compound 7 is chiral although the ndc ligand is achiral, while the framework of 8 is a typical 3D (3,4)-connected net. Compound 10 is the first example of Pb(II) wheel cluster [Pb(8)O(8)] units bridged by carboxylate groups. Compound 6 contains 1D chains which are further extended to a 3D structure by pi-pi interactions. Compound 9 consists of a 2D network constructed by Pb(II) centers and trans-chdc ligands. The structural differences between 7 and 8 and between 9 and 10 indicate the importance of solvents for framework formation of the coordination polymers. By varying the solvent the cis and trans conformations of H(2)chdc in 9 and 10 were separated completely. The photoluminescence and nonlinear optical properties of the coordination polymers have also been investigated. PMID:17212363

Yang, Jin; Li, Guo-Dong; Cao, Jun-Jun; Yue, Qi; Li, Guang-Hua; Chen, Jie-Sheng



Synthesis, structures, and magnetic properties of novel mononuclear, tetranuclear, and 1D chain Mn(III) complexes involving three related asymmetrical trianionic ligands. (United States)

The manganese(III) complexes studied in this report derive from asymmetrical trianionic ligands abbreviated H(3)L(i) (i = 4-6). These ligands are obtained through reaction of salicylaldehyde with "half-units", the latter resulting from monocondensation of different diamines with phenylsalicylate,. Upon deprotonation, L(i) (i = 4-6) possess an inner N(2)O(2) coordination site with one amido, one imine, and two phenoxo functions, and an outer amido oxygen donor. The trianionic character of such ligands yields original neutral complexes with the L/Mn stoichiometry. The crystal and molecular structures of three complexes have been determined at 190 K (1) or 180 K (2 and 3). Complex 1 crystallizes in the triclinic space group P (No. 2): a = 7.8582(14) A, b = 10.9225(16) A, c = 12.4882(18) A, alpha = 67.231(14) degrees, beta = 72.134(14) degrees, gamma = 82.589(13) degrees, V = 940.6(3) A(3), Z = 2. Complex 2 crystallizes in the orthorhombic space group Pbcn (Nuomicron. 60): a = 23.8283(15) A, b = 11.1605(7) A, c = 26.152(2) A, V = 6954.8(8) A(3), Z = 8, while complex 3 crystallizes in the monoclinic space group P2(1)/c (No. 14) with a = 11.7443(14) A, b = 7.5996(10) A, c = 18.029(2) A, beta = 100.604(10) degrees, V = 1581.6(3) A(3), Z = 4. Owing to hydrogen bonds and pi-pi stackings, the mononuclear neutral molecules of 1 are arranged in a 2D network while complexes 2 and 3 are tetranuclear and polymeric (1D chain) species, respectively, owing to the bridging ability of the oxygen atom of the amido function. The experimental magnetic susceptibilities of complexes 2 and 3 indicate the occurrence of similarly weak Mn(III)-Mn(III) antiferromagnetic interactions (J = -1.1 cm(-1)). Single ion zero-field splitting of manganese(III) must be taken into account for satisfactorily fitting the data by exact calculation of the energy levels associated to the spin Hamiltonian through diagonalization of the full matrix for axial symmetry in 2 (J = - 1.1 cm(-1), D(1) = 2.2 cm(-1), D(2) = -2.8 cm(-1)), D(1) and D(2) being associated to the six- and five-coordinate Mn ions, respectively. A weaker antiferromagnetic interaction (J = - 0.2 cm(-1)) operates through pi-pi stacking in complex 1. Complex 3 is a weak ferromagnet (ordering temperature approximately 7 K) as a result of the spin canting originating from the crystal packing. PMID:15074994

Costes, Jean-Pierre; Dahan, Françoise; Donnadieu, Bruno; Rodriguez Douton, Maria-Jesus; Fernandez Garcia, Maria-Isabel; Bousseksou, Azzedine; Tuchagues, Jean-Pierre



Sequence and Structural Analyses for Functional Non-coding RNAs (United States)

Analysis and detection of functional RNAs are currently important topics in both molecular biology and bioinformatics research. Several computational methods based on stochastic context-free grammars (SCFGs) have been developed for modeling and analysing functional RNA sequences. These grammatical methods have succeeded in modeling typical secondary structures of RNAs and are used for structural alignments of RNA sequences. Such stochastic models, however, are not sufficient to discriminate member sequences of an RNA family from non-members, and hence to detect non-coding RNA regions from genome sequences. Recently, the support vector machine (SVM) and kernel function techniques have been actively studied and proposed as a solution to various problems in bioinformatics. SVMs are trained from positive and negative samples and have strong, accurate discrimination abilities, and hence are more appropriate for the discrimination tasks. A few kernel functions that extend the string kernel to measure the similarity of two RNA sequences from the viewpoint of secondary structures have been proposed. In this article, we give an overview of recent progress in SCFG-based methods for RNA sequence analysis and novel kernel functions tailored to measure the similarity of two RNA sequences and developed for use with support vector machines (SVM) in discriminating members of an RNA family from non-members.

Sakakibara, Yasubumi; Sato, Kengo


Pairwise local structural alignment of RNA sequences with sequence similarity less than 40%  

DEFF Research Database (Denmark)

Motivation: Searching for non-coding RNA (ncRNA) genes and structural RNA elements (eleRNA) are major challenges in gene finding todya as these often are conserved in structure rather than in sequence. Even though the number of available methods is growing, it is still of interest to pairwise detect two genes with low sequence similarity, where the genes are part of a larger genomic region. Results: Here we present such an approach for pairwise local alignment which is based on FILDALIGN and the Sankoff algorithm for simultaneous structural alignment of multiple sequences. We include the ability to conduct mutual scans of two sequences of arbitrary length while searching for common local structural motifs of some maximum length. This drastically reduces the complexity of the algorithm. The scoring scheme includes structural parameters corresponding to those available for free energy as well as for substitution matrices similar to RIBOSUM. The new FOLDALIGN implementation is tested on a dataset where the ncRNAs and eleRNAs have sequence similarity <40% and where the ncRNAs and eleRNAs are energetically indistinguishable from the surrounding genomic sequence context. The method is tested in two ways: (1) its ability to find the common structure between the genes only and (2) its ability to locate ncRNAs and eleRNAs in a genomic context. In case (1), it makes sense to compare with methods like Dynalign, and the performances are very similar, but FOLDALIGN is substantially faster. The structure prediction performance for a family is typically around 0.7 using Matthews correlation coefficient. In case (2), the algorithm is successful at locating RNA families with an average sensitivity of 0.8 and a positive predictive value of 0.9 using a BLAST-like hit selection scheme. Availability: The program is available online at Contact:

Havgaard, Jakob Hull; Lyngsø, Rune B.



Data structures and compression algorithms for genomic sequence data  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Motivation: The continuing exponential accumulation of full genome data, including full diploid human genomes, creates new challenges not only for understanding genomic structure, function and evolution, but also for the storage, navigation and privacy of genomic data. Here, we develop data structures and algorithms for the efficient storage of genomic and other sequence data that may also facilitate querying and protecting the data.

Brandon, Marty C.; Wallace, Douglas C.; Baldi, Pierre



Implicit transfer of reversed temporal structure in visuomotor sequence learning. (United States)

Some spatio-temporal structures are easier to transfer implicitly in sequential learning. In this study, we investigated whether the consistent reversal of triads of learned components would support the implicit transfer of their temporal structure in visuomotor sequence learning. A triad comprised three sequential button presses ([1][2][3]) and seven consecutive triads comprised a sequence. Participants learned sequences by trial and error, until they could complete it 20 times without error. Then, they learned another sequence, in which each triad was reversed ([3][2][1]), partially reversed ([2][1][3]), or switched so as not to overlap with the other conditions ([2][3][1] or [3][1][2]). Even when the participants did not notice the alternation rule, the consistent reversal of the temporal structure of each triad led to better implicit transfer; this was confirmed in a subsequent experiment. These results suggest that the implicit transfer of the temporal structure of a learned sequence can be influenced by both the structure and consistency of the change. PMID:24215394

Tanaka, Kanji; Watanabe, Katsumi



Accurate multiple sequence-structure alignment of RNA sequences using combinatorial optimization  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background The discovery of functional non-coding RNA sequences has led to an increasing interest in algorithms related to RNA analysis. Traditional sequence alignment algorithms, however, fail at computing reliable alignments of low-homology RNA sequences. The spatial conformation of RNA sequences largely determines their function, and therefore RNA alignment algorithms have to take structural information into account. Results We present a graph-based representation for sequence-structure alignments, which we model as an integer linear program (ILP. We sketch how we compute an optimal or near-optimal solution to the ILP using methods from combinatorial optimization, and present results on a recently published benchmark set for RNA alignments. Conclusion The implementation of our algorithm yields better alignments in terms of two published scores than the other programs that we tested: This is especially the case with an increasing number of input sequences. Our program LARA is freely available for academic purposes from

Klau Gunnar W



Dinuclear and 1D iron(III) Schiff base complexes bridged by 4-salicylideneamino-1,2,4-triazolate: X-ray structures and magnetic properties. (United States)

Four new iron(III) complexes were obtained by the reaction of 4-salicylideneamino-1,2,4-triazole (Hsaltrz) and selected dinuclear ?-oxo-bridged iron(III) Schiff base complexes [{FeL(4)}(2)(?-O)], where L(4) represents a terminal tetradentate dianionic Schiff-base ligand. X-ray structural analysis revealed a novel bridging mode of ?N,?O of the saltrz ligand to form dinuclear complexes [{Fe(salen)(?-saltrz)}(2)]·CH(3)OH (1) (H(2)salen = N,N'-ethylenebis(salicylimine)) and [{Fe(salpn)(?-saltrz)}(2)] (2) (H(2)salpn = N,N'-1,2-propylenbis(salicylimine)), whereas one-dimensional (1D) zig-zag chains were formed in the case of [{Fe(salch)(?-saltrz)}·0.5CH(3)OH](n) (3) (H(2)salch = N,N'-cyclohexanebis(salicylimine)) and [Fe(salophen)(?-saltrz)](n) (4) (H(2)salophen = N,N'-o-phenylenebis(salicylimine)). It was also shown that the rigidity of the terminal ligand L(4) can be considered as the key factor for the molecular dimensionality of the products. The thorough magnetic analysis based on SQUID experiments, including the isotropic exchange and the zero-field splitting of both temperature and field dependent data, was performed for dimeric (1 and 2) and also for polymeric compounds (3 and 4) and revealed weak antiferromagnetic exchange mediated by the saltrz anions with much larger D-parameter (|D|?|J|). PMID:21968851

Herchel, Radovan; Pavelek, Lubomír; Trávní?ek, Zden?k



1D to 2D Na+ Ion Diffusion Inherently Linked to Structural Transitions in Na0.7CoO2 (United States)

We report the observation of a stepwise “melting” of the low-temperature Na-vacancy order in the layered transition-metal oxide Na0.7CoO2. High-resolution neutron powder diffraction analysis indicates the existence of two first-order structural transitions, one at T1?290K followed by a second at T2?400K. Detailed analysis strongly suggests that both transitions are linked to changes in the Na mobility. Our data are consistent with a two-step disappearance of Na-vacancy order through the successive opening of first quasi-1D (T1>T>T2) and then 2D (T>T2) Na diffusion paths. These results shed new light on previous, seemingly incompatible, experimental interpretations regarding the relationship between Na-vacancy order and Na dynamics in this material. They also represent an important step towards the tuning of physical properties and the design of tailored functional materials through an improved control and understanding of ionic diffusion.

Medarde, M.; Mena, M.; Gavilano, J. L.; Pomjakushina, E.; Sugiyama, J.; Kamazawa, K.; Pomjakushin, V. Yu.; Sheptyakov, D.; Batlogg, B.; Ott, H. R.; Månsson, M.; Juranyi, F.



Mapping and sequencing of structural variation from eight human genomes (United States)

Genetic variation among individual humans occurs on many different scales, ranging from gross alterations in the human karyotype to single nucleotide changes. Here we explore variation on an intermediate scale—particularly insertions, deletions and inversions affecting from a few thousand to a few million base pairs. We employed a clone-based method to interrogate this intermediate structural variation in eight individuals of diverse geographic ancestry. Our analysis provides a comprehensive overview of the normal pattern of structural variation present in these genomes, refining the location of 1,695 structural variants. We find that 50% were seen in more than one individual and that nearly half lay outside regions of the genome previously described as structurally variant. We discover 525 new insertion sequences that are not present in the human reference genome and show that many of these are variable in copy number between individuals. Complete sequencing of 261 structural variants reveals considerable locus complexity and provides insights into the different mutational processes that have shaped the human genome. These data provide the first high-resolution sequence map of human structural variation—a standard for genotyping platforms and a prelude to future individual genome sequencing projects.

Kidd, Jeffrey M.; Cooper, Gregory M.; Donahue, William F.; Hayden, Hillary S.; Sampas, Nick; Graves, Tina; Hansen, Nancy; Teague, Brian; Alkan, Can; Antonacci, Francesca; Haugen, Eric; Zerr, Troy; Yamada, N. Alice; Tsang, Peter; Newman, Tera L.; Tuzun, Eray; Cheng, Ze; Ebling, Heather M.; Tusneem, Nadeem; David, Robert; Gillett, Will; Phelps, Karen A.; Weaver, Molly; Saranga, David; Brand, Adrianne; Tao, Wei; Gustafson, Erik; McKernan, Kevin; Chen, Lin; Malig, Maika; Smith, Joshua D.; Korn, Joshua M.; McCarroll, Steven A.; Altshuler, David A.; Peiffer, Daniel A.; Dorschner, Michael; Stamatoyannopoulos, John; Schwartz, David; Nickerson, Deborah A.; Mullikin, James C.; Wilson, Richard K.; Bruhn, Laurakay; Olson, Maynard V.; Kaul, Rajinder; Smith, Douglas R.; Eichler, Evan E.



Evolutionarily consistent families in SCOP: sequence, structure and function  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background SCOP is a hierarchical domain classification system for proteins of known structure. The superfamily level has a clear definition: Protein domains belong to the same superfamily if there is structural, functional and sequence evidence for a common evolutionary ancestor. Superfamilies are sub-classified into families, however, there is not such a clear basis for the family level groupings. Do SCOP families group together domains with sequence similarity, do they group domains with similar structure or by common function? It is these questions we answer, but most importantly, whether each family represents a distinct phylogenetic group within a superfamily. Results Several phylogenetic trees were generated for each superfamily: one derived from a multiple sequence alignment, one based on structural distances, and the final two from presence/absence of GO terms or EC numbers assigned to domains. The topologies of the resulting trees and confidence values were compared to the SCOP family classification. Conclusions We show that SCOP family groupings are evolutionarily consistent to a very high degree with respect to classical sequence phylogenetics. The trees built from (automatically generated structural distances correlate well, but are not always consistent with SCOP (hand annotated groupings. Trees derived from functional data are less consistent with the family level than those from structure or sequence, though the majority still agree. Much of GO and EC annotation applies directly to one family or subset of the family; relatively few terms apply at the superfamily level. Maximum sequence diversity within a family is on average 22% but close to zero for superfamilies.

Pethica Ralph B



Synthesis and Structure of 1D Na6 Cluster Chain with Short Na-Na Distance: Organic like Aromaticity in Inorganic Metal Cluster  

CERN Document Server

A unique 1D chain of sodium cluster containing (Na6) rings stabilized by a molybdenum containing metalloligand has been synthesized and characterized. DFT calculations show striking resemblance in their aromatic behaviour with the corresponding hydrocarbon analogues

Khatua, S; Chattaraj, P K; Roy, D R; Bhattacharjee*, Manish; Chattaraj*, Pratim K.; Khatua, Snehadrinarayan; Roy, Debesh R.



1D Aging  

CERN Document Server

We derive exact expressions for a number of aging functions that are scaling limits of non-equilibrium correlations, R(tw,tw+t) as tw --> infinity with t/tw --> theta, in the 1D homogenous q-state Potts model for all q with T=0 dynamics following a quench from infinite temperature. One such quantity is (the two-point, two-time correlation function) when n/sqrt(tw) --> z. Exact, closed-form expressions are also obtained when one or more interludes of infinite temperature dynamics occur. Our derivations express the scaling limit via coalescing Brownian paths and a ``Brownian space-time spanning tree,'' which also yields other aging functions, such as the persistence probability of no spin flip at 0 between tw and tw+t.

Fontes, L R; Newman, C M; Stein, D L



Statistical mechanics of secondary structures formed by random RNA sequences  

CERN Document Server

The formation of secondary structures by a random RNA sequence is studied as a model system for the sequence-structure problem omnipresent in biopolymers. Several toy energy models are introduced to allow detailed analytical and numerical studies. First, a two-replica calculation is performed. By mapping the two-replica problem to the denaturation of a single homogeneous RNA in 6-dimensional embedding space, we show that sequence disorder is perturbatively irrelevant, i.e., an RNA molecule with weak sequence disorder is in a molten phase where many secondary structures with comparable total energy coexist. A numerical study of various models at high temperature reproduces behaviors characteristic of the molten phase. On the other hand, a scaling argument based on the extremal statistics of rare regions can be constructed to show that the low temperature phase is unstable to sequence disorder. We performed a detailed numerical study of the low temperature phase using the droplet theory as a guide, and characte...

Bundschuh, R



Polymorphisms in CD1d affect antigen presentation and the activation of CD1d-restricted T cells  

Digital Repository Infrastructure Vision for European Research (DRIVER)

CD1 proteins constitute a distinct lineage of antigen-presenting molecules specialized for the presentation of lipid antigens to T cells. In contrast to the extensive sequence polymorphism characteristic of classical MHC molecules, CD1 proteins exhibit limited sequence diversity. Here, we describe the identification and characterization of CD1d alleles in wild-derived mouse strains. We demonstrate that polymorphisms in CD1d affect the presentation of endogenous and exogenous ligands to CD1d-r...

Zimmer, Michael I.; Nguyen, Hanh P.; Wang, Bin; Xu, Honglin; Colmone, Angela; Felio, Kyrie; Choi, Hak-jong; Zhou, Ping; Alegre, Maria-luisa; Wang, Chyung-ru



Alignment of multiple protein structures based on sequence and structure features  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Comparing the structures of proteins is crucial to gaining insight into protein evolution and function. Here, we align the sequences of multiple protein structures by a dynamic programming optimization of a scoring function that is a sum of an affine gap penalty and terms dependent on various sequence and structure features (SALIGN). The features include amino acid residue type, residue position, residue accessible surface area, residue secondary structure state and the conformation of a shor...



From RNA Sequences to Folding Pathways and Structures:. a Perspective (United States)

My talk today concerns RNA folding. Our group is trying to understand the process of RNA folding, going from its primary nucleotide sequence to its secondary structure. The approach we are developing is somewhat complementary to those Michael Zuker and Peter Schuster presented earlier today ...

Isambert, Hervé



Methodical study on the estimation of strain in shearing and rotating structures using radio frequency ultrasound based on 1-D and 2-D strain estimation techniques. (United States)

This simulation study is concerned with: 1) the feasibility of measuring rotation and 2) the assessment of the performance of strain estimation in shearing and rotating structures. The performance of 3 different radio frequency (RF) based methods is investigated. Linear array ultrasound data of a deforming block were simulated (axial shear strain = 2.0, 4.0, and 6.0%, vertical strain = 0.0, 1.0, and 2.0%). Furthermore, data of a rotating block were simulated over an angular range of 0.5 degrees to 10 degrees . Local displacements were estimated using a coarse-to-fine algorithm using 1-D and 2-D precompression kernels. A new estimation method was developed in which axial displacements were used to correct the search area for local axial motion. The study revealed that this so-called free-shape 2-D method outperformed the other 2 methods and produced more accurate displacement images. For higher axial shear strains, the variance of the axial strain and the axial shear strain reduced by a factor of 4 to 5. Rotations could be accurately measured up to 4.0 to 5.0 degrees . Again, the free-shape 2-D method yielded the most accurate results. After reconstruction of the rotation angle, the mean angles were slightly underestimated. The precision of the strain estimates decreased with increasing rotation angles. In conclusion, the proposed free-shape 2-D method enhances the measurement of (axial shear) strains and rotation. Experimental validation of the new method still has to be performed. PMID:20378448

Lopata, Richard; Hansen, Hendrik; Nillesen, Maartje; Thijssen, Johan; Kapusta, Livia; de Korte, Chris



Sequence, structure, function, immunity: structural genomics of costimulation  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Costimulatory receptors and ligands trigger the signaling pathways that are responsible for modulating the strength, course and duration of an immune response. High-resolution structures have provided invaluable mechanistic insights by defining the chemical and physical features underlying costimulatory receptor/ligand specificity, affinity, oligomeric state, and valency. Furthermore, these structures revealed general architectural features that are important for the integration of these inte...

Chattopadhyay, Kausik; Lazar-molnar, Eszter; Yan, Qingrong; Rubinstein, Rotem; Zhan, Chenyang; Vigdorovich, Vladimir; Ramagopal, Udupi A.; Bonanno, Jeffrey; Nathenson, Stanley G.; Almo, Steven C.



Formation of 1D hierarchical structures composed of Ni{sub 3}S{sub 2} nanosheets on CNTs backbone for supercapacitors and photocatalytic H{sub 2} production  

Energy Technology Data Exchange (ETDEWEB)

One-dimensional (1D) hierarchical structures composed of Ni{sub 3}S{sub 2} nanosheets grown on carbon nanotube (CNT) backbone (denoted as CNT rate at Ni{sub 3}S{sub 2}) are fabricated by a rational multi-step transformation route. The first step involves coating the CNT backbone with a layer of silica to form CNT rate at SiO{sub 2}, which serves as the substrate for the growth of nickel silicate (NiSilicate) nanosheets in the second step to form CNT rate at SiO{sub 2} rate at NiSilicate core-double shell 1D structures. Finally the as-formed CNT rate at SiO{sub 2} rate at NiSilicate 1D structures are converted into CNT-supported Ni{sub 3}S{sub 2} nanosheets via hydrothermal treatment in the presence of Na{sub 2}S. Simultaneously the intermediate silica layer is eliminated during the hydrothermal treatment, leading to the formation of CNT rate at Ni{sub 3}S{sub 2} nanostructures. Because of the unique hybrid nano-architecture, the as-prepared 1D hierarchical structure is shown to exhibit excellent performance in both supercapacitors and photocatalytic H{sub 2} production. (Copyright copyright 2012 WILEY-VCH Verlag GmbH and Co. KGaA, Weinheim)

Zhu, Ting; Wu, Hao Bin; Wang, Yabo; Xu, Rong; Lou, Xiong Wen [David] [School of Chemical and Biomedical Engineering, Nanyang Technological University, 70 Nanyang Drive, Singapore 637457 (Singapore)



Unusual structures of the tandem repetitive DNA sequences located at human centromeres. (United States)

The presence of the highly conserved repetitive DNA sequence d(AATGG)n.d(CCATT)n in human centromeres argues for a special role for this sequence in recognition, most probably through the formation of an unusual structure during mitosis. Quantitative one- and two-dimensional nuclear magnetic resonance (1D/2D NMR) spectroscopic studies reveal that the Watson-Crick duplex d(AATGG)n.d(CCATT)n adopts the usual B-DNA conformation as illustrated by taking d(AATGG)3.d(CCATT)3 as an example, whereas the d(CCATT)n strand is essentially a random coil. In contrast, the d(AATGG)n strand adopts an unusual stem-loop motif for repeat lengths n = 2, 3, 4, and 6. In addition to normal Watson-Crick A.T pairs, the stem-loop structures are stabilized by mismatched A.G and G.G pairs in the stem and G-G-A stacking in the loop. Stem-loop structures of d(AATGG)n are independently verified by gel electrophoresis and nuclease digestion studies and were also previously shown to be as stable as the corresponding Watson-Crick duplex d(AATGG)n.d(CCATT)n [Grady et al. (1992) Proc. Natl. Acad. Sci. U.S.A. 89, 1695-1699]. Therefore, the sequence d(AATGG)n can, indeed, nucleate a stem-loop structure at little free energy cost, and if, during mitosis, it is located on the chromosome surface, it can provide specific recognition sites for kinetochore function. PMID:8142384

Catasti, P; Gupta, G; Garcia, A E; Ratliff, R; Hong, L; Yau, P; Moyzis, R K; Bradbury, E M



High-Throughput Sequencing Based Methods of RNA Structure Investigation  

DEFF Research Database (Denmark)

In this thesis we describe the development of four related methods for RNA structure probing that utilize massive parallel sequencing. Using them, we were able to gather structural data for multiple, long molecules simultaneously. First, we have established an easy to follow experimental and computational protocol for detecting the reverse transcription termination sites (RTTS-Seq). This protocol was subsequently applied to hydroxyl radical footprinting of three dimensional RNA structures to give a probing signal that correlates well with the RNA backbone solvent accessibility. Moreover, we applied RTTS-Seq to detect antisense oligonucleotide binding sites within a transcriptome. In this case, we applied an enrichment strategy to greatly reduce the background. Finally, we have modified the RTTS-Seq to study the secondary structure of 3â?? untranslated regions. In the course of this thesis we describe several computational methods. One that alleviates PCR bias by estimating number of unique molecules existing before the amplification, and two methods for data normalization: one applicable when the paired end sequencing is performed, and the other that works with the single read sequencing with known priming sites.

Kielpinski, Lukasz Jan



Metal-binding loop length and not sequence dictates structure. (United States)

The C-terminal copper-binding loop in the beta-barrel fold of the cupredoxin azurin has been replaced with a range of sequences containing alanine, glycine, and valine residues to assess the importance of amino acid composition and the length of this region. The introduction of 2 and 4 alanines between the coordinating Cys, His, and Met results in loop structures matching those in naturally occurring proteins with the same loop lengths. A loop with 4 alanines between the Cys and His and 3 between the His and Met ligands has a structure identical to that of the WT protein, whose loop is the same length. Loop structure is dictated by length and not sequence allowing the properties of the main surface patch for interactions with partners, to which the loop is a major contributor, to be optimized. Loops with 2 amino acids between the ligands using glycine, alanine, and valine residues have been compared. An empirical relationship is found between copper site protection by the loop and reduction potential. A loop adorned with 4 methyl groups is sufficient to protect the copper ion, enabling most sequences to adequately perform this task. The mutant with 3 alanine residues between the ligands forms a strand-swapped dimer in the crystal structure, an arrangement that has not, to our knowledge, been seen previously for this family of proteins. Cupredoxins function as redox shuttles and are required to be monomeric; therefore, none have evolved with a metal-binding loop of this length. PMID:19299503

Sato, Katsuko; Li, Chan; Salard, Isabelle; Thompson, Andrew J; Banfield, Mark J; Dennison, Christopher



Sequence and structural conservation in RNA ribose zippers  

Energy Technology Data Exchange (ETDEWEB)

The ribose zipper, an important element of RNA tertiary structure, is characterized by consecutive hydrogen-bonding interactions between ribose 20-hydroxyls from different regions of an RNA chain or between RNA chains. These tertiary contacts have previously been observed to also involve base backbone and base base interactions (A-minor type). We searched for ribose zipper tertiary interactions in the crystal structures of the large ribosomal subunit RNAs of Haloarcula marismortui and Deinococcus radiodurans, and the small ribosomal subunit RNA of Thermus thermophilus and identified a total of 97 ribose zippers. Of these, 20 were found in T. thermophilus 16 S rRNA, 44 in H. marismortui 23 S rRNA (plus 2 bridging 5 S and 23 S rRNAs) and 30 in D. radiodurans 23 S rRNA (plus 1 bridging 5 S and 23 S rRNAs). These were analyzed in terms of sequence conservation, structural conservation and stability, location in secondary structure, and phylogenetic conservation. Eleven types of ribose zippers were defined based on ribose base interactions. Of these 11, seven were observed in the ribosomal RNAs. The most common of these is the canonical ribose zipper, originally observed in the P4 P6 group I intron fragment. All ribose zippers were formed by antiparallel chain interactions and only a single example extended beyond two residues, forming an overlapping ribose zipper of three consecutive residues near the small subunit A-site. Almost all ribose zippers link stem (Watson Crick duplex) or stem-like (base-paired), with loop (external, internal, or junction) chain segments. About two-thirds of the observed ribose zippers interact with ribosomal proteins. Most of these ribosomal proteins bridge the ribose zipper chain segments with basic amino acid residues hydrogen bonding to the RNA backbone. Proteins involved in crucial ribosome function and in early stages of ribosomal assembly also stabilize ribose zipper interactions. All ribose zippers show strong sequence conservation both within these three ribosomal RNA structures and in a large database of aligned prokaryotic sequences. The physical basis of the sequence conservation is stacked base triples formed between consecutive base-pairs on the stem or stem-like segment with bases (often adenines) from the loop-side segment. These triples have previously been characterized as Type I and Type II A-minor motifs and are stabilized by base base and base ribose hydrogen bonds. The sequence and structure conservation of ribose zippers can be directly used in tertiary structure prediction and may have applications in molecular modeling and design.

Tamura, Makio; Holbrook, Stephen R.



The structure of verbal sequences analyzed with unsupervised learning techniques  

CERN Document Server

Data mining allows the exploration of sequences of phenomena, whereas one usually tends to focus on isolated phenomena or on the relation between two phenomena. It offers invaluable tools for theoretical analyses and exploration of the structure of sentences, texts, dialogues, and speech. We report here the results of an attempt at using it for inspecting sequences of verbs from French accounts of road accidents. This analysis comes from an original approach of unsupervised training allowing the discovery of the structure of sequential data. The entries of the analyzer were only made of the verbs appearing in the sentences. It provided a classification of the links between two successive verbs into four distinct clusters, allowing thus text segmentation. We give here an interpretation of these clusters by applying a statistical analysis to independent semantic annotations.

Recanati, Catherine; Bennani, Younès



Sequence-structure analysis of FAD-containing proteins  

Digital Repository Infrastructure Vision for European Research (DRIVER)

We have analyzed structure-sequence relationships in 32 families of flavin adenine dinucleotide (FAD)-binding proteins, to prepare for genomic-scale analyses of this family. Four different FAD-family folds were identified, each containing at least two or more protein families. Three of these families, exemplified by glutathione reductase (GR), ferredoxin reductase (FR), and p-cresol methylhydroxylase (PCMH) were previously defined, and a family represented by pyruvate oxidase (PO) is newly de...

Dym, Orly; Eisenberg, David



Finding Structure in Text, Genome and Other Symbolic Sequences  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The statistical methods derived and described in this thesis provide new ways to elucidate the structural properties of text and other symbolic sequences. Generically, these methods allow detection of a difference in the frequency of a single feature, the detection of a difference between the frequencies of an ensemble of features and the attribution of the source of a text. These three abstract tasks suffice to solve problems in a wide variety of settings. Furthermore, the ...

Dunning, Ted



Chromatin structure characteristics of pre-miRNA genomic sequences  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background MicroRNAs (miRNAs are non-coding RNAs with important roles in regulating gene expression. Recent studies indicate that transcription and cleavage of miRNA are coupled, and that chromatin structure may influence miRNA transcription. However, little is known about the relationship between the chromatin structure and cleavage of pre-miRNA from pri-miRNA. Results By analysis of genome-wide nucleosome positioning data sets from human and Caenorhabditis elegans (C. elegans, we found an enrichment of positioned nucleosome on pre-miRNA genomic sequences, which is highly correlated with GC content within pre-miRNA. In addition, obvious enrichments of three histone modifications (H2BK5me1, H3K36me3 and H4K20me1 as well as RNA Polymerase II (RNAPII were observed on pre-miRNA genomic sequences corresponding to the active-promoter miRNAs and expressed miRNAs. Conclusion Our results revealed the chromatin structure characteristics of pre-miRNA genomic sequences, and implied potential mechanisms that can recognize these characteristics, thus improving pre-miRNA cleavage.

Teng Mingxiang



Early-Stage Folding in Proteins (In Silico Sequence-to-Structure Relation  

Directory of Open Access Journals (Sweden)

Full Text Available A sequence-to-structure library has been created based on the complete PDB database. The tetrapeptide was selected as a unit representing a well-defined structural motif. Seven structural forms were introduced for structure classification. The early-stage folding conformations were used as the objects for structure analysis and classification. The degree of determinability was estimated for the sequence-to-structure and structure-to-sequence relations. Probability calculus and informational entropy were applied for quantitative estimation of the mutual relation between them. The structural motifs representing different forms of loops and bends were found to favor particular sequences in structure-to-sequence analysis.

Brylinski Micha?



Inferences from structural comparison: flexibility, secondary structure wobble and sequence alignment optimization  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Abstract Background Work on protein structure prediction is very useful in biological research. To evaluate their accuracy, experimental protein structures or their derived data are used as the 'gold standard'. However, as proteins are dynamic molecular machines with structural flexibility such a standard may be unreliable. Results To investigate the influence of the structure flexibility, we analysed 3,652 protein structures of 137 unique sequences from 24 prot...

Zhang Gaihua; Su Zhen



Sequence-structure matching in globular proteins: application to supersecondary and tertiary structure determination.  

Digital Repository Infrastructure Vision for European Research (DRIVER)

A methodology designed to address the inverse globular protein-folding problem (the identification of which sequences are compatible with a given three-dimensional structure) is described. By using a library of protein finger-prints, defined by the side chain interaction pattern, it is possible to match each structure to its own sequence in an exhaustive data base search. It is shown that this is a permissive requirement for the validation of the methodology. To pass the more rigorous test of...

Godzik, A.; Skolnick, J.



Data Structures: Sequence Problems, Range Queries, and Fault Tolerance  

DEFF Research Database (Denmark)

The focus of this dissertation is on algorithms, in particular data structures that give provably ecient solutions for sequence analysis problems, range queries, and fault tolerant computing. The work presented in this dissertation is divided into three parts. In Part I we consider algorithms for a range of sequence analysis problems that have risen from applications in pattern matching, bioinformatics, and data mining. On a high level, each problem is dened by a function and some constraints and the job at hand is to locate subsequences that score high with this function and are not invalidated by the constraints. Many variants and similar problems have been proposed leading to several dierent approaches and algorithms. We consider problems where the function is the sum of the elements in the sequence and the constraints only bound the length of the subsequences considered. We give optimal algorithms for several variants of the problem based on a simple idea and classic algorithms and data structures. In Part II we consider range query data structures. This a category of problems where the task is to preprocess an input sequence using as little time and space as possible such that one can eciently compute a certain function on the elements in a given query subsequence. There are many types of functions that has been considered in connection with input from dierent sources. The input could be ip-data sorted by ip-address, real estate prices sorted by zip code, advertising cost sorted by time etc. We consider data structures for two classic statistics functions, namely median and mode. Finally, Part III investigates fault tolerant algorithms and data structures. This deals with the trend of avoiding elaborate error checking and correction circuitry that would impose non-negligible costs in terms of hardware performance and money in the design of todays high speed memory technologies. Hardware, power failures, and environmental conditions such as cosmic rays and alpha particles can all alter the memory in unpredictable ways. In applications where large memory capacities are needed at low cost, it makes sense to assume that the algorithms themselves are in charge for dealing with memory faults. We investigate searching, sorting and counting algorithms and data structures that provably returns sensible information in spite of memory corruptions.

Jørgensen, Allan Grønlund



The sequence, structure and evolutionary features of HOTAIR in mammals  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background An increasing number of long noncoding RNAs (lncRNAs have been identified recently. Different from all the others that function in cis to regulate local gene expression, the newly identified HOTAIR is located between HoxC11 and HoxC12 in the human genome and regulates HoxD expression in multiple tissues. Like the well-characterised lncRNA Xist, HOTAIR binds to polycomb proteins to methylate histones at multiple HoxD loci, but unlike Xist, many details of its structure and function, as well as the trans regulation, remain unclear. Moreover, HOTAIR is involved in the aberrant regulation of gene expression in cancer. Results To identify conserved domains in HOTAIR and study the phylogenetic distribution of this lncRNA, we searched the genomes of 10 mammalian and 3 non-mammalian vertebrates for matches to its 6 exons and the two conserved domains within the 1800 bp exon6 using Infernal. There was just one high-scoring hit for each mammal, but many low-scoring hits were found in both mammals and non-mammalian vertebrates. These hits and their flanking genes in four placental mammals and platypus were examined to determine whether HOTAIR contained elements shared by other lncRNAs. Several of the hits were within unknown transcripts or ncRNAs, many were within introns of, or antisense to, protein-coding genes, and conservation of the flanking genes was observed only between human and chimpanzee. Phylogenetic analysis revealed discrete evolutionary dynamics for orthologous sequences of HOTAIR exons. Exon1 at the 5' end and a domain in exon6 near the 3' end, which contain domains that bind to multiple proteins, have evolved faster in primates than in other mammals. Structures were predicted for exon1, two domains of exon6 and the full HOTAIR sequence. The sequence and structure of two fragments, in exon1 and the domain B of exon6 respectively, were identified to robustly occur in predicted structures of exon1, domain B of exon6 and the full HOTAIR in mammals. Conclusions HOTAIR exists in mammals, has poorly conserved sequences and considerably conserved structures, and has evolved faster than nearby HoxC genes. Exons of HOTAIR show distinct evolutionary features, and a 239 bp domain in the 1804 bp exon6 is especially conserved. These features, together with the absence of some exons and sequences in mouse, rat and kangaroo, suggest ab initio generation of HOTAIR in marsupials. Structure prediction identifies two fragments in the 5' end exon1 and the 3' end domain B of exon6, with sequence and structure invariably occurring in various predicted structures of exon1, the domain B of exon6 and the full HOTAIR.

Zhu Hao



Structural Correlates of Skilled Performance on a Motor Sequence Task  

Directory of Open Access Journals (Sweden)

Full Text Available The brain regions functionally engaged in motor sequence performance are well established, but the structural characteristics of these regions and the fibre pathways involved have been less well studied. In addition, relatively few studies have combined multiple magnetic resonance imaging (MRI and behavioural performance measures in the same sample. Therefore, the current study used diffusion tensor imaging, probabilistic tractography, and voxel-based morphometry to determine the structural correlates of skilled motor performance. Further, we compared these findings with fMRI results in the same sample. We correlated final performance and rate of improvement measures on a temporal motor sequence task with skeletonised fractional anisotropy (FA and whole brain grey matter (GM volume. Final synchronisation performance was negatively correlated with FA in white matter underlying bilateral sensorimotor cortex – an effect that was mediated by a positive correlation with radial diffusivity. Multi-fibre tractography indicated that this region contained crossing fibres from the corticospinal tract and superior longitudinal fasciculus (SLF. The identified SLF pathway linked parietal and auditory cortical regions that have been shown to be functionally engaged in this task. Thus, we hypothesise that enhanced synchronisation performance on this task may be related to greater fibre integrity of the SLF. Rate of improvement on synchronisation was positively correlated with GM volume in cerebellar lobules HVI and V – regions that showed training-related decreases in activity in the same sample. Taken together, our results link individual differences in brain structure and function to motor sequence performance on the same task. Further, our study illustrates the utility of using multiple MR measures and analysis techniques to specify the interpretation of structural findings.




Primary sequence and domain structure of chicken vinculin. (United States)

We have determined the complete sequence of chick vinculin from two overlapping cDNA clones. The vinculin mRNA consists of 262 bp of 5' untranslated sequence, an open reading frame of 3195 bp (excluding the initiation codon) and a long 3' untranslated sequence (greater than 2 kb). Chick vinculin contains 1066 amino acid residues, and has a deduced molecular mass of 116,933 Da. Analysis of the domain structure of vinculin shows that the molecule can be cleaved by V8 proteinase into a 90 kDa globular head and a 32 kDa tail region, the latter of which could further be cleaved into a 27 kDa polypeptide. The 90 kDa globular head contains the N-terminus of vinculin, three 112-residue repeats (residues 259-589), and extends to approximately residue 850. Gel overlay experiments show that it also contains a binding site for the cytoskeletal protein talin. The talin-binding domain was further localized to the N-terminal 398 amino acid residues of the protein by expression in vitro of this region from a vinculin cDNA cloned into the Bluescript SK+ vector. The head and tail domains are apparently separated by a proline-rich region that contains V8-proteinase-cleavage sites and a candidate tyrosine (822)-phosphorylation site. Secondary-structure prediction suggests that the head and tail domains contain alpha-helical regions separated by short stretches of turn/coil. Comparison of the chick with a partial human sequence reveals that vinculin is a highly conserved protein. In chickens Southern-blot analysis is consistent with a single vinculin gene, and it is therefore likely that vinculin, and its higher-molecular-mass isoform termed metavinculin, arise through alternative splicing. PMID:2497736

Price, G J; Jones, P; Davison, M D; Patel, B; Bendori, R; Geiger, B; Critchley, D R



Structural Laplacian Eigenmaps for modeling sets of multivariate sequences. (United States)

A novel embedding-based dimensionality reduction approach, called structural Laplacian Eigenmaps, is proposed to learn models representing any concept that can be defined by a set of multivariate sequences. This approach relies on the expression of the intrinsic structure of the multivariate sequences in the form of structural constraints, which are imposed on dimensionality reduction process to generate a compact and data-driven manifold in a low dimensional space. This manifold is a mathematical representation of the intrinsic nature of the concept of interest regardless of the stylistic variability found in its instances. In addition, this approach is extended to model jointly several related concepts within a unified representation creating a continuous space between concept manifolds. Since a generated manifold encodes the unique characteristic of the concept of interest, it can be employed for classification of unknown instances of concepts. Exhaustive experimental evaluation on different datasets confirms the superiority of the proposed methodology to other state-of-the-art dimensionality reduction methods. Finally, the practical value of this novel dimensionality reduction method is demonstrated in three challenging computer vision applications, i.e., view-dependent and view-independent action recognition as well as human-human interaction classification. PMID:24144690

Lewandowski, Michal; Makris, Dimitrios; Velastin, Sergio A; Nebel, Jean-Christophe



Topological characterization of crystalline ice structures from coordination sequences  

CERN Document Server

Topological properties of crystalline ice structures are studied by considering ring statistics, coordination sequences, and topological density of different ice phases. The coordination sequences (number of sites at topological distance k from a reference site) have been obtained by direct enumeration until at least 40 coordination spheres for different ice polymorphs. This allows us to study the asymptotic behavior of the mean number of sites in the k-th shell, M_k, for high values of k: M_k ~ a k^2, a being a structure-dependent parameter. Small departures from a strict parabolic dependence have been studied by considering first and second differences of the series {M_k} for each structure. The parameter a ranges from 2.00 for ice VI to 4.27 for ice XII, and is used to define a topological density for these solid phases of water. Correlations between such topological density and the actual volume of ice phases are discussed. Ices Ih and Ic are found to depart from the general trend in this correlation due ...

Herrero, Carlos P



Infinite Sequence of Poincare Group Extensions: Structure and Dynamics  

CERN Document Server

We study the structure and dynamics of the infinite sequence of extensions of the Poincare algebra outlined in arXiv:0808.2243. We give explicitly the Maurer-Cartan (MC) 1-forms of the extended Lie algebras up to level three. Using these forms, coupled to new dynamical parameters, to construct a relativistically invariant particle Lagrangian, we find that it describes the motion of a relativistic particle subject to an electromagnetic field-like force. The form of this field is determined by the extension, increasing in complexity with each level. New physical degrees of freedom apart from the particle position appear in a natural way, dictated by the extended group structure.

Bonanos, Sotirios



Knowledge of sequence structure prevents auditory distraction: an ERP study. (United States)

Infrequent, salient stimuli often capture attention despite their task-irrelevancy, and disrupt on-going goal-directed behavior. A number of studies show that presenting cues signaling forthcoming deviants reduces distraction, which may be a "by-product" of cue-processing interference or the result of direct preparatory processes for the forthcoming distracter. In the present study, instead of "bursts" of cue information, information on the temporal structure of the stimulus sequence was provided. Young adults performed a spatial discrimination task where complex tones moving left or right were presented. In the predictable condition, every 7th tone was a pitch-deviant, while in the random condition the position of deviants was random with a probability of 1/7. Whereas the early event-related potential correlates of deviance-processing (N1 and MMN) were unaffected by predictability, P3a amplitude was significantly reduced in the predictable condition, indicating that prevention of distraction was based on the knowledge about the temporal structure of the stimulus sequence. PMID:24657900

Volosin, Márta; Horváth, János



DNA Sequence-Directed Organization of Chromatin: Structure-Based Computational Analysis of Nucleosome-Binding Sequences  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The folding of DNA on the nucleosome core particle governs many fundamental issues in eukaryotic molecular biology. In this study, an updated set of sequence-dependent empirical “energy” functions, derived from the structures of other protein-bound DNA molecules, is used to investigate the extent to which the architecture of nucleosomal DNA is dictated by its underlying sequence. The potentials are used to estimate the cost of deforming a collection of sequences known to bind or resist up...

Balasubramanian, Sreekala; Xu, Fei; Olson, Wilma K.



Wurst: a protein threading server with a structural scoring function, sequence profiles and optimized substitution matrices. (United States)

Wurst is a protein threading program with an emphasis on high quality sequence to structure alignments ( Submitted sequences are aligned to each of about 3000 templates with a conventional dynamic programming algorithm, but using a score function with sophisticated structure and sequence terms. The structure terms are a log-odds probability of sequence to structure fragment compatibility, obtained from a Bayesian classification procedure. A simplex optimization was used to optimize the sequence-based terms for the goal of alignment and model quality and to balance the sequence and structural contributions against each other. Both sequence and structural terms operate with sequence profiles. PMID:15215443

Torda, Andrew E; Procter, James B; Huber, Thomas



PROMALS3D web server for accurate multiple protein sequence and structure alignments  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Multiple sequence alignments are essential in computational sequence and structural analysis, with applications in homology detection, structure modeling, function prediction and phylogenetic analysis. We report PROMALS3D web server for constructing alignments for multiple protein sequences and/or structures using information from available 3D structures, database homologs and predicted secondary structures. PROMALS3D shows higher alignment accuracy than a number of other advanced methods. In...

Pei, Jimin; Tang, Ming; Grishin, Nick V.



Dimension reduction for extracting geometrical structure of multidimensional phase space: Application to fast energy exchange in the reaction O(1D)+N2O?NO+NO  

International Nuclear Information System (INIS)

One of the most fundamental problems in studying general Hamiltonian systems with many degrees of freedom is to extract a low-dimensional subsystem including the essential dynamics. In this paper, a new partial normal form (PNF) method is developed to reduce the number of coupling terms in the Hamiltonian and to simplify the dynamics analyses. The PNF method allows one to decouple many unimportant bath modes as well as the reactive mode from the system by assessing the significance of the coupling terms. The method is applied to the chemical reaction O(1D)+N2O?NO+NO, which was found to exhibit efficient energy exchange between the two NO stretching modes despite the short lifetime of the reaction intermediate [S. Kawai et al., J. Chem. Phys. 124, 184315 (2006)]. Through the analysis of the two-dimensional PNF Hamiltonian subsystem, it is found that the motion of the subsystem preserves the 'normal mode picture' of the symmetric and antisymmetric NO stretching modes despite its high energy. Then the vibrational energy, initially localized in the newly formed NO bond, is transferred to the reactants' NO bond through the beating between the symmetric and antisymmetric stretching modes. The preservation of the normal mode picture and the short period of the beating explain the fast energy exchange between the two NO bonds. This successful application proves that the PNF method can extract the essential small subspace from many-degrees-of-freedom Hamiltonian systems



Functional region prediction with a set of appropriate homologous sequences-an index for sequence selection by integrating structure and sequence information with spatial statistics  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background The detection of conserved residue clusters on a protein structure is one of the effective strategies for the prediction of functional protein regions. Various methods, such as Evolutionary Trace, have been developed based on this strategy. In such approaches, the conserved residues are identified through comparisons of homologous amino acid sequences. Therefore, the selection of homologous sequences is a critical step. It is empirically known that a certain degree of sequence divergence in the set of homologous sequences is required for the identification of conserved residues. However, the development of a method to select homologous sequences appropriate for the identification of conserved residues has not been sufficiently addressed. An objective and general method to select appropriate homologous sequences is desired for the efficient prediction of functional regions. Results We have developed a novel index to select the sequences appropriate for the identification of conserved residues, and implemented the index within our method to predict the functional regions of a protein. The implementation of the index improved the performance of the functional region prediction. The index represents the degree of conserved residue clustering on the tertiary structure of the protein. For this purpose, the structure and sequence information were integrated within the index by the application of spatial statistics. Spatial statistics is a field of statistics in which not only the attributes but also the geometrical coordinates of the data are considered simultaneously. Higher degrees of clustering generate larger index scores. We adopted the set of homologous sequences with the highest index score, under the assumption that the best prediction accuracy is obtained when the degree of clustering is the maximum. The set of sequences selected by the index led to higher functional region prediction performance than the sets of sequences selected by other sequence-based methods. Conclusions Appropriate homologous sequences are selected automatically and objectively by the index. Such sequence selection improved the performance of functional region prediction. As far as we know, this is the first approach in which spatial statistics have been applied to protein analyses. Such integration of structure and sequence information would be useful for other bioinformatics problems.

Nemoto Wataru



What is the impact of the sequence structure on implicit learning in children?  

Digital Repository Infrastructure Vision for European Research (DRIVER)

It is generally admitted that implicit learning abilities are efficient early in childhood. However, few studies have explored the impact of the structure of the sequence on children’s performance in implicit learning tasks. The current research was intended to examine sequence learning abilities in children by comparing sequences of different structural characteristics.

Lejeune, Caroline; Schmitz, Xavier; Lempereur, Ste?phanie; Maillart, Christelle; Meulemans, Thierry; Gabriel, Audrey



Quadrupole oscillator strengths for the helium isoelectronic sequence: n 1S-m 1D, n 3S-m 3D, n 1P-m 1P, and n 3P-m 3P transitions with n  

International Nuclear Information System (INIS)

Quadrupole oscillator strengths (QOS) for He and the isoelectronic ions from Li+ to Ne8+ are reported for all possible n 1S-m 1D, n 3S-m 3D, n 1P-m 1P and n 3P-m 3P transitions involving states with m<7 and n<7. The calculations are based upon explicitly correlated wavefunctions that lead to variational energies only nano-Hartrees above the best literature values. The results extend significantly both the accuracy and range of the QOSs available for two-electron atomic species. (author)



Can Computationally Designed Protein Sequences Improve Secondary Structure Prediction. (United States)

Computational sequence design methods are used to engineer proteins with desired properties such as increased thermal stability and novel function. In addition, these algorithms can be used to identify an envelope of sequences that may be compatible with ...

A. Wallqvist M. S. Lee R. Bondugula



Structural properties of replication origins in yeast DNA sequences  

International Nuclear Information System (INIS)

Sequence-dependent DNA flexibility is an important structural property originating from the DNA 3D structure. In this paper, we investigate the DNA flexibility of the budding yeast (S. Cerevisiae) replication origins on a genome-wide scale using flexibility parameters from two different models, the trinucleotide and the tetranucleotide models. Based on analyzing average flexibility profiles of 270 replication origins, we find that yeast replication origins are significantly rigid compared with their surrounding genomic regions. To further understand the highly distinctive property of replication origins, we compare the flexibility patterns between yeast replication origins and promoters, and find that they both contain significantly rigid DNAs. Our results suggest that DNA flexibility is an important factor that helps proteins recognize and bind the target sites in order to initiate DNA replication. Inspired by the role of the rigid region in promoters, we speculate that the rigid replication origins may facilitate binding of proteins, including the origin recognition complex (ORC), Cdc6, Cdt1 and the MCM2-7 complex



An Improved Protocol for Sequencing of Repetitive Genomic Regions and Structural Variations Using Mutagenesis and Next Generation Sequencing (United States)

The rise of Next Generation Sequencing (NGS) technologies has transformed de novo genome sequencing into an accessible research tool, but obtaining high quality eukaryotic genome assemblies remains a challenge, mostly due to the abundance of repetitive elements. These also make it difficult to study nucleotide polymorphism in repetitive regions, including certain types of structural variations. One solution proposed for resolving such regions is Sequence Assembly aided by Mutagenesis (SAM), which relies on the fact that introducing enough random mutations breaks the repetitive structure, making assembly possible. Sequencing many different mutated copies permits the sequence of the repetitive region to be inferred by consensus methods. However, this approach relies on molecular cloning in order to isolate and amplify individual mutant copies, making it hard to scale-up the approach for use in conjunction with high-throughput sequencing technologies. To address this problem, we propose NG-SAM, a modified version of the SAM protocol that relies on PCR and dilution steps only, coupled to a NGS workflow. NG-SAM therefore has the potential to be scaled-up, e.g. using emerging microfluidics technologies. We built a realistic simulation pipeline to study the feasibility of NG-SAM, and our results suggest that under appropriate experimental conditions the approach might be successfully put into practice. Moreover, our simulations suggest that NG-SAM is capable of reconstructing robustly a wide range of potential target sequences of varying lengths and repetitive structures.

Sipos, Botond; Massingham, Tim; Stutz, Adrian M.; Goldman, Nick



Primary structure of a genomic zein sequence of maize.  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The nucleotide sequence of a genomic clone (termed Z4 ) of the zein multigene family was compared to the nucleotide sequence of related cDNA clones of zein mRNAs. A tandem duplication of a 96-bp sequence is found in the genomic clone that is not present in the related cDNA clones. When the duplication is disregarded, the nucleotide sequence homology between Z4 and its related cDNAs was approximately 97%. The nucleotide sequence is also compared to other isolated cDNAs. No introns in the codin...

Hu, N. T.; Peifer, M. A.; Heidecker, G.; Messing, J.; Rubenstein, I.



Proteinase inhibitors and dendrotoxins. Sequence classification, structural prediction and structure/activity. (United States)

The amino acid sequences of four presynaptically active toxins from mamba snake venom (termed 'dendrotoxins') were compared systematically with homologous sequences of members of the proteinase inhibitor family (Kunitz). A comparison based on the complete sequences revealed that relatively few amino acid changes are necessary to abolish antiprotease activity and convert a proteinase inhibitor into a dendrotoxin. When comparison centred only on the sequence segments known to comprise the antiprotease site of bovine pancreatic trypsin inhibitor, the dendrotoxins were clearly classified apart from all the known inhibitors. Since the mode of action of the bovine pancreatic trypsin/kallikrein inhibitor involves beta sheet formation with the enzyme, predictions were obtained for this secondary structure in the region of the 'antiprotease site' throughout the homologues. Again, the dendrotoxins were clearly distinguished from the inhibitors. Structure/activity analyses, based on the crystal structures of inhibitor/enzyme complexes, suggest that unlike proteinase inhibitors, dendrotoxins might specifically co-ordinate the active-site 'catalytic' histidine residues of serine proteases. Although the significance of this remains to be studied, the presynaptic target is expected to involve an as yet uncharacterised member of the serine protease family. PMID:4076193

Dufton, M J



Towards comprehensive structural motif mining for better fold annotation in the "twilight zone" of sequence dissimilarity  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Abstract Background Automatic identification of structure fingerprints from a group of diverse protein structures is challenging, especially for proteins whose divergent amino acid sequences may fall into the "twilight-" or "midnight-" zones where pair-wise sequence identities to known sequences fall below 25% and sequence-based functional annotations often fail. Results Here we report a novel graph database mining method and demonstrate its appl...



Synthesis and structure elucidation of a series of pyranochromene chalcones and flavanones using 1D and 2D NMR spectroscopy and X-ray crystallography. (United States)

A series of novel pyranochromene chalcones and corresponding flavanones were synthesized. This is the first report on the confirmation of the absolute configuration of chromene-based flavanones using X-ray crystallography. These compounds were characterized by 2D NMR spectroscopy, and their assignments are reported herein. The 3D structure of the chalcone 3b and flavanone 4g was determined by X-ray crystallography, and the structure of the flavanone was confirmed to be in the S configuration at C-2. Copyright © 2014 John Wiley & Sons, Ltd. PMID:24623606

Pawar, Sunayna S; Koorbanally, Neil A



Structure and sequence of the human homeobox gene HOX7. (United States)

A cosmid containing the human sequence HOX7, homologous to the murine Hox-7 gene, was isolated from a genomic library, and the positions of the coding sequences were determined by hybridization. DNA sequence analysis demonstrated two exons that code for a homeodomain-containing protein of 297 amino acids. The open reading frame is interrupted by a single intron of approximately 1.6 kb, the splice donor and acceptor sites of which conform to known consensus sequences. The human HOX7 coding sequence has a very high degree of identity with the murine Hox-7 cDNA. Within the homeobox, the two sequences share 94% identity at the DNA level, all substitutions being silent. This high level of sequence similarity is not confined to the homeodomain; overall the human and murine HOX7 gene products show 80% identity at the amino acid level. Both the 5' and 3' untranslated regions also show significant similarity to the murine gene, with 79 and 70% sequence identity, respectively. The sequence upstream of the coding sequence of exon 1 contains a GC-rich putative promoter region. There is no TATA box, but a CCAAT and numerous GC boxes are present. The region encompassing the promoter region, exon 1, and the 5' region of exon 2 have a higher than expected frequency of CpG dinucleotides; numerous sites for rare-cutter restriction enzymes are present, a characteristic of HTF islands. PMID:1685479

Hewitt, J E; Clark, L N; Ivens, A; Williamson, R



Accurate prediction of protein structural classes using functional domains and predicted secondary structure sequences. (United States)

Protein structural class prediction is one of the challenging problems in bioinformatics. Previous methods directly based on the similarity of amino acid (AA) sequences have been shown to be insufficient for low-similarity protein data-sets. To improve the prediction accuracy for such low-similarity proteins, different methods have been recently proposed that explore the novel feature sets based on predicted secondary structure propensities. In this paper, we focus on protein structural class prediction using combinations of the novel features including secondary structure propensities as well as functional domain (FD) features extracted from the InterPro signature database. Our comprehensive experimental results based on several benchmark data-sets have shown that the integration of new FD features substantially improves the accuracy of structural class prediction for low-similarity proteins as they capture meaningful relationships among AA residues that are far away in protein sequence. The proposed prediction method has also been tested to predict structural classes for partially disordered proteins with the reasonable prediction accuracy, which is a more difficult problem comparing to structural class prediction for commonly used benchmark data-sets and has never been done before to the best of our knowledge. In addition, to avoid overfitting with a large number of features, feature selection is applied to select discriminating features that contribute to achieve high prediction accuracy. The selected features have been shown to achieve stable prediction performance across different benchmark data-sets. PMID:22545993

Ahmadi Adl, Amin; Nowzari-Dalini, Abbas; Xue, Bin; Uversky, Vladimir N; Qian, Xiaoning



SCPRED: Accurate prediction of protein structural class for sequences of twilight-zone similarity with predicting sequences  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Abstract Background Protein structure prediction methods provide accurate results when a homologous protein is predicted, while poorer predictions are obtained in the absence of homologous templates. However, some protein chains that share twilight-zone pairwise identity can form similar folds and thus determining structural similarity without the sequence similarity would be desirable for the structure prediction. The folding type of a protein or its domain is defined as the...



The Holliday junction in an inverted repeat DNA sequence: Sequence effects on the structure of four-way junctions  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Holliday junctions are important structural intermediates in recombination, viral integration, and DNA repair. We present here the single-crystal structure of the inverted repeat sequence d(CCGGTACCGG) as a Holliday junction at the nominal resolution of 2.1 ?. Unlike the previous crystal structures, this DNA junction has B-DNA arms with all standard Watson–Crick base pairs; it therefore represents the intermediate proposed by Holliday as being involved in homologous recombination. The jun...

Eichman, Brandt F.; Vargason, Jeffrey M.; Mooers, Blaine H. M.; Ho, P. Shing



Phosphatidylinositol transfer proteins: sequence motifs in structural and evolutionary analyses  

Directory of Open Access Journals (Sweden)

Full Text Available Phosphatidylinositol transfer proteins (PITP are a family of monomeric proteins that bind and transfer phosphatidylinositol and phosphatidylcholine between membrane compartments. They are required for production of inositol and diacylglycerol second messengers, and are found in most metazoan organisms. While PITPs are known to carry out crucial cell-signaling roles in many organisms, the structure, function and evolution of the majority of family members remains unexplored; primarily because the ubiquity and diversity of the family thwarts traditional methods of global alignment. To surmount this obstacle, we instead took a novel approach, using MEME and a parsimony-based analysis to create a cladogram of conserved sequence motifs in 56 PITP family proteins from 26 species. In keeping with previous functional annotations, three clades were supported within our evolutionary analysis; two classes of soluble proteins and a class of membrane-associat- ed proteins. By, focusing on conserved regions, the analysis allowed for in depth queries regarding possible functional roles of PITP proteins in both intra- and extra- cellular signaling.

Gerald J. Wyckoff



Structure and neural expression of a zebrafish homeobox sequence. (United States)

A genomic library of zebrafish was constructed and screened with homeobox-containing probes. One of the positive clones contains a transcribed region which shares extensive sequence homology with the murine Hox-1.4 and Hox-2.6 genes and the human HHO.c13 gene. Characterization of this zebrafish homologue (ZF-13) with respect to expression demonstrated that it is transcribed during embryogenesis where a major RNA species of 2.5 kb and a minor transcript of 4.6 kb are detected. The highest concentration of both transcripts was found in embryos at the stage of somite formation. By in situ hybridization the spatial localization of expression was analysed in hatching embryos. Hybridization signals were mainly detected throughout the neural tube and in the brain. A small amount of RNA derived from ZF-13 was localized in differentiated muscle cells. Our results suggest that homeobox genes of distantly related vertebrate species are very similar with respect to structure and function. PMID:2468579

Njølstad, P R; Molven, A; Eiken, H G; Fjose, A



Syntheses and crystal structures of four 1-D or 2-D coordination polymers based on 1-((benzotriazol-1-yl)methyl)-1 H-1,3-imidazole (United States)

In this paper, four coordination polymers, {[Ag(bmi)]·NO 3} n ( 1), [Co(N 3) 2(bmi) 2] n ( 2), [Cu(SCN) 2(bmi) 2] n ( 3), and {[Cu(bmi) 2(CH 3OH)(H 2O)]·(ClO 4) 2} n ( 4) have been synthesized through the reactions of an unsymmetrical ligand 1-((benzotriazol-1-yl)methyl)-1 H-1,3-imidazole (bmi) with Ag(I), Co(II) and Cu(II) salts at room temperature. X-ray diffraction analyses showed that compound 1 exhibits double-stranded helical chain. Compounds 2- 4 display 2-D rhombus grid network structure. The rhombus grid consists of 32-membered rings, and gives the dimensions of ca. 8.9 × 8.9 Å for compound 2, ca. 10.1 × 10.1 Å for compound 3, and ca. 9.7 × 9.5 Å for compound 4. In addition, the 2-D layers of compound 3 are stacked into 3-D structure via ?- ? interactions, while the 3-D architecture of compound 4 is realized through complicated hydrogen bonds and ?- ? interactions. The thermal analyses of compounds 1 and 3 indicate that they have high thermal stability and are stable up to 259 °C.

Zhou, Xiaoli; Li, Weiqiang; Jin, Guanghua; Zhao, Dong; Zhu, Xiaoqing; Meng, Xiangru; Hou, Hongwei



Designing to See and Share Structure in Number Sequences (United States)

This paper reports on a design experiment in the domain of number sequences conducted in the course of the "WebLabs" project. We iteratively designed and tested a set of activities and tools in which 10-14 year old students used the "ToonTalk" programming environment to construct models of sequences and series, and then shared their models and…

Mor, Yishay; Noss, Richard; Hoyles, Celia; Kahn, Ken; Simpson, Gordon



MMDB: annotating protein sequences with Entrez's 3D-structure database. (United States)

Three-dimensional (3D) structure is now known for a large fraction of all protein families. Thus, it has become rather likely that one will find a homolog with known 3D structure when searching a sequence database with an arbitrary query sequence. Depending on the extent of similarity, such neighbor relationships may allow one to infer biological function and to identify functional sites such as binding motifs or catalytic centers. Entrez's 3D-structure database, the Molecular Modeling Database (MMDB), provides easy access to the richness of 3D structure data and its large potential for functional annotation. Entrez's search engine offers several tools to assist biologist users: (i) links between databases, such as between protein sequences and structures, (ii) pre-computed sequence and structure neighbors, (iii) visualization of structure and sequence/structure alignment. Here, we describe an annotation service that combines some of these tools automatically, Entrez's 'Related Structure' links. For all proteins in Entrez, similar sequences with known 3D structure are detected by BLAST and alignments are recorded. The 'Related Structure' service summarizes this information and presents 3D views mapping sequence residues onto all 3D structures available in MMDB ( PMID:17135201

Wang, Yanli; Addess, Kenneth J; Chen, Jie; Geer, Lewis Y; He, Jane; He, Siqian; Lu, Shennan; Madej, Thomas; Marchler-Bauer, Aron; Thiessen, Paul A; Zhang, Naigong; Bryant, Stephen H



Structure and Active Stie Residues of Pg1D, an N-Acetyltransferase from the Bacillosamine Synthetic Pathway Required for N-Glycan Synthesis in Campylobacter jejuni  

Energy Technology Data Exchange (ETDEWEB)

Campylobacter jejuni is highly unusual among bacteria in forming N-linked glycoproteins. The heptasaccharide produced by its pgl system is attached to protein Asn through its terminal 2, 4-diacetamido-2, 4,6-trideoxy-d-Glc (QuiNAc4NAc or N, N'-diacetylbacillosamine) moiety. The crucial, last part of this sugar's synthesis is the acetylation of UDP-2-acetamido-4-amino-2, 4,6-trideoxy-d-Glc by the enzyme PglD, with acetyl-CoA as a cosubstrate. We have determined the crystal structures of PglD in CoA-bound and unbound forms, refined to 1.8 and 1.75 Angstroms resolution, respectively. PglD is a trimer of subunits each comprised of two domains, an N-terminal {alpha}/{beta}-domain and a C-terminal left-handed {beta}-helix. Few structural differences accompany CoA binding, except in the C-terminal region following the {beta}-helix (residues 189-195), which adopts an extended structure in the unbound form and folds to extend the {beta}-helix upon binding CoA. Computational molecular docking suggests a different mode of nucleotide-sugar binding with respect to the acetyl-CoA donor, with the molecules arranged in an 'L-shape', compared with the 'in-line' orientation in related enzymes. Modeling indicates that the oxyanion intermediate would be stabilized by the NH group of Gly143', with His125' the most likely residue to function as a general base, removing H+ from the amino group prior to nucleophilic attack at the carbonyl carbon of acetyl-CoA. Site-specific mutations of active site residues confirmed the importance of His125', Glu124', and Asn118. We conclude that Asn118 exerts its function by stabilizing the intricate hydrogen bonding network within the active site and that Glu124' may function to increase the pKa of the putative general base, His125'.

Rangarajan,E.; Ruane, K.; Sulea, T.; Watson, D.; Proteau, A.; Leclerc, S.; Cygler, M.; Matte, A.; Young, N.



Structural determination of 3beta-stearyloxy-urs-12-ene from Maytenus salicifolia by 1D and 2D NMR and quantitative 13C NMR spectroscopy. (United States)

Six pentacyclic triterpenoids, 3beta-stearyloxy-urs-12-ene (1), friedelin (2), 3beta-friedelinol (3), alpha-amyrin (4), beta-amyrin (5), and lupeol (6), have been isolated from the hexane extract of Maytenus salicifolia Reissek (Celastraceae) leaves. The molecular and structural formula as well as the stereochemistry of a new pentacyclic triterpene (1) were determined using data obtained from 1H and 13C NMR spectra, DEPT135 and by 2D HSQC, HMBC, COSY and NOESY experiments. The molecular formula C48H84O2 was established using quantitative 13C NMR, and the molecular weight (692 Da) was confirmed by elemental analysis and mass spectrometry (GC-MS). PMID:16358293

Miranda, R R S; Silva, G D F; Duarte, L P; Fortes, I C P; Filho, S A Vieira



Moments of the Spin Structure Functions g_1^p and g_1^d for 0.05 < Q^2 < 3.0 GeV^2  

CERN Document Server

The spin structure functions g_1 for the proton and the deuteron have been measured over a wide kinematic range in x and Q2 using 1.6 and 5.7 GeV longitudinally polarized electrons incident upon polarized NH_3 and ND_3 targets at Jefferson Lab. Scattered electrons were detected in the CEBAF Large Acceptance Spectrometer, for 0.05 < Q^2 < 5 GeV^2 and W < 3 GeV. The first moments of g_1 for the proton and deuteron are presented -- both have a negative slope at low Q2, as predicted by the extended Gerasimov-Drell-Hearn sum rule. The first result for the generalized forward spin polarizability of the proton gamma_0^p is also reported, and shows evidence of scaling above Q^2 = 1.5 GeV^2. Although the first moments of g_1 are consistent with Chiral Perturbation Theory (ChPT) calculations up to approximately Q^2 = 0.06 GeV^2, a significant discrepancy is observed between the \\gamma_0^p data and ChPT for gamma_0^p,even at the lowest Q2.

Prok, Y; Burkert, V D; Deur, A; Dharmawardane, K V; Dodge, G E; Griffioen, K A; Kuhn, S E; Minehart, R; Adams, G; Amaryan, M J; Anghinolfi, M; Asryan, G; Audit, G; Avakian, H; Bagdasaryan, H; Baillie, N; Ball, J P; Baltzell, N A; Barrow, S; Battaglieri, M; Beard, K; Bedlinskiy, I; Bektasoglu, M; Bellis, M; Benmouna, N; Berman, B L; Biselli, A S; Blaszczyk, L; Boiarinov, S; Bonner, B E; Bouchigny, S; Bradford, R; Branford, D; Briscoe, W J; Brooks, W K; Bültmann, S; Butuceanu, C; Calarco, J R; Careccia, S L; Carman, D S; Casey, L; Cazes, A; Chen, S; Cheng, L; Cole, P L; Collins, P; Coltharp, P; Cords, D; Corvisiero, P; Crabb, D; Credé, V; Cummings, J P; Dale, D; Dashyan, N; De Masi, R; De Vita, R; De Sanctis, E; Degtyarenko, P V; Denizli, H; Dennis, L; Dhuga, K S; Dickson, R; Djalali, C; Doughty, D; Dugger, M; Dytman, S; Dzyubak, O P; Egiyan, H; Egiyan, K S; El Fassi, L; Elouadrhiri, L; Eugenio, P; Fatemi, R; Fedotov, G; Feldman, G; Fersh, R G; Feuerbach, R J; Forest, T A; Fradi, A; Funsten, H; Garçon, M; Gavalian, G; Gevorgyan, N; Gilfoyle, G P; Giovanetti, K L; Girod, F X; Goetz, J T; Golovatch, E; Gothe, R W; Guidal, M; Guillo, M; Guler, N; Guo, L; Gyurjyan, V; Hadjidakis, C; Hafidi, K; Hakobyan, H; Hanretty, C; Hardie, J; Hassall, N; Heddle, D; Hersman, F W; Hicks, K; Hleiqawi, I; Holtrop, M; Huertas, M; Hyde-Wright, C E; Ilieva, Y; Ireland, D G; Ishkhanov, B S; Isupov, E L; Ito, M M; Jenkins, D; Jo, H S; Johnstone, J R; Joo, K; Jüngst, H G; Kalantarians, N; Keith, C D; Kellie, J D; Khandaker, M; Kim, K Y; Kim, K; Kim, W; Klein, A; Klein, F J; Klusman, M; Kossov, M; Krahn, Z; Kramer, L H; Kubarovski, V; Kühn, J; Kuleshov, S V; Kuznetsov, V; Lachniet, J; Laget, J M; Langheinrich, J; Lawrence, D; Ji Li; Lima, A C S; Livingston, K; Lu, H Y; Lukashin, K; MacCormick, M; Marchand, C; Markov, N; Mattione, P; McAleer, S; McKinnon, B; McNabb, J W C; Mecking, B A; Mestayer, M D; Meyer, C A; Mibe, T; Mikhailov, K; Mirazita, M; Miskimen, R; Mokeev, V; Morand, L; Moreno, B; Moriya, K; Morrow, S A; Moteabbed, M; Müller, J; Munevar, E; Mutchler, G S; Nadel-Turonski, P; Nasseripour, R; Niccolai, S; Niculescu, G; Niculescu, I; Niczyporuk, B B; Niroula, M R; Niyazov, R A; Nozar, M; O'Rielly, G V; Osipenko, M; Ostrovidov, A I; Park, K; Pasyuk, E; Paterson, C; Anefalos Pereira, S; Philips, S A; Pierce, J; Pivnyuk, N; Pocanic, D; Pogorelko, O; Popa, I; Pozdniakov, S; Preedom, B M; Price, J W; Procureur, S; Protopopescu, D; Qin, L M; Raue, B A; Riccardi, G; Ricco, G; Ripani, M; Ritchie, B G; Rosner, G; Rossi, P; Rowntree, D; Rubin, P D; Sabati, F; Salamanca, J; Salgado, C; Santoro, e J P; Sapunenko, V; Schumacher, R A; Seely, M L; Serov, V S; Sharabyan, Yu G; Sharov, D; Shaw, J; Shvedunov, N V; Skabelin, A V; Smith, E S; Smith, L C; Sober, D I; Sokhan, D; Stavinsky, A; Stepanyan, S S; Stepanyan, S; Stokes, B E; Stoler, P; Strakovsky, I I; Strauch, S; Suleiman, R; Taiuti, M; Tedeschi, D J; Tkabladze, A; Tkachenko, S; Todor, L; Ungaro, M; Vineyard, M F; Vlassov, A V; Watts, D P; Weinstein, L B; Weygand, D P; Williams, M; Wolin, E; Wood, M H; Yegneswaran, A; Yun, J; Zana, L; Zhang, J; Zhao, B; Zhao, Z W



Moments of the Spin Structure Functions g1p and g1d for 0.05 < Q2 < 3.0 GeV2  

Energy Technology Data Exchange (ETDEWEB)

The spin structure functions $g_1$ for the proton and the deuteron have been measured over a wide kinematic range in $x$ and \\Q2 using 1.6 and 5.7 GeV longitudinally polarized electrons incident upon polarized NH$_3$ and ND$_3$ targets at Jefferson Lab. Scattered electrons were detected in the CEBAF Large Acceptance Spectrometer, for $0.05 < Q^2 < 5 $\\ GeV$^2$ and $W < 3$ GeV. The first moments of $g_1$ for the proton and deuteron are presented -- both have a negative slope at low \\Q2, as predicted by the extended Gerasimov-Drell-Hearn sum rule. The first result for the generalized forward spin polarizability of the proton $\\gamma_0^p$ is also reported, and shows evidence of scaling above $Q^2$ = 1.5 GeV$^2$. Although the first moments of $g_1$ are consistent with Chiral Perturbation Theory (\\ChPT) calculations up to approximately $Q^2 = 0.06$ GeV$^2$, a significant discrepancy is observed between the $\\gamma_0^p$ data and \\ChPT\\ for $\\gamma_0^p$,even at the lowest \\Q2.

Prok, Yelena; Bosted, Peter; Burkert, Volker; Deur, Alexandre; Dharmawardane, Kahanawita; Dodge, Gail; Griffioen, Keith; Kuhn, Sebastian; Minehart, Ralph; Adams, Gary; Amaryan, Moscov; Amaryan, Moskov; Anghinolfi, Marco; Asryan, G.; Audit, Gerard; Avagyan, Harutyun; Baghdasaryan, Hovhannes; Baillie, Nathan; Ball, J.P.; Ball, Jacques; Baltzell, Nathan; Barrow, Steve; Battaglieri, Marco; Beard, Kevin; Bedlinskiy, Ivan; Bektasoglu, Mehmet; Bellis, Matthew; Benmouna, Nawal; Berman, Barry; Biselli, Angela; Blaszczyk, Lukasz; Boyarinov, Sergey; Bonner, Billy; Bouchigny, Sylvain; Bradford, Robert; Branford, Derek; Briscoe, William; Brooks, William; Bultmann, S.; Bueltmann, Stephen; Butuceanu, Cornel; Calarco, John; Careccia, Sharon; Carman, Daniel; Casey, Liam; Cazes, Antoine; Chen, Shifeng; Cheng, Lu; Cole, Philip; Collins, Patrick; Coltharp, Philip; Cords, Dieter; Corvisiero, Pietro; Crabb, Donald; Crede, Volker; Cummings, John; Dale, Daniel; Dashyan, Natalya; De Masi, Rita; De Vita, Raffaella; De Sanctis, Enzo; Degtiarenko, Pavel; Denizli, Haluk; Dennis, Lawrence; Dhuga, Kalvir; Dickson, Richard; Djalali, Chaden; Doughty, David; Dugger, Michael; Dytman, Steven; Dzyubak, Oleksandr; Egiyan, Hovanes; Egiyan, Kim; Elfassi, Lamiaa; Elouadrhiri, Latifa; Eugenio, Paul; Fatemi, Renee; Fedotov, Gleb; Feldman, Gerald; Fersch, Robert; Feuerbach, Robert; Forest, Tony; Fradi, Ahmed; Funsten, Herbert; Garcon, Michel; Gavalian, Gagik; Gevorgyan, Nerses; Gilfoyle, Gerard; Giovanetti, Kevin; Girod, Francois-Xavier; Goetz, John; Golovach, Evgeny; Gothe, Ralf; Guidal, Michel; Guillo, Matthieu; Guler, Nevzat; Guo, Lei; Gyurjyan, Vardan; Hadjidakis, Cynthia; Hafidi, Kawtar; Hakobyan, Hayk; Hanretty, Charles; Hardie, John; Hassall, Neil; Heddle, David; Hersman, F.; Hicks, Kenneth; Hleiqawi, Ishaq; Holtrop, Maurik; Huertas, Marco; Hyde, Charles; Ilieva, Yordanka; Ireland, David; Ishkhanov, Boris; Isupov, Evgeny; Ito, Mark; Jenkins, David; Jo, Hyon-Suk; Johnstone, John; Joo, Kyungseon; Juengst, Henry; Kalantarians, Narbe; Keith, Christopher; Kellie, James; Khandaker, Mahbubul; Kim, Kui; Kim, Kyungmo; Kim, Wooyoung; Klein, Andreas; Klein, Franz; Klusman, Mike; Kossov, Mikhail; Krahn, Zebulun; Kramer, Laird; Kubarovsky, Valery; Kuhn, Joachim; Kuleshov, Sergey; Kuznetsov, Viacheslav; Lachniet, Jeff; Laget, Jean; Langheinrich, Jorn; Lawrence, Dave; Lima, Ana; Livingston, Kenneth; Lu, Haiyun; Lukashin, K.; MacCormick, Marion; Marchand, Claude; Markov, Nikolai; Mattione, Paul; McAleer, Simeon; McKinnon, Bryan; McNabb, John; Mecking, Bernhard; Mestayer, Mac; Meyer, Curtis; Mibe, Tsutomu; Mikhaylov, Konstantin; Mirazita, Marco; Miskimen, Rory; Mokeev, Viktor; Morand, Ludyvine; Moreno, Brahim; Moriya, Kei; Morrow, Steven; Moteabbed, Maryam; Mueller, James; Munevar Espitia, Edwin; Mutchler, Gordon; Nadel-Turonski, Pawel; Nasseripour, Rakhsha; Niccolai, Silvia; Niculescu, Gabriel; Niculescu, Maria-Ioana; Niczyporuk, Bogdan; Niroula, Megh; Niyazov, Rustam; Nozar, Mina; O' Rielly, Grant; Osipenko, Mikhail; Ostrovidov, Alexander; Park, Kijun; Pasyuk, Evgueni; Paterson, Craig; Anefalos Pereira, S.; Philips, Sasha; Pierce, J.; Pivnyuk, Nikolay; Pocanic, Dinko; Pogorelko, Oleg; Popa, Iulian; Pozdnyakov, Sergey; Preedom, Barry; Price, John; Procureur, Sebastien; Protopopescu, Dan; Qin, Liming; Raue, Brian; Riccardi, Gregory; Ricco, Giovanni; Ripani, Marco; Ritchie, Barry; Rosner, Guenther; Rossi, Patrizia; Rowntree, David; Rubin, Philip; Sabatie, Franck; Salamanca, Julian; Salgado, Carlos; Santoro, Joseph; Sapunenko, Vladimir; Schumacher, Reinhard; Seely, Mikell; Serov, Vladimir; Sharabian, Youri; Sharov, Dmitri; Shaw, Jeffrey; Shvedunov, Nikolay; Skabelin, Alexander; Smith, Elton; Smith, Lee; Sober, Daniel; Sokhan, Daria; Stavinskiy, Aleksey; Stepanyan, Samuel; Stepanyan, Stepan; Stokes, Burnham; Stoler, Paul; Strakovski, Igor; Strauch, Steffen; Suleiman, Riad; Taiuti, Mauro; Tedeschi, David; Tkabladze, Avtandil; Tkachenko, Svyatoslav; Todor, Luminita; Ungaro, Maurizio; V



Intricate heterogeneous structures of the top 300 km of the Earth's inner core inferred from global array data: I. Regional 1D attenuation and velocity profiles (United States)

We apply a waveform inversion method based on simulated annealing to complex core phase data observed by globally deployed seismic arrays, and present regional variation of depth profiles of attenuation and velocity for the top half of the inner core. Whereas measured attenuation parameters exhibit consistent trends for data sampling the eastern hemisphere of the inner core, for the western hemisphere, there is a remarkable difference between data sampling the inner core beneath Africa (W1) and beneath north America (W2). Obtained attenuation profiles suggest that intricate heterogeneities appear to be confined in the top 300 km. The profile for the eastern hemisphere has a high attenuation zone in the top 150 km that gradually diminishes with depth. Conversely, for the western hemisphere, the profile for W1 shows constant low attenuation and that for W2 represents a gradual increase from the inner core boundary to a peak at around 200 km depth. Velocity profiles, obtained from differential traveltimes between PKP(DF) and PKP(CD, BC) phases, for the eastern and western hemispheres are respectively about 0.8% faster and 0.6% slower than the reference model at the top of the inner core, and the difference nearly disappears at about 200 km depth. Our result suggests the presence of intricate quasi-hemispherical structures in the top ˜200-300 km of the inner core.

Iritani, R.; Takeuchi, N.; Kawakatsu, H.



RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis (United States)

RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. Availability Abbreviations RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language.

Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab



Synchronous visual analysis and editing of RNA sequence and secondary structure alignments using 4SALE  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background The function of a noncoding RNA sequence is mainly determined by its secondary structure and therefore a family of noncoding RNA sequences is much more conserved on the structural level than on the sequence level. Understanding the function of noncoding RNA sequence families requires two things: a hand-crafted or hand-improved alignment and detailed analyses of the secondary structures. There are several tools available that help performing these tasks, but all of them are specialized and focus on only one aspect, editing the alignment or plotting the secondary structure. The problem is both these tasks need to be performed simultaneously. Findings 4SALE is designed to handle sequence and secondary structure information of RNAs synchronously. By including a complete new method of simultaneous visualization and editing RNA sequences and secondary structure information, 4SALE enables to improve and understand RNA sequence and secondary structure evolution much more easily. Conclusion 4SALE is a step further for simultaneously handling RNA sequence and secondary structure information. It provides a complete new way of visual monitoring different structural aspects, while editing the alignment. The software is freely available and distributed from its website at

Dandekar Thomas



Resolution and characterization of the structural polymorphism of a single quadruplex-forming sequence  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The remarkable structural polymorphism of quadruplex-forming sequences has been a considerable impediment in the elucidation of quadruplex folds. Sequence modifications have commonly been used to perturb and purportedly select a particular form out of the ensemble of folds for nuclear magnetic resonance (NMR) or X-ray crystallographic analysis. Here we report a simple chromatographic technique that separates the individual folds without need for sequence modification. The sequence d(GGTGGTGGT...

Dailey, Magdalena M.; Miller, M. Clarke; Bates, Paula J.; Lane, Andrew N.; Trent, John O.



Phase Transitions in Sequence Matches and Nucleic Acid Structure (United States)

Analyses of phase transitions in biopolymers have previously been restricted to studies of average behavior along macromolecules. Extremal properties, such as longest helical region, can now be studied with a new family of probability distributions [Arratia, R., Gordon, L. & Waterman, M. S. (1986) Ann. Stat. 14, 971-993]. Not only is such extremal behavior analyzed with great precision, but new phase transitions are determined. One phase transition occurs when behavior of the free energy of the longest helical region abruptly changes from proportional to logarithm of the sequence length to proportional to sequence length. The annealing of two single-stranded molecules and the melting of a double helix are both considered. These results, initially suggested by studies of optimal matching of random DNA sequences [Smith, T. F., Waterman, M. S. & Burks, C. (1985) Nucleic Acids Res. 13, 645-656], also have importance for significance tests in comparison of nucleic acid or protein sequences.

Waterman, Michael S.; Gordon, Louis; Arratia, Richard



4SALE – A tool for synchronous RNA sequence and secondary structure alignment and editing  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background In sequence analysis the multiple alignment builds the fundament of all proceeding analyses. Errors in an alignment could strongly influence all succeeding analyses and therefore could lead to wrong predictions. Hand-crafted and hand-improved alignments are necessary and meanwhile good common practice. For RNA sequences often the primary sequence as well as a secondary structure consensus is well known, e.g., the cloverleaf structure of the t-RNA. Recently, some alignment editors are proposed that are able to include and model both kinds of information. However, with the advent of a large amount of reliable RNA sequences together with their solved secondary structures (available from e.g. the ITS2 Database, we are faced with the problem to handle sequences and their associated secondary structures synchronously. Results 4SALE fills this gap. The application allows a fast sequence and synchronous secondary structure alignment for large data sets and for the first time synchronous manual editing of aligned sequences and their secondary structures. This study describes an algorithm for the synchronous alignment of sequences and their associated secondary structures as well as the main features of 4SALE used for further analyses and editing. 4SALE builds an optimal and unique starting point for every RNA sequence and structure analysis. Conclusion 4SALE, which provides an user-friendly and intuitive interface, is a comprehensive toolbox for RNA analysis based on sequence and secondary structure information. The program connects sequence and structure databases like the ITS2 Database to phylogeny programs as for example the CBCAnalyzer. 4SALE is written in JAVA and therefore platform independent. The software is freely available and distributed from the website at

Schultz Jörg



ArachnoServer 2.0, an updated online resource for spider toxin sequences and structures  

Digital Repository Infrastructure Vision for European Research (DRIVER)

ArachnoServer ( is a manually curated database providing information on the sequence, structure and biological activity of protein toxins from spider venoms. These proteins are of interest to a wide range of biologists due to their diverse applications in medicine, neuroscience, pharmacology, drug discovery and agriculture. ArachnoServer currently manages 1078 protein sequences, 759 nucleic acid sequences and 56 protein structures. Key features of ArachnoServer include a...

Herzig, Volker; Wood, David L. A.; Newell, Felicity; Chaumeil, Pierre-alain; Kaas, Quentin; Binford, Greta J.; Nicholson, Graham M.; Gorse, Dominique; King, Glenn F.



Rapid protein domain assignment from amino acid sequence using predicted secondary structure  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The elucidation of the domain content of a given protein sequence in the absence of determined structure or significant sequence homology to known domains is an important problem in structural biology. Here we address how successfully the delineation of continuous domains can be accomplished in the absence of sequence homology using simple baseline methods, an existing prediction algorithm (Domain Guess by Size), and a newly developed method (DomSSEA). The study was undertaken with a view to ...

Marsden, Russell L.; Mcguffin, Liam J.; Jones, David T.



Resolution-optimized NMR measurement of {sup 1}D{sub CH}, {sup 1}D{sub CC} and {sup 2}D{sub CH} residual dipolar couplings in nucleic acid bases  

Energy Technology Data Exchange (ETDEWEB)

New methods are described for accurate measurement of multiple residual dipolar couplings in nucleic acid bases. The methods use TROSY-type pulse sequences for optimizing resolution and sensitivity, and rely on the E.COSY principle to measure the relatively small two-bond {sup 2}D{sub CH} couplings at high precision. Measurements are demonstrated for a 24-nt stem-loop RNA sequence, uniformly enriched in {sup 13}C, and aligned in Pf1. The recently described pseudo-3D method is used to provide homonuclear {sup 1}H-{sup 1}H decoupling, which minimizes cross-correlation effects and optimizes resolution. Up to seven {sup 1}H-{sup 13}C and {sup 13}C-{sup 13}C couplings are measured for pyrimidines (U and C), including {sup 1}D{sub C5H5}, {sup 1}D{sub C6H6}, {sup 2}D{sub C5H6}, {sup 2}D{sub C6H5}, {sup 1}D{sub C5C4}, {sup 1}D{sub C5C6}, and {sup 2}D{sub C4H5}. For adenine, four base couplings ({sup 1}D{sub C2H2}, {sup 1}D{sub C8H8}, {sup 1}D{sub C4C5}, and {sup 1}D{sub C5C6}) are readily measured whereas for guanine only three couplings are accessible at high relative accuracy ({sup 1}D{sub C8H8}, {sup 1}D{sub C4C5}, and {sup 1}D{sub C5C6}). Only three dipolar couplings are linearly independent in planar structures such as nucleic acid bases, permitting cross validation of the data and evaluation of their accuracies. For the vast majority of dipolar couplings, the error is found to be less than {+-}3% of their possible range, indicating that the measurement accuracy is not limiting when using these couplings as restraints in structure calculations. Reported isotropic values of the one- and two-bond J couplings cluster very tightly for each type of nucleotide.

Boisbouvier, Jerome; Bryce, David L.; O' Neil-Cabello, Erin [Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health (United States); Nikonowicz, Edward P. [Rice University, Department of Biochemistry and Cell Biology (United States); Bax, Ad [Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health (United States)], E-mail:



Combinatorial variation of structure in considerations of compound lumping in one- and two-dimensional property representations of condensable atmospheric organic compounds. 1. Lumping by 1-D volatility with n fixed (United States)

Many current models that aim to predict urban and regional levels of organic particulate matter (OPM) use either the 2 product (2p) framework for secondary organic aerosol (SOA) formation, or a static 1-D volatility basis set (1-D-VBS). These approaches assume that: 1) the compounds involved in OPM condensation/evaporation can be lumped simply by volatility with no specificity regarding carbon number nC, MW, or polar functionality; 2) water uptake does not occur; and 3) the compounds are non-ionizing. This work considers the consequences for uniphasic PM caused by the first two assumptions due to effects of the condensed-phase mean molecular weight MW¯ and activity coefficients (?i), including when RH (relative humidity) > 0. Setting n = 10 for all bins, multiple chemical structures were developed for each bin of a 1-D-VBS for un-aged SOA in the ?-pinene/ozone system. For each bin, a group-contribution vapor pressure (pLo) prediction method was used to find multiple structures such that the groups-based log pLo for n = 10 and variable numbers of aldehyde, ketone, hydroxyl, and carboxylic acid groups agrees, within ±0.5, with the bin volatility. The number of possible combinations with one structure taken from each bin was 17,640. The Raster-Roulette Organic Aerosol (RROA) model was used to calculate the equilibrium mass concentrations (?g m-3) of OPM (Mo) and co-condensed water (Mw) at 25 °C for each combination for ranges of RH and ?HC (change in parent hydrocarbon concentration). UNIFAC was used to determine the needed values of ?i. Frequency distributions from RROA for Mo, Mw, and the O:C ratio were developed. For Mo levels typical of the ambient atmosphere, then for the 1-D-VBS and all bins constrained at n = 10, significant RH-induced enhancement of OPM condensation was observed in the distributions. The spread of the distributions was found to increase rapidly as the level of OPM decreased. The within-bin spread of ±0.5 log units in the groups-based estimates of log pL,iowas found to cause significant spread in the distributions at lower Mo values. At the chosen n (=10), the groups-based log pL,iovalues show a spread of ±2 log units in a plot of log pL,iovs. O:C. When seeking to advance to 2-D-grid predictive modeling of atmospheric OPM, use of an O:C vs. n grid will therefore require reliable information (or at least empirical calibration) as to the distributions of the likely structures at each gridpoint.

Pankow, James F.; Niakan, Negar; Asher, William E.



MSACompro: protein multiple sequence alignment using predicted secondary structure, solvent accessibility, and residue-residue contacts  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Multiple Sequence Alignment (MSA is a basic tool for bioinformatics research and analysis. It has been used essentially in almost all bioinformatics tasks such as protein structure modeling, gene and protein function prediction, DNA motif recognition, and phylogenetic analysis. Therefore, improving the accuracy of multiple sequence alignment is important for advancing many bioinformatics fields. Results We designed and developed a new method, MSACompro, to synergistically incorporate predicted secondary structure, relative solvent accessibility, and residue-residue contact information into the currently most accurate posterior probability-based MSA methods to improve the accuracy of multiple sequence alignments. The method is different from the multiple sequence alignment methods (e.g. 3D-Coffee that use the tertiary structure information of some sequences since the structural information of our method is fully predicted from sequences. To the best of our knowledge, applying predicted relative solvent accessibility and contact map to multiple sequence alignment is novel. The rigorous benchmarking of our method to the standard benchmarks (i.e. BAliBASE, SABmark and OXBENCH clearly demonstrated that incorporating predicted protein structural information improves the multiple sequence alignment accuracy over the leading multiple protein sequence alignment tools without using this information, such as MSAProbs, ProbCons, Probalign, T-coffee, MAFFT and MUSCLE. And the performance of the method is comparable to the state-of-the-art method PROMALS of using structural features and additional homologous sequences by slightly lower scores. Conclusion MSACompro is an efficient and reliable multiple protein sequence alignment tool that can effectively incorporate predicted protein structural information into multiple sequence alignment. The software is available at

Deng Xin



Iterative refinement of structure-based sequence alignments by Seed Extension  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Accurate sequence alignment is required in many bioinformatics applications but, when sequence similarity is low, it is difficult to obtain accurate alignments based on sequence similarity alone. The accuracy improves when the structures are available, but current structure-based sequence alignment procedures still mis-align substantial numbers of residues. In order to correct such errors, we previously explored the possibility of replacing the residue-based dynamic programming algorithm in structure alignment procedures with the Seed Extension algorithm, which does not use a gap penalty. Here, we describe a new procedure called RSE (Refinement with Seed Extension that iteratively refines a structure-based sequence alignment. Results RSE uses SE (Seed Extension in its core, which is an algorithm that we reported recently for obtaining a sequence alignment from two superimposed structures. The RSE procedure was evaluated by comparing the correctly aligned fractions of residues before and after the refinement of the structure-based sequence alignments produced by popular programs. CE, DaliLite, FAST, LOCK2, MATRAS, MATT, TM-align, SHEBA and VAST were included in this analysis and the NCBI's CDD root node set was used as the reference alignments. RSE improved the average accuracy of sequence alignments for all programs tested when no shift error was allowed. The amount of improvement varied depending on the program. The average improvements were small for DaliLite and MATRAS but about 5% for CE and VAST. More substantial improvements have been seen in many individual cases. The additional computation times required for the refinements were negligible compared to the times taken by the structure alignment programs. Conclusion RSE is a computationally inexpensive way of improving the accuracy of a structure-based sequence alignment. It can be used as a standalone procedure following a regular structure-based sequence alignment or to replace the traditional iterative refinement procedures based on residue-level dynamic programming algorithm in many structure alignment programs.

Lee Byungkook



The chemical structure of DNA sequence signals for RNA transcription (United States)

The proposed recognition sites for RNA transcription for E. coli NRA polymerase, bacteriophage T7 RNA polymerase, and eukaryotic RNA polymerase Pol II are evaluated in the light of the requirements for efficient recognition. It is shown that although there is good experimental evidence that specific nucleic acid sequence patterns are involved in transcriptional regulation in bacteria and bacterial viruses, among the sequences now available, only in the case of the promoters recognized by bacteriophage T7 polymerase does it seem likely that the pattern is sufficient. It is concluded that the eukaryotic pattern that is investigated is not restrictive enough to serve as a recognition site.

George, D. G.; Dayhoff, M. O.



Improving protein structure similarity searches using domain boundaries based on conserved sequence information  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background The identification of protein domains plays an important role in protein structure comparison. Domain query size and composition are critical to structure similarity search algorithms such as the Vector Alignment Search Tool (VAST, the method employed for computing related protein structures in NCBI Entrez system. Currently, domains identified on the basis of structural compactness are used for VAST computations. In this study, we have investigated how alternative definitions of domains derived from conserved sequence alignments in the Conserved Domain Database (CDD would affect the domain comparisons and structure similarity search performance of VAST. Results Alternative domains, which have significantly different secondary structure composition from those based on structurally compact units, were identified based on the alignment footprints of curated protein sequence domain families. Our analysis indicates that domain boundaries disagree on roughly 8% of protein chains in the medium redundancy subset of the Molecular Modeling Database (MMDB. These conflicting sequence based domain boundaries perform slightly better than structure domains in structure similarity searches, and there are interesting cases when structure similarity search performance is markedly improved. Conclusion Structure similarity searches using domain boundaries based on conserved sequence information can provide an additional method for investigators to identify interesting similarities between proteins with known structures. Because of the improvement in performance of structure similarity searches using sequence domain boundaries, we are in the process of implementing their inclusion into the VAST search and MMDB resources in the NCBI Entrez system.

Madej Tom



Compression of structured high-throughput sequencing data. (United States)

Large biological datasets are being produced at a rapid pace and create substantial storage challenges, particularly in the domain of high-throughput sequencing (HTS). Most approaches currently used to store HTS data are either unable to quickly adapt to the requirements of new sequencing or analysis methods (because they do not support schema evolution), or fail to provide state of the art compression of the datasets. We have devised new approaches to store HTS data that support seamless data schema evolution and compress datasets substantially better than existing approaches. Building on these new approaches, we discuss and demonstrate how a multi-tier data organization can dramatically reduce the storage, computational and network burden of collecting, analyzing, and archiving large sequencing datasets. For instance, we show that spliced RNA-Seq alignments can be stored in less than 4% the size of a BAM file with perfect data fidelity. Compared to the previous compression state of the art, these methods reduce dataset size more than 40% when storing exome, gene expression or DNA methylation datasets. The approaches have been integrated in a comprehensive suite of software tools ( that support common analyses for a range of high-throughput sequencing assays. PMID:24260313

Campagne, Fabien; Dorff, Kevin C; Chambwe, Nyasha; Robinson, James T; Mesirov, Jill P



Nucleotide sequence determination and secondary structure of Xenopus U3 snRNA.  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Using a combination of RNA sequencing and construction of cDNA clones followed by DNA sequencing, we have determined the primary nucleotide sequence of U3 snRNA in Xenopus laevis and Xenopus borealis. This molecule has a length of 219 nucleotides. Alignment of the Xenopus sequences with U3 snRNA sequences from other organisms reveals three evolutionarily conserved blocks. We have probed the secondary structure of U3 snRNA in intact Xenopus laevis nuclei using single-strand specific chemical r...



ZnO1-x nanorod arrays/ZnO thin film bilayer structure: from homojunction diode and high-performance memristor to complementary 1D1R application. (United States)

We present a ZnO(1-x) nanorod array (NR)/ZnO thin film (TF) bilayer structure synthesized at a low temperature, exhibiting a uniquely rectifying characteristic as a homojunction diode and a resistive switching behavior as memory at different biases. The homojunction diode is due to asymmetric Schottky barriers at interfaces of the Pt/ZnO NRs and the ZnO TF/Pt, respectively. The ZnO(1-x) NRs/ZnO TF bilayer structure also shows an excellent resistive switching behavior, including a reduced operation power and enhanced performances resulting from supplements of confined oxygen vacancies by the ZnO(1-x) NRs for rupture and recovery of conducting filaments inside the ZnO TF layer. A hydrophobic behavior with a contact angle of ~125° can be found on the ZnO(1-x) NRs/ZnO TF bilayer structure, demonstrating a self-cleaning effect. Finally, a successful demonstration of complementary 1D1R configurations can be achieved by simply connecting two identical devices back to back in series, realizing the possibility of a low-temperature all-ZnO-based memory system. PMID:22900519

Huang, Chi-Hsin; Huang, Jian-Shiou; Lin, Shih-Ming; Chang, Wen-Yuan; He, Jr-Hau; Chueh, Yu-Lun



Quantifying the relationship between sequence and three-dimensional structure conservation in RNA  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Abstract Background In recent years, the number of available RNA structures has rapidly grown reflecting the increased interest on RNA biology. Similarly to the studies carried out two decades ago for proteins, which gave the fundamental grounds for developing comparative protein structure prediction methods, we are now able to quantify the relationship between sequence and structure conservation in RNA. Results Here we introduce an all-against-all sequence- and...



MMDB: annotating protein sequences with Entrez's 3D-structure database  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Three-dimensional (3D) structure is now known for a large fraction of all protein families. Thus, it has become rather likely that one will find a homolog with known 3D structure when searching a sequence database with an arbitrary query sequence. Depending on the extent of similarity, such neighbor relationships may allow one to infer biological function and to identify functional sites such as binding motifs or catalytic centers. Entrez's 3D-structure database, the Molecular Modeling Databa...

Wang, Yanli; Addess, Kenneth J.; Chen, Jie; Geer, Lewis Y.; He, Jane; He, Siqian; Lu, Shennan; Madej, Thomas; Marchler-bauer, Aron; Thiessen, Paul A.; Zhang, Naigong; Bryant, Stephen H.



Structator: fast index-based search for RNA sequence-structure patterns  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background The secondary structure of RNA molecules is intimately related to their function and often more conserved than the sequence. Hence, the important task of searching databases for RNAs requires to match sequence-structure patterns. Unfortunately, current tools for this task have, in the best case, a running time that is only linear in the size of sequence databases. Furthermore, established index data structures for fast sequence matching, like suffix trees or arrays, cannot benefit from the complementarity constraints introduced by the secondary structure of RNAs. Results We present a novel method and readily applicable software for time efficient matching of RNA sequence-structure patterns in sequence databases. Our approach is based on affix arrays, a recently introduced index data structure, preprocessed from the target database. Affix arrays support bidirectional pattern search, which is required for efficiently handling the structural constraints of the pattern. Structural patterns like stem-loops can be matched inside out, such that the loop region is matched first and then the pairing bases on the boundaries are matched consecutively. This allows to exploit base pairing information for search space reduction and leads to an expected running time that is sublinear in the size of the sequence database. The incorporation of a new chaining approach in the search of RNA sequence-structure patterns enables the description of molecules folding into complex secondary structures with multiple ordered patterns. The chaining approach removes spurious matches from the set of intermediate results, in particular of patterns with little specificity. In benchmark experiments on the Rfam database, our method runs up to two orders of magnitude faster than previous methods. Conclusions The presented method's sublinear expected running time makes it well suited for RNA sequence-structure pattern matching in large sequence databases. RNA molecules containing several stem-loop substructures can be described by multiple sequence-structure patterns and their matches are efficiently handled by a novel chaining method. Beyond our algorithmic contributions, we provide with Structator a complete and robust open-source software solution for index-based search of RNA sequence-structure patterns. The Structator software is available at

Will Sebastian



Structure, sequence and expression of the hepatitis delta (?) viral genome (United States)

Biochemical and electron microscopic data indicate that the human hepatitis ? viral agent contains a covalently closed circular and single-stranded RNA genome that has certain similarities with viroid-like agents from plants. The sequence of the viral genome (1,678 nucleotides) has been determined and an open reading frame within the complementary strand has been shown to encode an antigen that binds specifically to antisera from patients with chronic hepatitis ? viral infections.

Wang, Kang-Sheng; Choo, Qui-Lim; Weiner, Amy J.; Ou, Jing-Hsiung; Najarian, Richard C.; Thayer, Richard M.; Mullenbach, Guy T.; Denniston, Katherine J.; Gerin, John L.; Houghton, Michael



Identifying time-clustering structures in lightning sequences  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Two years of lightning data, measured during 2002 and 2003 in an area of southern Italy, have been analyzed in order to reveal scaling behaviour in their time dynamics. We used the Allan Factor method to evidence the presence of time-clustering in the lightning data. We found that i) the sequence of lightning time-occurrences is characterized by two scaling regions, which reveal intra-cluster (inside an individual thunderstorm) and inter-cluster (among successive thunderstorms) time-clusterin...

Telesca, L.; Bernardi, M.; Rovelli, C.



RNA global alignment in the joint sequence-structure space using elastic shape analysis. (United States)

The functions of RNAs, like proteins, are determined by their structures, which, in turn, are determined by their sequences. Comparison/alignment of RNA molecules provides an effective means to predict their functions and understand their evolutionary relationships. For RNA sequence alignment, most methods developed for protein and DNA sequence alignment can be directly applied. RNA 3-dimensional structure alignment, on the other hand, tends to be more difficult than protein structure alignment due to the lack of regular secondary structures as observed in proteins. Most of the existing RNA 3D structure alignment methods use only the backbone geometry and ignore the sequence information. Using both the sequence and backbone geometry information in RNA alignment may not only produce more accurate classification, but also deepen our understanding of the sequence-structure-function relationship of RNA molecules. In this study, we developed a new RNA alignment method based on elastic shape analysis (ESA). ESA treats RNA structures as three dimensional curves with sequence information encoded on additional dimensions so that the alignment can be performed in the joint sequence-structure space. The similarity between two RNA molecules is quantified by a formal distance, geodesic distance. Based on ESA, a rigorous mathematical framework can be built for RNA structure comparison. Means and covariances of full structures can be defined and computed, and probability distributions on spaces of such structures can be constructed for a group of RNAs. Our method was further applied to predict functions of RNA molecules and showed superior performance compared with previous methods when tested on benchmark datasets. The programs are available at ?jinfeng/ESA.html. PMID:23585278

Laborde, Jose; Robinson, Daniel; Srivastava, Anuj; Klassen, Eric; Zhang, Jinfeng



The Study of Correlation Structures of DNA Sequences A Critical Review  

CERN Multimedia

The study of correlation structure in the primary sequences of DNA is reviewed. The issues reviewed include: symmetries among 16 base-base correlation functions, accurate estimation of correlation measures, the relationship between $1/f$ and Lorentzian spectra, heterogeneity in DNA sequences, different modeling strategies of the correlation structure of DNA sequences, the difference of correlation structure between coding and non-coding regions (besides the period-3 pattern), and source of broad distribution of domain sizes. Although some of the results remain controversial, a body of work on this topic constitutes a good starting point for future studies.

Li, W



Folding pathways of proteins with increasing degree of sequence identities but different structure and function  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Much experimental work has been devoted in comparing the folding behavior of proteins sharing the same fold but different sequence. The recent design of proteins displaying very high sequence identities but different 3D structure allows the unique opportunity to address the protein-folding problem from a complementary perspective. Here we explored by ?-value analysis the pathways of folding of three different heteromorphic pairs, displaying increasingly high-sequence identity (namely, 30%, 7...

Giri, Rajanish; Morrone, Angela; Travaglini-allocatelli, Carlo; Jemth, Per; Brunori, Maurizio; Gianni, Stefano



Modelling fibrinolysis: 1D continuum models. (United States)

Fibrinolysis is the enzymatic degradation of the fibrin mesh that stabilizes blood clots. Experiments have shown that coarse clots made of thick fibres sometimes lyse more quickly than fine clots made of thin fibres, despite the fact that individual thick fibres lyse more slowly than individual thin fibres. This paper aims at using a 1D continuum reaction-diffusion model of fibrinolysis to elucidate the mechanism by which coarse clots lyse more quickly than fine clots. Reaction-diffusion models have been the standard tool for investigating fibrinolysis, and have been successful in capturing the wave-like behaviour of lysis seen in experiments. These previous models treat the distribution of fibrin within a clot as homogeneous, and therefore cannot be used directly to study the lysis of fine and coarse clots. In our model, we include a spatially heterogeneous fibrin concentration, as well as a more accurate description of the role of fibrin as a cofactor in the activation of the lytic enzyme. Our model predicts spatio-temporal protein distributions in reasonable quantitative agreement with experimental data. The model also predicts observed behaviour such as a front of lysis moving through the clot with an accumulation of lytic proteins at the front. In spite of the model improvements, however, we find that 1D continuum models are unable to accurately describe the observed differences in lysis behaviour between fine and coarse clots. Features of the problems that lead to the inaccuracy of 1D continuum models are discussed. We conclude that higher-dimensional, multiscale models are required to investigate the effect of clot structure on lysis behaviour. PMID:23220467

Bannish, Brittany E; Keener, James P; Woodbury, Michael; Weisel, John W; Fogelson, Aaron L



Determining 3-D motion and structure from image sequences (United States)

A method of determining three-dimensional motion and structure from two image frames is presented. The method requires eight point correspondences between the two frames, from which motion and structure parameters are determined by solving a set of eight linear equations and a singular value decomposition of a 3x3 matrix. It is shown that the solution thus obtained is unique.

Huang, T. S.



The structurally constrained protein evolution model accounts for sequence patterns of the L?H superfamily  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Structure conservation constrains evolutionary sequence divergence, resulting in observable sequence patterns. Most current models of protein evolution do not take structure into account explicitly, being unsuitable for investigating the effects of structure conservation on sequence divergence. To this end, we recently developed the Structurally Constrained Protein Evolution (SCPE model. The model starts with the coding sequence of a protein with known three-dimensional structure. At each evolutionary time-step of an SCPE simulation, a trial sequence is generated by introducing a random point mutation in the current coding DNA sequence. Then, a "score" for the trial sequence is calculated and the mutation is accepted only if its score is under a given cutoff, ?. The SCPE score measures the distance between the trial sequence and a given reference sequence, given the structure. In our first brief report we used a "global score", in which the same reference sequence, the ancestral one, was used at each evolutionary step. Here, we introduce a new scoring function, the "local score", in which the sequence accepted at the previous evolutionary time-step is used as the reference. We assess the model on the UDP-N-acetylglucosamine acyltransferase (LPXA family, as in our previous report, and we extend this study to all other members of the left-handed parallel beta helix fold (L?H superfamily whose structure has been determined. Results We studied site-dependent entropies, amino acid probability distributions, and substitution matrices predicted by SCPE and compared with experimental data for several members of the L?H superfamily. We also evaluated structure conservation during simulations. Overall, SCPE outperforms JTT in the description of sequence patterns observed in structurally constrained sites. Maximum Likelihood calculations show that the local-score and global-score SCPE substitution matrices obtained for LPXA outperform the JTT model for the LPXA family and for the structurally constrained sites of class i of other members within the L?H superfamily. Conclusion We extended the SCPE model by introducing a new scoring function, the local score. We performed a thorough assessment of the SCPE model on the LPXA family and extended it to all other members of known structure of the L?H superfamily.

Echave Julián



Molecular characterization of pouched amphistome parasites (Trematoda: Gastrothylacidae) using ribosomal ITS2 sequence and secondary structures. (United States)

Members of the family Gastrothylacidae (Trematoda: Digenea: Paramphistomata) are parasitic in ruminants throughout Africa and Asia. In north-east India, five species of pouched amphistomes, namely Fischoederius cobboldi, F. elongatus, Gastrothylax crumenifer, Carmyerius spatiosus and Velasquezotrema tripurensis, belonging to this family have been reported so far. In the present study, the molecular phylogeny of these five gastrothylacid species is derived using the second internal transcribed spacer (ITS2) sequence and secondary structure analyses. ITS2 sequence analysis was carried out to see the occurrence of interspecific variations among the species. Phylogenetic analyses were performed for primary sequence data alone as well as the combined sequence-structure information using neighbour-joining and Bayesian approaches. The sequence analysis revealed that there exist considerable interspecific variations among the various gastrothylacid fluke species. In contrast, the inferred secondary structures for the five species using minimum free energy modelling showed structural identities, in conformity with the core four-helix domain structure that has been recently identified as common to almost all eukaryotic taxa. The phylogenetic tree reconstructed using combined sequence-structure data showed a better resolution, as compared to the one using sequence data alone, with the gastrothylacid species forming a monophyletic group that is well separated from members of the other family, Paramphistomidae, of the amphistomid flukes group. The study provides the molecular characterization based on primary sequence data of the rDNA ITS2 region of the gastrothylacid amphistome flukes. Results also demonstrate the phylogenetic utility of the ITS2 sequence-secondary structure data for inferences at higher taxonomic levels. PMID:21473796

Ghatani, S; Shylla, J A; Tandon, V; Chatterjee, A; Roy, B



Progressive structure-based alignment of homologous proteins: Adopting sequence comparison strategies.  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Comparison of multiple protein structures has a broad range of applications in the analysis of protein structure, function and evolution. Multiple structure alignment tools (MSTAs) are necessary to obtain a simultaneous comparison of a family of related folds. In this study, we have developed a method for multiple structure comparison largely based on sequence alignment techniques. A widely used Structural Alphabet named Protein Blocks (PBs) was used to transform the information on 3D protein...



??????: Genome sequence, comparative analysis and haplotype structure of the domestic dog.  

Full Text Available Genome sequence, comparative analysis and haplotype structure of the domestic dog. Lindblad-Toh ???SNP _resource_portal/LATEST/workflow_images/


A 1-D dusty plasma photonic crystal  

Energy Technology Data Exchange (ETDEWEB)

It is demonstrated numerically that a 1-D plasma crystal made of micron size cylindrical dust particles can, in principle, work as a photonic crystal for terahertz waves. The dust rods are parallel to each other and arranged in a linear string forming a periodic structure of dielectric-plasma regions. The dispersion equation is found by solving the waves equation with the boundary conditions at the dust-plasma interface and taking into account the dielectric permittivity of the dust material and plasma. The wavelength of the electromagnetic waves is in the range of a few hundred microns, close to the interparticle separation distance. The band gaps of the 1-D plasma crystal are numerically found for different types of dust materials, separation distances between the dust rods and rod diameters. The distance between levitated dust rods forming a string in rf plasma is shown experimentally to vary over a relatively wide range, from 650 ?m to about 1350 ?m, depending on the rf power fed into the discharge.

Mitu, M. L.; Tico?, C. M. [National Institute for Laser, Plasma and Radiation Physics, 077125 Bucharest (Romania); Toader, D.; Banu, N.; Scurtu, A. [National Institute for Laser, Plasma and Radiation Physics, 077125 Bucharest (Romania); Department of Physics, University of Bucharest, 077125 Bucharest (Romania)



Structure of fault zones in cohesive volcanic sequences  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Normal fault systems are basic features in commercially important geological structures like e.g. sedimentary basins. While the interpretation of seismic data sets reveals the structure of the strata and its offset by larger faults, the properties of the fault planes itself remain undetermined. The information on e.g. permeability of laterally confined fault systems is derived from outcrops or theoretical models. The scope of the faulting research focuses on softer unconsolidated materials as...

Holland, Marc



Implicit Structured Sequence Learning: An FMRI Study of the Structural Mere-Exposure Effect  

Directory of Open Access Journals (Sweden)

Full Text Available In this event-related FMRI study we investigated the effect of five days of implicit acquisition on preference classification by means of an artificial grammar learning (AGL paradigm based on the structural mere-exposure effect and preference classification using a simple right-linear unification grammar. This allowed us to investigate implicit AGL in a proper learning design by including baseline measurements prior to grammar exposure. After 5 days of implicit acquisition, the FMRI results showed activations in a network of brain regions including the inferior frontal (centered on BA 44/45 and the medial prefrontal regions (centered on BA 8/32. Importantly, and central to this study, the inclusion of a naive preference FMRI baseline measurement allowed us to conclude that these FMRI findings were the intrinsic outcomes of the learning process itself and not a reflection of a preexisting functionality recruited during classification, independent of acquisition. Support for the implicit nature of the knowledge utilized during preference classification on day 5 come from the fact that the basal ganglia, associated with implicit procedural learning, were activated during classification, while the medial temporal lobe system, associated with explicit declarative memory, was consistently deactivated. Thus, preference classification in combination with structural mere-exposure can be used to investigate structural sequence processing (syntax in unsupervised AGL paradigms with proper learning designs.

Karl MagnusPetersson



Implicit structured sequence learning: an fMRI study of the structural mere-exposure effect (United States)

In this event-related fMRI study we investigated the effect of 5 days of implicit acquisition on preference classification by means of an artificial grammar learning (AGL) paradigm based on the structural mere-exposure effect and preference classification using a simple right-linear unification grammar. This allowed us to investigate implicit AGL in a proper learning design by including baseline measurements prior to grammar exposure. After 5 days of implicit acquisition, the fMRI results showed activations in a network of brain regions including the inferior frontal (centered on BA 44/45) and the medial prefrontal regions (centered on BA 8/32). Importantly, and central to this study, the inclusion of a naive preference fMRI baseline measurement allowed us to conclude that these fMRI findings were the intrinsic outcomes of the learning process itself and not a reflection of a preexisting functionality recruited during classification, independent of acquisition. Support for the implicit nature of the knowledge utilized during preference classification on day 5 come from the fact that the basal ganglia, associated with implicit procedural learning, were activated during classification, while the medial temporal lobe system, associated with explicit declarative memory, was consistently deactivated. Thus, preference classification in combination with structural mere-exposure can be used to investigate structural sequence processing (syntax) in unsupervised AGL paradigms with proper learning designs.

Folia, Vasiliki; Petersson, Karl Magnus



Structure and sequence variation of mink interleukin-6 gene  

International Nuclear Information System (INIS)

Aleutian disease (AD) is the number one disease threat to the survival and future of the mink industry in Nova Scotia and the world. Several ranchers have gone out of business in recent years in Nova Scotia as a direct result of AD. Currently, the control measure for AD consists of testing and slaughtering of infected mink. This practice has not been effective in controlling the disease. Finding a means of controlling AD is the number one priority for the mink industry in Nova Scotia. An effective control measure will have a long-term positive effect on the rural economy by improving production potential of mink and reducing production cost. It has been shown that antiviral antibodies produced by activated immune system cells sometimes combine with interleukin-6 (IL-6) to form immune complexes that cause AD in mink. There is evidence of a significant relationship between nucleotide variations in IL-6 gene and the onset of certain diseases in humans, which bears similar symptoms to AD. Furthermore, pathological symptoms of AD resemble those of other conditions, such as systemic lupus erythematosus (SLE) and Castleman Diseases in humans, where overproduction of IL-6 coincides with the severity of the disease. These findings suggest that IL-6 could be a candidate gene and warrant investigation vis-a-vis differences among mink genotypes in resistance or tolerance to ADV infection. The sequence of the IL-6 gene in mink was done and identification of polymorphisms was used to evaluate the potential role of this gene in the immune system response to infections. The 4678 bp promoter region, five exons and four introns of the interleukin-6 (IL-6) gene were bi-directionally sequenced in four unrelated mink from each of the wild, black, brown, pastel and sapphire mink (Genbank accession number (EF620932). The 344 bp promoter region of the gene contained several transcription binding sites. One exonic and seven intronic single nucleotide polymorphisms (SNP) were detected by sequencing of the 20 mink and genotyping of an additional 82 animals from the five colour types. Only two intronic SNP were segregating at high frequencies, indicating that the level of polymorphisms in the mink IL-6 gene was low. A bi-allelic tetranucleotide repeat was detected in the promoter region, with the frequency of 0.0, 0.17, 0.25, 0.25 and 0.40 in the wild, black, pastel, brown and sapphire mink, respectively, suggesting that this locus may influence immune response to infection. A polymorphic (CA)16 with 10 alleles was also detected in intron 2. (author)



Sequence and structure space model of protein divergence driven by point mutations. (United States)

New folds of protein structures emerge in evolution as a result of insertions, deletions or shuffling of fragments of underlying gene sequences, and from aggregated effects of point mutations. The result of these evolutionary processes is a rich and complex universe of protein sequences and structures, with characteristic features such as heavy-tailed distribution of fold occurrences, and a distinct shape of relationship between sequence identity and structure similarity. Better understanding of how the protein universe evolved to its present form can be achieved by creating models of protein structure evolution. Here we introduce a stochastic model of evolution that involves residue substitutions as the sole source of structure innovation, and is nonetheless able to reproduce the diversity of the protein domains repertoire, its cluster structure with heavy-tailed distribution of family sizes, and presence of the twilight zone populated with remote homologs. PMID:23541620

Arod?, Tomasz; P?onka, Przemys?aw M



Multi-scale sequence correlations increase proteome structural disorder and promiscuity  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Numerous experiments demonstrate a high level of promiscuity and structural disorder in organismal proteomes. Here we ask the question what makes a protein promiscuous, i.e., prone to non-specific interactions, and structurally disordered. We predict that multi-scale correlations of amino acid positions within protein sequences statistically enhance the propensity for promiscuous intra- and inter-protein binding. We show that sequence correlations between amino acids of the ...

Afek, Ariel; Shakhnovich, Eugene I.; Lukatsky, David B.



Sequence Diversity, Predicted Two-Dimensional Protein Structure, and Epitope Mapping of Neisserial Opa Proteins  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The sequence diversity of 45 Opa outer membrane proteins from Neisseria meningitidis, Neisseria gonorrhoeae, Neisseria sicca, and Neisseria flava indicates that horizontal genetic exchange of opa alleles has been rare between these species. A two-dimensional structural model containing four surface-exposed loops was constructed based on rules derived from porin crystal structure and on conservation of sequence homology within transmembrane ?-strands. The minimal continuous epitopes recognize...

Malorny, Burkhard; Morelli, Giovanna; Kusecek, Barica; Kolberg, Jan; Achtman, Mark



Optimal sequence selection in proteins of known structure by simulated evolution.  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Rational design of protein structure requires the identification of optimal sequences to carry out a particular function within a given backbone structure. A general solution to this problem requires that a potential function describing the energy of the system as a function of its atomic coordinates be minimized simultaneously over all available sequences and their three-dimensional atomic configurations. Here we present a method that explicitly minimizes a semiempirical potential function s...

Hellinga, H. W.; Richards, F. M.



Sequence- and structure-based prediction of eukaryotic proteinphosphorylation sites  

DEFF Research Database (Denmark)

Protein phosphorylation at serine, threonine or tyrosine residues affects a multitude of cellular signaling processes. Howis specificity in substrate recognition and phosphorylation by protein kinases achieved? Here, we present an artificialneural network method that predicts phosphorylation sites in independent sequences with a sensitivity in the range from69 % to 96 %. As an example, we predict novel phosphorylation sites in the p300/CBP protein that may regulateinteraction with transcription factors and histone acetyltransferase activity. In addition, serine and threonine residues inp300/CBP that can be modified by O-linked glycosylation with N-acetylglucosamine are identified. Glycosylation mayprevent phosphorylation at these sites, a mechanism named yin-yang regulation. The prediction server is available on theInternet at via e-mail to NetPhos@cbs. Copyright 1999 AcademicPress.

Blom, Nikolaj; Gammeltoft, Steen



The ubiquitin domain superfold: structure-based sequence alignments and characterization of binding epitopes. (United States)

Ubiquitin-like domains are present, apart from ubiquitin-like proteins themselves, in many multidomain proteins involved in different signal transduction processes. The sequence conservation for all ubiquitin superfold family members is rather poor, even between subfamily members, leading to mistakes in sequence alignments using conventional sequence alignment methods. However, a correct alignment is essential, especially for in silico methods that predict binding partners on the basis of sequence and structure. In this study, using 3D-structural information we have generated and manually corrected sequence alignments for proteins of the five ubiquitin superfold subfamilies. On the basis of this alignment, we suggest domains for which structural information will be useful to allow homology modelling. In addition, we have analysed the energetic and electrostatic properties of ubiquitin-like domains in complex with various functional binding proteins using the protein design algorithm FoldX. On the basis of an in silico alanine-scanning mutagenesis, we provide a detailed binding epitope mapping of the hotspots of the ubiquitin domain fold, involved in the interaction with different domains and proteins. Finally, we provide a consensus fingerprint sequence that identifies all sequences described to belong to the ubiquitin superfold family. It is possible that the method that we describe may be applied to other domain families sharing a similar fold but having low levels of sequence homology. PMID:16310215

Kiel, Christina; Serrano, Luis



Structure- and sequence-based function prediction for non-homologous proteins. (United States)

The structural genomics projects have been accumulating an increasing number of protein structures, many of which remain functionally unknown. In parallel effort to experimental methods, computational methods are expected to make a significant contribution for functional elucidation of such proteins. However, conventional computational methods that transfer functions from homologous proteins do not help much for these uncharacterized protein structures because they do not have apparent structural or sequence similarity with the known proteins. Here, we briefly review two avenues of computational function prediction methods, i.e. structure-based methods and sequence-based methods. The focus is on our recent developments of local structure-based and sequence-based methods, which can effectively extract function information from distantly related proteins. Two structure-based methods, Pocket-Surfer and Patch-Surfer, identify similar known ligand binding sites for pocket regions in a query protein without using global protein fold similarity information. Two sequence-based methods, protein function prediction and extended similarity group, make use of weakly similar sequences that are conventionally discarded in homology based function annotation. Combined together with experimental methods we hope that computational methods will make leading contribution in functional elucidation of the protein structures. PMID:22270458

Sael, Lee; Chitale, Meghana; Kihara, Daisuke



Detection of structural DNA variation from next generation sequencing data: a review of informatic approaches. (United States)

Next generation sequencing (NGS), or massively paralleled sequencing, refers to a collective group of methods in which numerous sequencing reactions take place simultaneously, resulting in enormous amounts of sequencing data for a small fraction of the cost of Sanger sequencing. Typically short (50-250 bp), NGS reads are first mapped to a reference genome, and then variants are called from the mapped data. While most NGS applications focus on the detection of single nucleotide variants (SNVs) or small insertions/deletions (indels), structural variation, including translocations, larger indels, and copy number variation (CNV), can be identified from the same data. Structural variation detection can be performed from whole genome NGS data or "targeted" data including exomes or gene panels. However, while targeted sequencing greatly increases sequencing coverage or depth of particular genes, it may introduce biases in the data that require specialized informatic analyses. In the past several years, there have been considerable advances in methods used to detect structural variation, and a full range of variants from SNVs to balanced translocations to CNV can now be detected with reasonable sensitivity from either whole genome or targeted NGS data. Such methods are being rapidly applied to clinical testing where they can supplement or in some cases replace conventional fluorescence in situ hybridization or array-based testing. Here we review some of the informatics approaches used to detect structural variation from NGS data. PMID:24405614

Abel, Haley J; Duncavage, Eric J



Four basic symmetry types in the universal 7-cluster structure of 143 complete bacterial genomic sequences  

CERN Multimedia

Coding information is the main source of heterogeneity (non-randomness) in the sequences of bacterial genomes. This information can be naturally modeled by analysing cluster structures in the "in-phase" triplet distributions of relatively short genomic fragments (200-400bp). We found a universal 7-cluster structure in bacterial genomic sequences and explained its properties. We show that codon usage of bacterial genomes is a multi-linear function of their genomic G+C-content with high accuracy. Based on the analysis of 143 completely sequenced bacterial genomes available in Genbank in August 2004, we show that there are four "pure" types of the 7-cluster structure observed. All 143 cluster animated 3D-scatters are collected in a database and is made available on our web-site: The finding can be readily introduced into any software for gene prediction, sequence alignment or bacterial genomes classification.

Gorban, A N; Zinovyev, A Yu



SPARCS: a web server to analyze (un)structured regions in coding RNA sequences. (United States)

More than a simple carrier of the genetic information, messenger RNA (mRNA) coding regions can also harbor functional elements that evolved to control different post-transcriptional processes, such as mRNA splicing, localization and translation. Functional elements in RNA molecules are often encoded by secondary structure elements. In this aticle, we introduce Structural Profile Assignment of RNA Coding Sequences (SPARCS), an efficient method to analyze the (secondary) structure profile of protein-coding regions in mRNAs. First, we develop a novel algorithm that enables us to sample uniformly the sequence landscape preserving the dinucleotide frequency and the encoded amino acid sequence of the input mRNA. Then, we use this algorithm to generate a set of artificial sequences that is used to estimate the Z-score of classical structural metrics such as the sum of base pairing probabilities and the base pairing entropy. Finally, we use these metrics to predict structured and unstructured regions in the input mRNA sequence. We applied our methods to study the structural profile of the ASH1 genes and recovered key structural elements. A web server implementing this discovery pipeline is available at together with the source code of the sampling algorithm. PMID:23748952

Zhang, Yang; Ponty, Yann; Blanchette, Mathieu; Lécuyer, Eric; Waldispühl, Jérôme



Structural and sequence characteristics of long alpha helices in globular proteins.  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Elucidation of the detailed structural features and sequence requirements for alpha helices of various lengths could be very important in understanding secondary structure formation in proteins and, hence, in the protein folding mechanism. An algorithm to characterize the geometry of an alpha helix from its C(alpha) coordinates has been developed and used to analyze the structures of long alpha helices (number of residues > or = 25) found in globular proteins, the crystal structure coordinate...

Kumar, S.; Bansal, M.



Investigation of the protein osteocalcin of Camelops hesternus: Sequence, structure and phylogenetic implications (United States)

Ancient DNA sequences offer an extraordinary opportunity to unravel the evolutionary history of ancient organisms. Protein sequences offer another reservoir of genetic information that has recently become tractable through the application of mass spectrometric techniques. The extent to which ancient protein sequences resolve phylogenetic relationships, however, has not been explored. We determined the osteocalcin amino acid sequence from the bone of an extinct Camelid (21 ka, Camelops hesternus) excavated from Isleta Cave, New Mexico and three bones of extant camelids: bactrian camel ( Camelus bactrianus); dromedary camel ( Camelus dromedarius) and guanaco ( Llama guanacoe) for a diagenetic and phylogenetic assessment. There was no difference in sequence among the four taxa. Structural attributes observed in both modern and ancient osteocalcin include a post-translation modification, Hyp 9, deamidation of Gln 35 and Gln 39, and oxidation of Met 36. Carbamylation of the N-terminus in ancient osteocalcin may result in blockage and explain previous difficulties in sequencing ancient proteins via Edman degradation. A phylogenetic analysis using osteocalcin sequences of 25 vertebrate taxa was conducted to explore osteocalcin protein evolution and the utility of osteocalcin sequences for delineating phylogenetic relationships. The maximum likelihood tree closely reflected generally recognized taxonomic relationships. For example, maximum likelihood analysis recovered rodents, birds and, within hominins, the Homo-Pan-Gorilla trichotomy. Within Artiodactyla, character state analysis showed that a substitution of Pro 4 for His 4 defines the Capra-Ovis clade within Artiodactyla. Homoplasy in our analysis indicated that osteocalcin evolution is not a perfect indicator of species evolution. Limited sequence availability prevented assigning functional significance to sequence changes. Our preliminary analysis of osteocalcin evolution represents an initial step towards a complete character analysis aimed at determining the evolutionary history of this functionally significant protein. We emphasize that ancient protein sequencing and phylogenetic analyses using amino acid sequences must pay close attention to post-translational modifications, amino acid substitutions due to diagenetic alteration and the impacts of isobaric amino acids on mass shifts and sequence alignments.

Humpula, James F.; Ostrom, Peggy H.; Gandhi, Hasand; Strahler, John R.; Walker, Angela K.; Stafford, Thomas W.; Smith, James J.; Voorhies, Michael R.; George Corner, R.; Andrews, Phillip C.



Nucleotide sequence and secondary structure of citrus exocortis and chrysanthemum stunt viroid. (United States)

The complete nucleotide sequence of citrus exocortis viroid (CEV, propagated in Gymura) and chrysanthemum stunt viroid (CSV, propagated in Cineraria) has been established, using labelling in vitro and direct RNA sequencing methods and a new screening procedure for the rapid selection of suitable RNA fragments from limited digests. The covalently closed circular single-stranded viroid RNAs consist of 371 (CEV) and 354 (CSV) nucleotides, respectively. As previously shown for potato spindle tuber viroid (PSTV, 359 nucleotides), CEV and CSV also contain a long polypurine sequence. Maximal base-pairing of the established CEV and CSV sequences results in an extended rod-like secondary structure similar to that previously established for PSTV and as predicted from detailed physicochemical studies of all these viroids. Although the three viroid species sequenced to date differ in size and nucleotide sequence, there is 60--73% homology between them. As PSTV, CEV and CSV also contain conserved complementary sequences which are separated from each other in the native secondary structure. We postulate that the resulting 'secondary' hairpins, being formed and observed in vitro during the complex process of thermal denaturation of viroid RNA, must have a vital, although yet unknown, function in vivo. The possible origin and function of viroids are discussed on the basis of the characteristic structural features and of a considerable homology with U1a RNA found for a region highly conserved in the three viroids. PMID:7060550

Gross, H J; Krupp, G; Domdey, H; Raba, M; Jank, P; Lossow, C; Alberty, H; Ramm, K; Sänger, H L



Assessment and refinement of eukaryotic gene structure prediction with gene-structure-aware multiple protein sequence alignment (United States)

Background Accurate computational identification of eukaryotic gene organization is a long-standing problem. Despite the fundamental importance of precise annotation of genes encoded in newly sequenced genomes, the accuracy of predicted gene structures has not been critically evaluated, mostly due to the scarcity of proper assessment methods. Results We present a gene-structure-aware multiple sequence alignment method for gene prediction using amino acid sequences translated from homologous genes from many genomes. The approach provides rich information concerning the reliability of each predicted gene structure. We have also devised an iterative method that attempts to improve the structures of suspiciously predicted genes based on a spliced alignment algorithm using consensus sequences or reliable homologs as templates. Application of our methods to cytochrome P450 and ribosomal proteins from 47 plant genomes indicated that 50?~?60 % of the annotated gene structures are likely to contain some defects. Whereas more than half of the defect-containing genes may be intrinsically broken, i.e. they are pseudogenes or gene fragments, located in unfinished sequencing areas, or corresponding to non-productive isoforms, the defects found in a majority of the remaining gene candidates can be remedied by our iterative refinement method. Conclusions Refinement of eukaryotic gene structures mediated by gene-structure-aware multiple protein sequence alignment is a useful strategy to dramatically improve the overall prediction quality of a set of homologous genes. Our method will be applicable to various families of protein-coding genes if their domain structures are evolutionarily stable. It is also feasible to apply our method to gene families from all kingdoms of life, not just plants.



Sequence Analysis of the Protein Structure Homology Modeling of Growth Hormone Gene from Salmo trutta caspius  

Directory of Open Access Journals (Sweden)

Full Text Available In view of the growth hormone protein investigated and characterized from Salmo trutta caspius. Growth hormone gene in the Salmo trutta caspius have six exons in the full length that is translated into a Molecular Weight (kDa: ssDNA: 64.98 and dsDNA: 129.6. There are also 210 amino acid residue. The assembled full length of DNA contains open reading frame of growth hormone gene that contains 15 sequences in the full length. The average GC content is 47% and AT content is 53%. This protein multiple alignment has shown that this peptide is 100% identical to the corresponding homologous protein in the growth hormone protein which including Salmo salar (Accession number: AAA49558.1 and Rainbow trout (Salmo trutta (Accession number: AAA49555.1" sequences. The sequence of protein had deposited in Gene Bank, Accession number: AEK70940. Also we were analyzed second and third structure between sequences reported in Gene Bank Network system. The results are shown, there are homology between second structure in three sequences including: Salmo trutta caspius, Salmo salar and Rainbow trout. Regarding third structure, Salmo trutta caspius and Salmo salar are same type, but Rainbow trout has different homology with Salmo trutta caspius and Salmo salar. However, the sequences were observed three parallel " helix and in second structure there were almost same percent ? sheet.

Abolhasan Rezaei



Implications of some flow sedimentary structures within Miocene evaporitic sequence, Red Sea, Egypt  

Energy Technology Data Exchange (ETDEWEB)

Some typical flow sedimentary structures were clearly detected within the middle Miocene alternating gypsiferous and anhydritic clays of the evaporitic sequence in Ras Gemsa and Um El-Huweitat localities. Sedimentologic analyses of the different structural forms revealed that they were originally formed from unlithified sediments and due to submarine flowage. These structures were formed as a result of stress-load, compression, and rotation. Such a genetic approach is helpful in deducing the environmental conditions within which these sediments accumulated. Degrees of flowage and affected stresses on similar lithologic associations could be considered strong evidence for correlation within the extended Miocene evaporitic sequence along the Red Sea coast.

Wali, A.



Structural organization of the human glutathione reductase gene: determination of correct cDNA sequence and identification of a mitochondrial leader sequence. (United States)

The primary structure of human glutathione reductase gene (GSR) was determined by genomic cloning. The gene structure of human GSR spans 50 kb, consists of 13 exons, and was found to be highly similar to the mouse GSR gene. The coding sequence of human GSR resides on all 13 exons. An N-terminal arginine-rich mitochondrial leader sequence was present, with high homology to the murine leader sequence, between two in-frame start codons in the first exon. The 5' and 3' intron/exon splice junctions, with one exception, followed the general consensus sequences for intron spliced donor and acceptance sites. PMID:10708558

Kelner, M J; Montoya, M A



Thousands of corresponding human and mouse genomic regions unalignable in primary sequence contain common RNA structure  

DEFF Research Database (Denmark)

Human and mouse genome sequences contain roughly 100,000 regions that are unalignable in primary sequence and neighbor corresponding alignable regions between both organisms. These pairs are generally assumed to be nonconserved, although the level of structural conservation between these has never been investigated. Owing to the limitations in computational methods, comparative genomics has been lacking the ability to compare such nonconserved sequence regions for conserved structural RNA elements. We have investigated the presence of structural RNA elements by conducting a local structural alignment, using FOLDALIGN, on a subset of these 100,000 corresponding regions and estimate that 1800 contain common RNA structures. Comparing our results with the recent mapping of transcribed fragments (transfrags) in human, we find that high-scoring candidates are twice as likely to be found in regions overlapped by transfrags than regions that are not overlapped by transfrags. To verify the coexpression between predicted candidates in human and mouse, we conducted expression studies by RT-PCR and Northern blotting on mouse candidates, which overlap with transfrags on human chromosome 20. RT-PCR results confirmed expression of 32 out of 36 candidates, whereas Northern blots confirmed four out of 12 candidates. Furthermore, many RT-PCR results indicate differential expression in different tissues. Hence, our findings suggest that there are corresponding regions between human and mouse, which contain expressed non-coding RNA sequences not alignable in primary sequence.

Torarinsson, Elfar; Sawera, Milena



(1 + d/dz)^(-1)  

CERN Multimedia

We investigate the structure of fully non-linear P.D.E.'s in holomorphic functions, with emphasis on the functorial generalisation of so called "irregular" O.D.E.'s. Highlights are an implicit function theorem removing the perturbation conditions of Nash-Moser type, best possible existence results when the singularity of the linearised P.D.E. is at worst bi-dimensional, and various, again optimal, corollaries on existence of centre manifolds and conjugation to normal form of 3-dimensional vector fields.

McQuillan, Michael



Wavelet Analysis of DNA Bending Profiles reveals Structural Constraints on the Evolution of Genomic Sequences  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Analyses of genomic DNA sequences have shown in previous works that base pairs are correlated at large distances with scale-invariant statistical properties. We show in the present study that these correlations between nucleotides (letters) result in fact from long-range correlations (LRC) between sequence-dependent DNA structural elements (words) involved in the packaging of DNA in chromatin. Using the wavelet transform technique, we perform a comparative analysis of the DNA text and of the ...

Audit, Benjamin; Vaillant, Ce?dric; Arne?odo, Alain; D Aubenton-carafa, Yves; Thermes, Claude



Structural and genetic organization of IS232, a new insertion sequence of Bacillus thuringiensis.  

Digital Repository Infrastructure Vision for European Research (DRIVER)

In the Bacillus thuringiensis strains toxic for the lepidopteran larvae, the delta-endotoxin genes cryIA are frequently found within a composite transposonlike structure flanked by two inverted repeat sequences. We report that these elements are true insertion sequences and designate them IS232. IS232 is a 2,184-bp element and is delimited by two imperfect inverted repeats (28 of 37 bp are identical). Two adjacent open reading frames, overlapping for three codons, span almost the entire seque...



Thoroughly sampling sequence space: Large-scale protein design of structural ensembles  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Modeling the inherent flexibility of the protein backbone as part of computational protein design is necessary to capture the behavior of real proteins and is a prerequisite for the accurate exploration of protein sequence space. We present the results of a broad exploration of sequence space, with backbone flexibility, through a novel approach: large-scale protein design to structural ensembles. A distributed computing architecture has allowed us to generate hundreds of thousands of diverse ...

Larson, Stefan M.; England, Jeremy L.; Desjarlais, John R.; Pande, Vijay S.



Evolutionary Analyses of DNA Sequences Subject to Constraints on Secondary Structure  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Evolutionary models appropriate for analyzing nucleotide sequences that are subject to constraints on secondary structure are developed. The models consider the evolution of pairs of nucleotides, and they incorporate the effects of base-pairing constraints on nucleotide substitution rates by introducing a new parameter to extensions of standard models of sequence evolution. To illustrate some potential uses of the models, a likelihood-ratio test is constructed for the null hypothesis that two...

Muse, S. V.



Sequence and structural requirements of a mitochondrial protein import signal defined by saturation cassette mutagenesis.  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The Saccharomyces cerevisiae F1-ATPase beta subunit precursor contains redundant mitochondrial protein import information at its NH2 terminus (D. M. Bedwell, D. J. Klionsky, and S. D. Emr, Mol. Cell. Biol. 7:4038-4047, 1987). To define the critical sequence and structural features contained within this topogenic signal, one of the redundant regions (representing a minimal targeting sequence) was subjected to saturation cassette mutagenesis. Each of 97 different mutant oligonucleotide isolates...

Bedwell, D. M.; Strobel, S. A.; Yun, K.; Jongeward, G. D.; Emr, S. D.



Multilocus Sequence Typing Analysis of Staphylococcus lugdunensis Implies a Clonal Population Structure  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Staphylococcus lugdunensis is recognized as one of the major pathogenic species within the genus Staphylococcus, even though it belongs to the coagulase-negative group. A multilocus sequence typing (MLST) scheme was developed to study the genetic relationships and population structure of 87 S. lugdunensis isolates from various clinical and geographic sources by DNA sequence analysis of seven housekeeping genes (aroE, dat, ddl, gmk, ldh, recA, and yqiL). The number of alleles ranged from four ...

Chassain, Benoi?t; Leme?e, Ludovic; Didi, Jennifer; Thiberge, Jean-michel; Brisse, Sylvain; Pons, Jean-louis; Pestel-caron, Martine



Characterization of the sequence spectrum of DNA based on the appearance frequency of the nucleotide sequences of the genome——A new method for analysis of genome structure  

Directory of Open Access Journals (Sweden)

Full Text Available The nucleotide (base sequence of the genome might reflect biological information beyond the coding sequences. The appearance frequencies of successive base sequences (key sequences were calculated for entire genomes. Based on the appearance frequency of the key sequences of the genome, any DNA sequences on the genome could be expressed as a sequence spectrum with the adjoining base sequences, which could be used to study the corresponding biological phenomena. In this paper, we used 64 successive three- base sequences (triplets as the key sequences, and determined and compared the spectra of specific genes to the chromosome, or specific genes to tRNA genes in Saccharomyces cerevisiae, Schizosaccharomyces pombe and Escherichia coli. Based on these analyses, a gene and its corresponding position on the chromosome showed highly similar spectra with the same fold enlargement (approximately 400-fold in the S. cerevisiae, S. pombe and E. coli genomes. In addition, the homologous structure of genes that encode proteins was also observed with appropriate tRNA gene(s in the genome. This analytical method might faithfully reflect the encoded biological information, that is, the conservation of the base sequences was to make sense the conservation of the translated amino acids sequence in the coding region, and might be universally applicable to other genomes, even those that consisted of multiple chromosomes.

Masatoshi Nakahara



A simple and fast approach to prediction of protein secondary structure from multiply aligned sequences with accuracy above 70%.  

Digital Repository Infrastructure Vision for European Research (DRIVER)

To improve secondary structure predictions in protein sequences, the information residing in multiple sequence alignments of substituted but structurally related proteins is exploited. A database comprised of 70 protein families and a total of 2,500 sequences, some of which were aligned by tertiary structural superpositions, was used to calculate residue exchange weight matrices within alpha-helical, beta-strand, and coil substructures, respectively. Secondary structure predictions were made ...



Identification and structure analysis of endogenous proviral sequences in a Brown Leghorn chicken strain. (United States)

In a Brown Leghorn chicken strain, four endogenous proviral loci have been identified. The DNA mapping data show strong homology between their structures and that of the Rous-associated virus O (RAV-O) genome. Two of them seem similar to ev3 and ev6 loci previously described in White Leghorn chickens; the two others are unknown in White Leghorns. Using DNA amplification methods, envelope genes of these endogenous viral structures have been partially sequenced. The results demonstrate that subgroup-specific sequences of the endogenous loci were largely homologous with those of RAV-O. PMID:1659694

Ronfort, C; Afanassieff, M; Chebloune, Y; Dambrine, G; Nigon, V M; Verdier, G



Sequence determination and modeling of structural motifs for the smallest monomeric aminoacyl-tRNA synthetase.  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Polypeptide chains of 19 previously studied Escherichia coli aminoacyl-tRNA synthetases are as large as 951 amino acids and, depending on the enzyme, have quaternary structures of alpha, alpha 2, alpha 2 beta 2, and alpha 4. These enzymes have been organized into two classes which are defined by sequence motifs that are associated with specific three-dimensional structures. We isolated, cloned, and sequenced the previously uncharacterized gene for E. coli cysteine-tRNA synthetase (EC

Hou, Y. M.; Shiba, K.; Mottes, C.; Schimmel, P.



Bi3+/M2+ oxyphosphate: a continuous series of polycationic species from the 1D single chain to the 2D planes. Part 2: Crystal structure of three original structural types showing a combination of new ribbonlike polycations. (United States)

With the assistance of structural models deduced from the high-resolution electron microscope (HREM) investigation presented in Part 1 of this work, three new structural types were pointed out in Bi2O3-MO-P2O5 ternary systems. Their crystal structures are built on the arrangement of 2D polycationic ribbons formed of edge-sharing O(Bi,M)4 tetrahedra and isolated by PO4 groups. Prior to this study, materials with ribbons up to n = 3 tetrahedra wide have been discovered. The original structures presented here display longer n = 4-6 cases, which suggests a possible continuous series of polycationic entities that range from the single chain (one tetrahedron wide) to the infinite [Bi2O2]2+ Aurivillius layer. The ribbons with n > 3 show strong structural modifications that are able to bring a good ribbon-phosphate cohesion. In addition to these fascinating structural results, this work fully confirms the validity of the decoding established from HREM images of a single crystallite in inhomogeneous mixtures. PMID:16903715

Colmont, Marie; Huvé, Marielle; Mentré, Olivier



A sequence-based survey of the complex structural organization of tumor genomes (United States)

Background The genomes of many epithelial tumors exhibit extensive chromosomal rearrangements. All classes of genome rearrangements can be identified using end sequencing profiling, which relies on paired-end sequencing of cloned tumor genomes. Results In the present study brain, breast, ovary, and prostate tumors, along with three breast cancer cell lines, were surveyed using end sequencing profiling, yielding the largest available collection of sequence-ready tumor genome breakpoints and providing evidence that some rearrangements may be recurrent. Sequencing and fluorescence in situ hybridization confirmed translocations and complex tumor genome structures that include co-amplification and packaging of disparate genomic loci with associated molecular heterogeneity. Comparison of the tumor genomes suggests recurrent rearrangements. Some are likely to be novel structural polymorphisms, whereas others may be bona fide somatic rearrangements. A recurrent fusion transcript in breast tumors and a constitutional fusion transcript resulting from a segmental duplication were identified. Analysis of end sequences for single nucleotide polymorphisms revealed candidate somatic mutations and an elevated rate of novel single nucleotide polymorphisms in an ovarian tumor. Conclusion These results suggest that the genomes of many epithelial tumors may be far more dynamic and complex than was previously appreciated and that genomic fusions, including fusion transcripts and proteins, may be common, possibly yielding tumor-specific biomarkers and therapeutic targets.

Raphael, Benjamin J; Volik, Stanislav; Yu, Peng; Wu, Chunxiao; Huang, Guiqing; Linardopoulou, Elena V; Trask, Barbara J; Waldman, Frederic; Costello, Joseph; Pienta, Kenneth J; Mills, Gordon B; Bajsarowicz, Krystyna; Kobayashi, Yasuko; Sridharan, Shivaranjani; Paris, Pamela L; Tao, Quanzhou; Aerni, Sarah J; Brown, Raymond P; Bashir, Ali; Gray, Joe W; Cheng, Jan-Fang; de Jong, Pieter; Nefedov, Mikhail; Ried, Thomas; Padilla-Nash, Hesed M; Collins, Colin C



Simulation of Organic Solar Cells Using AMPS-1D Program  

Directory of Open Access Journals (Sweden)

Full Text Available The analysis of microelectronic and photonic structure in one dimension program [AMPS-1D] program has been successfully used to study inorganic solar cells. In this work the program has been used to optimize the performance of the organic solar cells. The cells considered consist of poly(2-methoxy-5-(3,7- dimethyloctyloxy-1,4-phenylenevinylene [MDMO-PPV

Samah G. Babiker



Relationship of sequence and structure to specificity in the alpha-amylase family of enzymes  

DEFF Research Database (Denmark)

The hydrolases and transferases that constitute the alpha-amylase family are multidomain proteins, but each has a catalytic domain in the form of a (beta/alpha)(8)-barrel, with the active site being at the C-terminal end of the barrel beta-strands. Although the enzymes are believed to share the same catalytic acids and a common mechanism of action, they have been assigned to three separate families - 13, 70 and 77 - in the classification scheme for glycoside hydrolases and transferases that is based on amino acid sequence similarities. Each enzyme has one glutamic acid and two aspartic acid residues necessary for activity, while most enzymes of the family also contain two histidine residues critical for transition state stabilisation. These five residues occur in four short sequences conserved throughout the family, and within such sequences some key amino acid residues are related to enzyme specificity. A table is given showing motifs distinctive for each specificity as extracted from 316 sequences, which should aid in identifying the enzyme from primary structure information. Where appropriate, existing problems with identification of some enzymes of the family are pointed out. For enzymes of known three-dimensional structure, action is discussed in terms of molecular architecture. The sequence-specificity and structure-specificity relationships described may provide useful pointers for rational protein engineering.

MacGregor, E. A.; Janecek, S.



Rapid assessment of contact-dependent secondary structure propensity: relevance to amyloidogenic sequences. (United States)

We have previously demonstrated that calculation of contact-dependent secondary structure propensity (CSSP) is highly sensitive in detecting non-native beta-strand propensities in the core sequences of known amyloidogenic proteins. Here we describe a CSSP method based on an artificial neural network that rapidly and accurately quantifies the influence of tertiary contacts (TCs) on secondary structure propensity in local regions of protein sequences. The present method exhibited 72% accuracy in predicting the alternate secondary structure adopted by chameleon sequences located in highly disparate TC regions. Analysis of 1930 nonhomologous protein domains reveals that the alpha-helix and the beta-strand largely share the same sequence context, and that tertiary context is a major determinant of the native conformation. Conversely, it appears that the propensity of random coils for either the alpha-helix or the beta-strand is largely invariant to tertiary effects. The present CSSP method successfully reproduced the amyloidogenic character observed in local regions of the human islet amyloid polypeptide (hIAPP). Furthermore, CSSP profiles were strongly correlated (r = 0.76) with the observed mutational effects on the aggregation rate of acylphosphatase. Taken together, these results provide compelling evidence in support of the present CSSP approach as a sensitive probe useful for analysis of full-length proteins and for detection of core sequences that may trigger amyloid fibril formation. The combined speed and simplicity of the CSSP method lends itself to proteome-wide analysis of the amyloidogenic nature of common proteins. PMID:15849755

Yoon, Sukjoon; Welsh, William J



Quadrant/octant sequencing and the role of coherent structures in bed load sediment entrainment (United States)

permit the tracking of turbulent flow structures in an Eulerian frame from single-point measurements, we make use of a generalization of conventional two-dimensional quadrant analysis to three-dimensional octants. We characterize flow structures using the sequences of these octants and show how significance may be attached to particular sequences using statistical mull models. We analyze an example experiment and show how a particular dominant flow structure can be identified from the conditional probability of octant sequences. The frequency of this structure corresponds to the dominant peak in the velocity spectra and exerts a high proportion of the total shear stress. We link this structure explicitly to the propensity for sediment entrainment and show that greater insight into sediment entrainment can be obtained by disaggregating those octants that occur within the identified macroturbulence structure from those that do not. Hence, this work goes beyond critiques of Reynolds stress approaches to bed load entrainment that highlight the importance of outward interactions, to identifying and prioritizing the quadrants/octants that define particular flow structures.

Keylock, Christopher J.; Lane, Stuart N.; Richards, Keith S.



Reading the three-dimensional structure of a protein from its amino acid sequence  

CERN Document Server

While all the information required for the folding of a protein is contained in its amino acid sequence, one has not yet learnt how to extract this information so as to predict the detailed, biological active, three-dimensional structure of a protein whose sequence is known. This situation is not particularly satisfactory, in keeping with the fact that while linear sequencing of the amino acids specifying a protein is relatively simple to carry out, the determination of the folded-native-conformation can only be done by an elaborate X-ray diffraction analysis performed on crystals of the protein or, if the protein is very small, by nuclear magnetic resonance techniques. Using insight obtained from lattice model simulations of the folding of small proteins (fewer than 100 residues), in particular of the fact that this phenomenon is essentially controlled by conserved contacts among strongly interacting amino acids, which also stabilize local elementary structures formed early in the folding process and leading...

Broglia, R A



The Structure of a Bernoulli Process Variation of the Fibonacci Sequence  

CERN Document Server

We consider the structure of a variation of the Fibonacci sequence which is determined by a Bernoulli process. The associated structure of all Bernoulli variations of the Fibonacci sequence can be represented by a directed binary tree, which we denote X, with vertex labels representing the specific state of the recurrence variation. Since X is a binary tree, we can consider the term of a sequence variation given by a finite traversal of X represented by a binary code t. We then prove that the traversal of X that is the reflection of the digits of t gives exactly the integer term corresponding to t. We consider how to further this result with the statement of an additional conjecture. Finally, we give connections to Fibonacci expansions, the Stern-Brocot tree, and we apply our methods to the Three Hat Problem as seen in ``Puzzle Corner'' of the ``Technology Review'' magazine.

Benson, Brian A



Fast Time-Space Tracking of Smoothly Moving Fine Structures in Image Sequences.  

Digital Repository Infrastructure Vision for European Research (DRIVER)

We address the problem of temporal tracking fine point-like and filamentary structures exhibiting smooth motions in \\mbox{image} sequences. By taking these specific restrictions into account, we propose an original tracking method based on the search for integral lines in time-space structure tensor fields. The method appears to be simple and very efficient \\mbox{regarding} computation time and tracking precision, allowing a sub-pixel accuracy in both spatial and temporal domains. We suggest ...

Tschumperle?, David; Bentolila, Yohan; Martinot, Jen; Fadili, Jalal



RNA secondary structure prediction from sequence alignments using a network of k-nearest neighbor classifiers  

Digital Repository Infrastructure Vision for European Research (DRIVER)

We present a machine learning method (a hierarchical network of k-nearest neighbor classifiers) that uses an RNA sequence alignment in order to predict a consensus RNA secondary structure. The input to the network is the mutual information, the fraction of complementary nucleotides, and a novel consensus RNAfold secondary structure prediction of a pair of alignment columns and its nearest neighbors. Given this input, the network computes a prediction as to whether a particular pair of alignme...



Folding pathways of proteins with increasing degree of sequence identities but different structure and function (United States)

Much experimental work has been devoted in comparing the folding behavior of proteins sharing the same fold but different sequence. The recent design of proteins displaying very high sequence identities but different 3D structure allows the unique opportunity to address the protein-folding problem from a complementary perspective. Here we explored by ?-value analysis the pathways of folding of three different heteromorphic pairs, displaying increasingly high-sequence identity (namely, 30%, 77%, and 88%), but different structures called GA (a 3-? helix fold) and GB (an ?/? fold). The analysis, based on 132 site-directed mutants, is fully consistent with the idea that protein topology is committed very early along the pathway of folding. Furthermore, data reveals that when folding approaches a perfect two-state scenario, as in the case of the GA domains, the structural features of the transition state appear very robust to changes in sequence composition. On the other hand, when folding is more complex and multistate, as for the GBs, there are alternative nuclei or accessible pathways that can be alternatively stabilized by altering the primary structure. The implications of our results in the light of previous work on the folding of different members belonging to the same protein family are discussed.

Giri, Rajanish; Morrone, Angela; Travaglini-Allocatelli, Carlo; Jemth, Per; Brunori, Maurizio; Gianni, Stefano



Structural instability of human tandemly repeated DNA sequences cloned in yeast artificial chromosome vectors. (United States)

The suitability of yeast artificial chromosome vectors (YACs) for cloning human Y chromosome tandemly repeated DNA sequences has been investigated. Clones containing DYZ3 or DYZ5 sequences were found in libraries at about the frequency anticipated on the basis of their abundance in the genome, but clones containing DYZ1 sequences were under-represented and the three clones examined contained junctions between DYZ1 and DYZ2. One DYZ3 clone was quite stable and had a long-range structure corresponding to genomic DNA. All other clones had long-range structures which either did not correspond to genomic DNA, or were too unstable to allow a simple comparison. The effects of the transformation process and host genotype on YAC structural stability were investigated. Gross structural rearrangements were often associated with re-transformation of yeast by a YAC. rad1-deficient yeast strains showed levels of instability similar to wild-type for all YAC clones tested. In rad52-deficient strains, DYZ5 containing YACs were as unstable as in the wild-type host, but DYZ1/DYZ2 or DYZ3 containing YACs were more stable. Thus the use of rad52 hosts for future library construction is recommended, but some sequences will still be unstable. PMID:2183192

Neil, D L; Villasante, A; Fisher, R B; Vetrie, D; Cox, B; Tyler-Smith, C



Accurate prediction of protein secondary structure and solvent accessibility by consensus combiners of sequence and structure information  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Structural properties of proteins such as secondary structure and solvent accessibility contribute to three-dimensional structure prediction, not only in the ab initio case but also when homology information to known structures is available. Structural properties are also routinely used in protein analysis even when homology is available, largely because homology modelling is lower throughput than, say, secondary structure prediction. Nonetheless, predictors of secondary structure and solvent accessibility are virtually always ab initio. Results Here we develop high-throughput machine learning systems for the prediction of protein secondary structure and solvent accessibility that exploit homology to proteins of known structure, where available, in the form of simple structural frequency profiles extracted from sets of PDB templates. We compare these systems to their state-of-the-art ab initio counterparts, and with a number of baselines in which secondary structures and solvent accessibilities are extracted directly from the templates. We show that structural information from templates greatly improves secondary structure and solvent accessibility prediction quality, and that, on average, the systems significantly enrich the information contained in the templates. For sequence similarity exceeding 30%, secondary structure prediction quality is approximately 90%, close to its theoretical maximum, and 2-class solvent accessibility roughly 85%. Gains are robust with respect to template selection noise, and significant for marginal sequence similarity and for short alignments, supporting the claim that these improved predictions may prove beneficial beyond the case in which clear homology is available. Conclusion The predictive system are publicly available at the address

Vullo Alessandro



WebScipio: An online tool for the determination of gene structures using protein sequences  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Obtaining the gene structure for a given protein encoding gene is an important step in many analyses. A software suited for this task should be readily accessible, accurate, easy to handle and should provide the user with a coherent representation of the most probable gene structure. It should be rigorous enough to optimise features on the level of single bases and at the same time flexible enough to allow for cross-species searches. Results WebScipio, a web interface to the Scipio software, allows a user to obtain the corresponding coding sequence structure of a here given a query protein sequence that belongs to an already assembled eukaryotic genome. The resulting gene structure is presented in various human readable formats like a schematic representation, and a detailed alignment of the query and the target sequence highlighting any discrepancies. WebScipio can also be used to identify and characterise the gene structures of homologs in related organisms. In addition, it offers a web service for integration with other programs. Conclusion WebScipio is a tool that allows users to get a high-quality gene structure prediction from a protein query. It offers more than 250 eukaryotic genomes that can be searched and produces predictions that are close to what can be achieved by manual annotation, for in-species and cross-species searches alike. WebScipio is freely accessible at

Waack Stephan



Developing 1D nanostructure arrays for future nanophotonics  

Directory of Open Access Journals (Sweden)

Full Text Available AbstractThere is intense and growing interest in one-dimensional (1-D nanostructures from the perspective of their synthesis and unique properties, especially with respect to their excellent optical response and an ability to form heterostructures. This review discusses alternative approaches to preparation and organization of such structures, and their potential properties. In particular, molecular-scale printing is highlighted as a method for creating organized pre-cursor structure for locating nanowires, as well as vapor–liquid–solid (VLS templated growth using nano-channel alumina (NCA, and deposition of 1-D structures with glancing angle deposition (GLAD. As regards novel optical properties, we discuss as an example, finite size photonic crystal cavity structures formed from such nanostructure arrays possessing highQand small mode volume, and being ideal for developing future nanolasers.

Cooke DG



1D WCIP and FEM hybridization  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The hybridization between two numerical methods, the 1D Wave Concept Iterative Procedure (WCIP) and the 2D Finite Element Method (FEM), is introduced. Preliminary numerical results are also presented.



Social exploration of 1D games  

DEFF Research Database (Denmark)

In this paper the apparently meaningless concept of a 1 dimensional computer game is explored, via netnography. A small number of games was designed and implemented, in close contact with online communities of players and developers, providing evidence that 1 dimension is enough to produce interesting gameplay, to allow for level design and even to leave room for artistic considerations on 1D rendering. General techniques to re-design classic 2D games into 1D are also emerging from this exploration.

Valente, Andrea; Marchetti, Emanuela



[Sequence variation and protein structure of pipo gene in Potato virus Y]. (United States)

The objectives of this study were to understand the sequence variation and the putative protein structure of pipo gene in the Potato virus Y (PVY) collected from Solanum tuberosum. The pipo gene in PVY was cloned using a pair of degenerate primers designed from its conserved region and its sequences were used to re-construct phylogenetic tree in Potyvirus genera by a Bayesian inference method. An expected fragment of 235 bp was amplified in all 20 samples by RT-PCR and the pipo genes in the 20 samples assayed shared more than 92% nucleotide sequence similarity with the published sequences of PVY strains. Among the 20 pipo gene sequences, 13 polymorphic sites were detected, including 4 parsimony informative sites and 9 singleton variable sites. These results indicate that PVY pipo gene is highly conserved but some sequence variations exist. Further analyses suggest that the pipo gene encodes a hydrophilic protein without signal peptide and transmembrane region. The protein has theoretical isoelectric points (pI) ranging from 11.26 to 11.62 and contains three highly conserved regions, especially between aa 10 and 59. The protein is likely located in the mitochondria and has a-helix secondary structure. Bayesian inference of phylogenetic trees reveals that PVY isolates are clustered in the same branch with high posterior probability, while Sunflower chlorotic mottle virus (SoCMoV) and Pepper severe mosaic virus (PepSMV) are closely related, consisting with the classification of Potyvirus genera using other approaches. Our analyses suggest that the pipo gene can be a new marker for phylogenetic analysis of the genera. The results reported in this paper provide useful insights in the genetic variation and the evolution of PVY and can stimulate further research on structure and function of the PIPO protein. PMID:24400487

Gao, Fang-Luan; Shen, Jian-Guo; Shi, Feng-Yang; Chang, Fei; Xie, Lian-Hui; Zhan, Jia-Sui



The effect of disease associated point mutations on 5?-reductase (AKR1D1) enzyme function (United States)

The stereospecific 5?-reduction of ?4-3-ketosterols is very difficult to achieve chemically and introduces a 90° bend between ring A and B of the planar steroid. In mammals, the reaction is catalyzed by steroid 5?-reductase, a member of the aldo-keto reductase (AKR) family. The human enzyme, AKR1D1, plays an essential role in bile-acid biosynthesis since the 5?-configuration is required for the emulsifying properties of bile. Deficient 5?-reductase activity can lead to cholestasis and neo-natal liver failure and is often lethal if it remains untreated. In five patients with 5?-reductase deficiency, sequencing revealed individual, non-synonymous point mutations in the AKR1D1 gene: L106F, P133R, G223E, P198L and R261C. However, mapping these mutations to the AKR1D1 crystal structure failed to reveal any obvious involvement in substrate or cofactor binding or catalytic mechanism, and it remained unclear whether these mutations could be causal for the observed disease. We analyzed the positions of the reported mutations and found that they reside in highly conserved portions of AKR1D1 and hypothesized that they would likely lead to changes in protein folding, and hence enzyme activity. Attempts to purify the mutant enzymes for further characterization by over-expression in E.coli yielded sufficient amounts of only one mutant (P133R). This enzyme exhibited reduced Km and kcat with the bile acid intermediate ?4-cholesten-7?-ol-3-one as substrate reminiscent of uncompetitive inhibition. In addition, P133R displayed no change in cofactor affinity but was more thermolabile as judged by CD-spectroscopy. When all AKR1D1 mutants were expressed in HEK 293 cells, protein expression levels and enzyme activity were dramatically reduced. Furthermore, cycloheximide treatment revealed decreased stability of several of the mutants compared to wild type. Our data show, that all five mutations identified in patients with functional bile acid deficiency strongly affected AKR1D1 enzyme functionality and therefore may be causal for this disease.

Mindnich, Rebekka; Drury, Jason E.; Penning, Trevor M.



Can Clustal-style progressive pairwise alignment of multiple sequences be used in RNA secondary structure prediction? (United States)

Background In ribonucleic acid (RNA) molecules whose function depends on their final, folded three-dimensional shape (such as those in ribosomes or spliceosome complexes), the secondary structure, defined by the set of internal basepair interactions, is more consistently conserved than the primary structure, defined by the sequence of nucleotides. Results The research presented here investigates the possibility of applying a progressive, pairwise approach to the alignment of multiple RNA sequences by simultaneously predicting an energy-optimized consensus secondary structure. We take an existing algorithm for finding the secondary structure common to two RNA sequences, Dynalign, and alter it to align profiles of multiple sequences. We then explore the relative successes of different approaches to designing the tree that will guide progressive alignments of sequence profiles to create a multiple alignment and prediction of conserved structure. Conclusion We have found that applying a progressive, pairwise approach to the alignment of multiple ribonucleic acid sequences produces highly reliable predictions of conserved basepairs, and we have shown how these predictions can be used as constraints to improve the results of a single-sequence structure prediction algorithm. However, we have also discovered that the amount of detail included in a consensus structure prediction is highly dependent on the order in which sequences are added to the alignment (the guide tree), and that if a consensus structure does not have sufficient detail, it is less likely to provide useful constraints for the single-sequence method.

Bellamy-Royds, Amelia B; Turcotte, Marcel



Tick-borne encephalitis virus genome. The nucleotide sequence coding for virion structural proteins. (United States)

RNA of a flavivirus, tick-borne encephalitis virus (TBEV; strain Sofjin), was subjected to reverse transcription and the DNA copy was transformed into double-stranded DNA by the action of E. coli DNA-polymerase I (Klenow fragment). This DNA was annealed with plasmid pBR322. The recombinant plasmids were cloned in E. coli K802. The nucleotide sequence of the inserts of the clones, coding for region structural proteins C, M, E and nonstructural protein NS1, was determined by the Maxam-Gilbert method. The genes of structural proteins form a compact cluster. Homology has been studied of the TBEV sequences found with the structures of proteins and RNAs of other flaviviruses, yellow fever virus and West Nile virus, and a high degree of homology was found. PMID:3709796

Pletnev, A G; Yamshchikov, V F; Blinov, V M



Prediction of protein secondary structure by combining nearest-neighbor algorithms and multiple sequence alignments. (United States)

Recently Yi & Lander used a neural network and nearest-neighbor method with a scoring system that combined a sequence-similarity matrix with the local structural environment scoring scheme described by Bowie and co-workers for predicting protein secondary structure. We have improved their scoring system by taking into consideration N and C-terminal positions of alpha-helices and beta-strands and also beta-turns as distinctive types of secondary structure. Another improvement, which also decreases the time of computation, is performed by restricting a data base with a smaller subset of proteins that are similar with a query sequence. Using multiple sequence alignments rather than single sequences and a simple jury decision procedure our method reaches a sustained overall three-state accuracy of 72.2%, which is better than that observed for the most accurate multilayered neural-network approach, tested on the same data set of 126 non-homologous protein chains. PMID:7897654

Salamov, A A; Solovyev, V V



Sequence context effect on the structure of nitrous acid induced DNA interstrand cross-links  

Digital Repository Infrastructure Vision for European Research (DRIVER)

In the preceding paper in this journal, we described the solution structure of the nitrous acid cross-linked dodecamer duplex [d(GCATCCGGATGC)]2 (the cross-linked guanines are underlined). The structure revealed that the cross-linked guanines form a nearly planar covalently linked ‘G:G base pair’, with the complementary partner cytidines flipped out of the helix. Here we explore the flanking sequence context effect on the structure of nitrous acid cross-links in [d(CG)]2 and the factors a...

Edfeldt, N. B. Fredrik; Harwood, Eric A.; Sigurdsson, Snorri Th; Hopkins, Paul B.; Reid, Brian R.



Polaron in a quasi 1D cylindrical quantum wire  

Directory of Open Access Journals (Sweden)

Full Text Available Polaron states in a quasi 1D cylindrical quantum wire with a parabolic confinement potential are investigated applying the Feynman variational principle. The effect of the wire radius on the polaron ground state energy level, the mass and the Fröhlich electron-phonon-coupling constant are obtained for the case of a quasi 1D cylindrical quantum wire. The effect of anisotropy of the structure on the polaron ground state energy level and the mass are also investigated. It is observed that as the wire radius tends to zero, the polaron mass and energy diverge logarithmically. The polaron mass and energy differ from the canonical strong-coupling behavior by the Fröhlich electron-phonon coupling constant and the radius of the quasi 1D cylindrical quantum wire that are expressed through a logarithmic function. Moreover, it is observed that the polaron energy and mass for strong coupling for the case of the quasi 1D cylindrical quantum wire are greater than those for bulk crystals. It is also observed that the anisotropy of the structure considerably affects both the polaron ground state energy level and the mass. It is found that as the radius of the cylindrical wire reduces, the regimes of the weak and intermediate coupling polaron shorten while the region of the strong coupling polaron broadens and extends into those of the weak and intermediate ones. Analytic expressions for the polaron ground state energy level and mass are derived for the case of strong coupling polarons.




The primary structure of Escherichia coli RNA polymerase. Nucleotide sequence of the rpoB gene and amino-acid sequence of the beta-subunit. (United States)

The combined structural study of proteins and of their corresponding genes utilizing the methods of both protein and nucleotide chemistry greatly accelerates and considerably simplifies both the nucleotide and protein structure determination and, in particular, enhances the reliability of the analysis. This approach has been successfully applied in the primary structure determination of the beta and beta' subunits of Escherichia coli DNA-dependent RNA polymerase and of their structural genes, yielding a continuous nucleotide sequence (4714 base pairs) that embraces the entire rpoB gene, the initial part of the rpoC gene and the intercistronic region, together with the total amino acid sequence of the beta subunit, comprising 1342 residues, and the N-terminal sequence of the beta' subunit (176 residues). PMID:6266829

Ovchinnikov, Y A; Monastyrskaya, G S; Gubanov, V V; Guryev, S O; Chertov OYu; Modyanov, N N; Grinkevich, V A; Makarova, I A; Marchenko, T V; Polovnikova, I N; Lipkin, V M; Sverdlov, E D



Sequences attaching loops of nuclear and mitochondrial DNA to underlying structures in human cells: the role of transcription units.  

Digital Repository Infrastructure Vision for European Research (DRIVER)

DNA sequences attaching loops of nuclear and mitochondrial DNA to underlying structures in HeLa cells have been cloned and 106 representative clones sequenced; 10 clones containing random genomic fragments served as controls. As chromatin is prone to rearrangement, care was taken to isolate sequences using 'physiological' conditions that did not create additional attachments. Comparison (by Southern blotting) of the concentration of each cloned sequence in 'total' and 'attached' fractions of ...



Structural and sequence analysis of imelysin-like proteins implicated in bacterial iron uptake. (United States)

Imelysin-like proteins define a superfamily of bacterial proteins that are likely involved in iron uptake. Members of this superfamily were previously thought to be peptidases and were included in the MEROPS family M75. We determined the first crystal structures of two remotely related, imelysin-like proteins. The Psychrobacter arcticus structure was determined at 2.15 Å resolution and contains the canonical imelysin fold, while higher resolution structures from the gut bacteria Bacteroides ovatus, in two crystal forms (at 1.25 Å and 1.44 Å resolution), have a circularly permuted topology. Both structures are highly similar to each other despite low sequence similarity and circular permutation. The all-helical structure can be divided into two similar four-helix bundle domains. The overall structure and the GxHxxE motif region differ from known HxxE metallopeptidases, suggesting that imelysin-like proteins are not peptidases. A putative functional site is located at the domain interface. We have now organized the known homologous proteins into a superfamily, which can be separated into four families. These families share a similar functional site, but each has family-specific structural and sequence features. These results indicate that imelysin-like proteins have evolved from a common ancestor, and likely have a conserved function. PMID:21799754

Xu, Qingping; Rawlings, Neil D; Farr, Carol L; Chiu, Hsiu-Ju; Grant, Joanna C; Jaroszewski, Lukasz; Klock, Heath E; Knuth, Mark W; Miller, Mitchell D; Weekes, Dana; Elsliger, Marc-André; Deacon, Ashley M; Godzik, Adam; Lesley, Scott A; Wilson, Ian A



Multi-scale coding of genomic information: From DNA sequence to genome structure and function  

International Nuclear Information System (INIS)

Understanding how chromatin is spatially and dynamically organized in the nucleus of eukaryotic cells and how this affects genome functions is one of the main challenges of cell biology. Since the different orders of packaging in the hierarchical organization of DNA condition the accessibility of DNA sequence elements to trans-acting factors that control the transcription and replication processes, there is actually a wealth of structural and dynamical information to learn in the primary DNA sequence. In this review, we show that when using concepts, methodologies, numerical and experimental techniques coming from statistical mechanics and nonlinear physics combined with wavelet-based multi-scale signal processing, we are able to decipher the multi-scale sequence encoding of chromatin condensation-decondensation mechanisms that play a fundamental role in regulating many molecular processes involved in nuclear functions.



Structure and patterns of sequence variation in the mitochondrial DNA control region of the great cats. (United States)

Mitochondrial DNA control region structure and variation were determined in the five species of the genus Panthera. Comparative analyses revealed two hypervariable segments, a central conserved region, and the occurrence of size and sequence heteroplasmy. As observed in the domestic cat, but not commonly seen in other animals, two repetitive sequence arrays (RS-2 with an 80-bp motif and RS-3 with a 6-10-bp motif) were identified. The 3' ends of RS-2 and RS-3 were highly conserved among species, suggesting that these motifs have different functional constraints. Control region sequences provided improved phylogenetic resolution grouping the sister taxa lion (Panthera leo) and leopard (Panthera pardus), with the jaguar (Panthera onca). PMID:16120284

Jae-Heup, K; Eizirik, E; O'Brien, S J; Johnson, W E



Genome sequence, comparative analysis and haplotype structure of the domestic dog. (United States)

Here we report a high-quality draft genome sequence of the domestic dog (Canis familiaris), together with a dense map of single nucleotide polymorphisms (SNPs) across breeds. The dog is of particular interest because it provides important evolutionary information and because existing breeds show great phenotypic diversity for morphological, physiological and behavioural traits. We use sequence comparison with the primate and rodent lineages to shed light on the structure and evolution of genomes and genes. Notably, the majority of the most highly conserved non-coding sequences in mammalian genomes are clustered near a small subset of genes with important roles in development. Analysis of SNPs reveals long-range haplotypes across the entire dog genome, and defines the nature of genetic diversity within and across breeds. The current SNP map now makes it possible for genome-wide association studies to identify genes responsible for diseases and traits, with important consequences for human and companion animal health. PMID:16341006

Lindblad-Toh, Kerstin; Wade, Claire M; Mikkelsen, Tarjei S; Karlsson, Elinor K; Jaffe, David B; Kamal, Michael; Clamp, Michele; Chang, Jean L; Kulbokas, Edward J; Zody, Michael C; Mauceli, Evan; Xie, Xiaohui; Breen, Matthew; Wayne, Robert K; Ostrander, Elaine A; Ponting, Chris P; Galibert, Francis; Smith, Douglas R; DeJong, Pieter J; Kirkness, Ewen; Alvarez, Pablo; Biagi, Tara; Brockman, William; Butler, Jonathan; Chin, Chee-Wye; Cook, April; Cuff, James; Daly, Mark J; DeCaprio, David; Gnerre, Sante; Grabherr, Manfred; Kellis, Manolis; Kleber, Michael; Bardeleben, Carolyne; Goodstadt, Leo; Heger, Andreas; Hitte, Christophe; Kim, Lisa; Koepfli, Klaus-Peter; Parker, Heidi G; Pollinger, John P; Searle, Stephen M J; Sutter, Nathan B; Thomas, Rachael; Webber, Caleb; Baldwin, Jennifer; Abebe, Adal; Abouelleil, Amr; Aftuck, Lynne; Ait-Zahra, Mostafa; Aldredge, Tyler; Allen, Nicole; An, Peter; Anderson, Scott; Antoine, Claudel; Arachchi, Harindra; Aslam, Ali; Ayotte, Laura; Bachantsang, Pasang; Barry, Andrew; Bayul, Tashi; Benamara, Mostafa; Berlin, Aaron; Bessette, Daniel; Blitshteyn, Berta; Bloom, Toby; Blye, Jason; Boguslavskiy, Leonid; Bonnet, Claude; Boukhgalter, Boris; Brown, Adam; Cahill, Patrick; Calixte, Nadia; Camarata, Jody; Cheshatsang, Yama; Chu, Jeffrey; Citroen, Mieke; Collymore, Alville; Cooke, Patrick; Dawoe, Tenzin; Daza, Riza; Decktor, Karin; DeGray, Stuart; Dhargay, Norbu; Dooley, Kimberly; Dooley, Kathleen; Dorje, Passang; Dorjee, Kunsang; Dorris, Lester; Duffey, Noah; Dupes, Alan; Egbiremolen, Osebhajajeme; Elong, Richard; Falk, Jill; Farina, Abderrahim; Faro, Susan; Ferguson, Diallo; Ferreira, Patricia; Fisher, Sheila; FitzGerald, Mike; Foley, Karen; Foley, Chelsea; Franke, Alicia; Friedrich, Dennis; Gage, Diane; Garber, Manuel; Gearin, Gary; Giannoukos, Georgia; Goode, Tina; Goyette, Audra; Graham, Joseph; Grandbois, Edward; Gyaltsen, Kunsang; Hafez, Nabil; Hagopian, Daniel; Hagos, Birhane; Hall, Jennifer; Healy, Claire; Hegarty, Ryan; Honan, Tracey; Horn, Andrea; Houde, Nathan; Hughes, Leanne; Hunnicutt, Leigh; Husby, M; Jester, Benjamin; Jones, Charlien; Kamat, Asha; Kanga, Ben; Kells, Cristyn; Khazanovich, Dmitry; Kieu, Alix Chinh; Kisner, Peter; Kumar, Mayank; Lance, Krista; Landers, Thomas; Lara, Marcia; Lee, William; Leger, Jean-Pierre; Lennon, Niall; Leuper, Lisa; LeVine, Sarah; Liu, Jinlei; Liu, Xiaohong; Lokyitsang, Yeshi; Lokyitsang, Tashi; Lui, Annie; Macdonald, Jan; Major, John; Marabella, Richard; Maru, Kebede; Matthews, Charles; McDonough, Susan; Mehta, Teena; Meldrim, James; Melnikov, Alexandre; Meneus, Louis; Mihalev, Atanas; Mihova, Tanya; Miller, Karen; Mittelman, Rachel; Mlenga, Valentine; Mulrain, Leonidas; Munson, Glen; Navidi, Adam; Naylor, Jerome; Nguyen, Tuyen; Nguyen, Nga; Nguyen, Cindy; Nguyen, Thu; Nicol, Robert; Norbu, Nyima; Norbu, Choe; Novod, Nathaniel; Nyima, Tenchoe; Olandt, Peter; O'Neill, Barry; O'Neill, Keith; Osman, Sahal; Oyono, Lucien; Patti, Christopher; Perrin, Danielle; Phunkhang, Pema; Pierre, Fritz; Priest, Margaret; Rachupka, Anthony; Raghuraman, Sujaa; Rameau, Rayale; Ray, Verneda; Raymond, Christina; Rege, Filip; Rise, Cecil; Rogers, Julie; Rogov, Peter; Sahalie, Julie; Settipalli, Sampath; Sharpe, Theodore; Shea, Terrance; Sheehan, Mechele; Sherpa, Ngawang; Shi, Jianying; Shih, Diana; Sloan, Jessie; Smith, Cherylyn; Sparrow, Todd; Stalker, John; Stange-Thomann, Nicole; Stavropoulos, Sharon; Stone, Catherine; Stone, Sabrina; Sykes, Sean; Tchuinga, Pierre; Tenzing, Pema; Tesfaye, Senait; Thoulutsang, Dawa; Thoulutsang, Yama; Topham, Kerri; Topping, Ira; Tsamla, Tsamla; Vassiliev, Helen; Venkataraman, Vijay; Vo, Andy; Wangchuk, Tsering; Wangdi, Tsering; Weiand, Michael; Wilkinson, Jane; Wilson, Adam; Yadav, Shailendra; Yang, Shuli; Yang, Xiaoping; Young, Geneva; Yu, Qing; Zainoun, Joanne; Zembek, Lisa; Zimmer, Andrew; Lander, Eric S



Phosphorylation-dependent PIH1D1 interactions define substrate specificity of the R2TP cochaperone complex. (United States)

The R2TP cochaperone complex plays a critical role in the assembly of multisubunit machines, including small nucleolar ribonucleoproteins (snoRNPs), RNA polymerase II, and the mTORC1 and SMG1 kinase complexes, but the molecular basis of substrate recognition remains unclear. Here, we describe a phosphopeptide binding domain (PIH-N) in the PIH1D1 subunit of the R2TP complex that preferentially binds to highly acidic phosphorylated proteins. A cocrystal structure of a PIH-N domain/TEL2 phosphopeptide complex reveals a highly specific phosphopeptide recognition mechanism in which Lys57 and 64 in PIH1D1, along with a conserved DpSDD phosphopeptide motif within TEL2, are essential and sufficient for binding. Proteomic analysis of PIH1D1 interactors identified R2TP complex substrates that are recruited by the PIH-N domain in a sequence-specific and phosphorylation-dependent manner suggestive of a common mechanism of substrate recognition. We propose that protein complexes assembled by the R2TP complex are defined by phosphorylation of a specific motif and recognition by the PIH1D1 subunit. PMID:24656813

Ho?ejší, Zuzana; Stach, Lasse; Flower, Thomas G; Joshi, Dhira; Flynn, Helen; Skehel, J Mark; O'Reilly, Nicola J; Ogrodowicz, Roksana W; Smerdon, Stephen J; Boulton, Simon J



1D design style implications for mask making and CEBL (United States)

At advanced nodes, CMOS logic is being designed in a highly regular design style because of the resolution limitations of optical lithography equipment. Logic and memory layouts using 1D Gridded Design Rules (GDR) have been demonstrated to nodes beyond 12nm.[1-4] Smaller nodes will require the same regular layout style but with multiple patterning for critical layers. One of the significant advantages of 1D GDR is the ease of splitting layouts into lines and cuts. A lines and cuts approach has been used to achieve good pattern fidelity and process margin to below 12nm.[4] Line scaling with excellent line-edge roughness (LER) has been demonstrated with self-aligned spacer processing.[5] This change in design style has important implications for mask making: • The complexity of the masks will be greatly reduced from what would be required for 2D designs with very complex OPC or inverse lithography corrections. • The number of masks will initially increase, as for conventional multiple patterning. But in the case of 1D design, there are future options for mask count reduction. • The line masks will remain simple, with little or no OPC, at pitches (1x) above 80nm. This provides an excellent opportunity for continual improvement of line CD and LER. The line pattern will be processed through a self-aligned pitch division sequence to divide pitch by 2 or by 4. • The cut masks can be done with "simple OPC" as demonstrated to beyond 12nm.[6] Multiple simple cut masks may be required at advanced nodes. "Coloring" has been demonstrated to below 12nm for two colors and to 8nm for three colors. • Cut/hole masks will eventually be replaced by e-beam direct write using complementary e-beam lithography (CEBL).[7-11] This transition is gated by the availability of multiple column e-beam systems with throughput adequate for high- volume manufacturing. A brief description of 1D and 2D design styles will be presented, followed by examples of 1D layouts. Mask complexity for 1D layouts patterned directly will be compared to mask complexity for lines and cuts at nodes larger than 20nm. No such comparison is possible below 20nm since single-patterning does not work below ~80nm pitch using optical exposure tools. Also discussed will be recently published wafer results for line patterns with pitch division by-2 and by-4 at sub-12nm nodes, plus examples of post-etch results for 1D patterns done with cut masks and compared to cuts exposed by a single-column e-beam direct write system.

Smayling, Michael C.



SARA-Coffee web server, a tool for the computation of RNA sequence and structure multiple alignments. (United States)

This article introduces the SARA-Coffee web server; a service allowing the online computation of 3D structure based multiple RNA sequence alignments. The server makes it possible to combine sequences with and without known 3D structures. Given a set of sequences SARA-Coffee outputs a multiple sequence alignment along with a reliability index for every sequence, column and aligned residue. SARA-Coffee combines SARA, a pairwise structural RNA aligner with the R-Coffee multiple RNA aligner in a way that has been shown to improve alignment accuracy over most sequence aligners when enough structural data is available. The server can be accessed from PMID:24972831

Tommaso, Paolo Di; Bussotti, Giovanni; Kemena, Carsten; Capriotti, Emidio; Chatzou, Maria; Prieto, Pablo; Notredame, Cedric



Multifractal properties of the structure factor of a class of substitutional sequences  

International Nuclear Information System (INIS)

We show how to estimate the multifractal generating functional of the structure factor of a class of substitutional sequences. These sequences are sequences of two elements, a and b, and are generated by the repetitive application of the rules a??1(a,b) and b??2(a,b), with the ?'s consisting of strings of a's and b's. We restrict ourselves to the case in which ?1 and ?2 each contain R elements. This set includes the case of the Thue-Morse sequence. Subject to a technical assumption, we present a systematic approximation scheme for the multifractal generating functional, in which the lowest-order approximation is the generating functional of an R-scale Cantor set. As a by-product of our analysis, we demonstrate the existence of a discontinuity in the multifractal spectrum of the Thue-Morse sequence, and explain the origin of that discontinuity. We present explicit results for two examples, including the Thue-Morse case, and compare the results of our approximations for the Thue-Morse system with numerical simulations



High-resolution NMR structure of an AT-rich DNA sequence  

International Nuclear Information System (INIS)

We have determined, by proton NMR and complete relaxation matrix methods, the high-resolution structure of a DNA oligonucleotide in solution with nine contiguous AT base pairs. The stretch of AT pairs, TAATTATAA.TTATAATTA, is imbedded in a 27-nucleotide stem-and-loop construct, which is stabilized by terminal GC base pairs and an extraordinarily stable DNA loop GAA (Hirao et al., 1994, Nucleic Acids Res.22, 576-582). The AT-rich sequence has three repeated TAA.TTA motifs, one in the reverse orientation. Comparison of the local conformations of the three motifs shows that the sequence context has a minor effect here: atomic RMSD between the three TAA.TTA fragments is 0.4-0.5 A, while each fragment is defined within the RMSD of 0.3-0.4 A. The AT-rich stem also contains a consensus sequence for the Pribnow box, TATAAT. The TpA, ApT, and TpT.ApA steps have characteristic local conformations, a combination of which determines a unique sequence-dependent pattern of minor groove width variation. All three TpA steps are locally bent in the direction compressing the major groove of DNA. These bends, however, compensate each other, because of their relative position in the sequence, so that the overall helical axis is essentially straight



Describing sequencing results of structural chromosome rearrangements with a suggested next-generation cytogenetic nomenclature. (United States)

With recent rapid advances in genomic technologies, precise delineation of structural chromosome rearrangements at the nucleotide level is becoming increasingly feasible. In this era of "next-generation cytogenetics" (i.e., an integration of traditional cytogenetic techniques and next-generation sequencing), a consensus nomenclature is essential for accurate communication and data sharing. Currently, nomenclature for describing the sequencing data of these aberrations is lacking. Herein, we present a system called Next-Gen Cytogenetic Nomenclature, which is concordant with the International System for Human Cytogenetic Nomenclature (2013). This system starts with the alignment of rearrangement sequences by BLAT or BLAST (alignment tools) and arrives at a concise and detailed description of chromosomal changes. To facilitate usage and implementation of this nomenclature, we are developing a program designated BLA(S)T Output Sequence Tool of Nomenclature (BOSToN), a demonstrative version of which is accessible online. A standardized characterization of structural chromosomal rearrangements is essential both for research analyses and for application in the clinical setting. PMID:24746958

Ordulu, Zehra; Wong, Kristen E; Currall, Benjamin B; Ivanov, Andrew R; Pereira, Shahrin; Althari, Sara; Gusella, James F; Talkowski, Michael E; Morton, Cynthia C



Sequence, Structural and Expression Divergence of Duplicate Genes in the Bovine Genome (United States)

Gene duplication is a widespread phenomenon in genome evolution, and it has been proposed to serve as an engine of evolutionary innovation. In the present study, we performed the first comprehensive analysis of duplicate genes in the bovine genome. A total of 3131 putative duplicated gene pairs were identified, including 712 cattle-specific duplicate gene pairs unevenly distributed across the genome, which are significantly enriched for specific biological functions including immunity, growth, digestion, reproduction, embryonic development, inflammatory response, and defense response to bacterium. Around 97.1% (87.8%) of (cattle-specific) duplicate gene pairs were found to have distinct exon-intron structures. Analysis of gene expression by RNA-Seq and sequence divergence (synonymous or non-synonymous) revealed that expression divergence is correlated with sequence divergence, as has been previously observed in other species. This analysis also led to the identification of a subset of cattle-specific duplicate gene pairs exhibiting very high expression divergence. Interestingly, further investigation revealed a significant relationship between structural and expression divergence while controlling for the effect of synonymous sequence divergence. Together these results provide further insight into duplicate gene sequence and expression divergence in cattle, and their potential contributions to phenotypic divergence.

Liao, Xiaoping; Bao, Hua; Meng, Yan; Plastow, Graham; Moore, Stephen; Stothard, Paul



Syntheses, structures and electrochemical properties of a class of 1-D double chain polyoxotungstate hybrids [H(2)dap][Cu(dap)(2)](0.5)[Cu(dap)(2)(H2O)][Ln(H(2)O)3(?-GeW(11)O(39))]·3H(2)O. (United States)

A series of novel organic-inorganic hybrid 1-D double chain germanotungstates [H2dap][Cu(dap)2]0.5[Cu(dap)2(H2O)][Ln(H2O)3(?-GeW11O39)]·3H2O [Ln = La(III) (1), Pr(III) (2), Nd(III) (3), Sm(III) (4), Eu(III) (5), Tb(III) (6), Er(III) (7)] (dap = 1,2-diaminopropane) have been hydrothermally prepared and structurally characterized by elemental analyses, powder X-ray diffraction (PXRD), IR spectra, thermogravimetric (TG) analyses, X-ray photoelectron spectroscopy (XPS) and single-crystal X-ray diffraction. The most prominent structural feature of 1-7 is that the [Ln(H2O)3(?-GeW11O39)](5-) moieties are firstly connected with each other via the W-O-Ln-O-W bridges creating a 1-D {[Cu(dap)2(H2O)][Ln(H2O)3(?-GeW11O39)]}n(3n-) polymeric chain and then two adjacent antiparallel 1-D polymeric chains are linked together through [Cu(dap)2](2+) linkages giving rise to the rare organic-inorganic hybrid 1-D Cu(II)-Ln(III) heterometallic double-chain architectures. To the best of our knowledge, 1-7 represent the first 1-D double-chain Cu(II)-Ln(III) heterometallic germanotungstates. The variable-temperature magnetic susceptibilities of 2, 4 and 7 have been investigated. Furthermore, the solid-state electrochemical and electro-catalytic properties of 3 and 4 have been measured in 0.5 mol L(-1) Na2SO4 + H2SO4 aqueous solution by entrapping them in a carbon paste electrode. 3 and 4 display apparent electro-catalytic activities for nitrite, bromate and hydrogen peroxide reduction. PMID:24554042

Zhao, Jun-Wei; Li, Yan-Zhou; Ji, Fan; Yuan, Jing; Chen, Li-Juan; Yang, Guo-Yu



Sequence context effect on the structure of nitrous acid induced DNA interstrand cross-links. (United States)

In the preceding paper in this journal, we described the solution structure of the nitrous acid cross-linked dodecamer duplex [d(GCATCCGGATGC)]2 (the cross-linked guanines are underlined). The structure revealed that the cross-linked guanines form a nearly planar covalently linked 'G:G base pair', with the complementary partner cytidines flipped out of the helix. Here we explore the flanking sequence context effect on the structure of nitrous acid cross-links in [d(CG)]2 and the factors allowing the extrahelical cytidines to adopt such fixed positions in the minor groove. We have used NMR spectroscopy to determine the solution structure of a second cross-linked dodecamer duplex, [d(CGCTACGTAGCG)]2, which shows that the identity of the flanking base pairs significantly alters the stacking patterns and phosphate backbone conformations. The cross-linked guanines are now stacked well on adenines preceding the extrahelical cytidines, illustrating the importance of purine- purine base stacking. Observation of an imino proton resonance at 15.6 p.p.m. provides evidence for hydrogen bonding between the two cross-linked guanines. Preliminary structural studies on the cross-linked duplex [d(CGCGACGTCGCG)]2 show that the extrahelical cytidines are very mobile in this sequence context. We suggest that favorable van der Waals interactions between the cytidine and the adenine 2 bp away from the cross-link localize the cytidines in the previous cross-linked structures. PMID:15155848

Edfeldt, N B Fredrik; Harwood, Eric A; Sigurdsson, Snorri Th; Hopkins, Paul B; Reid, Brian R



De novo prediction of structured RNAs from genomic sequences  

DEFF Research Database (Denmark)

Growing recognition of the numerous, diverse and important roles played by non-coding RNA in all organisms motivates better elucidation of these cellular components. Comparative genomics is a powerful tool for this task and is arguably preferable to any high-throughput experimental technology currently available, because evolutionary conservation highlights functionally important regions. Conserved secondary structure, rather than primary sequence, is the hallmark of many functionally important RNAs, because compensatory substitutions in base-paired regions preserve structure. Unfortunately, such substitutions also obscure sequence identity and confound alignment algorithms, which complicates analysis greatly. This paper surveys recent computational advances in this difficult arena, which have enabled genome-scale prediction of cross-species conserved RNA elements. These predictions suggest that a wealth of these elements indeed exist

Gorodkin, Jan; Hofacker, Ivo L.



Sequence and structural characterization of Trx-Grx type of monothiol glutaredoxins from Ashbya gossypii (United States)

Glutaredoxins are enzymatic antioxidants which are small, ubiquitous, glutathione dependent and essentially classified under thioredoxin-fold superfamily. Glutaredoxins are classified into two types: dithiol and monothiol. Monothiol glutaredoxins which carry the signature “CGFS“ as a redox active motif is known for its role in oxidative stress, inside the cell. In the present analysis, the 138 amino acid long monothiol glutaredoxin, AgGRX1 from Ashbya gossypii was identified and has been used for the analysis. The multiple sequence alignment of the AgGRX1 protein sequence revealed the characteristic motif of typical monothiol glutaredoxin as observed in various other organisms. The proposed structure of the AgGRX1 protein was used to analyze signature folds related to the thioredoxin superfamily. Further, the study highlighted the structural features pertaining to the complex mechanism of glutathione docking and interacting residues.

Yadav, Saurabh; Kumari, Pragati; Kushwaha, Hemant Ritturaj



Efficient pairwise RNA structure prediction and alignment using sequence alignment constraints  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Abstract Background We are interested in the problem of predicting secondary structure for small sets of homologous RNAs, by incorporating limited comparative sequence information into an RNA folding model. The Sankoff algorithm for simultaneous RNA folding and alignment is a basis for approaches to this problem. There are two open problems in applying a Sankoff algorithm: development of a good unified scoring system for alignment and folding and development of practical heur...

Dowell Robin D; Eddy Sean R



Population Structure and Properties of Candida albicans, as Determined by Multilocus Sequence Typing†  

Digital Repository Infrastructure Vision for European Research (DRIVER)

We submitted a panel of 416 isolates of Candida albicans from separate sources to multilocus sequence typing (MLST). The data generated determined a population structure in which four major clades of closely related isolates were delineated, together with eight minor clades comprising five or more isolates. By Fisher's exact test, a statistically significant association was found between particular clades and the anatomical source, geographical source, ABC genotype, decade of isolation, and h...



A rostro-caudal gradient of structured sequence processing in the left inferior frontal gyrus  

Digital Repository Infrastructure Vision for European Research (DRIVER)

In this paper, we present two novel perspectives on the function of the left inferior frontal gyrus (LIFG). First, a structured sequence processing perspective facilitates the search for functional segregation within the LIFG and provides a way to express common aspects across cognitive domains including language, music and action. Converging evidence from functional magnetic resonance imaging and transcranial magnetic stimulation studies suggests that the LIFG is engaged in sequential proces...

Udde?n, Julia; Bahlmann, Jo?rg



The Transmembrane Domain Sequence Affects the Structure and Function of the Newcastle Disease Virus Fusion Protein ?  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The role of specific sequences in the transmembrane (TM) domain of Newcastle disease virus (NDV) fusion (F) protein in the structure and function of this protein was assessed by replacing this domain with the F protein TM domains from two other paramyxoviruses, Sendai virus (SV) and measles virus (MV), or the TM domain of the unrelated glycoprotein (G) of vesicular stomatitis virus (VSV). Mutant proteins with the SV or MV F protein TM domains were expressed, transported to cell surfaces, and ...



Self-Optimizing Control Structures for Active Constraint Regions of a Sequence of Distillation Columns  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Investigating and mapping active constraint regions for processes, and subsequently finding control structures for each region, is vital for their optimal operation. In this work, active constraint regions of three different case studies for the distillation process have been investigated: • A single distillation column with constant product prices. • A single distillation column with purity dependent prices. • Two distillation columns in sequence with constant prices. The active constr...



Recombining Population Structure of Plesiomonas shigelloides (Enterobacteriaceae) Revealed by Multilocus Sequence Typing? †  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Plesiomonas shigelloides is an emerging pathogen that is widespread in the aquatic environment and is responsible for intestinal diseases and extraintestinal infections in humans and other animals. Virtually nothing is known about its genetic diversity, population structure, and evolution, which severely limits epidemiological control. We addressed these questions by developing a multilocus sequence typing (MLST) system based on five genes (fusA, leuS, pyrG, recG, and rpoB) and analyzing 77 e...

Salerno, Anna; Dele?toile, Alexis; Lefevre, Martine; Ciznar, Ivan; Krovacek, Karel; Grimont, Patrick; Brisse, Sylvain



Revised Mimivirus major capsid protein sequence reveals intron-containing gene structure and extra domain  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Acanthamoebae polyphaga Mimivirus (APM is the largest known dsDNA virus. The viral particle has a nearly icosahedral structure with an internal capsid shell surrounded with a dense layer of fibrils. A Capsid protein sequence, D13L, was deduced from the APM L425 coding gene and was shown to be the most abundant protein found within the viral particle. However this protein remained poorly characterised until now. A revised protein sequence deposited in a database suggested an additional N-terminal stretch of 142 amino acids missing from the original deduced sequence. This result led us to investigate the L425 gene structure and the biochemical properties of the complete APM major Capsid protein. Results This study describes the full length 3430 bp Capsid coding gene and characterises the 593 amino acids long corresponding Capsid protein 1. The recombinant full length protein allowed the production of a specific monoclonal antibody able to detect the Capsid protein 1 within the viral particle. This protein appeared to be post-translationnally modified by glycosylation and phosphorylation. We proposed a secondary structure prediction of APM Capsid protein 1 compared to the Capsid protein structure of Paramecium Bursaria Chlorella Virus 1, another member of the Nucleo-Cytoplasmic Large DNA virus family. Conclusion The characterisation of the full length L425 Capsid coding gene of Acanthamoebae polyphaga Mimivirus provides new insights into the structure of the main Capsid protein. The production of a full length recombinant protein will be useful for further structural studies.

Suzan-Monti Marie



PETcofold : predicting conserved interactions and structures of two multiple alignments of RNA sequences  

DEFF Research Database (Denmark)

MOTIVATION: Predicting RNA-RNA interactions is essential for determining the function of putative non-coding RNAs. Existing methods for the prediction of interactions are all based on single sequences. Since comparative methods have already been useful in RNA structure determination, we assume that conserved RNA-RNA interactions also imply conserved function. Of these, we further assume that a non-negligible amount of the existing RNA-RNA interactions have also acquired compensating base changes throughout evolution. We implement a method, PETcofold, that can take covariance information in intra-molecular and inter-molecular base pairs into account to predict interactions and secondary structures of two multiple alignments of RNA sequences. RESULTS: PETcofold's ability to predict RNA-RNA interactions was evaluated on a carefully curated dataset of 32 bacterial small RNAs and their targets, which was manually extracted from the literature. For evaluation of both RNA-RNA interaction and structure prediction, we were able to extract only a few high-quality examples: one vertebrate small nucleolar RNA and four bacterial small RNAs. For these we show that the prediction can be improved by our comparative approach. Furthermore, PETcofold was evaluated on controlled data with phylogenetically simulated sequences enriched for covariance patterns at the interaction sites. We observed increased performance with increased amounts of covariance. AVAILABILITY: The program PETcofold is available as source code and can be downloaded from SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

Seemann, Ernst Stefan; Richter, Andreas S.



Prediction of protein secondary structure with a reliability score estimated by local sequence clustering. (United States)

Most algorithms for protein secondary structure prediction are based on machine learning techniques, e.g. neural networks. Good architectures and learning methods have improved the performance continuously. The introduction of profile methods, e.g. PSI-BLAST, has been a major breakthrough in increasing the prediction accuracy to close to 80%. In this paper, a brute-force algorithm is proposed and the reliability of each prediction is estimated by a z-score based on local sequence clustering. This algorithm is intended to perform well for those secondary structures in a protein whose formation is mainly dominated by the neighboring sequences and short-range interactions. A reliability z-score has been defined to estimate the goodness of a putative cluster found for a query sequence in a database. The database for prediction was constructed by experimentally determined, non-redundant protein structures with nearest neighbor methods, performed very well within the expectation of previous methods and that the reliability z-score as defined was correlated with the reliability of prediction. This led to the possibility of making very accurate predictions for a few selected residues in a protein with an accuracy measure of Q3 > 80%. The further development of this algorithm, and a nucleation mechanism for protein folding are suggested. PMID:14560050

Jiang, Fan



From sequence to structure and back again: approaches for predicting protein-DNA binding  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Gene regulation in higher organisms is achieved by a complex network of transcription factors (TFs. Modulating gene expression and exploring gene function are major aims in molecular biology. Furthermore, the identification of putative target genes for a certain TF serve as powerful tools for specific targeting of rational drugs. Detecting the short and variable transcription factor binding sites (TFBSs in genomic DNA is an intriguing challenge for computational and structural biologists. Fast and reliable computational methods for predicting TFBSs on a whole-genome scale offer several advantages compared to the current experimental methods that are rather laborious and slow. Two main approaches are being explored, advanced sequence-based algorithms and structure-based methods. The aim of this review is to outline the computational and experimental methods currently being applied in the field of protein-DNA interactions. With a focus on the former, the current state of the art in modeling these interactions is discussed. Surveying sequence and structure-based methods for predicting TFBSs, we conclude that in order to achieve a sound and specific method applicable on genomic sequences it is desirable and important to bring these two approaches together.

Kohlbacher Oliver



Fast computational methods for predicting protein structure from primary amino acid sequence  

Energy Technology Data Exchange (ETDEWEB)

The present invention provides a method utilizing primary amino acid sequence of a protein, energy minimization, molecular dynamics and protein vibrational modes to predict three-dimensional structure of a protein. The present invention also determines possible intermediates in the protein folding pathway. The present invention has important applications to the design of novel drugs as well as protein engineering. The present invention predicts the three-dimensional structure of a protein independent of size of the protein, overcoming a significant limitation in the prior art.

Agarwal, Pratul Kumar (Knoxville, TN)



Quantitative Correlation between the protein primary sequences and secondary structures in spider dragline silks. (United States)

Synthetic spider silk holds great potential for use in various applications spanning medical uses to ultra lightweight armor; however, producing synthetic fibers with mechanical properties comparable to natural spider silk has eluded the scientific community. Natural dragline spider silks are commonly made from proteins that contain highly repetitive amino acid motifs, adopting an array of secondary structures. Before further advances can be made in the production of synthetic fibers based on spider silk proteins, it is imperative to know the percentage of each amino acid in the protein that forms a specific secondary structure. Linking these percentages to the primary amino acid sequence of the protein will establish a structural foundation for synthetic silk. In this study, nuclear magnetic resonance (NMR) techniques are used to quantify the percentage of Ala, Gly, and Ser that form both beta-sheet and helical secondary structures. The fraction of these three amino acids and their secondary structure are quantitatively correlated to the primary amino acid sequence for the proteins that comprise major and minor ampullate silk from the Nephila clavipes spider providing a blueprint for synthetic spider silks. PMID:20000730

Jenkins, Janelle E; Creager, Melinda S; Lewis, Randolph V; Holland, Gregory P; Yarger, Jeffery L



Ribosomal DNA sequence heterogeneity reflects intraspecies phylogenies and predicts genome structure in two contrasting yeast species. (United States)

The ribosomal RNA encapsulates a wealth of evolutionary information, including genetic variation that can be used to discriminate between organisms at a wide range of taxonomic levels. For example, the prokaryotic 16S rDNA sequence is very widely used both in phylogenetic studies and as a marker in metagenomic surveys and the internal transcribed spacer region, frequently used in plant phylogenetics, is now recognized as a fungal DNA barcode. However, this widespread use does not escape criticism, principally due to issues such as difficulties in classification of paralogous versus orthologous rDNA units and intragenomic variation, both of which may be significant barriers to accurate phylogenetic inference. We recently analyzed data sets from the Saccharomyces Genome Resequencing Project, characterizing rDNA sequence variation within multiple strains of the baker's yeast Saccharomyces cerevisiae and its nearest wild relative Saccharomyces paradoxus in unprecedented detail. Notably, both species possess single locus rDNA systems. Here, we use these new variation datasets to assess whether a more detailed characterization of the rDNA locus can alleviate the second of these phylogenetic issues, sequence heterogeneity, while controlling for the first. We demonstrate that a strong phylogenetic signal exists within both datasets and illustrate how they can be used, with existing methodology, to estimate intraspecies phylogenies of yeast strains consistent with those derived from whole-genome approaches. We also describe the use of partial Single Nucleotide Polymorphisms, a type of sequence variation found only in repetitive genomic regions, in identifying key evolutionary features such as genome hybridization events and show their consistency with whole-genome Structure analyses. We conclude that our approach can transform rDNA sequence heterogeneity from a problem to a useful source of evolutionary information, enabling the estimation of highly accurate phylogenies of closely related organisms, and discuss how it could be extended to future studies of multilocus rDNA systems. [concerted evolution; genome hydridisation; phylogenetic analysis; ribosomal DNA; whole genome sequencing; yeast]. PMID:24682414

West, Claire; James, Stephen A; Davey, Robert P; Dicks, Jo; Roberts, Ian N



Common interruptions in the repeating tripeptide sequence of non-fibrillar collagens: Sequence analysis and structural studies on triple-helix peptide models  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Interruptions in the repeating (Gly-X1-X2)n amino acid sequence pattern are found in the triple-helix domains of all non-fibrillar collagens, and perturbations to the triple-helix at such sites are likely to play a role in collagen higher order structure and function. This report defines the sequence features and structural consequences of the most common interruption, where one residue is missing in the tripeptide pattern, Gly-X1-X2-Gly-AA1-Gly-X1-X2, designated as G1G interruptions. Residue...

Thiagarajan, Geetha; Li, Yingjie; Mohs, Angela; Strafaci, Christopher; Popiel, Magdalena; Baum, Jean; Brodsky, Barbara



RNAG: a new Gibbs sampler for predicting RNA secondary structure for unaligned sequences (United States)

Motivation: RNA secondary structure plays an important role in the function of many RNAs, and structural features are often key to their interaction with other cellular components. Thus, there has been considerable interest in the prediction of secondary structures for RNA families. In this article, we present a new global structural alignment algorithm, RNAG, to predict consensus secondary structures for unaligned sequences. It uses a blocked Gibbs sampling algorithm, which has a theoretical advantage in convergence time. This algorithm iteratively samples from the conditional probability distributions P(Structure | Alignment) and P(Alignment | Structure). Not surprisingly, there is considerable uncertainly in the high-dimensional space of this difficult problem, which has so far received limited attention in this field. We show how the samples drawn from this algorithm can be used to more fully characterize the posterior space and to assess the uncertainty of predictions. Results: Our analysis of three publically available datasets showed a substantial improvement in RNA structure prediction by RNAG over extant prediction methods. Additionally, our analysis of 17 RNA families showed that the RNAG sampled structures were generally compact around their ensemble centroids, and at least 11 families had at least two well-separated clusters of predicted structures. In general, the distance between a reference structure and our predicted structure was large relative to the variation among structures within an ensemble. Availability: The Perl implementation of the RNAG algorithm and the data necessary to reproduce the results described in Sections 3.1 and 3.2 are available at Contact: Supplementary information: Supplementary data are available at Bioinformatics online.

Wei, Donglai; Alpert, Lauren V.; Lawrence, Charles E.



Determinación de la estructura de bases de Schiff derivadas de 2-aminofenol, nitro y flúor sustituidas, utilizando la RMN 1D y 2D / Structure determination of the Schiff bases derivated from 2- aminophenol, nitro and fluorid substituted, using RMN 1D and 2D  

Scientific Electronic Library Online (English)

Full Text Available SciELO Peru | Language: Spanish Abstract in spanish En este trabajo se presenta el resultado de la síntesis de bases de Schiff a partir del 2-amino fenol con 4-nitro y 2-fluorbenzaldehído y se caracterizan los productos, usando el microanálisis, la espectroscopía infrarroja, la espectroscopía de RMN de H¹ y C13 y la RMN en dos dimensiones (COSY y HMB [...] C ), para determinar sus estructuras. Además, se estudia el corrimiento que sufren los carbonos con respecto al tipo de sustituyente del aldehído en la base de Schiff. Abstract in english In this work the result of the synthesis of a base of Schiff is presented, starting from the 2-amino phenol with 4-nitro and 2- fluorbenzaldehyde and the products are characterized, using the microanalysis, the infrared spectroscopy, the spectroscopy of RMN of H¹ and C13 and the RMN in two dimension [...] s (COSY and HMBC), to determine their structures. In addition, the shifts that suffering the carbon atoms respecting to the type of sustituents in the Schiff base are studied.

Sergio, Zamorano; Juan, Camus.


First Observation of Upsilon(1D) States  

CERN Document Server

The CLEO III experiment has recently accumulated a large statistics sample of 4.73 x 10^6 Upsilon(3S) decays. We present the first evidence for the production of the triplet Upsilon(1D) states in the four-photon cascade, Upslion(3S) -> gamma chi_b(2P), chi_b(2P) -> gamma Upsilon(1D), Upsilon(1D) -> gamma chi_b(1P), chi_b(1P) -> gamma Upsilon(1S), followed by the Upsilon(1S) annihilation to e+ e- or mu+ mu-. The signal has a significance of 9.7 standard deviations. The measured product branching ratio for these five decays, (3.3 +- 0.6 +- 0.5) x 10^{-5}, is consistent with the theoretical estimates. We see a 6.8 standard deviation signal for a state with a mass of 10162.2 +- 1.6 MeV/c^2, consistent with the Upsilon(1D_2) assignment. We also present improved measurements of the Upsilon(3S) -> pi0 pi0 Upsilon(1S) branching ratio and the associated di-pion mass distribution.

Csorna, S E; Bonvicini, G; Cinabro, D; Dubrovin, M; McGee, S; Bornheim, A; Lipeles, E; Pappas, S P; Shapiro, A; Sun, W M; Weinstein, A J; Mahapatra, R; Briere, R A; Chen, G P; Ferguson, T; Tatishvili, G T; Vogel, H; Adam, N E; Alexander, J P; Berkelman, K; Boisvert, V; Cassel, David G; Drell, P S; Duboscq, J E; Ecklund, K M; Ehrlich, R; Galik, R S; Gibbons, L; Gittelman, B; Gray, S W; Hartill, D L; Heltsley, B K; Hsu, L; Jones, C D; Kandaswamy, J; Kreinick, D L; Magerkurth, A; Mahlke-Krüger, H; Meyer, T O; Mistry, N B; Nordberg, E; Patterson, J R; Peterson, D; Pivarski, J; Riley, D; Sadoff, A J; Schwarthoff, H; Shepherd, M R; Thayer, J G; Urner, D; Viehhauser, G; Warburton, A; Weinberger, M; Athar, S B; Avery, P; Breva-Newell, L; Potlia, V; Stöck, H; Yelton, J; Brandenburg, G; Kim, D Y J; Wilson, R; Benslama, K; Eisenstein, B I; Ernst, J; Gollin, G D; Hans, R M; Karliner, I; Lowrey, N; Plager, C; Sedlack, C; Selen, M; Thaler, J J; Williams, J; Edwards, K W; Ammar, R; Besson, D; Zhao, X; Anderson, S; Frolov, V V; Kubota, Y; Lee, S J; Li, S Z; Poling, R A; Smith, A; Stepaniak, C J; Urheim, J; Metreveli, Z V; Seth, K K; Tomaradze, A G; Zweber, P; Ahmed, S; Alam, M S; Jian, L; Saleem, M; Wappler, F; Eckhart, E; Gan, K K; Gwon, C; Hart, T; Honscheid, K; Hufnagel, D; Kagan, H; Kass, R; Pedlar, T K; Thayer, J B; Von Törne, E; Wilksen, T; Zoeller, M M; Muramatsu, H; Richichi, S J; Severini, H; Skubic, P L; Dytman, S A; Müller, J A; Nam, S; Savinov, V; Chen, S; Hinson, J W; Lee, J; Miller, D H; Pavlunin, V; Shibata, E I; Shipsey, I P J; Cronin-Hennessy, D; Lyon, A L; Park, C S; Park, W; Thorndike, E H; Coan, T E; Gao, Y S; Liu, F; Maravin, Y; Stroynowski, R; Artuso, M; Boulahouache, C; Bukin, K; Dambasuren, E; Khroustalev, K; Mountain, R; Nandakumar, R; Skwarnicki, T; Stone, S; Wang, J C; Mahmood, A H



1d WCIP and FEM hybridization  

Digital Repository Infrastructure Vision for European Research (DRIVER)

An hybridization between two numerical methods, the 1d Wave Concept Iterative Procedure (WCIP) and the 2d Finite Element Method (FEM), is developed. Using two examples, comparisons are provided between the new hybrid method and an analytic solution, when available, or the WCIP alone.



Preferential binding and structural distortion by Fe2+ at RGGG-containing DNA sequences correlates with enhanced oxidative cleavage at such sequences (United States)

Certain DNA sequences are known to be unusually sensitive to nicking via the Fe2+-mediated Fenton reaction. Most notable are a purine nucleotide followed by three or more G residues, RGGG, and purine nucleotides flanking a TG combination, RTGR. Our laboratory previously demonstrated that nicking in the RGGG sequences occurs preferentially 5? to a G residue with the nicking probability decreasing from the 5? to 3?end of these sequences. Using 1H NMR to characterize Fe2+ binding within the duplex CGAGTTAGGGTAGC/GCTACCCTAACTCG and 7-deazaguanine-containing (Z) variants of it, we show that Fe2+ binds preferentially at the GGG sequence, most strongly towards its 5? end. Substitutions of individual guanines with Z indicate that the high affinity Fe2+ binding at AGGG involves two adjacent guanine N7 moieties. Binding is accompanied by large changes in specific imino, aromatic and methyl proton chemical shifts, indicating that a locally distorted structure forms at the binding site that affects the conformation of the two base pairs 3? to the GGG sequence. The binding of Fe2+ to RGGG contrasts with that previously observed for the RTGR sequence, which binds Fe2+ with negligible structural rearrangements.

Rai, Priyamvada; Wemmer, David E.; Linn, Stuart



Assessing Diversity of DNA Structure-Related Sequence Features in Prokaryotic Genomes (United States)

Prokaryotic genomes are diverse in terms of their nucleotide and oligonucleotide composition as well as presence of various sequence features that can affect physical properties of the DNA molecule. We present a survey of local sequence patterns which have a potential to promote non-canonical DNA conformations (i.e. different from standard B-DNA double helix) and interpret the results in terms of relationships with organisms' habitats, phylogenetic classifications, and other characteristics. Our present work differs from earlier similar surveys not only by investigating a wider range of sequence patterns in a large number of genomes but also by using a more realistic null model to assess significant deviations. Our results show that simple sequence repeats and Z-DNA-promoting patterns are generally suppressed in prokaryotic genomes, whereas palindromes and inverted repeats are over-represented. Representation of patterns that promote Z-DNA and intrinsic DNA curvature increases with increasing optimal growth temperature (OGT), and decreases with increasing oxygen requirement. Additionally, representations of close direct repeats, palindromes and inverted repeats exhibit clear negative trends with increasing OGT. The observed relationships with environmental characteristics, particularly OGT, suggest possible evolutionary scenarios of structural adaptation of DNA to particular environmental niches.

Huang, Yongjie; Mrazek, Jan



Genomic sequence diversity and population structure of Saccharomyces cerevisiae assessed by RAD-seq. (United States)

The budding yeast Saccharomyces cerevisiae is important for human food production and as a model organism for biological research. The genetic diversity contained in the global population of yeast strains represents a valuable resource for a number of fields, including genetics, bioengineering, and studies of evolution and population structure. Here, we apply a multiplexed, reduced genome sequencing strategy (restriction site-associated sequencing or RAD-seq) to genotype a large collection of S. cerevisiae strains isolated from a wide range of geographical locations and environmental niches. The method permits the sequencing of the same 1% of all genomes, producing a multiple sequence alignment of 116,880 bases across 262 strains. We find diversity among these strains is principally organized by geography, with European, North American, Asian, and African/S. E. Asian populations defining the major axes of genetic variation. At a finer scale, small groups of strains from cacao, olives, and sake are defined by unique variants not present in other strains. One population, containing strains from a variety of fermentations, exhibits high levels of heterozygosity and a mixture of alleles from European and Asian populations, indicating an admixed origin for this group. We propose a model of geographic differentiation followed by human-associated admixture, primarily between European and Asian populations and more recently between European and North American populations. The large collection of genotyped yeast strains characterized here will provide a useful resource for the broad community of yeast researchers. PMID:24122055

Cromie, Gareth A; Hyma, Katie E; Ludlow, Catherine L; Garmendia-Torres, Cecilia; Gilbert, Teresa L; May, Patrick; Huang, Angela A; Dudley, Aimée M; Fay, Justin C



Plasmonic Excitations of 1D Metal-Dielectric Interfaces in 2D Systems: 1D Surface Plasmon Polaritons (United States)

Surface plasmon-polariton (SPP) excitations of metal-dielectric interfaces are a fundamental light-matter interaction which has attracted interest as a route to spatial confinement of light far beyond that offered by conventional dielectric optical devices. Conventionally, SPPs have been studied in noble-metal structures, where the SPPs are intrinsically bound to a 2D metal-dielectric interface. Meanwhile, recent advances in the growth of hybrid 2D crystals, which comprise laterally connected domains of distinct atomically thin materials, provide the first realistic platform on which a 2D metal-dielectric system with a truly 1D metal-dielectric interface can be achieved. Here we show for the first time that 1D metal-dielectric interfaces support a fundamental 1D plasmonic mode (1DSPP) which exhibits cutoff behavior that provides dramatically improved light confinement in 2D systems. The 1DSPP constitutes a new basic category of plasmon as the missing 1D member of the plasmon family: 3D bulk plasmon, 2DSPP, 1DSPP, and 0D localized SP.

Mason, Daniel R.; Menabde, Sergey G.; Yu, Sunkyu; Park, Namkyoo



Hydrothermal synthesis and crystal structure of an infinite 1D ladderlike metal-organic compound: [Cu 2(btec)(2,2'-bipy) 2] ? (btec=1,2,4,5-benzenetetracarboxylate) (United States)

A new metal-organic hybrid compound [Cu 2(btec)(2,2'-bipy) 2] ? (btec=1,2,4,5-benzenetetracarboxylate) has been hydrothermally synthesized and characterized by elemental analyses, IR spectrum, TG analysis and single-crystal X-ray diffraction. Dark-blue crystals crystallizes in the monoclinic system, space group P2(1)/ n, a=7.2587(15) Å, b=12.396(3) Å, c=14.428(3) Å, ?=103.87(3)°, V=1260.4(4) Å3, Z=2, R1=0.0553, wR2=0.1746. The title compound exhibits a new infinite 1D ladderlike chain architecture constructed from the {Cu(2,2'-bipy)} 2+ moieties and btec ligands. Furthermore, the adjacent chains are stacked into a 3D supramolecular framework via ?-? stacking interactions of bipy groups and intermolecular hydrogen bonding interactions.

Hao, Na; Li, Yangguang; Wang, Enbo; Shen, Enhong; Hu, Changwen; Xu, Lin



Structure-function relationships in glycopolymers: effects of residue sequences, duplex, and triplex organization. (United States)

The importance of residue sequence and duplex and triplex structures as basis for establishing molecular understanding of the structure-function relationships within glycopolymers is highlighted. The copolysaccharide alginate is the selected example for elucidating effects of residue sequence on functional properties like ionotropic gelation. Xanthan and comblike branched ?-d-glucans are used as examples of impact of duplex and triplex organization on global conformation and functional properties. Combined with further examples within self-interactions of mucins possessing different saccharide decorations, polyelectrolyte complexation and multilayer formation, the examples indicate that a molecular understanding of various properties related to impact of residue sequences, duplex, and triplex organization can be established. Strategies similar to those included in the highlighted examples, also combined with novel tools, for example single-molecule approaches, interrogated by combination of experimental and theoretical/numerical approaches, and investigated closer to the native biological state, are expected to further advance the field. © 2013 Wiley Periodicals, Inc. Biopolymers 99: 757-771, 2013. PMID:23784702

Sletmoen, Marit; Stokke, Bjørn Torger



Predicting absolute contact numbers of native protein structure from amino acid sequence. (United States)

The contact number of an amino acid residue in a protein structure is defined by the number of C(beta) atoms around the C(beta) atom of the given residue, a quantity similar to, but different from, solvent accessible surface area. We present a method to predict the contact numbers of a protein from its amino acid sequence. The method is based on a simple linear regression scheme and predicts the absolute values of contact numbers. When single sequences are used for both parameter estimation and cross-validation, the present method predicts the contact numbers with a correlation coefficient of 0.555 on average. When multiple sequence alignments are used, the correlation increases to 0.627, which is a significant improvement over previous methods. In terms of discrete states prediction, the accuracies for 2-, 3-, and 10-state predictions are, respectively, 71.4%, 54.1%, and 18.9% with residue type-dependent unbiased thresholds, and 76.3%, 59.2%, and 21.8% with residue type-independent unbiased thresholds. The difference between accessible surface area and contact number from a prediction viewpoint and the application of contact number prediction to three-dimensional structure prediction are discussed. PMID:15523668

Kinjo, Akira R; Horimoto, Katsuhisa; Nishikawa, Ken



Functional and immunological relevance of Anaplasma marginale major surface protein 1a sequence and structural analysis. (United States)

Bovine anaplasmosis is caused by cattle infection with the tick-borne bacterium, Anaplasma marginale. The major surface protein 1a (MSP1a) has been used as a genetic marker for identifying A. marginale strains based on N-terminal tandem repeats and a 5'-UTR microsatellite located in the msp1a gene. The MSP1a tandem repeats contain immune relevant elements and functional domains that bind to bovine erythrocytes and tick cells, thus providing information about the evolution of host-pathogen and vector-pathogen interactions. Here we propose one nomenclature for A. marginale strain classification based on MSP1a. All tandem repeats among A. marginale strains were classified and the amino acid variability/frequency in each position was determined. The sequence variation at immunodominant B cell epitopes was determined and the secondary (2D) structure of the tandem repeats was modeled. A total of 224 different strains of A. marginale were classified, showing 11 genotypes based on the 5'-UTR microsatellite and 193 different tandem repeats with high amino acid variability per position. Our results showed phylogenetic correlation between MSP1a sequence, secondary structure, B-cell epitope composition and tick transmissibility of A. marginale strains. The analysis of MSP1a sequences provides relevant information about the biology of A. marginale to design vaccines with a cross-protective capacity based on MSP1a B-cell epitopes. PMID:23776456

Cabezas-Cruz, Alejandro; Passos, Lygia M F; Lis, Katarzyna; Kenneil, Rachel; Valdés, James J; Ferrolho, Joana; Tonk, Miray; Pohl, Anna E; Grubhoffer, Libor; Zweygarth, Erich; Shkap, Varda; Ribeiro, Mucio F B; Estrada-Peña, Agustín; Kocan, Katherine M; de la Fuente, José



Evolutionary conservation of sequence and secondary structures inCRISPR repeats  

Energy Technology Data Exchange (ETDEWEB)

Clustered Regularly Interspaced Palindromic Repeats (CRISPRs) are a novel class of direct repeats, separated by unique spacer sequences of similar length, that are present in {approx}40% of bacterial and all archaeal genomes analyzed to date. More than 40 gene families, called CRISPR-associated sequences (CAS), appear in conjunction with these repeats and are thought to be involved in the propagation and functioning of CRISPRs. It has been proposed that the CRISPR/CAS system samples, maintains a record of, and inactivates invasive DNA that the cell has encountered, and therefore constitutes a prokaryotic analog of an immune system. Here we analyze CRISPR repeats identified in 195 microbial genomes and show that they can be organized into multiple clusters based on sequence similarity. All individual repeats in any given cluster were inferred to form characteristic RNA secondary structure, ranging from non-existent to pronounced. Stable secondary structures included G:U base pairs and exhibited multiple compensatory base changes in the stem region, indicating evolutionary conservation and functional importance. We also show that the repeat-based classification corresponds to, and expands upon, a previously reported CAS gene-based classification including specific relationships between CRISPR and CAS subtypes.

Kunin, Victor; Sorek, Rotem; Hugenholtz, Philip



Combining classifiers for improved classification of proteins from sequence or structure  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Predicting a protein's structural or functional class from its amino acid sequence or structure is a fundamental problem in computational biology. Recently, there has been considerable interest in using discriminative learning algorithms, in particular support vector machines (SVMs, for classification of proteins. However, because sufficiently many positive examples are required to train such classifiers, all SVM-based methods are hampered by limited coverage. Results In this study, we develop a hybrid machine learning approach for classifying proteins, and we apply the method to the problem of assigning proteins to structural categories based on their sequences or their 3D structures. The method combines a full-coverage but lower accuracy nearest neighbor method with higher accuracy but reduced coverage multiclass SVMs to produce a full coverage classifier with overall improved accuracy. The hybrid approach is based on the simple idea of "punting" from one method to another using a learned threshold. Conclusion In cross-validated experiments on the SCOP hierarchy, the hybrid methods consistently outperform the individual component methods at all levels of coverage. Code and data sets are available at

Leslie Christina S



Comparative Analysis of Structure and Sequences of Oryza sativa Superoxide Dismutase  

Directory of Open Access Journals (Sweden)

Full Text Available One of the major classes of antioxidant enzymes, which protect the cellular and subcellular components against harmful reactive oxygen species (ROS, is superoxide dismutase (SOD. SODs play pivotal role in scavenging highly reactive free oxygen radicals and protecting cells from toxic effects. In Oryza sativa three types of SODs are available based on their metal content viz. Cu-Zn SOD, Mn SOD and Fe SOD. In the present study attempts were made to critically assess the structure and phylogenetic relationship among Oryza sativa SODs. The sequence similarity search using local BLAST shows that Mn SODs and Fe SODs have greater degree of similarity compared with that of Cu-Zn SODs. The multiple alignment reveals that seven amino acids were found to be totally conserved. The secondary structure shows that Mn SODs and Fe SODs have similar helixes, sheets, turns and coils compared with that of Cu-Zn SODs. The comparative analysis also displayed greater resemblance in primary, secondary and tertiary structures of Fe SODs and Mn SODs. Comparison between the structure and sequence analysis reveals that Mn SOD and Fe SOD are found to be closely related whereas Cu-Zn SOD evolves independently.

Aiyar Balasubramanian



Identification of microRNA precursors with new sequence-structure features  

Directory of Open Access Journals (Sweden)

Full Text Available MicroRNAs are an important subclass of non-coding RNAs (ncRNA, and serve as main players into RNA interference (RNAi. Mature microRNA derived from stem-loop structure called precursor. Identification of precursor microRNA (pre-miRNA is essential step to target microRNA in whole genome. The present work proposed 25 novel local features for identifying stem- loop structure of pre-miRNAs, which captures characteristics on both the sequence and structure. Firstly, we pulled the stem of hairpins and aligned the bases in bulges and internal loops used ‘?’, and then counted 24 base-pairs (‘AA’, ‘AU’, …, ‘?G’, except ‘??’ in pulled stem (formalized by length of pulled stem as features vector of Support Vector Machine (SVM. Performances of three classifiers with our features and different kernels trained on human data were all superior to Triplet-SVM-classifier’s in po- sitive and negative testing data sets. Moreover, we achieved higher prediction accuracy through combining 7 global sequence-structure. The result indicates validity of novel local features.

Ying-Jie Zhao



A molecular basis for NKT cell recognition of CD1d-self antigen  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The antigen receptor for natural killer T cells (NKT TCR) bind CD1d-restricted microbial and self lipid antigens, although the molecular basis of self-CD1d recognition is unclear. Here, we have characterized NKT TCR recognition of CD1d molecules loaded with natural self-antigens (Ags), and report the 2.3 Å resolution structure of an autoreactive NKT TCR-phosphatidylinositol-CD1d complex. NKT TCR recognition of self and foreign antigens was underpinned by a similar mode of germline-encoded re...

Mallevaey, Thierry; Clarke, Andrew J.; Scott-browne, James; Young, Mary H.; Roisman, Laila C.; Pellicci, Daniel G.; Patel, Onisha; Vivian, Julian P.; Matsuda, Jennifer L.; Mccluskey, James; Godfrey, Dale I.; Marrack, Philippa; Rossjohn, Jamie; Gapin, Laurent



Natural Lipid Ligands Associated with Human CD1d Targeted to Different Subcellular Compartments  

Digital Repository Infrastructure Vision for European Research (DRIVER)

CD1d is an MHC class I-like membrane glycoprotein that presents lipid antigens to NKT cells. Despite intensive biochemical, genetic and structural studies, the endogenous lipids associated with CD1d remain poorly defined because of the biochemical challenges posed by their hydrophobic nature. Here we report the generation of a protease-cleavable CD1d variant with a similar trafficking pattern to wild-type CD1d that can be purified in the absence of detergent and allows the characterization of...

Yuan, Weiming; Kang, Suk-jo; Evans, James E.; Cresswell, Peter



Loss of quaternary structure is associated with rapid sequence divergence in the OSBS family. (United States)

The rate of protein evolution is determined by a combination of selective pressure on protein function and biophysical constraints on protein folding and structure. Determining the relative contributions of these properties is an unsolved problem in molecular evolution with broad implications for protein engineering and function prediction. As a case study, we examined the structural divergence of the rapidly evolving o-succinylbenzoate synthase (OSBS) family, which catalyzes a step in menaquinone synthesis in diverse microorganisms and plants. On average, the OSBS family is much more divergent than other protein families from the same set of species, with the most divergent family members sharing <15% sequence identity. Comparing 11 representative structures revealed that loss of quaternary structure and large deletions or insertions are associated with the family's rapid evolution. Neither of these properties has been investigated in previous studies to identify factors that affect the rate of protein evolution. Intriguingly, one subfamily retained a multimeric quaternary structure and has small insertions and deletions compared with related enzymes that catalyze diverse reactions. Many proteins in this subfamily catalyze both OSBS and N-succinylamino acid racemization (NSAR). Retention of ancestral structural characteristics in the NSAR/OSBS subfamily suggests that the rate of protein evolution is not proportional to the capacity to evolve new protein functions. Instead, structural features that are conserved among proteins with diverse functions might contribute to the evolution of new functions. PMID:24872444

Odokonyero, Denis; Sakai, Ayano; Patskovsky, Yury; Malashkevich, Vladimir N; Fedorov, Alexander A; Bonanno, Jeffrey B; Fedorov, Elena V; Toro, Rafael; Agarwal, Rakhi; Wang, Chenxi; Ozerova, Nicole D S; Yew, Wen Shan; Sauder, J Michael; Swaminathan, Subramanyam; Burley, Stephen K; Almo, Steven C; Glasner, Margaret E



Mod-seq: high-throughput sequencing for chemical probing of RNA structure. (United States)

The functions of RNA molecules are intimately linked to their ability to fold into complex secondary and tertiary structures. Thus, understanding how these molecules fold is essential to determining how they function. Current methods for investigating RNA structure often use small molecules, enzymes, or ions that cleave or modify the RNA in a solvent-accessible manner. While these methods have been invaluable to understanding RNA structure, they can be fairly labor intensive and often focus on short regions of single RNAs. Here we present a new method (Mod-seq) and data analysis pipeline (Mod-seeker) for assaying the structure of RNAs by high-throughput sequencing. This technique can be utilized both in vivo and in vitro, with any small molecule that modifies RNA and consequently impedes reverse transcriptase. As proof-of-principle, we used dimethyl sulfate (DMS) to probe the in vivo structure of total cellular RNAs in Saccharomyces cerevisiae. Mod-seq analysis simultaneously revealed secondary structural information for all four ribosomal RNAs and 32 additional noncoding RNAs. We further show that Mod-seq can be used to detect structural changes in 5.8S and 25S rRNAs in the absence of ribosomal protein L26, correctly identifying its binding site on the ribosome. While this method is applicable to RNAs of any length, its high-throughput nature makes Mod-seq ideal for studying long RNAs and complex RNA mixtures. PMID:24664469

Talkish, Jason; May, Gemma; Lin, Yizhu; Woolford, John L; McManus, C Joel



The PETfold and PETcofold web servers for intra- and intermolecular structures of multiple RNA sequences  

DEFF Research Database (Denmark)

The function of non-coding RNA genes largely depends on their secondary structure and the interaction with other molecules. Thus, an accurate prediction of secondary structure and RNA-RNA interaction is essential for the understanding of biological roles and pathways associated with a specific RNA gene. We present web servers to analyze multiple RNA sequences for common RNA structure and for RNA interaction sites. The web servers are based on the recent PET (Probabilistic Evolutionary and Thermodynamic) models PETfold and PETcofold, but add user friendly features ranging from a graphical layer to interactive usage of the predictors. Additionally, the web servers provide direct access to annotated RNA alignments, such as the Rfam 10.0 database and multiple alignments of 16 vertebrate genomes with human. The web servers are freely available at:

Seemann, Ernst Stefan; Menzel, Karl Peter



Automated Aufbau of antibody structures from given sequences using Macromoltek's SmrtMolAntibody. (United States)

This study was a part of the second antibody modeling assessment. The assessment is a blind study of the performance of multiple software programs used for antibody homology modeling. In the study, research groups were given sequences for 11 antibodies and asked to predict their corresponding structures. The results were measured using root-mean-square deviation (rmsd) between the submitted models and X-ray crystal structures. In 10 of 11 cases, the results using SmrtMolAntibody show good agreement between the submitted models and X-ray crystal structures. In the first stage, the average rmsd was 1.4 Å. Average rmsd values for the framework was 1.2 Å and for the H3 loop was 3.0 Å. In stage two, there was a slight improvement with an rmsd for the H3 loop of 2.9 Å. Proteins 2014; 82:1636-1645. © 2014 Wiley Periodicals, Inc. PMID:24777752

Berrondo, Monica; Kaufmann, Susana; Berrondo, Manuel



Sequence-specific 1H NMR assignments and secondary structure of eglin c  

International Nuclear Information System (INIS)

Sequence-specific nuclear magnetic resonance assignments were obtained for eglin c, a polypeptide inhibitor of the granulocytic proteinases elastase and cathepsin G and some other proteinases. The protein consists of a single polypeptide chain of 70 residues. All proton resonances were assigned except for some labile protons of arginine side chains. The patterns of nuclear Overhauser enhancements and coupling constants and the observation of slow hydrogen exchange were used to characterize the secondary structure of the protein. The results indicate that the solution structure of the free inhibitor is very similar to the crystal structure reported for the same protein in the complex with subtilisin Carlsberg. However, a part of the binding loop seems to have a significantly different conformation in the free protein



Large Scale Identification and Categorization of Protein Sequences Using Structured Logistic Regression (United States)

Background Structured Logistic Regression (SLR) is a newly developed machine learning tool first proposed in the context of text categorization. Current availability of extensive protein sequence databases calls for an automated method to reliably classify sequences and SLR seems well-suited for this task. The classification of P-type ATPases, a large family of ATP-driven membrane pumps transporting essential cations, was selected as a test-case that would generate important biological information as well as provide a proof-of-concept for the application of SLR to a large scale bioinformatics problem. Results Using SLR, we have built classifiers to identify and automatically categorize P-type ATPases into one of 11 pre-defined classes. The SLR-classifiers are compared to a Hidden Markov Model approach and shown to be highly accurate and scalable. Representing the bulk of currently known sequences, we analysed 9.3 million sequences in the UniProtKB and attempted to classify a large number of P-type ATPases. To examine the distribution of pumps on organisms, we also applied SLR to 1,123 complete genomes from the Entrez genome database. Finally, we analysed the predicted membrane topology of the identified P-type ATPases. Conclusions Using the SLR-based classification tool we are able to run a large scale study of P-type ATPases. This study provides proof-of-concept for the application of SLR to a bioinformatics problem and the analysis of P-type ATPases pinpoints new and interesting targets for further biochemical characterization and structural analysis.

Axelsen, Kristian B.; Palmgren, Michael G.; Nissen, Poul; Wiuf, Carsten; Pedersen, Christian N. S.



Short, synthetic and selectively 13C-labeled RNA sequences for the NMR structure determination of protein–RNA complexes  

Digital Repository Infrastructure Vision for European Research (DRIVER)

We report an optimized synthesis of all canonical 2?-O-TOM protected ribonucleoside phosphoramidites and solid supports containing [13C5]-labeled ribose moieties, their sequence-specific introduction into very short RNA sequences and their use for the structure determination of two protein–RNA complexes. These specifically labeled sequences facilitate RNA resonance assignments and are essential to assign a high number of sugar–sugar and intermolecular NOEs, which ultimately improve the ...

Wenter, Philipp; Reymond, Luc; Auweter, Sigrid D.; Allain, Fre?de?ric H. -t; Pitsch, Stefan



The Homeodomain Resource: a comprehensive collection of sequence, structure, interaction, genomic and functional information on the homeodomain protein family  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The Homeodomain Resource is a curated collection of sequence, structure, interaction, genomic and functional information on the homeodomain family. The current version builds upon previous versions by the addition of new, complete sets of homeodomain sequences from fully sequenced genomes, the expansion of existing curated homeodomain information and the improvement of data accessibility through better search tools and more complete data integration. This release contains 1534 full-length hom...

Moreland, R. Travis; Ryan, Joseph F.; Pan, Christopher; Baxevanis, Andreas D.



Structural characterization of an engineered tandem repeat contrasts the importance of context and sequence in protein folding  

Digital Repository Infrastructure Vision for European Research (DRIVER)

To test a different approach to understanding the relationship between the sequence of part of a protein and its conformation in the overall folded structure, the amino acid sequence corresponding to an ?-helix of T4 lysozyme was duplicated in tandem. The presence of such a sequence repeat provides the protein with “choices” during folding. The mutant protein folds with almost wild-type stability, is active, and crystallizes in two different space groups, one isomorphous with wild type a...

Sagermann, Martin; Baase, Walter A.; Matthews, Brian W.



Iron-based 1D nanostructures by electrospinning process  

International Nuclear Information System (INIS)

Iron-based 1D nanostructures have been successfully prepared using an electrospinning technique and varying the pyrolysis atmospheres. Hematite (Fe2O3) nanotubes and polycrystalline Fe3C nanofibers were obtained by simple air or mixed gas (H2, Ar) annealing treatments. Using the air annealing treatment, a high control of the morphology as well as of the wall thickness of the nanotubes was demonstrated with a direct influence of the starting polymer concentration. When mixed gases (H2 and Ar) were used for the annealing treatments, for the first time polycrystalline Fe3C nanofibers composed of carbon graphitic planes were obtained, ensuring Fe3C nanoparticle stability and nanofiber cohesion. The morphology and structural properties of all these iron-based 1D nanostructures were fully characterized by SEM, TEM, XRD and Raman spectroscopy.



Sandia reactor kinetics codes: SAK and PK1D  

International Nuclear Information System (INIS)

The Sandia Kinetics code (SAK) is a one-dimensional coupled thermal-neutronics transient analysis code for use in simulation of reactor transients. The time-dependent cross section routines allow arbitrary time-dependent changes in material properties. The one-dimensional heat transfer routines are for cylindrical geometry and allow arbitrary mesh structure, temperature-dependent thermal properties, radiation treatment, and coolant flow and heat-transfer properties at the surface of a fuel element. The Point Kinetics 1 Dimensional Heat Transfer Code (PK1D) solves the point kinetics equations and has essentially the same heat-transfer treatment as SAK. PK1D can address extended reactor transients with minimal computer execution time



Enhancement of accuracy and efficiency for RNA secondary structure prediction by sequence segmentation and MapReduce (United States)

Background Ribonucleic acid (RNA) molecules play important roles in many biological processes including gene expression and regulation. Their secondary structures are crucial for the RNA functionality, and the prediction of the secondary structures is widely studied. Our previous research shows that cutting long sequences into shorter chunks, predicting secondary structures of the chunks independently using thermodynamic methods, and reconstructing the entire secondary structure from the predicted chunk structures can yield better accuracy than predicting the secondary structure using the RNA sequence as a whole. The chunking, prediction, and reconstruction processes can use different methods and parameters, some of which produce more accurate predictions than others. In this paper, we study the prediction accuracy and efficiency of three different chunking methods using seven popular secondary structure prediction programs that apply to two datasets of RNA with known secondary structures, which include both pseudoknotted and non-pseudoknotted sequences, as well as a family of viral genome RNAs whose structures have not been predicted before. Our modularized MapReduce framework based on Hadoop allows us to study the problem in a parallel and robust environment. Results On average, the maximum accuracy retention values are larger than one for our chunking methods and the seven prediction programs over 50 non-pseudoknotted sequences, meaning that the secondary structure predicted using chunking is more similar to the real structure than the secondary structure predicted by using the whole sequence. We observe similar results for the 23 pseudoknotted sequences, except for the NUPACK program using the centered chunking method. The performance analysis for 14 long RNA sequences from the Nodaviridae virus family outlines how the coarse-grained mapping of chunking and predictions in the MapReduce framework exhibits shorter turnaround times for short RNA sequences. However, as the lengths of the RNA sequences increase, the fine-grained mapping can surpass the coarse-grained mapping in performance. Conclusions By using our MapReduce framework together with statistical analysis on the accuracy retention results, we observe how the inversion-based chunking methods can outperform predictions using the whole sequence. Our chunk-based approach also enables us to predict secondary structures for very long RNA sequences, which is not feasible with traditional methods alone.



PAIRpred: Partner-specific prediction of interacting residues from sequence and structure. (United States)

We present a novel partner-specific protein-protein interaction site prediction method called PAIRpred. Unlike most existing machine learning binding site prediction methods, PAIRpred uses information from both proteins in a protein complex to predict pairs of interacting residues from the two proteins. PAIRpred captures sequence and structure information about residue pairs through pairwise kernels that are used for training a support vector machine classifier. As a result, PAIRpred presents a more detailed model of protein binding, and offers state of the art accuracy in predicting binding sites at the protein level as well as inter-protein residue contacts at the complex level. We demonstrate PAIRpred's performance on Docking Benchmark 4.0 and recent CAPRI targets. We present a detailed performance analysis outlining the contribution of different sequence and structure features, together with a comparison to a variety of existing interface prediction techniques. We have also studied the impact of binding-associated conformational change on prediction accuracy and found PAIRpred to be more robust to such structural changes than existing schemes. As an illustration of the potential applications of PAIRpred, we provide a case study in which PAIRpred is used to analyze the nature and specificity of the interface in the interaction of human ISG15 protein with NS1 protein from influenza A virus. Python code for PAIRpred is available at Proteins 2014; 82:1142-1155. © 2013 Wiley Periodicals, Inc. PMID:24243399

Afsar Minhas, Fayyaz Ul Amir; Geiss, Brian J; Ben-Hur, Asa



Chaos in 1d edge plasmas  

International Nuclear Information System (INIS)

Radiative instabilities that can develop in plasmas subjected to external heating and radiative cooling are of great importance in edge plasmas of tokamaks and stellarators. They will be analyzed in this paper on the basis of the 1d heat conduction equation. Bifurcation and time evolution of temperature profiles along magnetic field lines between two target plates have been reported. The simple model functions used there are applied here together with methods proved to be useful in nonlinear theories of dynamical systems in order to investigate stable, unstable and chaotic solutions of the 1d heat conduction equation. We consider the model of a radiative plasma with periodically (period ?) injected impurities. In order to show the basic mechanism we discuss at first the time-dependent problem which leads to an equation that can be integrated piecewise exactly analogous to the equation of motion for the periodically kicked rotator. Solution and Lyapunov stability analysis of that one-dimensional radiative map show the existence of stable and unstable solutions. Calculating attractors and Lyapunov exponents in dependence of parameters like power input or period ? shows the appearance of periodical solutions followed by period doubling and finally resulting in chaos in the radiative plasma. Second, we consider 1d and time-dependent problems by calculating profiles and attractors. Enhancing the period ? starting from ? = 0 (stationary problem) rediscovers the known routes to chaos in spatial extension like period doubling or intermittence. (orig.)



An integrative probabilistic model for identification of structural variation in sequencing data. (United States)

Paired-end sequencing is a common approach for identifying structural variation (SV) in genomes. Discrepancies between the observed and expected alignments indicate potential SVs. Most SV detection algorithms use only one of the possible signals and ignore reads with multiple alignments. This results in reduced sensitivity to detect SVs, especially in repetitive regions. We introduce GASVPro, an algorithm combining both paired read and read depth signals into a probabilistic model which can analyze multiple alignments of reads. GASVPro outperforms existing methods with a 50-90% improvement in specificity on deletions and a 50% improvement on inversions. PMID:22452995

Sindi, Suzanne S; Onal, Selim; Peng, Luke C; Wu, Hsin-Ta; Raphael, Benjamin J



Genetic structure, transforming sequence, and gene product of avian sarcoma virus UR1.  

Digital Repository Infrastructure Vision for European Research (DRIVER)

We analyzed the genetic structure and gene products of the newly isolated avian sarcoma virus UR1, which recently has been shown to be replication defective and to contain no sequences homologous to the src gene of Rous sarcoma virus. The sizes of the genomic RNAs of UR1 and its associated helper virus, UR1AV, were determined to be 29S and 35S (5.9 and 8.5 kilobases), respectively, by gel electrophoresis and sucrose gradient sedimentation. RNase T1 oligonucleotide mapping of purified viral RN...

Wang, L. H.; Feldman, R.; Shibuya, M.; Hanafusa, H.; Notter, M. F.; Balduzzi, P. C.



Syntheses, crystal structure, spectroscopic and photoluminescence studies of mononuclear copper(II), manganese(II), cadmium(II), and a 1D polymeric Cu(II) complexes with a pyrimidine derived Schiff base ligand (United States)

The complexation behaviour of Schiff base ligand 2-((2-(4,6-dimethylpyrimidin-2-yl)hydrazono)methyl)phenol [HL] towards different metal centres is reported by the syntheses and characterization of three mononuclear Cu(II), Mn(II) and Cd(II) complexes, [Cu(L)(H2O)2](NO3)(H2O) (1), [Mn(L)2](CH3OH) (2), [Cd(L)2](CH3OH) (3) and a 1D polymeric Cu(II) complex, [Cu(L)(ClO4)(C2N2O2H)]n(CH3OH) (4) respectively. In the complexes 1-4 the deprotonated uninegative tridentate ligand serves as NNO donor where one pyrimidine ring N, the azomethine N and the salicyl hydroxyl oxygen atoms are coordinatively active. The complex 1 has almost square pyramidal geometry [? = 0.2081] whereas the metal centres maintain distorted octahedral geometry in the remaining three complexes 2-4. All the complexes are characterized by X-ray crystallography. The Cd(II) complex has considerable fluorescence while the rest of the complexes and the ligand molecule are fluorescent silent.

Ray, Sangita; Konar, Saugata; Jana, Atanu; Das, Kinsuk; Dhara, Anamika; Chatterjee, Sudipta; Kar, Susanta Kumar



Sequence diversity of Pseudomonas aeruginosa: impact on population structure and genome evolution. (United States)

Comparative sequencing of Pseudomonas aeruginosa genes oriC, citS, ampC, oprI, fliC, and pilA in 19 environmental and clinical isolates revealed the sequence diversity to be about 1 order of magnitude lower than in comparable housekeeping genes of Salmonella. In contrast to the low nucleotide substitution rate, the frequency of recombination among different P. aeruginosa genotypes was high, leading to the random association of alleles. The P. aeruginosa population consists of equivalent genotypes that form a net-like population structure. However, each genotype represents a cluster of closely related strains which retain their sequence signature in the conserved gene pool and carry a set of genotype-specific DNA blocks. The codon adaptation index, a quantitative measure of synonymous codon bias of genes, was found to be consistently high in the P. aeruginosa genome irrespective of the metabolic category and the abundance of the encoded gene product. Such uniformly high codon adaptation indices of 0.55 to 0.85 fit the ubiquitous lifestyle of P. aeruginosa. PMID:10809691

Kiewitz, C; Tümmler, B



Memory Efficient Sequence Analysis Using Compressed Data Structures (Metagenomics Informatics Challenges Workshop: 10K Genomes at a Time)  

Energy Technology Data Exchange (ETDEWEB)

Wellcome Trust Sanger Institute's Jared Simpson on "Memory efficient sequence analysis using compressed data structures" at the Metagenomics Informatics Challenges Workshop held at the DOE JGI on October 12-13, 2011

Simpson, Jared [Wellcome Trust Sanger Institute



High quality protein sequence alignment by combining structural profile prediction and profile alignment using SABER-TOOTH  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Protein alignments are an essential tool for many bioinformatics analyses. While sequence alignments are accurate for proteins of high sequence similarity, they become unreliable as they approach the so-called 'twilight zone' where sequence similarity gets indistinguishable from random. For such distant pairs, structure alignment is of much better quality. Nevertheless, sequence alignment is the only choice in the majority of cases where structural data is not available. This situation demands development of methods that extend the applicability of accurate sequence alignment to distantly related proteins. Results We develop a sequence alignment method that combines the prediction of a structural profile based on the protein's sequence with the alignment of that profile using our recently published alignment tool SABERTOOTH. In particular, we predict the contact vector of protein structures using an artificial neural network based on position-specific scoring matrices generated by PSI-BLAST and align these predicted contact vectors. The resulting sequence alignments are assessed using two different tests: First, we assess the alignment quality by measuring the derived structural similarity for cases in which structures are available. In a second test, we quantify the ability of the significance score of the alignments to recognize structural and evolutionary relationships. As a benchmark we use a representative set of the SCOP (structural classification of proteins database, with similarities ranging from closely related proteins at SCOP family level, to very distantly related proteins at SCOP fold level. Comparing these results with some prominent sequence alignment tools, we find that SABERTOOTH produces sequence alignments of better quality than those of Clustal W, T-Coffee, MUSCLE, and PSI-BLAST. HHpred, one of the most sophisticated and computationally expensive tools available, outperforms our alignment algorithm at family and superfamily levels, while the use of SABERTOOTH is advantageous for alignments at fold level. Our alignment scheme will profit from future improvements of structural profiles prediction. Conclusions We present the automatic sequence alignment tool SABERTOOTH that computes pairwise sequence alignments of very high quality. SABERTOOTH is especially advantageous when applied to alignments of remotely related proteins. The source code is available at, free for academic users upon request.

Bastolla Ugo



Sequence-structure based phylogeny of GPCR Class A Rhodopsin receptors. (United States)

Current methods of G protein coupled receptors (GPCRs) phylogenetic classification are sequence based and therefore inappropriate for highly divergent sequences, sharing low sequence identity. In this study, sequence structure profile based alignment generated by PROMALS3D was used to understand the GPCR Class A Rhodopsin superfamily evolution using the MEGA 5 software. Phylogenetic analysis included a combination of Neighbor-Joining method and Maximum Likelihood method, with 1000 bootstrap replicates. Our study was able to identify potential ligand association for Class A Orphans and putative/unclassified Class A receptors with no cognate ligand information: GPR21 and GPR52 with fatty acids; GPR75 with Neuropeptide Y; GPR82, GPR18, GPR141 with N-arachidonylglycine; GPR176 with Free fatty acids, GPR10 with Tachykinin & Neuropeptide Y; GPR85 with ATP, ADP & UDP glucose; GPR151 with Galanin; GPR153 and GPR162 with Adrenalin, Noradrenalin; GPR146, GPR139, GPR142 with Neuromedin, Ghrelin, Neuromedin U-25 & Thyrotropin-releasing hormone; GPR171 with ATP, ADP & UDP Glucose; GPR88, GPR135, GPR161, GPR101with 11-cis-retinal; GPR83 with Tackykinin; GPR148 with Prostanoids, GPR109b, GPR81, GPR31with ATP & UTP and GPR150 with GnRH I & GnRHII. Furthermore, we suggest that this study would prove useful in re-classification of receptors, selecting templates for homology modeling and identifying ligands which may show cross reactivity with other GPCRs as signaling via multiple ligands play a significant role in disease modulation. PMID:24503482

Kakarala, Kavita Kumari; Jamil, Kaiser



The Denatured State Dictates the Topology of Two Proteins with Almost Identical Sequence but Different Native Structure and Function*  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The protein folding problem is often studied by comparing the mechanisms of proteins sharing the same structure but different sequence. The recent design of the two proteins GA88 and GB88, displaying different structures and functions while sharing 88% sequence identity (49 out of 56 amino acids), allows the unique opportunity for a complementary approach. At which stage of its folding pathway does a protein commit to a given topology? Which residues are crucial in directing folding mechanism...

Morrone, Angela; Mccully, Michelle E.; Bryan, Philip N.; Brunori, Maurizio; Daggett, Valerie; Gianni, Stefano; Travaglini-allocatelli, Carlo



Improving protein structure prediction using multiple sequence-based contact predictions. (United States)

Although residue-residue contact maps dictate the topology of proteins, sequence-based ab initio contact predictions have been found little use in actual structure prediction due to the low accuracy. We developed a composite set of nine SVM-based contact predictors that are used in I-TASSER simulation in combination with sparse template contact restraints. When testing the strategy on 273 nonhomologous targets, remarkable improvements of I-TASSER models were observed for both easy and hard targets, with p value by Student's t test30%, which essentially converts "nonfoldable" targets into "foldable" ones. In CASP9, I-TASSER employed ab initio contact predictions, and generated models for 26 FM targets with a GDT-score 16% and 44% higher than the second and third best servers from other groups, respectively. These findings demonstrate a new avenue to improve the accuracy of protein structure prediction especially for free-modeling targets. PMID:21827953

Wu, Sitao; Szilagyi, Andras; Zhang, Yang



DNA breaks and repair in interstitial telomere sequences: Influence of chromatin structure  

International Nuclear Information System (INIS)

Interstitial Telomeric Sequences (ITS) are over-involved in spontaneous and radiationinduced chromosome aberrations in chinese hamster cells. We have performed a study to investigate the origin of their instability, spontaneously or after low doses irradiation. Our results demonstrate that ITS have a particular chromatin structure: short nucleotide repeat length, less compaction of the 30 nm chromatin fiber, presence of G-quadruplex structures. These features would modulate breaks production and would favour the recruitment of alternative DNA repair mechanisms, which are prone to produce chromosome aberrations. These pathways could be at the origin of chromosome aberrations in ITS whereas NHEJ and HR Double Strand Break repair pathways are rather required for a correct repair in these regions. (author)



Focused Evolution of HIV-1 Neutralizing Antibodies Revealed by Structures and Deep Sequencing  

Energy Technology Data Exchange (ETDEWEB)

Antibody VRC01 is a human immunoglobulin that neutralizes about 90% of HIV-1 isolates. To understand how such broadly neutralizing antibodies develop, we used x-ray crystallography and 454 pyrosequencing to characterize additional VRC01-like antibodies from HIV-1-infected individuals. Crystal structures revealed a convergent mode of binding for diverse antibodies to the same CD4-binding-site epitope. A functional genomics analysis of expressed heavy and light chains revealed common pathways of antibody-heavy chain maturation, confined to the IGHV1-2*02 lineage, involving dozens of somatic changes, and capable of pairing with different light chains. Broadly neutralizing HIV-1 immunity associated with VRC01-like antibodies thus involves the evolution of antibodies to a highly affinity-matured state required to recognize an invariant viral structure, with lineages defined from thousands of sequences providing a genetic roadmap of their development.

Wu, Xueling; Zhou, Tongqing; Zhu, Jiang; Zhang, Baoshan; Georgiev, Ivelin; Wang, Charlene; Chen, Xuejun; Longo, Nancy S.; Louder, Mark; McKee, Krisha; O?Dell, Sijy; Perfetto, Stephen; Schmidt, Stephen D.; Shi, Wei; Wu, Lan; Yang, Yongping; Yang, Zhi-Yong; Yang, Zhongjia; Zhang, Zhenhai; Bonsignori, Mattia; Crump, John A.; Kapiga, Saidi H.; Sam, Noel E.; Haynes, Barton F.; Simek, Melissa; Burton, Dennis R.; Koff, Wayne C.; Doria-Rose, Nicole A.; Connors, Mark; Mullikin, James C.; Nabel, Gary J.; Roederer, Mario; Shapiro, Lawrence; Kwong, Peter D.; Mascola, John R. (Tumaini); (NIH); (Duke); (Kilimanjaro Repro.); (IAVI)



Sequence-Based Protein Crystallization Propensity Prediction for Structural Genomics: Review and Comparative Analysis  

Directory of Open Access Journals (Sweden)

Full Text Available Structural genomics (SG is an international effort that aims at solving three-dimensional shapes of important biological macro-molecules with primary focus on proteins. One of the main bottlenecks in SG is the ability to produce dif-fraction quality crystals for X-ray crystallogra-phy based protein structure determination. SG pipelines allow for certain flexibility in target selection which motivates development of in- silico methods for sequence-based prediction/ assessment of the protein crystallization pro-pensity. We overview existing SG databanks that are used to derive these predictive models and we discuss analytical results concerning protein sequence properties that were discov-ered to correlate with the ability to form crystals. We also contrast and empirically compare mo- dern sequence-based predictors of crystalliza-tion propensity including OB-Score, ParCrys, XtalPred and CRYSTALP2. Our analysis shows that these methods provide useful and compli-mentary predictions. Although their average ac- curacy is similar at around 70%, we show that application of a simple majority-vote based en-semble improves accuracy to almost 74%. The best improvements are achieved by combining XtalPred with CRYSTALP2 while OB-Score and ParCrys methods overlap to a larger extend, although they still complement the other two predictors. We also demonstrate that 90% of the protein chains can be correctly predicted by at least one of these methods, which suggests that more accurate ensembles could be built in the future. We believe that current protein crystalli-zation propensity predictors could provide useful input for the target selection procedures utilized by the SG centers.

Marcin J. Mizianty



Exome sequencing improves genetic diagnosis of structural fetal abnormalities revealed by ultrasound. (United States)

The genetic etiology of non-aneuploid fetal structural abnormalities is typically investigated by karyotyping and array-based detection of microscopically detectable rearrangements, and submicroscopic copy-number variants (CNVs), which collectively yield a pathogenic finding in up to 10% of cases. We propose that exome sequencing may substantially increase the identification of underlying etiologies. We performed exome sequencing on a cohort of 30 non-aneuploid fetuses and neonates (along with their parents) with diverse structural abnormalities first identified by prenatal ultrasound. We identified candidate pathogenic variants with a range of inheritance models, and evaluated these in the context of detailed phenotypic information. We identified 35 de novo single-nucleotide variants (SNVs), small indels, deletions or duplications, of which three (accounting for 10% of the cohort) are highly likely to be causative. These are de novo missense variants in FGFR3 and COL2A1, and a de novo 16.8 kb deletion that includes most of OFD1. In five further cases (17%) we identified de novo or inherited recessive or X-linked variants in plausible candidate genes, which require additional validation to determine pathogenicity. Our diagnostic yield of 10% is comparable to, and supplementary to, the diagnostic yield of existing microarray testing for large chromosomal rearrangements and targeted CNV detection. The de novo nature of these events could enable couples to be counseled as to their low recurrence risk. This study outlines the way for a substantial improvement in the diagnostic yield of prenatal genetic abnormalities through the application of next-generation sequencing. PMID:24476948

Carss, Keren J; Hillman, Sarah C; Parthiban, Vijaya; McMullan, Dominic J; Maher, Eamonn R; Kilby, Mark D; Hurles, Matthew E



Structural Analysis of Single-Point Mutations Given an RNA Sequence: A Case Study with RNAMute  

Directory of Open Access Journals (Sweden)

Full Text Available We introduce here for the first time the RNAMute package, a pattern-recognition-based utility to perform mutational analysis and detect vulnerable spots within an RNA sequence that affect structure. Mutations in these spots may lead to a structural change that directly relates to a change in functionality. Previously, the concept was tried on RNA genetic control elements called "riboswitches" and other known RNA switches, without an organized utility that analyzes all single-point mutations and can be further expanded. The RNAMute package allows a comprehensive categorization, given an RNA sequence that has functional relevance, by exploring the patterns of all single-point mutants. For illustration, we apply the RNAMute package on an RNA transcript for which individual point mutations were shown experimentally to inactivate spectinomycin resistance in Escherichia coli. Functional analysis of mutations on this case study was performed experimentally by creating a library of point mutations using PCR and screening to locate those mutations. With the availability of RNAMute, preanalysis can be performed computationally before conducting an experiment.

Churkin Alexander



ParaPep: a web resource for experimentally validated antiparasitic peptide sequences and their structures. (United States)

ParaPep is a repository of antiparasitic peptides, which provides comprehensive information related to experimentally validated antiparasitic peptide sequences and their structures. The data were collected and compiled from published research papers, patents and from various databases. The current release of ParaPep holds 863 entries among which 519 are unique peptides. In addition to peptides having natural amino acids, ParaPep also consists of peptides having d-amino acids and chemically modified residues. In ParaPep, most of the peptides have been evaluated for growth inhibition of various species of Plasmodium, Leishmania and Trypanosoma. We have provided comprehensive information about these peptides that include peptide sequence, chemical modifications, stereochemistry, antiparasitic activity, origin, nature of peptide, assay types, type of parasite, mode of action and hemolytic activity. Structures of peptides consisting of natural, as well as modified amino acids have been determined using state-of-the-art software, PEPstr. To facilitate users, various user-friendly web tools, for data fetching, analysis and browsing, have been integrated. We hope that ParaPep will be advantageous in designing therapeutic peptides against parasitic diseases. Database URL: PMID:24923818

Mehta, Divya; Anand, Priya; Kumar, Vineet; Joshi, Anshika; Mathur, Deepika; Singh, Sandeep; Tuknait, Abhishek; Chaudhary, Kumardeep; Gautam, Shailendra K; Gautam, Ankur; Varshney, Grish C; Raghava, Gajendra P S



Partial primary structure of human pregnancy zone protein: extensive sequence homology with human alpha 2-macroglobulin  

DEFF Research Database (Denmark)

Human pregnancy zone protein (PZP) is a major pregnancy-associated protein. Its quaternary structure (two covalently bound 180-kDa subunits, which are further non-covalently assembled into a tetramer of 720 kDa) is similar to that of human alpha 2-macroglobulin (alpha 2M). Here we show, from the results of complete or partial sequence determination of a random selection of 38 tryptic peptides covering 685 residues of the subunit of PZP, that PZP and alpha 2M indeed are extensively homologous. In the stretches of PZP sequenced so far, the degree of identically placed residues in the two proteins is 68%, indicating a close evolutionary relationship between PZP and alpha 2M. Although the function of PZP in pregnancy is largely unknown, its close structural relationship to alpha 2M suggests analogous proteinase binding properties and a potential for being taken up in cells by receptor-mediated endocytosis. In this regard our studies indicate a bait region in PZP significantly different from that present in alpha 2M. PZP could be the human equivalent of the acute-phase alpha-macroglobulins (e.g., rat alpha 2M and rabbit alpha 1M) described earlier

Sottrup-Jensen, Lars; Folkersen, J



ParaPep: a web resource for experimentally validated antiparasitic peptide sequences and their structures (United States)

ParaPep is a repository of antiparasitic peptides, which provides comprehensive information related to experimentally validated antiparasitic peptide sequences and their structures. The data were collected and compiled from published research papers, patents and from various databases. The current release of ParaPep holds 863 entries among which 519 are unique peptides. In addition to peptides having natural amino acids, ParaPep also consists of peptides having d-amino acids and chemically modified residues. In ParaPep, most of the peptides have been evaluated for growth inhibition of various species of Plasmodium, Leishmania and Trypanosoma. We have provided comprehensive information about these peptides that include peptide sequence, chemical modifications, stereochemistry, antiparasitic activity, origin, nature of peptide, assay types, type of parasite, mode of action and hemolytic activity. Structures of peptides consisting of natural, as well as modified amino acids have been determined using state-of-the-art software, PEPstr. To facilitate users, various user-friendly web tools, for data fetching, analysis and browsing, have been integrated. We hope that ParaPep will be advantageous in designing therapeutic peptides against parasitic diseases. Database URL:

Mehta, Divya; Anand, Priya; Kumar, Vineet; Joshi, Anshika; Mathur, Deepika; Singh, Sandeep; Tuknait, Abhishek; Chaudhary, Kumardeep; Gautam, Shailendra K.; Gautam, Ankur; Varshney, Grish C.; Raghava, Gajendra P.S.



Functional and Structural Overview of G-Protein-Coupled Receptors Comprehensively Obtained from Genome Sequences  

Directory of Open Access Journals (Sweden)

Full Text Available An understanding of the functional mechanisms of G-protein-coupled receptors (GPCRs is very important for GPCR-related drug design. We have developed an integrated GPCR database (SEVENS that includes 64,090 reliable GPCR genes comprehensively identified from 56 eukaryote genome sequences, and overviewed the sequences and structure spaces of the GPCRs. In vertebrates, the number of receptors for biological amines, peptides, etc. is conserved in most species, whereas the number of chemosensory receptors for odorant, pheromone, etc. significantly differs among species. The latter receptors tend to be single exon type or a few exon type and show a high ratio in the numbers of GPCRs, whereas some families, such as Class B and Class C receptors, have long lengths due to the presence of many exons. Statistical analyses of amino acid residues reveal that most of the conserved residues in Class A GPCRs are found in the cytoplasmic half regions of transmembrane (TM helices, while residues characteristic to each subfamily found on the extracellular half regions. The 69 of Protein Data Bank (PDB entries of complete or fragmentary structures could be mapped on the TM/loop regions of Class A GPCRs covering 14 subfamilies.

Makiko Suwa



Examining Prebiotic Chemistry Using O(^1D) Insertion Reactions (United States)

Aminomethanol, methanediol, and methoxymethanol are all prebiotic molecules expected to form via photo-driven grain surface chemistry in the interstellar medium (ISM). These molecules are expected to be precursors for larger, biologically-relevant molecules in the ISM such as sugars and amino acids. These three molecules have not yet been detected in the ISM because of the lack of available rotational spectra. A high resolution (sub)millimeter spectrometer coupled to a molecular source is being used to study these molecules using O(^1D) insertion reactions. The O(^1D) chemistry is initiated using an excimer laser, and the products of the insertion reactions are adiabatically cooled using a supersonic expansion. Experimental parameters are being optimized by examination of methanol formed from O(^1D) insertion into methane. Theoretical studies of the structure and reaction energies for aminomethanol, methanediol, and methoxymethanol have been conducted to guide the laboratory studies once the methanol experiment has been optimized. The results of the calculations and initial experimental results will be presented.

Hays, Brian M.; Laas, Jacob C.; Weaver, Susanna L. Widicus



1D localisation and the symmetric group  

Energy Technology Data Exchange (ETDEWEB)

Earlier investigations making use of a density matrix formalism to study the average resistance of a disordered solid are extended in the 1D case to calculate moments of the resistance within a diagonal-disorder model. Application of the symmetric group greatly simplifies the problem, reducing matrices from of order 2 /SUP 2N/ to of order 2N + 1 for the Nth moment. The symmetry-reduced matrix enables a transparent discussion to be made of the analytic properties of the mean conductance which is shown to have a number of remarkable properties. Finally, some analytic results are presented in various limiting cases and compared with earlier work.

Pendry, J.B.



Bi3+/M2+ oxyphosphate: a continuous series of polycationic species from the 1D single chain to the 2D planes. Part 1: From HREM images to crystal-structure deduction. (United States)

This work deals with the crystal-structure deduction of new structural types of Bi3+-M2+ oxyphosphates (M is a transition element) from HREM images. Previous studies showed the unequivocal attribution of particular HREM contrasts to the corresponding Bi/M/O-based polycationic species in similar materials. On this basis, the examination of isolated crystallites of polyphased samples led to new HREM contrasts assigned to new polycationic species in three new structural types. This helped us to solve one crystal structure, and the two other forms have been deduced through HREM image decoding. It helped to model the investigated materials from the structural point of view as well as the chemical one. The three assumed crystal structures are formed by polycationic ribbons, n tetrahedra wide, surrounded by PO4 groups, as already encountered in these series of oxyphosphates. However, here we deal with the original n= 4-6 cases, whereas, up to this work, only the n= 1-3 ribbons have been reported. The greater size of ribbons is associated with particular structural modifications responsible for complex HREM contrasts. The validity of the proposed models is verified in Part 2 of this work. PMID:16903714

Huvé, M; Colmont, M; Mentré, O



Identification of novel DNA repair proteins via primary sequence, secondary structure, and homology  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background DNA repair is the general term for the collection of critical mechanisms which repair many forms of DNA damage such as methylation or ionizing radiation. DNA repair has mainly been studied in experimental and clinical situations, and relatively few information-based approaches to new extracting DNA repair knowledge exist. As a first step, automatic detection of DNA repair proteins in genomes via informatics techniques is desirable; however, there are many forms of DNA repair and it is not a straightforward process to identify and classify repair proteins with a single optimal method. We perform a study of the ability of homology and machine learning-based methods to identify and classify DNA repair proteins, as well as scan vertebrate genomes for the presence of novel repair proteins. Combinations of primary sequence polypeptide frequency, secondary structure, and homology information are used as feature information for input to a Support Vector Machine (SVM. Results We identify that SVM techniques are capable of identifying portions of DNA repair protein datasets without admitting false positives; at low levels of false positive tolerance, homology can also identify and classify proteins with good performance. Secondary structure information provides improved performance compared to using primary structure alone. Furthermore, we observe that machine learning methods incorporating homology information perform best when data is filtered by some clustering technique. Analysis by applying these methodologies to the scanning of multiple vertebrate genomes confirms a positive correlation between the size of a genome and the number of DNA repair protein transcripts it is likely to contain, and simultaneously suggests that all organisms have a non-zero minimum number of repair genes. In addition, the scan result clusters several organisms' repair abilities in an evolutionarily consistent fashion. Analysis also identifies several functionally unconfirmed proteins that are highly likely to be involved in the repair process. A new web service, INTREPED, has been made available for the immediate search and annotation of DNA repair proteins in newly sequenced genomes. Conclusion Despite complexity due to a multitude of repair pathways, combinations of sequence, structure, and homology with Support Vector Machines offer good methods in addition to existing homology searches for DNA repair protein identification and functional annotation. Most importantly, this study has uncovered relationships between the size of a genome and a genome's available repair repetoire, and offers a number of new predictions as well as a prediction service, both which reduce the search time and cost for novel repair genes and proteins.

Akutsu Tatsuya



Genotyping by sequencing resolves shallow population structure to inform conservation of Chinook salmon (Oncorhynchus tshawytscha) (United States)

Recent advances in population genomics have made it possible to detect previously unidentified structure, obtain more accurate estimates of demographic parameters, and explore adaptive divergence, potentially revolutionizing the way genetic data are used to manage wild populations. Here, we identified 10 944 single-nucleotide polymorphisms using restriction-site-associated DNA (RAD) sequencing to explore population structure, demography, and adaptive divergence in five populations of Chinook salmon (Oncorhynchus tshawytscha) from western Alaska. Patterns of population structure were similar to those of past studies, but our ability to assign individuals back to their region of origin was greatly improved (>90% accuracy for all populations). We also calculated effective size with and without removing physically linked loci identified from a linkage map, a novel method for nonmodel organisms. Estimates of effective size were generally above 1000 and were biased downward when physically linked loci were not removed. Outlier tests based on genetic differentiation identified 733 loci and three genomic regions under putative selection. These markers and genomic regions are excellent candidates for future research and can be used to create high-resolution panels for genetic monitoring and population assignment. This work demonstrates the utility of genomic data to inform conservation in highly exploited species with shallow population structure.

Larson, Wesley A; Seeb, Lisa W; Everett, Meredith V; Waples, Ryan K; Templin, William D; Seeb, James E



Sequence-specific sup 1 H NMR assignments and secondary structure of porcine motilin  

Energy Technology Data Exchange (ETDEWEB)

The solution structure of the 22-residue peptide hormone motilin has been studied by circular dichroism and two-dimensional {sup 1}H nuclear magnetic resonance spectroscopy. Circular dichroism spectra indicate the presence of {alpha}-helical secondary structure in aqueous solution, and the secondary structure can be stabilized with hexafluoro-2-propanol. Sequence-specific assignments of the proton NMR spectrum of porcine motilin in 30% hexafluoro-2-propanol have been made by using two-dimensional NMR techniques. All backbone proton resonances (NH and {alpha}CH) and most of the side-chain resonances have been assigned by using double-quantum-filtered COSY, RELAYED-COSY, and NOESY experiments. Simulations of NOESY cross-peak intensities as a function of mixing time indicate that spin diffusion has a relatively small effect in peptides the size of motilin, thereby allowing the use of long mixing times to confidently make assignments and delineate secondary structure. Sequential {alpha}CH-NH and NH-NH NOESY connectivities were observed over a significant portion of the length of the peptide. The intensities of selected NOESY cross-peaks relative to corresponding diagonal peaks were used to estimate a rotational correlation time of approximately 2.5 ns for the peptide, indicating that the peptide exists as a monomer in solution under the conditions used here.

Khan, N.; Graslund, A.; Shriver, J. (Southern Illinois Univ., Carbondale (USA)); Ehrenberg, A. (Univ. of Stockholm (Sweden))



Sequence-specific 1H NMR assignments and secondary structure of porcine motilin  

International Nuclear Information System (INIS)

The solution structure of the 22-residue peptide hormone motilin has been studied by circular dichroism and two-dimensional 1H nuclear magnetic resonance spectroscopy. Circular dichroism spectra indicate the presence of ?-helical secondary structure in aqueous solution, and the secondary structure can be stabilized with hexafluoro-2-propanol. Sequence-specific assignments of the proton NMR spectrum of porcine motilin in 30% hexafluoro-2-propanol have been made by using two-dimensional NMR techniques. All backbone proton resonances (NH and ?CH) and most of the side-chain resonances have been assigned by using double-quantum-filtered COSY, RELAYED-COSY, and NOESY experiments. Simulations of NOESY cross-peak intensities as a function of mixing time indicate that spin diffusion has a relatively small effect in peptides the size of motilin, thereby allowing the use of long mixing times to confidently make assignments and delineate secondary structure. Sequential ?CH-NH and NH-NH NOESY connectivities were observed over a significant portion of the length of the peptide. The intensities of selected NOESY cross-peaks relative to corresponding diagonal peaks were used to estimate a rotational correlation time of approximately 2.5 ns for the peptide, indicating that the peptide exists as a monomer in solution under the conditions used here



Wavelet Analysis of DNA Bending Profiles reveals Structural Constraints on the Evolution of Genomic Sequences. (United States)

Analyses of genomic DNA sequences have shown in previous works that base pairs are correlated at large distances with scale-invariant statistical properties. We show in the present study that these correlations between nucleotides (letters) result in fact from long-range correlations (LRC) between sequence-dependent DNA structural elements (words) involved in the packaging of DNA in chromatin. Using the wavelet transform technique, we perform a comparative analysis of the DNA text and of the corresponding bending profiles generated with curvature tables based on nucleosome positioning data. This exploration through the optics of the so-called `wavelet transform microscope' reveals a characteristic scale of 100-200 bp that separates two regimes of different LRC. We focus here on the existence of LRC in the small-scale regime (? 200 bp). Analysis of genomes in the three kingdoms reveals that this regime is specifically associated to the presence of nucleosomes. Indeed, small scale LRC are observed in eukaryotic genomes and to a less extent in archaeal genomes, in contrast with their absence in eubacterial genomes. Similarly, this regime is observed in eukaryotic but not in bacterial viral DNA genomes. There is one exception for genomes of Poxviruses, the only animal DNA viruses that do not replicate in the cell nucleus and do not present small scale LRC. Furthermore, no small scale LRC are detected in the genomes of all examined RNA viruses, with one exception in the case of retroviruses. Altogether, these results strongly suggest that small-scale LRC are a signature of the nucleosomal structure. Finally, we discuss possible interpretations of these small-scale LRC in terms of the mechanisms that govern the positioning, the stability and the dynamics of the nucleosomes along the DNA chain. This paper is maily devoted to a pedagogical presentation of the theoretical concepts and physical methods which are well suited to perform a statistical analysis of genomic sequences. We review the results obtained with the so-called wavelet-based multifractal analysis when investigating the DNA sequences of various organisms in the three kingdoms. Some of these results have been announced in B. Audit et al. [1, 2]. PMID:23345861

Audit, Benjamin; Vaillant, Cédric; Arnéodo, Alain; d'Aubenton-Carafa, Yves; Thermes, Claude



Influence of loading sequence and stress ratio on Fatigue damage accumulation of a structural component  

Scientific Electronic Library Online (English)

Full Text Available SciELO Portugal | Language: English Abstract in portuguese Este artigo apresenta resultados experimentais relativos à acumulação de dano de fadiga de um componente estrutural de aço P355NL1. O componente estrutural é uma placa rectangular com duplo entalhe. Foram aplicadas sequências de dois e múltiplos blocos de carga de amplitude constante, para várias co [...] mbinações de razões de tensão remotas, nomeadamente R=0, R=0.15 e R=0.3. Também foram analisados os efeitos da aplicação de blocos de amplitude variável, aplicados de acordo com um espectro de carga predefinido. Este estudo foi complementado com resultados de ensaios realizados em amplitude constante, os quais serviram para os cálculos de acumulação de dano. Em geral, o carregamento por blocos demonstra que o dano provocado por fadiga apresenta uma evolução não linear com o número de ciclos de carga, sendo esta evolução de dano função da sequência de carga, do nível de tensão e da razão de tensões. Geralmente, a aplicação de carregamentos de amplitude variável indicia um importante efeito da razão de tensões na acumulação de dano por fadiga. Particularmente, é observado um efeito claro da sequência de carga nos carregamentos compostos por dois blocos de carga, com razão de tensões nula. Para as outras razões de tensões (altas), os efeitos da sequência de carga são praticamente desprezáveis; contudo a evolução de dano continua a ser não linear. Abstract in english This paper presents experimental results about the fatigue damage accumulation behaviour of a structural component made of P355NL1 steel. The structural component is a rectangular double notched plate. Two and multiple alternated constant amplitude block sequences were applied for various combinatio [...] ns of remote stress ranges. Three stress ratios were investigated, namely R=0, R=0.15 and R=0.3. Variable amplitude blocks were also investigated according predefined stress spectra. Constant amplitude data was also generated which is applied for damage calculation purposes. In general, the block loading demonstrates that fatigue damage evolves nonlinearly with the number of loading cycles, function of the load sequence, stress level and stress ratios. Generally, the application of variable amplitude loading suggests an important stress ratio effect on fatigue damage accumulation. In particular, a clear load sequence effect is verified for the two block loading, with null stress ratio. For the other (higher) stress ratios, the load sequence effects are almost negligible; however the damage evolution still is non-linear.

Hélder F. S. G., Pereira; Abílio M.P. de, Jesus; Alfredo S., Ribeiro; António A., Fernandes.


Stability of 1D-ordered beams  

CERN Document Server

Beam and storage ring parameters have to meet a number of stringent conditions before circulating three-dimensional crystalline beams can be obtained. Existing storage rings violate at least one of these conditions. The remarkable exception is the very low-energy ring PALLAS with 'radio frequency focusing', which apparently abides by the full set of conditions, and where 3D-ordering has indeed been realized. However 1D-ordering, - particles lining up in a 'string of non-overtaking links' - has been observed in several storage rings with electron cooling, even though the above conditions on the classical plasma parameter as well as on the lattice periodicity are largely violated. In this paper, we discuss conditions for 1D- ordering, including space-charge limits, instabilities and cooling requirements. We conclude that densities of some 10**4 ions/m may be possible; about 100 times more than those obtained in present experiments. Such dense strings are of practical interest for precision measurements of ion p...

Katayama, T



Phthalocyanine based 1D nanowires for device applications (United States)

1D nanowires (NWs) of Cu (II) 1,4,8,11,15,18,22,25-octabutoxy-29H,31H-Phthalocyanine (CuPc(OBu)8) molecule have been grown on different substrates by cost effective solution processing technique. The density of NWs is found to be strongly dependent on the concentration of solution. The possible formation mechanism of these structures is ?-? interaction between phthalocyanine molecules. The improved conductivity of these NWs as compared to spin coated film indicates their potential for molecular device applications.

Saini, Rajan; Mahajan, Aman; Bedi, R. K.



Feedback stabilization of a simplified 1d fluid- particle system  

Digital Repository Infrastructure Vision for European Research (DRIVER)

We consider the feedback stabilization of a simplified 1d model for a fluid-structure interaction system. The fluid equation is the viscous Burgers equation whereas the motion of the particle is given by the Newton's laws. We stabilize this system around a stationary state by using feedbacks located at the exterior boundary of the fluid domain. With one input, we obtain a local stabilizability of the system with an exponential decay rate of order $\\sigma<\\sigma_0$. An arbitrary order for the ...

Badra, Mehdi; Takahashi, Take?o



2D separated-local-field spectra from projections of 1D experiments (United States)

A novel procedure for reconstruction of 2D separated-local-field (SLF) NMR spectra from projections of 1D NMR data is presented. The technique, dubbed SLF projection reconstruction from one-dimensional spectra (SLF-PRODI), is particularly useful for uniaxially oriented membrane protein samples and represents a fast and robust alternative to the popular PISEMA experiment which correlates 1H- 15N dipole-dipole couplings with 15N chemical shifts. The different 1D projections in the SLF-PRODI experiment are obtained from 1D spectra recorded under influence of homonuclear decoupling sequences with different scaling factors for the heteronuclear dipolar couplings. We demonstrate experimentally and numerically that as few as 2-4 1D projections will normally be sufficient to reconstruct a 2D SLF-PRODI spectrum with a quality resembling typical PISEMA spectra, leading to significant reduction of the acquisition time.

Bertelsen, Kresten; Pedersen, Jan M.; Nielsen, Niels Chr.; Vosegaard, Thomas



Sequence homology and structural analysis of plasmepsin 4 isolated from Indian Plasmodium vivax isolates. (United States)

Plasmodium vivax malaria is a globally widespread disease responsible for 50% of human malaria cases in Central and South America, South East Asia and Indian subcontinent. The rising severity of the disease and emerging resistance of the parasite has emphasized the need for the search of novel therapeutic targets to combat P. vivax malaria. Plasmepsin 4 (PM4) a food vacuole aspartic protease is essential in parasite functions and viability such as initiating hemoglobin digestion and processing of proteins and is being looked upon as potential drug target. Although the plasmepsins of Plasmodium falciparum have been extensively studied, the plasmepsins of P. vivax are not well characterized. This is the first report detailing complete PM4 gene analysis from Indian P. vivax isolates. Blast results of sequences of P. vivax plasmepsin 4 (PvPM4) shows 100% homology among isolates of P. vivax collected from different geographical regions of India. All of the seven Indian isolates did not contain intron within the coding region. Interestingly, PvPM4 sequence analysis showed a very high degree of homology with all other sequences of Plasmodium species available in the genebank. Our results strongly suggest that PvPM4 are highly conserved except a small number of amino acid substitutions that did not modify key motifs at active site formation for the function or the structure of the enzymes. Furthermore, our study shows that PvPM4 occupies unique phylogenetic status within Plasmodium group and sufficiently differ from the most closely related human aspartic protease, cathepsin D. The analysis of 3D model of PM4 showed a typical aspartic protease structure with bi-lobed, compact and distinct peptide binding cleft in both P. vivax and P. falciparum. In order to validate appropriate use of PM4 as potential anti-malarial drug target, studies on genetic and structural variations among P. vivax plasmepsins (PvPMs) from different geographical regions are of utmost importance for drugs and vaccine designs for anti-malarial strategies. PMID:21382523

Rawat, Manmeet; Vijay, Sonam; Gupta, Yash; Dixit, Rajnikant; Tiwari, P K; Sharma, Arun



A rostro-caudal gradient of structured sequence processing in the left inferior frontal gyrus. (United States)

In this paper, we present two novel perspectives on the function of the left inferior frontal gyrus (LIFG). First, a structured sequence processing perspective facilitates the search for functional segregation within the LIFG and provides a way to express common aspects across cognitive domains including language, music and action. Converging evidence from functional magnetic resonance imaging and transcranial magnetic stimulation studies suggests that the LIFG is engaged in sequential processing in artificial grammar learning, independently of particular stimulus features of the elements (whether letters, syllables or shapes are used to build up sequences). The LIFG has been repeatedly linked to processing of artificial grammars across all different grammars tested, whether they include non-adjacent dependencies or mere adjacent dependencies. Second, we apply the sequence processing perspective to understand how the functional segregation of semantics, syntax and phonology in the LIFG can be integrated in the general organization of the lateral prefrontal cortex (PFC). Recently, it was proposed that the functional organization of the lateral PFC follows a rostro-caudal gradient, such that more abstract processing in cognitive control is subserved by more rostral regions of the lateral PFC. We explore the literature from the viewpoint that functional segregation within the LIFG can be embedded in a general rostro-caudal abstraction gradient in the lateral PFC. If the lateral PFC follows a rostro-caudal abstraction gradient, then this predicts that the LIFG follows the same principles, but this prediction has not yet been tested or explored in the LIFG literature. Integration might provide further insights into the functional architecture of the LIFG and the lateral PFC. PMID:22688637

Uddén, Julia; Bahlmann, Jörg



Genome3D: a UK collaborative project to annotate genomic sequences with predicted 3D structures based on SCOP and CATH domains  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Genome3D, available at, is a new collaborative project that integrates UK-based structural resources to provide a unique perspective on sequence-structure-function relationships. Leading structure prediction resources (DomSerf, FUGUE, Gene3D, pDomTHREADER, Phyre and SUPERFAMILY) provide annotations for UniProt sequences to indicate the locations of structural domains (structural annotations) and their 3D structures (structural models). Structural annotations and 3D mode...

Lewis, T. E.; Sillitoe, I.; Andreeva, A.; Blundell, T. L.; Buchan, D. W.; Chothia, C.; Cuff, A.; Dana, J. M.; Filippis, I.; Gough, J.; Hunter, S.; Jones, D. T.; Kelley, L. A.; Kleywegt, G. J.; Minneci, F.



The transmembrane domain sequence affects the structure and function of the Newcastle disease virus fusion protein. (United States)

The role of specific sequences in the transmembrane (TM) domain of Newcastle disease virus (NDV) fusion (F) protein in the structure and function of this protein was assessed by replacing this domain with the F protein TM domains from two other paramyxoviruses, Sendai virus (SV) and measles virus (MV), or the TM domain of the unrelated glycoprotein (G) of vesicular stomatitis virus (VSV). Mutant proteins with the SV or MV F protein TM domains were expressed, transported to cell surfaces, and proteolytically cleaved at levels comparable to that of the wild-type protein, while mutant proteins with the VSV G protein TM domain were less efficiently expressed on cell surfaces and proteolytically cleaved. All mutant proteins were defective in all steps of membrane fusion, including hemifusion. In contrast to the wild-type protein, the mutant proteins did not form detectable complexes with the NDV hemagglutinin-neuraminidase (HN) protein. As determined by binding of conformation-sensitive antibodies, the conformations of the ectodomains of the mutant proteins were altered. These results show that the specific sequence of the TM domain of the NDV F protein is important for the conformation of the preactivation form of the ectodomain, the interactions of the protein with HN protein, and fusion activity. PMID:21270151

Gravel, Kathryn A; McGinnes, Lori W; Reitter, Julie; Morrison, Trudy G



The interplay of peptide sequence and local structure in TiO2 biomineralization. (United States)

Using cyclic constrained TiO(2) binding peptides STB1 (CHKKPSKSC), RSTB1 (CHRRPSRSC) and linear peptide LSTB1 (AHKKPSKSA), it was shown that while affinity of the peptide to TiO(2) is essential to enable TiO(2) biomineralization, other factors such as biomineralization kinetics and peptide local structure need to be considered to predict biomineralization efficacy. Cyclic and linear TiO(2) binding peptides show significantly different biomineralization activities. Cyclic STB1 and RSTB1 could induce TiO(2) precipitation in the presence of titanium(IV)-bis-ammonium-lactato-dihydroxide (TiBALDH) precursor in water or tris buffer at pH 8. In contrast, linear LSTB1 was unable to mineralize TiO(2) under the same experimental conditions despite its high affinity to TiO(2) comparable with STB1 and/or RSTB1. LSTB1 being a flexible molecule could not render the stable condensation of TiBALDH precursor to form TiO(2) particles. However, in the presence of phosphate buffer ions, the structure of LSTB1 is stabilized, leading to efficient condensation of TiBALDH and TiO(2) particle formation. This study demonstrates that peptide-mediated TiO(2) mineralization is governed by a complicated interplay of peptide sequence, local structure, kinetics and the presence of mineralizing aider such as phosphate ions. PMID:22922289

Choi, Noori; Tan, Lihan; Jang, Ji-ryang; Um, Yu Mi; Yoo, Pil J; Choe, Woo-Seok



TBC1D24 Mutation Causes Autosomal-Dominant Nonsyndromic Hearing Loss. (United States)

Hereditary hearing loss is extremely heterogeneous. Over 70 genes have been identified to date, and with the advent of massively parallel sequencing, the pace of novel gene discovery has accelerated. In a family segregating progressive autosomal-dominant nonsyndromic hearing loss (NSHL), we used OtoSCOPE® to exclude mutations in known deafness genes and then performed segregation mapping and whole-exome sequencing to identify a unique variant, p.Ser178Leu, in TBC1D24 that segregates with the hearing loss phenotype. TBC1D24 encodes a GTPase-activating protein expressed in the cochlea. Ser178 is highly conserved across vertebrates and its change is predicted to be damaging. Other variants in TBC1D24 have been associated with a panoply of clinical symptoms including autosomal recessive NSHL, syndromic hearing impairment associated with onychodystrophy, osteodystrophy, mental retardation, and seizures (DOORS syndrome), and a wide range of epileptic disorders. PMID:24729539

Azaiez, Hela; Booth, Kevin T; Bu, Fengxiao; Huygen, Patrick; Shibata, Seiji B; Shearer, A Eliot; Kolbe, Diana; Meyer, Nicole; Black-Ziegelbein, E Ann; Smith, Richard J H



Creation and structure determination of an artificial protein with three complete sequence repeats. (United States)

Symfoil-4P is a de novo protein exhibiting the threefold symmetrical ?-trefoil fold designed based on the human acidic fibroblast growth factor. First three asparagine-glycine sequences of Symfoil-4P are replaced with glutamine-glycine (Symfoil-QG) or serine-glycine (Symfoil-SG) sequences protecting from deamidation, and His-Symfoil-II was prepared by introducing a protease digestion site into Symfoil-QG so that Symfoil-II has three complete repeats after removal of the N-terminal histidine tag. The Symfoil-QG and SG and His-Symfoil-II proteins were expressed in Eschericha coli as soluble protein, and purified by nickel affinity chromatography. Symfoil-II was further purified by anion-exchange chromatography after removing the HisTag by proteolysis. Both Symfoil-QG and Symfoil-II were crystallized in 0.1 M Tris-HCl buffer (pH 7.0) containing 1.8 M ammonium sulfate as precipitant at 293 K; several crystal forms were observed for Symfoil-QG and II. The maximum diffraction of Symfoil-QG and II crystals were 1.5 and 1.1 Å resolution, respectively. The Symfoil-II without histidine tag diffracted better than Symfoil-QG with N-terminal histidine tag. Although the crystal packing of Symfoil-II is slightly different from Symfoil-QG and other crystals of Symfoil derivatives having the N-terminal histidine tag, the refined crystal structure of Symfoil-II showed pseudo-threefold symmetry as expected from other Symfoils. Since the removal of the unstructured N-terminal histidine tag did not affect the threefold structure of Symfoil, the improvement of diffraction quality of Symfoil-II may be caused by molecular characteristics of Symfoil-II such as molecular stability. PMID:24121347

Adachi, Motoyasu; Shimizu, Rumi; Kuroki, Ryota; Blaber, Michael



Mitogenomic sequences better resolve stock structure of southern Greater Caribbean green turtle rookeries. (United States)

Analyses of mitochondrial control region polymorphisms have supported the presence of several demographically independent green turtle (Chelonia mydas) rookeries in the Greater Caribbean region. However, extensive sharing of common haplotypes based on 490-bp control region sequences confounds assessment of the scale of natal homing and population structure among regional rookeries. We screened the majority of the mitochondrial genomes of 20 green turtles carrying the common haplotype CM-A5 and representing the rookeries of Buck Island, St. Croix, United States Virgin Islands (USVI); Aves Island, Venezuela; Galibi, Suriname; and Tortuguero, Costa Rica. Five single-nucleotide polymorphisms (SNPs) were identified that subdivided CM-A5 among regions. Mitogenomic pairwise ?(ST) values of eastern Caribbean rookery comparisons were markedly lower than the respective pairwise F(ST) values. This discrepancy results from the presence of haplotypes representing two divergent lineages in each rookery, highlighting the importance of choosing the appropriate test statistic for addressing the study question. Haplotype frequency differentiation supports demographic independence of Aves Island and Suriname, emphasizing the need to recognize the smaller Aves rookery as a distinct management unit. Aves Island and Buck Island rookeries shared mitogenomic haplotypes; however, frequency divergence suggests that the Buck Island rookery is sufficiently demographically isolated to warrant management unit status for the USVI rookeries. Given that haplotype sharing among rookeries is common in marine turtles with cosmopolitan distributions, mitogenomic sequencing may enhance inferences of population structure and phylogeography, as well as improve the resolution of mixed stock analyses aimed at estimating natal origins of foraging turtles. PMID:22432442

Shamblin, Brian M; Bjorndal, Karen A; Bolten, Alan B; Hillis-Starr, Zandy M; Lundgren, Ian; Naro-Maciel, Eugenia; Nairn, Campbell J



Recombining population structure of Plesiomonas shigelloides (Enterobacteriaceae) revealed by multilocus sequence typing. (United States)

Plesiomonas shigelloides is an emerging pathogen that is widespread in the aquatic environment and is responsible for intestinal diseases and extraintestinal infections in humans and other animals. Virtually nothing is known about its genetic diversity, population structure, and evolution, which severely limits epidemiological control. We addressed these questions by developing a multilocus sequence typing (MLST) system based on five genes (fusA, leuS, pyrG, recG, and rpoB) and analyzing 77 epidemiologically unrelated strains from several countries and several ecological sources. The phylogenetic position of P. shigelloides within family Enterobacteriaceae was precisely defined by phylogenetic analysis of the same gene portions in other family members. Within P. shigelloides, high levels of nucleotide diversity (average percentage of nucleotide differences between strains, 1.49%) and genotypic diversity (64 distinct sequence types; Simpson's index, 99.7%) were found, with no salient internal phylogenetic structure. We estimated that homologous recombination in housekeeping genes affects P. shigelloides alleles and nucleotides 7 and 77 times more frequently than mutation, respectively. These ratios are similar to those observed in the naturally transformable species Streptococcus pneumoniae with a high rate of recombination. In contrast, recombination within Salmonella enterica, Escherichia coli, and Yersinia enterocolitica was much less frequent. P. shigelloides thus stands out among members of the Enterobacteriaceae. Its high rate of recombination results in a lack of association between genomic background and O and H antigenic factors, as observed for the 51 serotypes found in our sample. Given its robustness and discriminatory power, we recommend MLST as a reference method for population biology studies and epidemiological tracking of P. shigelloides strains. PMID:17693512

Salerno, Anna; Delétoile, Alexis; Lefevre, Martine; Ciznar, Ivan; Krovacek, Karel; Grimont, Patrick; Brisse, Sylvain



MMT. 1-D Radionuclide Ground Water Transport  

Energy Technology Data Exchange (ETDEWEB)

MMT solves the one-dimensional equation for transport of radionuclides in a ground-water system with uniform velocity and transport properties. The purpose of the code is to evaluate radionuclide release rates from the site subsystem. MMT treats convective transport, sorption-desorption effects, and hydrodynamic dispersion. Sorption and desorption of radionuclides are taken into account by application of retardation factors which are spatially uniform and derived from bulk rock properties and average geochemical data. Dispersion along the direction of flow (forward and backward) is also taken into account. Because the code solves only the one-dimensional transport equation, dispersion transverse to the direction of flow is not evaluated. A discrete parcel random walk (DPRW) approach is used to solve the coupled equations. This numerical technique is inherently stable and minimizes computational errors that lead to apparent mass nonconservation. MMT is a set of three programs which are run sequentially. The first program, MMT1D, utilizes flow and radionuclide input to generate the raw radionuclide release rate output in terms of the frequency at which parcels pass the specified release point and the calculated weights of these parcels. The second program, PLTCVT, is a postprocessor which filters and smooths the MMT1D output to reduce the statistical fluctuation. PLTCVT uses a variable-window, moving-average filter algorithm to smooth the raw results. The smoothed output from PLTCVT consists of separate, smoothed output files for each nuclide under consideration. The third program, MERGE, merges the separate release files generated by PLTCVT into a single output file. This merged output file may be used directly by a biosphere transport or dose assessment code as radionuclide release rate input data.

Golis, M.J. [Battelle Memorial Institute, Columbus, OH (United States)



[Nucleotide sequence of the genome region of the tick-borne encephalitis virus coding for structural virion proteins]. (United States)

RNA of a flavivirus-tick-borne encephalitis virus (strain Sofjin) was subjected to reverse transcription and the DNA copy was transformed into double-stranded DNA by action of E. coli DNA-polymerase I (Klenow's fragment). This DNA was annealed with pBR322 plasmid. The recombinant plasmids were cloned in E. coli K802. The nucleotide sequence of the inserts of the clones coding for region of structural proteins C, pre-M, E and nonstructural protein ns1 was determined by the Maxam-Gilbert method. The nucleotide sequence of these regions is translatable into an amino acid sequence of proteins without interruption. The amino acid sequences of proteins and nucleotide sequence of genome of the tick-borne encephalitis virus are extensively homologous to that found for the flaviviruses Yellow Fever and West Nile. PMID:3022757

Pletnev, A G; Iamshchikov, V F; Blinov, V M



Descriptive analysis of the crystal structure of the 1-D semiconducting TCNQ salt : TEA(TCNQ)2, as a function of temperature. - I. Intermolecular distortions of the conducting TCNQ columns  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Within reasonable approximations, the structure of a conducting TCNQ column is characterized by six parameters. Two parameters define the equivalent regular column, and four parameters refer to the intermolecular distortions. Two of the distortional parameters are associated with a dimerization in the column, while the other two parameters are associated with a tetramerization. It is then shown, as an experimental fact, that only two of these six parameters are significantly temperature depen...

Farges, J. P.



Synthesis and structural characterization of two cobalt phosphites: 1-D (H3NC6H4NH3)Co(HPO3)2 and 2-D (NH4)2Co2(HPo3)3  

International Nuclear Information System (INIS)

Two new cobalt phosphites, (H3NC6H4NH3)Co(HPO3)2 (1) and (NH4)2Co2(HPO3)3 (2), have been synthesized and characterized by single-crystal X-ray diffraction. All the cobalt atoms of 1 are in tetrahedral CoO4 coordination. The structure of 1 comprises twisted square chains of four-rings, which contain alternating vertex-shared CoO4 tetrahedra and HPO3 groups. These chains are interlinked with trans-1,4-diaminocyclohexane cations by hydrogen bonds. The 2-D structure of 2 comprises anionic complex sheets with ammonium cations present between them. An anionic complex sheet contains three-deck phosphite units, which are interconnected by Co2O9 to form complex layers. Magnetic susceptibility measurements of 1 and 2 showed that they have a weak antiferromagnetic interaction. - Graphical abstract: The 2-D structure of (NH4)2Co2(HPO3)3 comprises anionic complex sheets with ammonium cations present between them. An anionic complex sheet contains three-deck phosphite units, which are interconnected by dimmeric Co2O9 to form complex layers.



The D1-D2 region of the large subunit ribosomal DNA as barcode for ciliates. (United States)

Ciliates are a major evolutionary lineage within the alveolates, which are distributed in nearly all habitats on our planet and are an essential component for ecosystem function, processes and stability. Accurate identification of these unicellular eukaryotes through, for example, microscopy or mating type reactions is reserved to few specialists. To satisfy the demand for a DNA barcode for ciliates, which meets the standard criteria for DNA barcodes defined by the Consortium for the Barcode of Life (CBOL), we here evaluated the D1-D2 region of the ribosomal DNA large subunit (LSU-rDNA). Primer universality for the phylum Ciliophora was tested in silico with available database sequences as well as in the laboratory with 73 ciliate species, which represented nine of 12 ciliate classes. Primers tested in this study were successful for all tested classes. To test the ability of the D1-D2 region to resolve conspecific and congeneric sequence divergence, 63 Paramecium strains were sampled from 24 mating species. The average conspecific D1-D2 variation was 0.18%, whereas congeneric sequence divergence averaged 4.83%. In pairwise genetic distance analyses, we identified a D1-D2 sequence divergence of barcode for ciliated protists. PMID:24165195

Stoeck, T; Przybos, E; Dunthorn, M



Distinct DNA sequence and structure requirements for the two steps of V(D)J recombination signal cleavage  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Cleavage of V(D)J recombination signals by purified RAG1 and RAG2 proteins permits the dissection of DNA structure and sequence requirements. The two recognition elements of a signal (nonamer and heptamer) are used differently, and their cooperation depends on correct helical phasing. The nonamer is most important for initial binding, while efficient nicking and hairpin formation require the heptamer sequence. Both nicking and hairpin form...

Ramsden, D. A.; Mcblane, J. F.; Gent, D. C.; Gellert, M.



Structural characterization of genomes by large scale sequence-structure threading: application of reliability analysis in structural genomics  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background We establish that the occurrence of protein folds among genomes can be accurately described with a Weibull function. Systems which exhibit Weibull character can be interpreted with reliability theory commonly used in engineering analysis. For instance, Weibull distributions are widely used in reliability, maintainability and safety work to model time-to-failure of mechanical devices, mechanisms, building constructions and equipment. Results We have found that the Weibull function describes protein fold distribution within and among genomes more accurately than conventional power functions which have been used in a number of structural genomic studies reported to date. It has also been found that the Weibull reliability parameter ? for protein fold distributions varies between genomes and may reflect differences in rates of gene duplication in evolutionary history of organisms. Conclusions The results of this work demonstrate that reliability analysis can provide useful insights and testable predictions in the fields of comparative and structural genomics.

Brunham Robert C



Molecular cloning, sequencing, and overexpression of the structural gene encoding the delta subunit of Escherichia coli DNA polymerase III holoenzyme.  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Using an oligonucleotide hybridization probe, we have mapped the structural gene for the delta subunit of Escherichia coli DNA polymerase III holoenzyme to 14.6 centisomes of the chromosome. This gene, designated holA, was cloned and sequenced. The sequence of holA matches precisely four amino acid sequences obtained for the amino terminus of delta and three internal tryptic peptides. A holA-overproducing plasmid that directs the expression of delta up to 4% of the soluble protein was constru...

Carter, J. R.; Franden, M. A.; Aebersold, R.; Mchenry, C. S.



Structural dynamics of cereal mitochondrial genomes as revealed by complete nucleotide sequencing of the wheat mitochondrial genome  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The application of a new gene-based strategy for sequencing the wheat mitochondrial genome shows its structure to be a 452?528 bp circular molecule, and provides nucleotide-level evidence of intra-molecular recombination. Single, reciprocal and double recombinant products, and the nucleotide sequences of the repeats that mediate their formation have been identified. The genome has 55 genes with exons, including 35 protein-coding, 3 rRNA and 17 tRNA genes. Nucleotide sequences of seven wheat...

Ogihara, Yasunari; Yamazaki, Yukiko; Murai, Koji; Kanno, Akira; Terachi, Toru; Shiina, Takashi; Miyashita, Naohiko; Nasuda, Shuhei; Nakamura, Chiharu; Mori, Naoki; Takumi, Shigeo; Murata, Minoru; Futo, Satoshi; Tsunewaki, Koichiro



Novel sequence variations in LAMA2 and SGCG genes modulating cis-acting regulatory elements and RNA secondary structure  

Digital Repository Infrastructure Vision for European Research (DRIVER)

In this study, we detected new sequence variations in LAMA2 and SGCG genes in 5 ethnic populations, and analysed their effect on enhancer composition and mRNA structure. PCR amplification and DNA sequencing were performed and followed by bioinformatics analyses using ESEfinder as well as MFOLD software. We found 3 novel sequence variations in the LAMA2 (c.3174+22_23insAT and c.6085 +12delA) and SGCG (c.*102A/C) genes. These variations were present in 210 tested healthy controls from Tunisian,...

Olfa Siala; Ikhlass Hadj Salem; Abdelaziz Tlili; Imen Ammar; Hanen Belguith; Faiza Fakhfakh



Novel sequence variations in LAMA2 andSGCG genes modulating cis-acting regulatory elements and RNA secondary structure  

Digital Repository Infrastructure Vision for European Research (DRIVER)

In this study, we detected new sequence variations in LAMA2 and SGCG genes in 5 ethnic populations, and analysed their effect on enhancer composition and mRNA structure. PCR amplification and DNA sequencing were performed and followed by bioinformatics analyses using ESEfinder as well as MFOLD software. We found 3 novel sequence variations in the LAMA2 (c.3174+22_23insAT and c.6085 +12delA) and SGCG (c. * 102A/C) genes. These variations were present in 210 tested healthy controls from Tunisia...

Siala, Olfa; Salem, Ikhlass Hadj; Tlili, Abdelaziz; Ammar, Imen; Belguith, Hanen; Fakhfakh, Faiza



Signatures of DNA flexibility, interactions and sequence-related structural variations in classical X-ray diffraction patterns  

Digital Repository Infrastructure Vision for European Research (DRIVER)

The theory of X-ray diffraction from ideal, rigid helices allowed Watson and Crick to unravel the DNA structure, thereby elucidating functions encoded in it. Yet, as we know now, the DNA double helix is neither ideal nor rigid. Its structure varies with the base pair sequence. Its flexibility leads to thermal fluctuations and allows molecules to adapt their structure to optimize their intermolecular interactions. In addition to the double helix symmetry revealed by Watson and Crick, classical...



A statistical learning approach to the modeling of chromatographic retention of oligonucleotides incorporating sequence and secondary structure data  

Digital Repository Infrastructure Vision for European Research (DRIVER)

We propose a new model for predicting the retention time of oligonucleotides. The model is based on ? support vector regression using features derived from base sequence and predicted secondary structure of oligonucleotides. Because of the secondary structure information, the model is applicable even at relatively low temperatures where the secondary structure is not suppressed by thermal denaturing. This makes the prediction of oligonucleotide retention time for arbitrary temperatures possi...

Sturm, Marc; Quinten, Sascha; Huber, Christian G.; Kohlbacher, Oliver



Spatio-temporal structure extraction and denoising of geophysical fluid image sequences using 3D curvelet transforms  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Since several decades many satellites have been launched for the observation of the Earth for a better knowledge of the atmosphere and of the ocean. The sequences of images that such satellites provide show the evolution of some large scale structures such as vortices and fronts. It is obvious that the dynamic of these structures may have a strong predictive potential. Extracting these structures and tracking their evolution automatically is then essential for future forecast systems. In this...



A sequence-based survey of the complex structural organization of tumor genomes  

Digital Repository Infrastructure Vision for European Research (DRIVER)

Tumors and cancer cell lines were surveyed with end-sequencing profiling, yielding the largest available collection of sequence-ready tumor genome breakpoints and providing evidence that some rearrangements may be recurrent.



Structural Basis and Sequence Rules for Substrate Recognition by Tankyrase Explain the Basis for Cherubism Disease  

Energy Technology Data Exchange (ETDEWEB)

The poly(ADP-ribose)polymerases Tankyrase 1/2 (TNKS/TNKS2) catalyze the covalent linkage of ADP-ribose polymer chains onto target proteins, regulating their ubiquitylation, stability, and function. Dysregulation of substrate recognition by Tankyrases underlies the human disease cherubism. Tankyrases recruit specific motifs (often called RxxPDG hexapeptides) in their substrates via an N-terminal region of ankyrin repeats. These ankyrin repeats form five domains termed ankyrin repeat clusters (ARCs), each predicted to bind substrate. Here we report crystal structures of a representative ARC of TNKS2 bound to targeting peptides from six substrates. Using a solution-based peptide library screen, we derive a rule-based consensus for Tankyrase substrates common to four functionally conserved ARCs. This 8-residue consensus allows us to rationalize all known Tankyrase substrates and explains the basis for cherubism-causing mutations in the Tankyrase substrate 3BP2. Structural and sequence information allows us to also predict and validate other Tankyrase targets, including Disc1, Striatin, Fat4, RAD54, BCR, and MERIT40.

Guettler, Sebastian; LaRose, Jose; Petsalaki, Evangelia; Gish, Gerald; Scotter, Andy; Pawson, Tony; Rottapel, Robert; Sicheri, Frank (Mount Sinai Hospital); (OCI)



Sequence-specific size, structure, and stability of tight protein knots  

CERN Document Server

Approximately 1% of the known protein structures display knotted configurations in their native fold but their function is not understood. It has been speculated that the entanglement may inhibit mechanical protein unfolding or transport, e.g., as in cellular threading or translocation processes through narrow biological pores. Here we investigate tigh peptide knot (TPK) characteristics in detail by pulling selected 3_1 and 4_1-knotted peptides using all-atom molecular dynamics computer simulations. We find that the 3_1 and 4_1-TPK lengths are typically Delta l~4.7 nm and 6.9 nm, respectively, for a wide range of tensions (F < 1.5 nN), pointing to a pore diameter of ~2 nm below which a translocated knotted protein might get stuck. The 4_1-knot length is in agreement with recent AFM pulling experiments. Detailed TPK characteristics however, may be sequence-specific: we find a different size and structural behavior in polyglycines, and, strikingly, a strong hydrogen bonding and water molecule trapping capabi...

Dzubiella, Joachim



Deep RNA sequencing improved the structural annotation of the Tuber melanosporum transcriptome  

Digital Repository Infrastructure Vision for European Research (DRIVER)

P>The functional complexity of the Tuber melanosporum transcriptome has not yet been fully elucidated. Here, we applied high-throughput Illumina RNA-sequencing (RNA-Seq) to the transcriptome of T. melanosporum at different major developmental stages, that is free-living mycelium, fruiting body and ectomycorrhiza. Sequencing of cDNA libraries generated a total of c. 24 million sequence reads representing > 882 Mb of sequence data. To construct a coverage signal profile across the genome, all r...



The Interior Structure Constants as an Age Diagnostic for Low-Mass, Pre-Main Sequence Detached Eclipsing Binary Stars  

CERN Multimedia

We propose a novel method for determining the ages of low-mass, pre-main sequence stellar systems using the apsidal motion of low-mass detached eclipsing binaries. The apsidal motion of a binary system with an eccentric orbit provides information regarding the interior structure constants of the individual stars. These constants are related to the normalized stellar interior density distribution and can be extracted from the predictions of stellar evolution models. We demonstrate that low-mass, pre-main sequence stars undergoing radiative core contraction display rapidly changing interior structure constants (greater than 5% per 10 Myr) that, when combined with observational determinations of the interior structure constants (with 5 -- 10% precision), allow for a robust age estimate. This age estimate, unlike those based on surface quantities, is largely insensitive to the surface layer where effects of magnetic activity are likely to be most pronounced. On the main sequence, where age sensitivity is minimal,...

Feiden, Gregory A



Mycobacterial phosphatidylinositol mannoside is a natural antigen for CD1d-restricted T cells  

Digital Repository Infrastructure Vision for European Research (DRIVER)

A group of T cells recognizes glycolipids presented by molecules of the CD1 family. The CD1d-restricted natural killer T cells (NKT cells) are primarily considered to be self-reactive. By employing CD1d-binding and T cell assays, the following structural parameters for presentation by CD1d were defined for a number of mycobacterial and mammalian lipids: two acyl chains facilitated binding, and a polar head group was essential for T cell recognition. Of the mycobacterial lipids tested, only a ...

Fischer, Karsten; Scotet, Emmanuel; Niemeyer, Marcus; Koebernick, Heidrun; Zerrahn, Jens; Maillet, Sophie; Hurwitz, Robert; Kursar, Mischo; Bonneville, Marc; Kaufmann, Stefan H. E.; Schaible, Ulrich E.



Compression-based classification of biological sequences and structures via the Universal Similarity Metric: experimental assessment  

Directory of Open Access Journals (Sweden)

Full Text Available Abstract Background Similarity of sequences is a key mathematical notion for Classification and Phylogenetic studies in Biology. It is currently primarily handled using alignments. However, the alignment methods seem inadequate for post-genomic studies since they do not scale well with data set size and they seem to be confined only to genomic and proteomic sequences. Therefore, alignment-free similarity measures are actively pursued. Among those, USM (Universal Similarity Metric has gained prominence. It is based on the deep theory of Kolmogorov Complexity and universality is its most novel striking feature. Since it can only be approximated via data compression, USM is a methodology rather than a formula quantifying the similarity of two strings. Three approximations of USM are available, namely UCD (Universal Compression Dissimilarity, NCD (Normalized Compression Dissimilarity and CD (Compression Dissimilarity. Their applicability and robustness is tested on various data sets yielding a first massive quantitative estimate that the USM methodology and its approximations are of value. Despite the rich theory developed around USM, its experimental assessment has limitations: only a few data compressors have been tested in conjunction with USM and mostly at a qualitative level, no comparison among UCD, NCD and CD is available and no comparison of USM with existing methods, both based on alignments and not, seems to be available. Results We experimentally test the USM methodology by using 25 compressors, all three of its known approximations and six data sets of relevance to Molecular Biology. This offers the first systematic and quantitative experimental assessment of this methodology, that naturally complements the many theoretical and the preliminary experimental results available. Moreover, we compare the USM methodology both with methods based on alignments and not. We may group our experiments into two sets. The first one, performed via ROC (Receiver Operating Curve analysis, aims at assessing the intrinsic ability of the methodology to discriminate and classify biological sequences and structures. A second set of experiments aims at assessing how well two commonly available classification algorithms, UPGMA (Unweighted Pair Group Method with Arithmetic Mean and NJ (Neighbor Joining, can use the methodology to perform their task, their performance being evaluated against gold standards and with the use of well known statistical indexes, i.e., the F-measure and the partition distance. Based on the experiments, several conclusions can be drawn and, from them, novel valuable guidelines for the use of USM on biological data. The main ones are reported next. Conclusion UCD and NCD are indistinguishable, i.e., they yield nearly the same values of the statistical indexes we have used, accross experiments and data sets, while CD is almost always worse than both. UPGMA seems to yield better classification results with respect to NJ, i.e., better values of the statistical indexes (10% difference or above, on a substantial fraction of experiments, compressors and USM approximation choices. The compression program PPMd, based on PPM (Prediction by Partial Matching, for generic data and Gencompress for DNA, are the best performers among the compression algorithms we have used, although the difference in performance, as measured by statistical indexes, between them and the other algorithms depends critically on the data set and may not be as large as expected. PPMd used with UCD or NCD and UPGMA, on sequence data is very close, although worse, in performance with the alignment methods (less than 2% difference on the F-measure. Yet, it scales well with data set size and it can work on data other than sequences. In summary, our quantitative analysis naturally complements the rich theory behind USM and supports the conclusion that the methodology is worth using because of its robustness, flexibility, scalability, and competitiveness with existing techniques. In particular, the methodology applies to all biological

Manzini Giovanni



Using Chou's pseudo amino acid composition to predict protein quaternary structure: a sequence-segmented PseAAC approach. (United States)

In the protein universe, many proteins are composed of two or more polypeptide chains, generally referred to as subunits, which associate through noncovalent interactions and, occasionally, disulfide bonds to form protein quaternary structures. It has long been known that the functions of proteins are closely related to their quaternary structures; some examples include enzymes, hemoglobin, DNA polymerase, and ion channels. However, it is extremely labor-expensive and even impossible to quickly determine the structures of hundreds of thousands of protein sequences solely from experiments. Since the number of protein sequences entering databanks is increasing rapidly, it is highly desirable to develop computational methods for classifying the quaternary structures of proteins from their primary sequences. Since the concept of Chou's pseudo amino acid composition (PseAAC) was introduced, a variety of approaches, such as residue conservation scores, von Neumann entropy, multiscale energy, autocorrelation function, moment descriptors, and cellular automata, have been utilized to formulate the PseAAC for predicting different attributes of proteins. Here, in a different approach, a sequence-segmented PseAAC is introduced to represent protein samples. Meanwhile, multiclass SVM classifier modules were adopted to classify protein quaternary structures. As a demonstration, the dataset constructed by Chou and Cai [(2003) Proteins 53:282-289] was adopted as a benchmark dataset. The overall jackknife success rates thus obtained were 88.2-89.1%, indicating that the new approach is quite promising for predicting protein quaternary structure. PMID:18427713

Zhang, Shao-Wu; Chen, Wei; Yang, Feng; Pan, Quan



The investigation of the secondary structures of various peptide sequences of ?-casein by the multicanonical simulation method (United States)

The structural properties of Arginine-Glutamic acid-Leucine-Glutamic acid-Glutamic acid-Leucine-Asparagine-Valine-Proline-Glycine (RELEELNVPG, in one letter code), Glutamic acid-Glutamic acid-Glutamine-Glutamine-Glutamine-Threonine-Glutamic acid (EEQQQTE) and Glutamic acid-Aspartic acid-Glutamic acid-Leucine-Glutamine-Aspartic acid-Lysine-Isoleucine (EDELQDKI) peptide sequences of ?-casein were studied by three-dimensional molecular modeling. In this work, the three-dimensional conformations of each peptide from their primary sequences were obtained by multicanonical simulations. With using major advantage of this simulation technique, Ramachandran plots were prepared and analysed to predict the relative occurrence probabilities of ?-turn, ?-turn and helical structures. Structural predictions of these sequences of ?-casein molecule indicate the presence of high level of helical structures and ?III-turns. The occurrence probabilities of inverse and classical ?-turns were low. The probability of helical structure of each sequence significantly decreased when the temperature increased. Our results show these peptides have highly helical structure and better agreement with the results of spectroscopic techniques and other prediction methods.

Ya?ar, F.; Çelik, S.; Köksel, H.



How Does Sequence Structure Affect the Judgment of Time? Exploring a Weighted Sum of Segments Model (United States)

This paper examines the judgment of segmented temporal intervals, using short tone sequences as a convenient test case. In four experiments, we investigate how the relative lengths, arrangement, and pitches of the tones in a sequence affect judgments of sequence duration, and ask whether the data can be described by a simple weighted sum of…

Matthews, William J.



Novel sequence variations in LAMA2 and SGCG genes modulating cis-acting regulatory elements and RNA secondary structure  

Directory of Open Access Journals (Sweden)

Full Text Available In this study, we detected new sequence variations in LAMA2 and SGCG genes in 5 ethnic populations, and analysed their effect on enhancer composition and mRNA structure. PCR amplification and DNA sequencing were performed and followed by bioinformatics analyses using ESEfinder as well as MFOLD software. We found 3 novel sequence variations in the LAMA2 (c.3174+22_23insAT and c.6085 +12delA and SGCG (c.*102A/C genes. These variations were present in 210 tested healthy controls from Tunisian, Moroccan, Algerian, Lebanese and French populations suggesting that they represent novel polymorphisms within LAMA2 and SGCG genes sequences. ESEfinder showed that the c.*102A/C substitution created a new exon splicing enhancer in the 3'UTR of SGCG genes, whereas the c.6085 +12delA deletion was situated in the base pairing region between LAMA2 mRNA and the U1snRNA spliceosomal components. The RNA structure analyses showed that both variations modulated RNA secondary structure. Our results are suggestive of correlations between mRNA folding and the recruitment of spliceosomal components mediating splicing, including SR proteins. The contribution of common sequence variations to mRNA structural and functional diversity will contribute to a better study of gene expression.

Olfa Siala



Novel sequence variations in LAMA2 and SGCG genes modulating cis-acting regulatory elements and RNA secondary structure  

Scientific Electronic Library Online (English)

Full Text Available SciELO Brazil | Language: English Abstract in english In this study, we detected new sequence variations in LAMA2 and SGCG genes in 5 ethnic popu