Maximum Likelihood, Consistency and Data Envelopment Analysis: A Statistical Foundation
Rajiv D. Banker
1993-01-01
This paper provides a formal statistical basis for the efficiency evaluation techniques of data envelopment analysis (DEA). DEA estimators of the best practice monotone increasing and concave production function are shown to be also maximum likelihood estimators if the deviation of actual output from the efficient output is regarded as a stochastic variable with a monotone decreasing probability density function. While the best practice frontier estimator is biased below the theoretical front...
THESEUS: maximum likelihood superpositioning and analysis of macromolecular structures.
Theobald, Douglas L; Wuttke, Deborah S
2006-09-01
THESEUS is a command line program for performing maximum likelihood (ML) superpositions and analysis of macromolecular structures. While conventional superpositioning methods use ordinary least-squares (LS) as the optimization criterion, ML superpositions provide substantially improved accuracy by down-weighting variable structural regions and by correcting for correlations among atoms. ML superpositioning is robust and insensitive to the specific atoms included in the analysis, and thus it does not require subjective pruning of selected variable atomic coordinates. Output includes both likelihood-based and frequentist statistics for accurate evaluation of the adequacy of a superposition and for reliable analysis of structural similarities and differences. THESEUS performs principal components analysis for analyzing the complex correlations found among atoms within a structural ensemble. ANSI C source code and selected binaries for various computing platforms are available under the GNU open source license from http://monkshood.colorado.edu/theseus/ or http://www.theseus3d.org.
A short proof that phylogenetic tree reconstruction by maximum likelihood is hard.
Roch, Sebastien
2006-01-01
Maximum likelihood is one of the most widely used techniques to infer evolutionary histories. Although it is thought to be intractable, a proof of its hardness has been lacking. Here, we give a short proof that computing the maximum likelihood tree is NP-hard by exploiting a connection between likelihood and parsimony observed by Tuffley and Steel.
A Short Proof that Phylogenetic Tree Reconstruction by Maximum Likelihood is Hard
Roch, S.
2005-01-01
Maximum likelihood is one of the most widely used techniques to infer evolutionary histories. Although it is thought to be intractable, a proof of its hardness has been lacking. Here, we give a short proof that computing the maximum likelihood tree is NP-hard by exploiting a connection between likelihood and parsimony observed by Tuffley and Steel.
Zhou, Xiaofan; Shen, Xing-Xing; Hittinger, Chris Todd
2018-01-01
Abstract The sizes of the data matrices assembled to resolve branches of the tree of life have increased dramatically, motivating the development of programs for fast, yet accurate, inference. For example, several different fast programs have been developed in the very popular maximum likelihood framework, including RAxML/ExaML, PhyML, IQ-TREE, and FastTree. Although these programs are widely used, a systematic evaluation and comparison of their performance using empirical genome-scale data matrices has so far been lacking. To address this question, we evaluated these four programs on 19 empirical phylogenomic data sets with hundreds to thousands of genes and up to 200 taxa with respect to likelihood maximization, tree topology, and computational speed. For single-gene tree inference, we found that the more exhaustive and slower strategies (ten searches per alignment) outperformed faster strategies (one tree search per alignment) using RAxML, PhyML, or IQ-TREE. Interestingly, single-gene trees inferred by the three programs yielded comparable coalescent-based species tree estimations. For concatenation-based species tree inference, IQ-TREE consistently achieved the best-observed likelihoods for all data sets, and RAxML/ExaML was a close second. In contrast, PhyML often failed to complete concatenation-based analyses, whereas FastTree was the fastest but generated lower likelihood values and more dissimilar tree topologies in both types of analyses. Finally, data matrix properties, such as the number of taxa and the strength of phylogenetic signal, sometimes substantially influenced the programs’ relative performance. Our results provide real-world gene and species tree phylogenetic inference benchmarks to inform the design and execution of large-scale phylogenomic data analyses. PMID:29177474
Lin, Yu; Hu, Fei; Tang, Jijun; Moret, Bernard M E
2013-01-01
The rapid accumulation of whole-genome data has renewed interest in the study of the evolution of genomic architecture, under such events as rearrangements, duplications, losses. Comparative genomics, evolutionary biology, and cancer research all require tools to elucidate the mechanisms, history, and consequences of those evolutionary events, while phylogenetics could use whole-genome data to enhance its picture of the Tree of Life. Current approaches in the area of phylogenetic analysis are limited to very small collections of closely related genomes using low-resolution data (typically a few hundred syntenic blocks); moreover, these approaches typically do not include duplication and loss events. We describe a maximum likelihood (ML) approach for phylogenetic analysis that takes into account genome rearrangements as well as duplications, insertions, and losses. Our approach can handle high-resolution genomes (with 40,000 or more markers) and can use in the same analysis genomes with very different numbers of markers. Because our approach uses a standard ML reconstruction program (RAxML), it scales up to large trees. We present the results of extensive testing on both simulated and real data showing that our approach returns very accurate results very quickly. In particular, we analyze a dataset of 68 high-resolution eukaryotic genomes, with from 3,000 to 42,000 genes, from the eGOB database; the analysis, including bootstrapping, takes just 3 hours on a desktop system and returns a tree in agreement with all well supported branches, while also suggesting resolutions for some disputed placements.
Tamura, Koichiro; Peterson, Daniel; Peterson, Nicholas; Stecher, Glen; Nei, Masatoshi; Kumar, Sudhir
2011-01-01
Comparative analysis of molecular sequence data is essential for reconstructing the evolutionary histories of species and inferring the nature and extent of selective forces shaping the evolution of genes and species. Here, we announce the release of Molecular Evolutionary Genetics Analysis version 5 (MEGA5), which is a user-friendly software for mining online databases, building sequence alignments and phylogenetic trees, and using methods of evolutionary bioinformatics in basic biology, biomedicine, and evolution. The newest addition in MEGA5 is a collection of maximum likelihood (ML) analyses for inferring evolutionary trees, selecting best-fit substitution models (nucleotide or amino acid), inferring ancestral states and sequences (along with probabilities), and estimating evolutionary rates site-by-site. In computer simulation analyses, ML tree inference algorithms in MEGA5 compared favorably with other software packages in terms of computational efficiency and the accuracy of the estimates of phylogenetic trees, substitution parameters, and rate variation among sites. The MEGA user interface has now been enhanced to be activity driven to make it easier for the use of both beginners and experienced scientists. This version of MEGA is intended for the Windows platform, and it has been configured for effective use on Mac OS X and Linux desktops. It is available free of charge from http://www.megasoftware.net. PMID:21546353
Evolutionary analysis of apolipoprotein E by Maximum Likelihood and complex network methods
Directory of Open Access Journals (Sweden)
Leandro de Jesus Benevides
Full Text Available Abstract Apolipoprotein E (apo E is a human glycoprotein with 299 amino acids, and it is a major component of very low density lipoproteins (VLDL and a group of high-density lipoproteins (HDL. Phylogenetic studies are important to clarify how various apo E proteins are related in groups of organisms and whether they evolved from a common ancestor. Here, we aimed at performing a phylogenetic study on apo E carrying organisms. We employed a classical and robust method, such as Maximum Likelihood (ML, and compared the results using a more recent approach based on complex networks. Thirty-two apo E amino acid sequences were downloaded from NCBI. A clear separation could be observed among three major groups: mammals, fish and amphibians. The results obtained from ML method, as well as from the constructed networks showed two different groups: one with mammals only (C1 and another with fish (C2, and a single node with the single sequence available for an amphibian. The accordance in results from the different methods shows that the complex networks approach is effective in phylogenetic studies. Furthermore, our results revealed the conservation of apo E among animal groups.
Maximum likelihood-based analysis of photon arrival trajectories in single-molecule FRET
Energy Technology Data Exchange (ETDEWEB)
Waligorska, Marta [Adam Mickiewicz University, Faculty of Chemistry, Grunwaldzka 6, 60-780 Poznan (Poland); Molski, Andrzej, E-mail: amolski@amu.edu.pl [Adam Mickiewicz University, Faculty of Chemistry, Grunwaldzka 6, 60-780 Poznan (Poland)
2012-07-25
Highlights: Black-Right-Pointing-Pointer We study model selection and parameter recovery from single-molecule FRET experiments. Black-Right-Pointing-Pointer We examine the maximum likelihood-based analysis of two-color photon trajectories. Black-Right-Pointing-Pointer The number of observed photons determines the performance of the method. Black-Right-Pointing-Pointer For long trajectories, one can extract mean dwell times that are comparable to inter-photon times. -- Abstract: When two fluorophores (donor and acceptor) are attached to an immobilized biomolecule, anti-correlated fluctuations of the donor and acceptor fluorescence caused by Foerster resonance energy transfer (FRET) report on the conformational kinetics of the molecule. Here we assess the maximum likelihood-based analysis of donor and acceptor photon arrival trajectories as a method for extracting the conformational kinetics. Using computer generated data we quantify the accuracy and precision of parameter estimates and the efficiency of the Akaike information criterion (AIC) and the Bayesian information criterion (BIC) in selecting the true kinetic model. We find that the number of observed photons is the key parameter determining parameter estimation and model selection. For long trajectories, one can extract mean dwell times that are comparable to inter-photon times.
Dermoune, Azzouz; Simon, Eric Pierre
2017-12-01
This paper is a theoretical analysis of the maximum likelihood (ML) channel estimator for orthogonal frequency-division multiplexing (OFDM) systems in the presence of unknown interference. The following theoretical results are presented. Firstly, the uniqueness of the ML solution for practical applications, i.e., when thermal noise is present, is analytically demonstrated when the number of transmitted OFDM symbols is strictly greater than one. The ML solution is then derived from the iterative conditional ML (CML) algorithm. Secondly, it is shown that the channel estimate can be described as an algebraic function whose inputs are the initial value and the means and variances of the received samples. Thirdly, it is theoretically demonstrated that the channel estimator is not biased. The second and the third results are obtained by employing oblique projection theory. Furthermore, these results are confirmed by numerical results.
McGuire, Jimmy A; Witt, Christopher C; Altshuler, Douglas L; Remsen, J V
2007-10-01
Hummingbirds are an important model system in avian biology, but to date the group has been the subject of remarkably few phylogenetic investigations. Here we present partitioned Bayesian and maximum likelihood phylogenetic analyses for 151 of approximately 330 species of hummingbirds and 12 outgroup taxa based on two protein-coding mitochondrial genes (ND2 and ND4), flanking tRNAs, and two nuclear introns (AK1 and BFib). We analyzed these data under several partitioning strategies ranging between unpartitioned and a maximum of nine partitions. In order to select a statistically justified partitioning strategy following partitioned Bayesian analysis, we considered four alternative criteria including Bayes factors, modified versions of the Akaike information criterion for small sample sizes (AIC(c)), Bayesian information criterion (BIC), and a decision-theoretic methodology (DT). Following partitioned maximum likelihood analyses, we selected a best-fitting strategy using hierarchical likelihood ratio tests (hLRTS), the conventional AICc, BIC, and DT, concluding that the most stringent criterion, the performance-based DT, was the most appropriate methodology for selecting amongst partitioning strategies. In the context of our well-resolved and well-supported phylogenetic estimate, we consider the historical biogeography of hummingbirds using ancestral state reconstructions of (1) primary geographic region of occurrence (i.e., South America, Central America, North America, Greater Antilles, Lesser Antilles), (2) Andean or non-Andean geographic distribution, and (3) minimum elevational occurrence. These analyses indicate that the basal hummingbird assemblages originated in the lowlands of South America, that most of the principle clades of hummingbirds (all but Mountain Gems and possibly Bees) originated on this continent, and that there have been many (at least 30) independent invasions of other primary landmasses, especially Central America.
International Nuclear Information System (INIS)
Vanhaelewyn, G.; Callens, F.; Gruen, R.
2000-01-01
In order to determine the components which give rise to the EPR spectrum around g = 2 we have applied Maximum Likelihood Common Factor Analysis (MLCFA) on the EPR spectra of enamel sample 1126 which has previously been analysed by continuous wave and pulsed EPR as well as EPR microscopy. MLCFA yielded agreeing results on three sets of X-band spectra and the following components were identified: an orthorhombic component attributed to CO - 2 , an axial component CO 3- 3 , as well as four isotropic components, three of which could be attributed to SO - 2 , a tumbling CO - 2 and a central line of a dimethyl radical. The X-band results were confirmed by analysis of Q-band spectra where three additional isotropic lines were found, however, these three components could not be attributed to known radicals. The orthorhombic component was used to establish dose response curves for the assessment of the past radiation dose, D E . The results appear to be more reliable than those based on conventional peak-to-peak EPR intensity measurements or simple Gaussian deconvolution methods
Neandertal admixture in Eurasia confirmed by maximum-likelihood analysis of three genomes.
Lohse, Konrad; Frantz, Laurent A F
2014-04-01
Although there has been much interest in estimating histories of divergence and admixture from genomic data, it has proved difficult to distinguish recent admixture from long-term structure in the ancestral population. Thus, recent genome-wide analyses based on summary statistics have sparked controversy about the possibility of interbreeding between Neandertals and modern humans in Eurasia. Here we derive the probability of full mutational configurations in nonrecombining sequence blocks under both admixture and ancestral structure scenarios. Dividing the genome into short blocks gives an efficient way to compute maximum-likelihood estimates of parameters. We apply this likelihood scheme to triplets of human and Neandertal genomes and compare the relative support for a model of admixture from Neandertals into Eurasian populations after their expansion out of Africa against a history of persistent structure in their common ancestral population in Africa. Our analysis allows us to conclusively reject a model of ancestral structure in Africa and instead reveals strong support for Neandertal admixture in Eurasia at a higher rate (3.4-7.3%) than suggested previously. Using analysis and simulations we show that our inference is more powerful than previous summary statistics and robust to realistic levels of recombination.
Maximum likelihood-based analysis of single-molecule photon arrival trajectories
Hajdziona, Marta; Molski, Andrzej
2011-02-01
In this work we explore the statistical properties of the maximum likelihood-based analysis of one-color photon arrival trajectories. This approach does not involve binning and, therefore, all of the information contained in an observed photon strajectory is used. We study the accuracy and precision of parameter estimates and the efficiency of the Akaike information criterion and the Bayesian information criterion (BIC) in selecting the true kinetic model. We focus on the low excitation regime where photon trajectories can be modeled as realizations of Markov modulated Poisson processes. The number of observed photons is the key parameter in determining model selection and parameter estimation. For example, the BIC can select the true three-state model from competing two-, three-, and four-state kinetic models even for relatively short trajectories made up of 2 × 103 photons. When the intensity levels are well-separated and 104 photons are observed, the two-state model parameters can be estimated with about 10% precision and those for a three-state model with about 20% precision.
Maximum likelihood-based analysis of single-molecule photon arrival trajectories.
Hajdziona, Marta; Molski, Andrzej
2011-02-07
In this work we explore the statistical properties of the maximum likelihood-based analysis of one-color photon arrival trajectories. This approach does not involve binning and, therefore, all of the information contained in an observed photon strajectory is used. We study the accuracy and precision of parameter estimates and the efficiency of the Akaike information criterion and the Bayesian information criterion (BIC) in selecting the true kinetic model. We focus on the low excitation regime where photon trajectories can be modeled as realizations of Markov modulated Poisson processes. The number of observed photons is the key parameter in determining model selection and parameter estimation. For example, the BIC can select the true three-state model from competing two-, three-, and four-state kinetic models even for relatively short trajectories made up of 2 × 10(3) photons. When the intensity levels are well-separated and 10(4) photons are observed, the two-state model parameters can be estimated with about 10% precision and those for a three-state model with about 20% precision.
Maximum likelihood fitting of FROC curves under an initial-detection-and-candidate-analysis model
International Nuclear Information System (INIS)
Edwards, Darrin C.; Kupinski, Matthew A.; Metz, Charles E.; Nishikawa, Robert M.
2002-01-01
We have developed a model for FROC curve fitting that relates the observer's FROC performance not to the ROC performance that would be obtained if the observer's responses were scored on a per image basis, but rather to a hypothesized ROC performance that the observer would obtain in the task of classifying a set of 'candidate detections' as positive or negative. We adopt the assumptions of the Bunch FROC model, namely that the observer's detections are all mutually independent, as well as assumptions qualitatively similar to, but different in nature from, those made by Chakraborty in his AFROC scoring methodology. Under the assumptions of our model, we show that the observer's FROC performance is a linearly scaled version of the candidate analysis ROC curve, where the scaling factors are just given by the FROC operating point coordinates for detecting initial candidates. Further, we show that the likelihood function of the model parameters given observational data takes on a simple form, and we develop a maximum likelihood method for fitting a FROC curve to this data. FROC and AFROC curves are produced for computer vision observer datasets and compared with the results of the AFROC scoring method. Although developed primarily with computer vision schemes in mind, we hope that the methodology presented here will prove worthy of further study in other applications as well
Extended maximum likelihood analysis of apparent flattenings of S0 and spiral galaxies
International Nuclear Information System (INIS)
Okamura, Sadanori; Takase, Bunshiro; Hamabe, Masaru; Nakada, Yoshikazu; Kodaira, Keiichi.
1981-01-01
Apparent flattenings of S0 and spiral galaxies compiled by Sandage et al. (1970) and van den Bergh (1977), and those listed in the Second Reference Catalogue (RC2) are analyzed by means of the extended maximum likelihood method which was recently developed in the information theory for statistical model identification. Emphasis is put on the possible difference in the distribution of intrinsic flattenings between S0's and spirals as a group, and on the apparent disagreements present in the previous results. The present analysis shows that (1) One cannot conclude on the basis of the data in the Reference Catalogue of Bright Galaxies (RCBG) that the distribution of intrinsic flattenings of spirals is almost identical to that of S0's; spirals have wider dispersion than S0's, and there are more round systems in spirals than in S0's. (2) The distribution of intrinsic flattenings of S0's and spirals derived from the data in RC2 again indicates a significant difference from each other. (3) The distribution of intrinsic flattenings of S0's exhibits different characteristics depending upon the surface-brightness level; the distribution with one component is obtained from the data at RCBG level (--23.5 mag arcsec -2 ) and that with two components at RC2 level (25 mag arcsec -2 ). (author)
Analysis of Pairwise Interactions in a Maximum Likelihood Sense to Identify Leaders in a Group
Directory of Open Access Journals (Sweden)
Violet Mwaffo
2017-07-01
Full Text Available Collective motion in animal groups manifests itself in the form of highly coordinated maneuvers determined by local interactions among individuals. A particularly critical question in understanding the mechanisms behind such interactions is to detect and classify leader–follower relationships within the group. In the technical literature of coupled dynamical systems, several methods have been proposed to reconstruct interaction networks, including linear correlation analysis, transfer entropy, and event synchronization. While these analyses have been helpful in reconstructing network models from neuroscience to public health, rules on the most appropriate method to use for a specific dataset are lacking. Here, we demonstrate the possibility of detecting leaders in a group from raw positional data in a model-free approach that combines multiple methods in a maximum likelihood sense. We test our framework on synthetic data of groups of self-propelled Vicsek particles, where a single agent acts as a leader and both the size of the interaction region and the level of inherent noise are systematically varied. To assess the feasibility of detecting leaders in real-world applications, we study a synthetic dataset of fish shoaling, generated by using a recent data-driven model for social behavior, and an experimental dataset of pharmacologically treated zebrafish. Not only does our approach offer a robust strategy to detect leaders in synthetic data but it also allows for exploring the role of psychoactive compounds on leader–follower relationships.
Analysis of Minute Features in Speckled Imagery with Maximum Likelihood Estimation
Directory of Open Access Journals (Sweden)
Alejandro C. Frery
2004-12-01
Full Text Available This paper deals with numerical problems arising when performing maximum likelihood parameter estimation in speckled imagery using small samples. The noise that appears in images obtained with coherent illumination, as is the case of sonar, laser, ultrasound-B, and synthetic aperture radar, is called speckle, and it can neither be assumed Gaussian nor additive. The properties of speckle noise are well described by the multiplicative model, a statistical framework from which stem several important distributions. Amongst these distributions, one is regarded as the universal model for speckled data, namely, the Ã°ÂÂ’Â¢0 law. This paper deals with amplitude data, so the Ã°ÂÂ’Â¢A0 distribution will be used. The literature reports that techniques for obtaining estimates (maximum likelihood, based on moments and on order statistics of the parameters of the Ã°ÂÂ’Â¢A0 distribution require samples of hundreds, even thousands, of observations in order to obtain sensible values. This is verified for maximum likelihood estimation, and a proposal based on alternate optimization is made to alleviate this situation. The proposal is assessed with real and simulated data, showing that the convergence problems are no longer present. A Monte Carlo experiment is devised to estimate the quality of maximum likelihood estimators in small samples, and real data is successfully analyzed with the proposed alternated procedure. Stylized empirical influence functions are computed and used to choose a strategy for computing maximum likelihood estimates that is resistant to outliers.
Directory of Open Access Journals (Sweden)
Wang Huai-Chun
2009-09-01
Full Text Available Abstract Background The covarion hypothesis of molecular evolution holds that selective pressures on a given amino acid or nucleotide site are dependent on the identity of other sites in the molecule that change throughout time, resulting in changes of evolutionary rates of sites along the branches of a phylogenetic tree. At the sequence level, covarion-like evolution at a site manifests as conservation of nucleotide or amino acid states among some homologs where the states are not conserved in other homologs (or groups of homologs. Covarion-like evolution has been shown to relate to changes in functions at sites in different clades, and, if ignored, can adversely affect the accuracy of phylogenetic inference. Results PROCOV (protein covarion analysis is a software tool that implements a number of previously proposed covarion models of protein evolution for phylogenetic inference in a maximum likelihood framework. Several algorithmic and implementation improvements in this tool over previous versions make computationally expensive tree searches with covarion models more efficient and analyses of large phylogenomic data sets tractable. PROCOV can be used to identify covarion sites by comparing the site likelihoods under the covarion process to the corresponding site likelihoods under a rates-across-sites (RAS process. Those sites with the greatest log-likelihood difference between a 'covarion' and an RAS process were found to be of functional or structural significance in a dataset of bacterial and eukaryotic elongation factors. Conclusion Covarion models implemented in PROCOV may be especially useful for phylogenetic estimation when ancient divergences between sequences have occurred and rates of evolution at sites are likely to have changed over the tree. It can also be used to study lineage-specific functional shifts in protein families that result in changes in the patterns of site variability among subtrees.
Directory of Open Access Journals (Sweden)
Kodner Robin B
2010-10-01
Full Text Available Abstract Background Likelihood-based phylogenetic inference is generally considered to be the most reliable classification method for unknown sequences. However, traditional likelihood-based phylogenetic methods cannot be applied to large volumes of short reads from next-generation sequencing due to computational complexity issues and lack of phylogenetic signal. "Phylogenetic placement," where a reference tree is fixed and the unknown query sequences are placed onto the tree via a reference alignment, is a way to bring the inferential power offered by likelihood-based approaches to large data sets. Results This paper introduces pplacer, a software package for phylogenetic placement and subsequent visualization. The algorithm can place twenty thousand short reads on a reference tree of one thousand taxa per hour per processor, has essentially linear time and memory complexity in the number of reference taxa, and is easy to run in parallel. Pplacer features calculation of the posterior probability of a placement on an edge, which is a statistically rigorous way of quantifying uncertainty on an edge-by-edge basis. It also can inform the user of the positional uncertainty for query sequences by calculating expected distance between placement locations, which is crucial in the estimation of uncertainty with a well-sampled reference tree. The software provides visualizations using branch thickness and color to represent number of placements and their uncertainty. A simulation study using reads generated from 631 COG alignments shows a high level of accuracy for phylogenetic placement over a wide range of alignment diversity, and the power of edge uncertainty estimates to measure placement confidence. Conclusions Pplacer enables efficient phylogenetic placement and subsequent visualization, making likelihood-based phylogenetics methodology practical for large collections of reads; it is freely available as source code, binaries, and a web service.
Statistical analysis of maximum likelihood estimator images of human brain FDG PET studies
International Nuclear Information System (INIS)
Llacer, J.; Veklerov, E.; Hoffman, E.J.; Nunez, J.; Coakley, K.J.
1993-01-01
The work presented in this paper evaluates the statistical characteristics of regional bias and expected error in reconstructions of real PET data of human brain fluorodeoxiglucose (FDG) studies carried out by the maximum likelihood estimator (MLE) method with a robust stopping rule, and compares them with the results of filtered backprojection (FBP) reconstructions and with the method of sieves. The task that the authors have investigated is that of quantifying radioisotope uptake in regions-of-interest (ROI's). They first describe a robust methodology for the use of the MLE method with clinical data which contains only one adjustable parameter: the kernel size for a Gaussian filtering operation that determines final resolution and expected regional error. Simulation results are used to establish the fundamental characteristics of the reconstructions obtained by out methodology, corresponding to the case in which the transition matrix is perfectly known. Then, data from 72 independent human brain FDG scans from four patients are used to show that the results obtained from real data are consistent with the simulation, although the quality of the data and of the transition matrix have an effect on the final outcome
Improved efficiency of maximum likelihood analysis of time series with temporally correlated errors
Langbein, John
2017-08-01
Most time series of geophysical phenomena have temporally correlated errors. From these measurements, various parameters are estimated. For instance, from geodetic measurements of positions, the rates and changes in rates are often estimated and are used to model tectonic processes. Along with the estimates of the size of the parameters, the error in these parameters needs to be assessed. If temporal correlations are not taken into account, or each observation is assumed to be independent, it is likely that any estimate of the error of these parameters will be too low and the estimated value of the parameter will be biased. Inclusion of better estimates of uncertainties is limited by several factors, including selection of the correct model for the background noise and the computational requirements to estimate the parameters of the selected noise model for cases where there are numerous observations. Here, I address the second problem of computational efficiency using maximum likelihood estimates (MLE). Most geophysical time series have background noise processes that can be represented as a combination of white and power-law noise, 1/f^{α } with frequency, f. With missing data, standard spectral techniques involving FFTs are not appropriate. Instead, time domain techniques involving construction and inversion of large data covariance matrices are employed. Bos et al. (J Geod, 2013. doi: 10.1007/s00190-012-0605-0) demonstrate one technique that substantially increases the efficiency of the MLE methods, yet is only an approximate solution for power-law indices >1.0 since they require the data covariance matrix to be Toeplitz. That restriction can be removed by simply forming a data filter that adds noise processes rather than combining them in quadrature. Consequently, the inversion of the data covariance matrix is simplified yet provides robust results for a wider range of power-law indices.
Weigmann, U; Petersen, J
1996-08-01
Visual acuity determination according to DIN 58,220 does not make full use of the information received about the patient, in contrast to the staircase method. Thus, testing the same number of optotypes, the staircase method should yield more reproducible acuity results. On the other hand, the staircase method gives systematically higher acuity values because it converges on the 48% point of the psychometric function (for Landolt rings in eight positions) and not on the 65% probability, as DIN 58,220 with criterion 3/5 does. This bias can be avoided by means of a modified evaluation. Using the staircase data we performed a maximum likelihood estimate of the psychometric function as a whole and computed the acuity value for 65% probability of correct answers. We determined monocular visual acuity in 102 persons with widely differing visual performance. Each subject underwent four tests in random order, two according to DIN 58,220 and two using the modified staircase method (Landolt rings in eight positions scaled by a factor 1.26; PC monitor with 1024 x 768 pixels; distance 4.5 m). Each test was performed with 25 optotypes. The two procedures provide the same mean visual acuity values (difference less than 0.02 acuity steps). The test-retest results match in 30.4% of DIN repetitions but in 50% of the staircases. The standard deviation of the test-retest difference is 1.41 (DIN) and 1.06 (modified staircase) acuity steps. Thus the standard deviation of the single test is 1.0 (DIN) and 0.75 (modified staircase) acuity steps. The new method provides visual acuity values identical to DIN 58,220 but is superior with respect to reproducibility.
International Nuclear Information System (INIS)
Williams, O. R.; Bennett, K.; Much, R.; Schoenfelder, V.; Blom, J. J.; Ryan, J.
1997-01-01
The maximum likelihood-ratio method is frequently used in COMPTEL analysis to determine the significance of a point source at a given location. In this paper we do not consider whether the likelihood-ratio at a particular location indicates a detection, but rather whether distributions of likelihood-ratios derived from many locations depart from that expected for source free data. We have constructed distributions of likelihood-ratios by reading values from standard COMPTEL maximum-likelihood ratio maps at positions corresponding to the locations of different categories of AGN. Distributions derived from the locations of Seyfert galaxies are indistinguishable, according to a Kolmogorov-Smirnov test, from those obtained from ''random'' locations, but differ slightly from those obtained from the locations of flat spectrum radio loud quasars, OVVs, and BL Lac objects. This difference is not due to known COMPTEL sources, since regions near these sources are excluded from the analysis. We suggest that it might arise from a number of sources with fluxes below the COMPTEL detection threshold
Maximum likelihood of phylogenetic networks.
Jin, Guohua; Nakhleh, Luay; Snir, Sagi; Tuller, Tamir
2006-11-01
Horizontal gene transfer (HGT) is believed to be ubiquitous among bacteria, and plays a major role in their genome diversification as well as their ability to develop resistance to antibiotics. In light of its evolutionary significance and implications for human health, developing accurate and efficient methods for detecting and reconstructing HGT is imperative. In this article we provide a new HGT-oriented likelihood framework for many problems that involve phylogeny-based HGT detection and reconstruction. Beside the formulation of various likelihood criteria, we show that most of these problems are NP-hard, and offer heuristics for efficient and accurate reconstruction of HGT under these criteria. We implemented our heuristics and used them to analyze biological as well as synthetic data. In both cases, our criteria and heuristics exhibited very good performance with respect to identifying the correct number of HGT events as well as inferring their correct location on the species tree. Implementation of the criteria as well as heuristics and hardness proofs are available from the authors upon request. Hardness proofs can also be downloaded at http://www.cs.tau.ac.il/~tamirtul/MLNET/Supp-ML.pdf
Grimm, Guido W.; Renner, Susanne S.; Stamatakis, Alexandros; Hemleben, Vera
2007-01-01
The multi-copy internal transcribed spacer (ITS) region of nuclear ribosomal DNA is widely used to infer phylogenetic relationships among closely related taxa. Here we use maximum likelihood (ML) and splits graph analyses to extract phylogenetic information from ~ 600 mostly cloned ITS sequences, representing 81 species and subspecies of Acer, and both species of its sister Dipteronia. Additional analyses compared sequence motifs in Acer and several hundred Anacardiaceae, Burseraceae, Meliaceae, Rutaceae, and Sapindaceae ITS sequences in GenBank. We also assessed the effects of using smaller data sets of consensus sequences with ambiguity coding (accounting for within-species variation) instead of the full (partly redundant) original sequences. Neighbor-nets and bipartition networks were used to visualize conflict among character state patterns. Species clusters observed in the trees and networks largely agree with morphology-based classifications; of de Jong’s (1994) 16 sections, nine are supported in neighbor-net and bipartition networks, and ten by sequence motifs and the ML tree; of his 19 series, 14 are supported in networks, motifs, and the ML tree. Most nodes had higher bootstrap support with matrices of 105 or 40 consensus sequences than with the original matrix. Within-taxon ITS divergence did not differ between diploid and polyploid Acer, and there was little evidence of differentiated parental ITS haplotypes, suggesting that concerted evolution in Acer acts rapidly. PMID:19455198
Directory of Open Access Journals (Sweden)
Guido W. Grimm
2006-01-01
Full Text Available The multi-copy internal transcribed spacer (ITS region of nuclear ribosomal DNA is widely used to infer phylogenetic relationships among closely related taxa. Here we use maximum likelihood (ML and splits graph analyses to extract phylogenetic information from ~ 600 mostly cloned ITS sequences, representing 81 species and subspecies of Acer, and both species of its sister Dipteronia. Additional analyses compared sequence motifs in Acer and several hundred Anacardiaceae, Burseraceae, Meliaceae, Rutaceae, and Sapindaceae ITS sequences in GenBank. We also assessed the effects of using smaller data sets of consensus sequences with ambiguity coding (accounting for within-species variation instead of the full (partly redundant original sequences. Neighbor-nets and bipartition networks were used to visualize conflict among character state patterns. Species clusters observed in the trees and networks largely agree with morphology-based classifications; of de Jong’s (1994 16 sections, nine are supported in neighbor-net and bipartition networks, and ten by sequence motifs and the ML tree; of his 19 series, 14 are supported in networks, motifs, and the ML tree. Most nodes had higher bootstrap support with matrices of 105 or 40 consensus sequences than with the original matrix. Within-taxon ITS divergence did not differ between diploid and polyploid Acer, and there was little evidence of differentiated parental ITS haplotypes, suggesting that concerted evolution in Acer acts rapidly.
Can, Seda; van de Schoot, Rens|info:eu-repo/dai/nl/304833207; Hox, Joop|info:eu-repo/dai/nl/073351431
2015-01-01
Because variables may be correlated in the social and behavioral sciences, multicollinearity might be problematic. This study investigates the effect of collinearity manipulated in within and between levels of a two-level confirmatory factor analysis by Monte Carlo simulation. Furthermore, the
Tom, C. H.; Miller, L. D.
1984-01-01
The Bayesian maximum likelihood parametric classifier has been tested against the data-based formulation designated 'linear discrimination analysis', using the 'GLIKE' decision and "CLASSIFY' classification algorithms in the Landsat Mapping System. Identical supervised training sets, USGS land use/land cover classes, and various combinations of Landsat image and ancilliary geodata variables, were used to compare the algorithms' thematic mapping accuracy on a single-date summer subscene, with a cellularized USGS land use map of the same time frame furnishing the ground truth reference. CLASSIFY, which accepts a priori class probabilities, is found to be more accurate than GLIKE, which assumes equal class occurrences, for all three mapping variable sets and both levels of detail. These results may be generalized to direct accuracy, time, cost, and flexibility advantages of linear discriminant analysis over Bayesian methods.
DEFF Research Database (Denmark)
De Carvalho, Elisabeth; Omar, Samir; Slock, Dirk
2013-01-01
We analyze two algorithms that have been introduced previously for Deterministic Maximum Likelihood (DML) blind estimation of multiple FIR channels. The first one is a modification of the Iterative Quadratic ML (IQML) algorithm. IQML gives biased estimates of the channel and performs poorly at low...... to the initialization. Its asymptotic performance does not reach the DML performance though. The second strategy, called Pseudo-Quadratic ML (PQML), is naturally denoised. The denoising in PQML is furthermore more efficient than in DIQML: PQML yields the same asymptotic performance as DML, as opposed to DIQML......, but requires a consistent initialization. We furthermore compare DIQML and PQML to the strategy of alternating minimization w.r.t. symbols and channel for solving DML (AQML). An asymptotic performance analysis, a complexity evaluation and simulation results are also presented. The proposed DIQML and PQML...
International Nuclear Information System (INIS)
Metz, C.E.; Tokars, R.P.; Kronman, H.B.; Griem, M.L.
1982-01-01
A Therapeutic Operating Characteristic (TOC) curve for radiation therapy plots, for all possible treatment doses, the probability of tumor ablation as a function of the probability of radiation-induced complication. Application of this analysis to actual therapeutic situation requires that dose-response curves for ablation and for complication be estimated from clinical data. We describe an approach in which ''maximum likelihood estimates'' of these dose-response curves are made, and we apply this approach to data collected on responses to radiotherapy for carcinoma of the nasopharynx. TOC curves constructed from the estimated dose-response curves are subject to moderately large uncertainties because of the limitations of available data.These TOC curves suggest, however, that treatment doses greater than 1800 rem may substantially increase the probability of tumor ablation with little increase in the risk of radiation-induced cervical myelopathy, especially for T1 and T2 tumors
Yang, Li; Wang, Guobao; Qi, Jinyi
2016-04-01
Detecting cancerous lesions is a major clinical application of emission tomography. In a previous work, we studied penalized maximum-likelihood (PML) image reconstruction for lesion detection in static PET. Here we extend our theoretical analysis of static PET reconstruction to dynamic PET. We study both the conventional indirect reconstruction and direct reconstruction for Patlak parametric image estimation. In indirect reconstruction, Patlak parametric images are generated by first reconstructing a sequence of dynamic PET images, and then performing Patlak analysis on the time activity curves (TACs) pixel-by-pixel. In direct reconstruction, Patlak parametric images are estimated directly from raw sinogram data by incorporating the Patlak model into the image reconstruction procedure. PML reconstruction is used in both the indirect and direct reconstruction methods. We use a channelized Hotelling observer (CHO) to assess lesion detectability in Patlak parametric images. Simplified expressions for evaluating the lesion detectability have been derived and applied to the selection of the regularization parameter value to maximize detection performance. The proposed method is validated using computer-based Monte Carlo simulations. Good agreements between the theoretical predictions and the Monte Carlo results are observed. Both theoretical predictions and Monte Carlo simulation results show the benefit of the indirect and direct methods under optimized regularization parameters in dynamic PET reconstruction for lesion detection, when compared with the conventional static PET reconstruction.
Energy Technology Data Exchange (ETDEWEB)
Price, Oliver R., E-mail: oliver.price@unilever.co [Warwick-HRI, University of Warwick, Wellesbourne, Warwick, CV32 6EF (United Kingdom); University of Reading, Soil Science Department, Whiteknights, Reading, RG6 6UR (United Kingdom); Oliver, Margaret A. [University of Reading, Soil Science Department, Whiteknights, Reading, RG6 6UR (United Kingdom); Walker, Allan [Warwick-HRI, University of Warwick, Wellesbourne, Warwick, CV32 6EF (United Kingdom); Wood, Martin [University of Reading, Soil Science Department, Whiteknights, Reading, RG6 6UR (United Kingdom)
2009-05-15
An unbalanced nested sampling design was used to investigate the spatial scale of soil and herbicide interactions at the field scale. A hierarchical analysis of variance based on residual maximum likelihood (REML) was used to analyse the data and provide a first estimate of the variogram. Soil samples were taken at 108 locations at a range of separating distances in a 9 ha field to explore small and medium scale spatial variation. Soil organic matter content, pH, particle size distribution, microbial biomass and the degradation and sorption of the herbicide, isoproturon, were determined for each soil sample. A large proportion of the spatial variation in isoproturon degradation and sorption occurred at sampling intervals less than 60 m, however, the sampling design did not resolve the variation present at scales greater than this. A sampling interval of 20-25 m should ensure that the main spatial structures are identified for isoproturon degradation rate and sorption without too great a loss of information in this field. - Estimating the spatial scale of herbicide and soil interactions by nested sampling.
International Nuclear Information System (INIS)
Price, Oliver R.; Oliver, Margaret A.; Walker, Allan; Wood, Martin
2009-01-01
An unbalanced nested sampling design was used to investigate the spatial scale of soil and herbicide interactions at the field scale. A hierarchical analysis of variance based on residual maximum likelihood (REML) was used to analyse the data and provide a first estimate of the variogram. Soil samples were taken at 108 locations at a range of separating distances in a 9 ha field to explore small and medium scale spatial variation. Soil organic matter content, pH, particle size distribution, microbial biomass and the degradation and sorption of the herbicide, isoproturon, were determined for each soil sample. A large proportion of the spatial variation in isoproturon degradation and sorption occurred at sampling intervals less than 60 m, however, the sampling design did not resolve the variation present at scales greater than this. A sampling interval of 20-25 m should ensure that the main spatial structures are identified for isoproturon degradation rate and sorption without too great a loss of information in this field. - Estimating the spatial scale of herbicide and soil interactions by nested sampling.
International Nuclear Information System (INIS)
Lowrey, Justin D.; Biegalski, Steven R.F.
2012-01-01
The spectrum deconvolution analysis tool (SDAT) software code was written and tested at The University of Texas at Austin utilizing the standard spectrum technique to determine activity levels of Xe-131m, Xe-133m, Xe-133, and Xe-135 in β–γ coincidence spectra. SDAT was originally written to utilize the method of least-squares to calculate the activity of each radionuclide component in the spectrum. Recently, maximum likelihood estimation was also incorporated into the SDAT tool. This is a robust statistical technique to determine the parameters that maximize the Poisson distribution likelihood function of the sample data. In this case it is used to parameterize the activity level of each of the radioxenon components in the spectra. A new test dataset was constructed utilizing Xe-131m placed on a Xe-133 background to compare the robustness of the least-squares and maximum likelihood estimation methods for low counting statistics data. The Xe-131m spectra were collected independently from the Xe-133 spectra and added to generate the spectra in the test dataset. The true independent counts of Xe-131m and Xe-133 are known, as they were calculated before the spectra were added together. Spectra with both high and low counting statistics are analyzed. Studies are also performed by analyzing only the 30 keV X-ray region of the β–γ coincidence spectra. Results show that maximum likelihood estimation slightly outperforms least-squares for low counting statistics data.
Approximate maximum parsimony and ancestral maximum likelihood.
Alon, Noga; Chor, Benny; Pardi, Fabio; Rapoport, Anat
2010-01-01
We explore the maximum parsimony (MP) and ancestral maximum likelihood (AML) criteria in phylogenetic tree reconstruction. Both problems are NP-hard, so we seek approximate solutions. We formulate the two problems as Steiner tree problems under appropriate distances. The gist of our approach is the succinct characterization of Steiner trees for a small number of leaves for the two distances. This enables the use of known Steiner tree approximation algorithms. The approach leads to a 16/9 approximation ratio for AML and asymptotically to a 1.55 approximation ratio for MP.
Maximum-Likelihood Detection Of Noncoherent CPM
Divsalar, Dariush; Simon, Marvin K.
1993-01-01
Simplified detectors proposed for use in maximum-likelihood-sequence detection of symbols in alphabet of size M transmitted by uncoded, full-response continuous phase modulation over radio channel with additive white Gaussian noise. Structures of receivers derived from particular interpretation of maximum-likelihood metrics. Receivers include front ends, structures of which depends only on M, analogous to those in receivers of coherent CPM. Parts of receivers following front ends have structures, complexity of which would depend on N.
International Nuclear Information System (INIS)
Reinaldo Gonzalez; Scott R. Reeves; Eric Eslinger
2007-01-01
, with high vertical resolution, could be generated for many wells. This procedure permits to populate any well location with core-scale estimates of P and P and rock types facilitating the application of geostatistical characterization methods. The first step procedure was to discriminate rock types of similar depositional environment and/or reservoir quality (RQ) using a specific clustering technique. The approach implemented utilized a model-based, probabilistic clustering analysis procedure called GAMLS1,2,3,4 (Geologic Analysis via Maximum Likelihood System) which is based on maximum likelihood principles. During clustering, samples (data at each digitized depth from each well) are probabilistically assigned to a previously specified number of clusters with a fractional probability that varies between zero and one
phangorn: phylogenetic analysis in R.
Schliep, Klaus Peter
2011-02-15
phangorn is a package for phylogenetic reconstruction and analysis in the R language. Previously it was only possible to estimate phylogenetic trees with distance methods in R. phangorn, now offers the possibility of reconstructing phylogenies with distance based methods, maximum parsimony or maximum likelihood (ML) and performing Hadamard conjugation. Extending the general ML framework, this package provides the possibility of estimating mixture and partition models. Furthermore, phangorn offers several functions for comparing trees, phylogenetic models or splits, simulating character data and performing congruence analyses. phangorn can be obtained through the CRAN homepage http://cran.r-project.org/web/packages/phangorn/index.html. phangorn is licensed under GPL 2.
Maximum likelihood window for time delay estimation
International Nuclear Information System (INIS)
Lee, Young Sup; Yoon, Dong Jin; Kim, Chi Yup
2004-01-01
Time delay estimation for the detection of leak location in underground pipelines is critically important. Because the exact leak location depends upon the precision of the time delay between sensor signals due to leak noise and the speed of elastic waves, the research on the estimation of time delay has been one of the key issues in leak lovating with the time arrival difference method. In this study, an optimal Maximum Likelihood window is considered to obtain a better estimation of the time delay. This method has been proved in experiments, which can provide much clearer and more precise peaks in cross-correlation functions of leak signals. The leak location error has been less than 1 % of the distance between sensors, for example the error was not greater than 3 m for 300 m long underground pipelines. Apart from the experiment, an intensive theoretical analysis in terms of signal processing has been described. The improved leak locating with the suggested method is due to the windowing effect in frequency domain, which offers a weighting in significant frequencies.
MXLKID: a maximum likelihood parameter identifier
International Nuclear Information System (INIS)
Gavel, D.T.
1980-07-01
MXLKID (MaXimum LiKelihood IDentifier) is a computer program designed to identify unknown parameters in a nonlinear dynamic system. Using noisy measurement data from the system, the maximum likelihood identifier computes a likelihood function (LF). Identification of system parameters is accomplished by maximizing the LF with respect to the parameters. The main body of this report briefly summarizes the maximum likelihood technique and gives instructions and examples for running the MXLKID program. MXLKID is implemented LRLTRAN on the CDC7600 computer at LLNL. A detailed mathematical description of the algorithm is given in the appendices. 24 figures, 6 tables
Balzer, Laura B; Zheng, Wenjing; van der Laan, Mark J; Petersen, Maya L
2018-01-01
We often seek to estimate the impact of an exposure naturally occurring or randomly assigned at the cluster-level. For example, the literature on neighborhood determinants of health continues to grow. Likewise, community randomized trials are applied to learn about real-world implementation, sustainability, and population effects of interventions with proven individual-level efficacy. In these settings, individual-level outcomes are correlated due to shared cluster-level factors, including the exposure, as well as social or biological interactions between individuals. To flexibly and efficiently estimate the effect of a cluster-level exposure, we present two targeted maximum likelihood estimators (TMLEs). The first TMLE is developed under a non-parametric causal model, which allows for arbitrary interactions between individuals within a cluster. These interactions include direct transmission of the outcome (i.e. contagion) and influence of one individual's covariates on another's outcome (i.e. covariate interference). The second TMLE is developed under a causal sub-model assuming the cluster-level and individual-specific covariates are sufficient to control for confounding. Simulations compare the alternative estimators and illustrate the potential gains from pairing individual-level risk factors and outcomes during estimation, while avoiding unwarranted assumptions. Our results suggest that estimation under the sub-model can result in bias and misleading inference in an observational setting. Incorporating working assumptions during estimation is more robust than assuming they hold in the underlying causal model. We illustrate our approach with an application to HIV prevention and treatment.
Maximum likelihood estimation for integrated diffusion processes
DEFF Research Database (Denmark)
Baltazar-Larios, Fernando; Sørensen, Michael
We propose a method for obtaining maximum likelihood estimates of parameters in diffusion models when the data is a discrete time sample of the integral of the process, while no direct observations of the process itself are available. The data are, moreover, assumed to be contaminated...... EM-algorithm to obtain maximum likelihood estimates of the parameters in the diffusion model. As part of the algorithm, we use a recent simple method for approximate simulation of diffusion bridges. In simulation studies for the Ornstein-Uhlenbeck process and the CIR process the proposed method works...... by measurement errors. Integrated volatility is an example of this type of observations. Another example is ice-core data on oxygen isotopes used to investigate paleo-temperatures. The data can be viewed as incomplete observations of a model with a tractable likelihood function. Therefore we propose a simulated...
Energy Technology Data Exchange (ETDEWEB)
Driscoll, Donald D [Case Western Reserve Univ., Cleveland, OH (United States)
2004-05-01
of a beta-eliminating cut based on a maximum-likelihood characterization described above.
Avtandilashvili, Maia; Brey, Richard; James, Anthony C
2012-07-01
The U.S. Transuranium and Uranium Registries' tissue donors 0202 and 0407 are the two most highly exposed of the 18 registrants who were involved in the 1965 plutonium fire accident at a defense nuclear facility. Material released during the fire was well characterized as "high fired" refractory plutonium dioxide with 0.32-μm mass median diameter. The extensive bioassay data from long-term follow-up of these two cases were used to evaluate the applicability of the Human Respiratory Tract Model presented by International Commission on Radiological Protection in Publication 66 and its revision proposed by Gregoratto et al. in order to account for the observed long-term retention of insoluble material in the lungs. The maximum likelihood method was used to calculate the point estimates of intake and tissue doses and to examine the effect of different lung clearance, blood absorption, and systemic models on the goodness-of-fit and estimated dose values. With appropriate adjustments, Gregoratto et al. particle transport model coupled with the customized blood absorption parameters yielded a credible fit to the bioassay data for both cases and predicted the Case 0202 liver and skeletal activities measured postmortem. PuO2 particles produced by the plutonium fire are extremely insoluble. About 1% of this material is absorbed from the respiratory tract relatively rapidly, at a rate of about 1 to 2 d (half-time about 8 to 16 h). The remainder (99%) is absorbed extremely slowly, at a rate of about 5 × 10(-6) d (half-time about 400 y). When considering this situation, it appears that doses to other body organs are negligible in comparison to those to tissues of the respiratory tract. About 96% of the total committed weighted dose equivalent is contributed by the lungs. Doses absorbed by these workers' lungs were high: 3.2 Gy to AI and 6.5 Gy to LNTH for Case 0202 (18 y post-intake) and 3.2 Gy to AI and 55.5 Gy to LNTH for Case 0407 (43 y post-intake). This evaluation
Multi-Channel Maximum Likelihood Pitch Estimation
DEFF Research Database (Denmark)
Christensen, Mads Græsbøll
2012-01-01
In this paper, a method for multi-channel pitch estimation is proposed. The method is a maximum likelihood estimator and is based on a parametric model where the signals in the various channels share the same fundamental frequency but can have different amplitudes, phases, and noise characteristics....... This essentially means that the model allows for different conditions in the various channels, like different signal-to-noise ratios, microphone characteristics and reverberation. Moreover, the method does not assume that a certain array structure is used but rather relies on a more general model and is hence...
Maximum Likelihood Reconstruction for Magnetic Resonance Fingerprinting.
Zhao, Bo; Setsompop, Kawin; Ye, Huihui; Cauley, Stephen F; Wald, Lawrence L
2016-08-01
This paper introduces a statistical estimation framework for magnetic resonance (MR) fingerprinting, a recently proposed quantitative imaging paradigm. Within this framework, we present a maximum likelihood (ML) formalism to estimate multiple MR tissue parameter maps directly from highly undersampled, noisy k-space data. A novel algorithm, based on variable splitting, the alternating direction method of multipliers, and the variable projection method, is developed to solve the resulting optimization problem. Representative results from both simulations and in vivo experiments demonstrate that the proposed approach yields significantly improved accuracy in parameter estimation, compared to the conventional MR fingerprinting reconstruction. Moreover, the proposed framework provides new theoretical insights into the conventional approach. We show analytically that the conventional approach is an approximation to the ML reconstruction; more precisely, it is exactly equivalent to the first iteration of the proposed algorithm for the ML reconstruction, provided that a gridding reconstruction is used as an initialization.
Modelling maximum likelihood estimation of availability
International Nuclear Information System (INIS)
Waller, R.A.; Tietjen, G.L.; Rock, G.W.
1975-01-01
Suppose the performance of a nuclear powered electrical generating power plant is continuously monitored to record the sequence of failure and repairs during sustained operation. The purpose of this study is to assess one method of estimating the performance of the power plant when the measure of performance is availability. That is, we determine the probability that the plant is operational at time t. To study the availability of a power plant, we first assume statistical models for the variables, X and Y, which denote the time-to-failure and the time-to-repair variables, respectively. Once those statistical models are specified, the availability, A(t), can be expressed as a function of some or all of their parameters. Usually those parameters are unknown in practice and so A(t) is unknown. This paper discusses the maximum likelihood estimator of A(t) when the time-to-failure model for X is an exponential density with parameter, lambda, and the time-to-repair model for Y is an exponential density with parameter, theta. Under the assumption of exponential models for X and Y, it follows that the instantaneous availability at time t is A(t)=lambda/(lambda+theta)+theta/(lambda+theta)exp[-[(1/lambda)+(1/theta)]t] with t>0. Also, the steady-state availability is A(infinity)=lambda/(lambda+theta). We use the observations from n failure-repair cycles of the power plant, say X 1 , X 2 , ..., Xsub(n), Y 1 , Y 2 , ..., Ysub(n) to present the maximum likelihood estimators of A(t) and A(infinity). The exact sampling distributions for those estimators and some statistical properties are discussed before a simulation model is used to determine 95% simulation intervals for A(t). The methodology is applied to two examples which approximate the operating history of two nuclear power plants. (author)
Lehmann, A; Scheffler, Ch; Hermanussen, M
2010-02-01
Recent progress in modelling individual growth has been achieved by combining the principal component analysis and the maximum likelihood principle. This combination models growth even in incomplete sets of data and in data obtained at irregular intervals. We re-analysed late 18th century longitudinal growth of German boys from the boarding school Carlsschule in Stuttgart. The boys, aged 6-23 years, were measured at irregular 3-12 monthly intervals during the period 1771-1793. At the age of 18 years, mean height was 1652 mm, but height variation was large. The shortest boy reached 1474 mm, the tallest 1826 mm. Measured height closely paralleled modelled height, with mean difference of 4 mm, SD 7 mm. Seasonal height variation was found. Low growth rates occurred in spring and high growth rates in summer and autumn. The present study demonstrates that combining the principal component analysis and the maximum likelihood principle enables growth modelling in historic height data also. Copyright (c) 2009 Elsevier GmbH. All rights reserved.
Maximum likelihood convolutional decoding (MCD) performance due to system losses
Webster, L.
1976-01-01
A model for predicting the computational performance of a maximum likelihood convolutional decoder (MCD) operating in a noisy carrier reference environment is described. This model is used to develop a subroutine that will be utilized by the Telemetry Analysis Program to compute the MCD bit error rate. When this computational model is averaged over noisy reference phase errors using a high-rate interpolation scheme, the results are found to agree quite favorably with experimental measurements.
Cases in which ancestral maximum likelihood will be confusingly misleading.
Handelman, Tomer; Chor, Benny
2017-05-07
Ancestral maximum likelihood (AML) is a phylogenetic tree reconstruction criteria that "lies between" maximum parsimony (MP) and maximum likelihood (ML). ML has long been known to be statistically consistent. On the other hand, Felsenstein (1978) showed that MP is statistically inconsistent, and even positively misleading: There are cases where the parsimony criteria, applied to data generated according to one tree topology, will be optimized on a different tree topology. The question of weather AML is statistically consistent or not has been open for a long time. Mossel et al. (2009) have shown that AML can "shrink" short tree edges, resulting in a star tree with no internal resolution, which yields a better AML score than the original (resolved) model. This result implies that AML is statistically inconsistent, but not that it is positively misleading, because the star tree is compatible with any other topology. We show that AML is confusingly misleading: For some simple, four taxa (resolved) tree, the ancestral likelihood optimization criteria is maximized on an incorrect (resolved) tree topology, as well as on a star tree (both with specific edge lengths), while the tree with the original, correct topology, has strictly lower ancestral likelihood. Interestingly, the two short edges in the incorrect, resolved tree topology are of length zero, and are not adjacent, so this resolved tree is in fact a simple path. While for MP, the underlying phenomenon can be described as long edge attraction, it turns out that here we have long edge repulsion. Copyright © 2017. Published by Elsevier Ltd.
A maximum likelihood framework for protein design
Directory of Open Access Journals (Sweden)
Philippe Hervé
2006-06-01
Full Text Available Abstract Background The aim of protein design is to predict amino-acid sequences compatible with a given target structure. Traditionally envisioned as a purely thermodynamic question, this problem can also be understood in a wider context, where additional constraints are captured by learning the sequence patterns displayed by natural proteins of known conformation. In this latter perspective, however, we still need a theoretical formalization of the question, leading to general and efficient learning methods, and allowing for the selection of fast and accurate objective functions quantifying sequence/structure compatibility. Results We propose a formulation of the protein design problem in terms of model-based statistical inference. Our framework uses the maximum likelihood principle to optimize the unknown parameters of a statistical potential, which we call an inverse potential to contrast with classical potentials used for structure prediction. We propose an implementation based on Markov chain Monte Carlo, in which the likelihood is maximized by gradient descent and is numerically estimated by thermodynamic integration. The fit of the models is evaluated by cross-validation. We apply this to a simple pairwise contact potential, supplemented with a solvent-accessibility term, and show that the resulting models have a better predictive power than currently available pairwise potentials. Furthermore, the model comparison method presented here allows one to measure the relative contribution of each component of the potential, and to choose the optimal number of accessibility classes, which turns out to be much higher than classically considered. Conclusion Altogether, this reformulation makes it possible to test a wide diversity of models, using different forms of potentials, or accounting for other factors than just the constraint of thermodynamic stability. Ultimately, such model-based statistical analyses may help to understand the forces
Parallel implementation of D-Phylo algorithm for maximum likelihood clusters.
Malik, Shamita; Sharma, Dolly; Khatri, Sunil Kumar
2017-03-01
This study explains a newly developed parallel algorithm for phylogenetic analysis of DNA sequences. The newly designed D-Phylo is a more advanced algorithm for phylogenetic analysis using maximum likelihood approach. The D-Phylo while misusing the seeking capacity of k -means keeps away from its real constraint of getting stuck at privately conserved motifs. The authors have tested the behaviour of D-Phylo on Amazon Linux Amazon Machine Image(Hardware Virtual Machine)i2.4xlarge, six central processing unit, 122 GiB memory, 8 × 800 Solid-state drive Elastic Block Store volume, high network performance up to 15 processors for several real-life datasets. Distributing the clusters evenly on all the processors provides us the capacity to accomplish a near direct speed if there should arise an occurrence of huge number of processors.
Maximum likelihood estimation of phase-type distributions
DEFF Research Database (Denmark)
Esparza, Luz Judith R
for both univariate and multivariate cases. Methods like the EM algorithm and Markov chain Monte Carlo are applied for this purpose. Furthermore, this thesis provides explicit formulae for computing the Fisher information matrix for discrete and continuous phase-type distributions, which is needed to find......This work is concerned with the statistical inference of phase-type distributions and the analysis of distributions with rational Laplace transform, known as matrix-exponential distributions. The thesis is focused on the estimation of the maximum likelihood parameters of phase-type distributions...... confidence regions for their estimated parameters. Finally, a new general class of distributions, called bilateral matrix-exponential distributions, is defined. These distributions have the entire real line as domain and can be used, for instance, for modelling. In addition, this class of distributions...
Maximum Likelihood Blood Velocity Estimator Incorporating Properties of Flow Physics
DEFF Research Database (Denmark)
Schlaikjer, Malene; Jensen, Jørgen Arendt
2004-01-01
)-data under investigation. The flow physic properties are exploited in the second term, as the range of velocity values investigated in the cross-correlation analysis are compared to the velocity estimates in the temporal and spatial neighborhood of the signal segment under investigation. The new estimator...... has been compared to the cross-correlation (CC) estimator and the previously developed maximum likelihood estimator (MLE). The results show that the CMLE can handle a larger velocity search range and is capable of estimating even low velocity levels from tissue motion. The CC and the MLE produce...... for the CC and the MLE. When the velocity search range is set to twice the limit of the CC and the MLE, the number of incorrect velocity estimates are 0, 19.1, and 7.2% for the CMLE, CC, and MLE, respectively. The ability to handle a larger search range and estimating low velocity levels was confirmed...
Maximum likelihood estimation of finite mixture model for economic data
Phoong, Seuk-Yen; Ismail, Mohd Tahir
2014-06-01
Finite mixture model is a mixture model with finite-dimension. This models are provides a natural representation of heterogeneity in a finite number of latent classes. In addition, finite mixture models also known as latent class models or unsupervised learning models. Recently, maximum likelihood estimation fitted finite mixture models has greatly drawn statistician's attention. The main reason is because maximum likelihood estimation is a powerful statistical method which provides consistent findings as the sample sizes increases to infinity. Thus, the application of maximum likelihood estimation is used to fit finite mixture model in the present paper in order to explore the relationship between nonlinear economic data. In this paper, a two-component normal mixture model is fitted by maximum likelihood estimation in order to investigate the relationship among stock market price and rubber price for sampled countries. Results described that there is a negative effect among rubber price and stock market price for Malaysia, Thailand, Philippines and Indonesia.
Narrow band interference cancelation in OFDM: Astructured maximum likelihood approach
Sohail, Muhammad Sadiq; Al-Naffouri, Tareq Y.; Al-Ghadhban, Samir N.
2012-01-01
This paper presents a maximum likelihood (ML) approach to mitigate the effect of narrow band interference (NBI) in a zero padded orthogonal frequency division multiplexing (ZP-OFDM) system. The NBI is assumed to be time variant and asynchronous
Maximum likelihood estimation for cytogenetic dose-response curves
International Nuclear Information System (INIS)
Frome, E.L; DuFrain, R.J.
1983-10-01
In vitro dose-response curves are used to describe the relation between the yield of dicentric chromosome aberrations and radiation dose for human lymphocytes. The dicentric yields follow the Poisson distribution, and the expected yield depends on both the magnitude and the temporal distribution of the dose for low LET radiation. A general dose-response model that describes this relation has been obtained by Kellerer and Rossi using the theory of dual radiation action. The yield of elementary lesions is kappa[γd + g(t, tau)d 2 ], where t is the time and d is dose. The coefficient of the d 2 term is determined by the recovery function and the temporal mode of irradiation. Two special cases of practical interest are split-dose and continuous exposure experiments, and the resulting models are intrinsically nonlinear in the parameters. A general purpose maximum likelihood estimation procedure is described and illustrated with numerical examples from both experimental designs. Poisson regression analysis is used for estimation, hypothesis testing, and regression diagnostics. Results are discussed in the context of exposure assessment procedures for both acute and chronic human radiation exposure
Maximum likelihood estimation for cytogenetic dose-response curves
International Nuclear Information System (INIS)
Frome, E.L.; DuFrain, R.J.
1986-01-01
In vitro dose-response curves are used to describe the relation between chromosome aberrations and radiation dose for human lymphocytes. The lymphocytes are exposed to low-LET radiation, and the resulting dicentric chromosome aberrations follow the Poisson distribution. The expected yield depends on both the magnitude and the temporal distribution of the dose. A general dose-response model that describes this relation has been presented by Kellerer and Rossi (1972, Current Topics on Radiation Research Quarterly 8, 85-158; 1978, Radiation Research 75, 471-488) using the theory of dual radiation action. Two special cases of practical interest are split-dose and continuous exposure experiments, and the resulting dose-time-response models are intrinsically nonlinear in the parameters. A general-purpose maximum likelihood estimation procedure is described, and estimation for the nonlinear models is illustrated with numerical examples from both experimental designs. Poisson regression analysis is used for estimation, hypothesis testing, and regression diagnostics. Results are discussed in the context of exposure assessment procedures for both acute and chronic human radiation exposure
Maximum likelihood estimation for cytogenetic dose-response curves
Energy Technology Data Exchange (ETDEWEB)
Frome, E.L; DuFrain, R.J.
1983-10-01
In vitro dose-response curves are used to describe the relation between the yield of dicentric chromosome aberrations and radiation dose for human lymphocytes. The dicentric yields follow the Poisson distribution, and the expected yield depends on both the magnitude and the temporal distribution of the dose for low LET radiation. A general dose-response model that describes this relation has been obtained by Kellerer and Rossi using the theory of dual radiation action. The yield of elementary lesions is kappa(..gamma..d + g(t, tau)d/sup 2/), where t is the time and d is dose. The coefficient of the d/sup 2/ term is determined by the recovery function and the temporal mode of irradiation. Two special cases of practical interest are split-dose and continuous exposure experiments, and the resulting models are intrinsically nonlinear in the parameters. A general purpose maximum likelihood estimation procedure is described and illustrated with numerical examples from both experimental designs. Poisson regression analysis is used for estimation, hypothesis testing, and regression diagnostics. Results are discussed in the context of exposure assessment procedures for both acute and chronic human radiation exposure.
Maximum likelihood estimation of the attenuated ultrasound pulse
DEFF Research Database (Denmark)
Rasmussen, Klaus Bolding
1994-01-01
The attenuated ultrasound pulse is divided into two parts: a stationary basic pulse and a nonstationary attenuation pulse. A standard ARMA model is used for the basic pulse, and a nonstandard ARMA model is derived for the attenuation pulse. The maximum likelihood estimator of the attenuated...
MAXIMUM-LIKELIHOOD-ESTIMATION OF THE ENTROPY OF AN ATTRACTOR
SCHOUTEN, JC; TAKENS, F; VANDENBLEEK, CM
In this paper, a maximum-likelihood estimate of the (Kolmogorov) entropy of an attractor is proposed that can be obtained directly from a time series. Also, the relative standard deviation of the entropy estimate is derived; it is dependent on the entropy and on the number of samples used in the
Adaptive Unscented Kalman Filter using Maximum Likelihood Estimation
DEFF Research Database (Denmark)
Mahmoudi, Zeinab; Poulsen, Niels Kjølstad; Madsen, Henrik
2017-01-01
The purpose of this study is to develop an adaptive unscented Kalman filter (UKF) by tuning the measurement noise covariance. We use the maximum likelihood estimation (MLE) and the covariance matching (CM) method to estimate the noise covariance. The multi-step prediction errors generated...
Multilevel maximum likelihood estimation with application to covariance matrices
Czech Academy of Sciences Publication Activity Database
Turčičová, Marie; Mandel, J.; Eben, Kryštof
Published online: 23 January ( 2018 ) ISSN 0361-0926 R&D Projects: GA ČR GA13-34856S Institutional support: RVO:67985807 Keywords : Fisher information * High dimension * Hierarchical maximum likelihood * Nested parameter spaces * Spectral diagonal covariance model * Sparse inverse covariance model Subject RIV: BB - Applied Statistics, Operational Research Impact factor: 0.311, year: 2016
Maximum Likelihood Blind Channel Estimation for Space-Time Coding Systems
Directory of Open Access Journals (Sweden)
Hakan A. Çırpan
2002-05-01
Full Text Available Sophisticated signal processing techniques have to be developed for capacity enhancement of future wireless communication systems. In recent years, space-time coding is proposed to provide significant capacity gains over the traditional communication systems in fading wireless channels. Space-time codes are obtained by combining channel coding, modulation, transmit diversity, and optional receive diversity in order to provide diversity at the receiver and coding gain without sacrificing the bandwidth. In this paper, we consider the problem of blind estimation of space-time coded signals along with the channel parameters. Both conditional and unconditional maximum likelihood approaches are developed and iterative solutions are proposed. The conditional maximum likelihood algorithm is based on iterative least squares with projection whereas the unconditional maximum likelihood approach is developed by means of finite state Markov process modelling. The performance analysis issues of the proposed methods are studied. Finally, some simulation results are presented.
Design of Simplified Maximum-Likelihood Receivers for Multiuser CPM Systems
Directory of Open Access Journals (Sweden)
Li Bing
2014-01-01
Full Text Available A class of simplified maximum-likelihood receivers designed for continuous phase modulation based multiuser systems is proposed. The presented receiver is built upon a front end employing mismatched filters and a maximum-likelihood detector defined in a low-dimensional signal space. The performance of the proposed receivers is analyzed and compared to some existing receivers. Some schemes are designed to implement the proposed receivers and to reveal the roles of different system parameters. Analysis and numerical results show that the proposed receivers can approach the optimum multiuser receivers with significantly (even exponentially in some cases reduced complexity and marginal performance degradation.
Design of simplified maximum-likelihood receivers for multiuser CPM systems.
Bing, Li; Bai, Baoming
2014-01-01
A class of simplified maximum-likelihood receivers designed for continuous phase modulation based multiuser systems is proposed. The presented receiver is built upon a front end employing mismatched filters and a maximum-likelihood detector defined in a low-dimensional signal space. The performance of the proposed receivers is analyzed and compared to some existing receivers. Some schemes are designed to implement the proposed receivers and to reveal the roles of different system parameters. Analysis and numerical results show that the proposed receivers can approach the optimum multiuser receivers with significantly (even exponentially in some cases) reduced complexity and marginal performance degradation.
Directory of Open Access Journals (Sweden)
Daniel L. Rabosky
2006-01-01
Full Text Available Rates of species origination and extinction can vary over time during evolutionary radiations, and it is possible to reconstruct the history of diversification using molecular phylogenies of extant taxa only. Maximum likelihood methods provide a useful framework for inferring temporal variation in diversification rates. LASER is a package for the R programming environment that implements maximum likelihood methods based on the birth-death process to test whether diversification rates have changed over time. LASER contrasts the likelihood of phylogenetic data under models where diversification rates have changed over time to alternative models where rates have remained constant over time. Major strengths of the package include the ability to detect temporal increases in diversification rates and the inference of diversification parameters under multiple rate-variable models of diversification. The program and associated documentation are freely available from the R package archive at http://cran.r-project.org.
GENERALIZATION OF RAYLEIGH MAXIMUM LIKELIHOOD DESPECKLING FILTER USING QUADRILATERAL KERNELS
Directory of Open Access Journals (Sweden)
S. Sridevi
2013-02-01
Full Text Available Speckle noise is the most prevalent noise in clinical ultrasound images. It visibly looks like light and dark spots and deduce the pixel intensity as murkiest. Gazing at fetal ultrasound images, the impact of edge and local fine details are more palpable for obstetricians and gynecologists to carry out prenatal diagnosis of congenital heart disease. A robust despeckling filter has to be contrived to proficiently suppress speckle noise and simultaneously preserve the features. The proposed filter is the generalization of Rayleigh maximum likelihood filter by the exploitation of statistical tools as tuning parameters and use different shapes of quadrilateral kernels to estimate the noise free pixel from neighborhood. The performance of various filters namely Median, Kuwahura, Frost, Homogenous mask filter and Rayleigh maximum likelihood filter are compared with the proposed filter in terms PSNR and image profile. Comparatively the proposed filters surpass the conventional filters.
Maximum Likelihood Compton Polarimetry with the Compton Spectrometer and Imager
Energy Technology Data Exchange (ETDEWEB)
Lowell, A. W.; Boggs, S. E; Chiu, C. L.; Kierans, C. A.; Sleator, C.; Tomsick, J. A.; Zoglauer, A. C. [Space Sciences Laboratory, University of California, Berkeley (United States); Chang, H.-K.; Tseng, C.-H.; Yang, C.-Y. [Institute of Astronomy, National Tsing Hua University, Taiwan (China); Jean, P.; Ballmoos, P. von [IRAP Toulouse (France); Lin, C.-H. [Institute of Physics, Academia Sinica, Taiwan (China); Amman, M. [Lawrence Berkeley National Laboratory (United States)
2017-10-20
Astrophysical polarization measurements in the soft gamma-ray band are becoming more feasible as detectors with high position and energy resolution are deployed. Previous work has shown that the minimum detectable polarization (MDP) of an ideal Compton polarimeter can be improved by ∼21% when an unbinned, maximum likelihood method (MLM) is used instead of the standard approach of fitting a sinusoid to a histogram of azimuthal scattering angles. Here we outline a procedure for implementing this maximum likelihood approach for real, nonideal polarimeters. As an example, we use the recent observation of GRB 160530A with the Compton Spectrometer and Imager. We find that the MDP for this observation is reduced by 20% when the MLM is used instead of the standard method.
Algorithms of maximum likelihood data clustering with applications
Giada, Lorenzo; Marsili, Matteo
2002-12-01
We address the problem of data clustering by introducing an unsupervised, parameter-free approach based on maximum likelihood principle. Starting from the observation that data sets belonging to the same cluster share a common information, we construct an expression for the likelihood of any possible cluster structure. The likelihood in turn depends only on the Pearson's coefficient of the data. We discuss clustering algorithms that provide a fast and reliable approximation to maximum likelihood configurations. Compared to standard clustering methods, our approach has the advantages that (i) it is parameter free, (ii) the number of clusters need not be fixed in advance and (iii) the interpretation of the results is transparent. In order to test our approach and compare it with standard clustering algorithms, we analyze two very different data sets: time series of financial market returns and gene expression data. We find that different maximization algorithms produce similar cluster structures whereas the outcome of standard algorithms has a much wider variability.
Statistical Bias in Maximum Likelihood Estimators of Item Parameters.
1982-04-01
34 a> E r’r~e r ,C Ie I# ne,..,.rVi rnd Id.,flfv b1 - bindk numb.r) I; ,t-i i-cd I ’ tiie bias in the maximum likelihood ,st i- i;, ’ t iIeiIrs in...NTC, IL 60088 Psychometric Laboratory University of North Carolina I ERIC Facility-Acquisitions Davie Hall 013A 4833 Rugby Avenue Chapel Hill, NC
Maximum likelihood as a common computational framework in tomotherapy
International Nuclear Information System (INIS)
Olivera, G.H.; Shepard, D.M.; Reckwerdt, P.J.; Ruchala, K.; Zachman, J.; Fitchard, E.E.; Mackie, T.R.
1998-01-01
Tomotherapy is a dose delivery technique using helical or axial intensity modulated beams. One of the strengths of the tomotherapy concept is that it can incorporate a number of processes into a single piece of equipment. These processes include treatment optimization planning, dose reconstruction and kilovoltage/megavoltage image reconstruction. A common computational technique that could be used for all of these processes would be very appealing. The maximum likelihood estimator, originally developed for emission tomography, can serve as a useful tool in imaging and radiotherapy. We believe that this approach can play an important role in the processes of optimization planning, dose reconstruction and kilovoltage and/or megavoltage image reconstruction. These processes involve computations that require comparable physical methods. They are also based on equivalent assumptions, and they have similar mathematical solutions. As a result, the maximum likelihood approach is able to provide a common framework for all three of these computational problems. We will demonstrate how maximum likelihood methods can be applied to optimization planning, dose reconstruction and megavoltage image reconstruction in tomotherapy. Results for planning optimization, dose reconstruction and megavoltage image reconstruction will be presented. Strengths and weaknesses of the methodology are analysed. Future directions for this work are also suggested. (author)
John Hogland; Nedret Billor; Nathaniel Anderson
2013-01-01
Discriminant analysis, referred to as maximum likelihood classification within popular remote sensing software packages, is a common supervised technique used by analysts. Polytomous logistic regression (PLR), also referred to as multinomial logistic regression, is an alternative classification approach that is less restrictive, more flexible, and easy to interpret. To...
Maximum-likelihood method for numerical inversion of Mellin transform
International Nuclear Information System (INIS)
Iqbal, M.
1997-01-01
A method is described for inverting the Mellin transform which uses an expansion in Laguerre polynomials and converts the Mellin transform to Laplace transform, then the maximum-likelihood regularization method is used to recover the original function of the Mellin transform. The performance of the method is illustrated by the inversion of the test functions available in the literature (J. Inst. Math. Appl., 20 (1977) 73; Math. Comput., 53 (1989) 589). Effectiveness of the method is shown by results obtained through demonstration by means of tables and diagrams
Penalized Maximum Likelihood Estimation for univariate normal mixture distributions
International Nuclear Information System (INIS)
Ridolfi, A.; Idier, J.
2001-01-01
Due to singularities of the likelihood function, the maximum likelihood approach for the estimation of the parameters of normal mixture models is an acknowledged ill posed optimization problem. Ill posedness is solved by penalizing the likelihood function. In the Bayesian framework, it amounts to incorporating an inverted gamma prior in the likelihood function. A penalized version of the EM algorithm is derived, which is still explicit and which intrinsically assures that the estimates are not singular. Numerical evidence of the latter property is put forward with a test
Maximum Likelihood and Bayes Estimation in Randomly Censored Geometric Distribution
Directory of Open Access Journals (Sweden)
Hare Krishna
2017-01-01
Full Text Available In this article, we study the geometric distribution under randomly censored data. Maximum likelihood estimators and confidence intervals based on Fisher information matrix are derived for the unknown parameters with randomly censored data. Bayes estimators are also developed using beta priors under generalized entropy and LINEX loss functions. Also, Bayesian credible and highest posterior density (HPD credible intervals are obtained for the parameters. Expected time on test and reliability characteristics are also analyzed in this article. To compare various estimates developed in the article, a Monte Carlo simulation study is carried out. Finally, for illustration purpose, a randomly censored real data set is discussed.
Elemental composition of cosmic rays using a maximum likelihood method
International Nuclear Information System (INIS)
Ruddick, K.
1996-01-01
We present a progress report on our attempts to determine the composition of cosmic rays in the knee region of the energy spectrum. We have used three different devices to measure properties of the extensive air showers produced by primary cosmic rays: the Soudan 2 underground detector measures the muon flux deep underground, a proportional tube array samples shower density at the surface of the earth, and a Cherenkov array observes light produced high in the atmosphere. We have begun maximum likelihood fits to these measurements with the hope of determining the nuclear mass number A on an event by event basis. (orig.)
Maximum Likelihood Joint Tracking and Association in Strong Clutter
Directory of Open Access Journals (Sweden)
Leonid I. Perlovsky
2013-01-01
Full Text Available We have developed a maximum likelihood formulation for a joint detection, tracking and association problem. An efficient non-combinatorial algorithm for this problem is developed in case of strong clutter for radar data. By using an iterative procedure of the dynamic logic process “from vague-to-crisp” explained in the paper, the new tracker overcomes the combinatorial complexity of tracking in highly-cluttered scenarios and results in an orders-of-magnitude improvement in signal-to-clutter ratio.
Superfast maximum-likelihood reconstruction for quantum tomography
Shang, Jiangwei; Zhang, Zhengyun; Ng, Hui Khoon
2017-06-01
Conventional methods for computing maximum-likelihood estimators (MLE) often converge slowly in practical situations, leading to a search for simplifying methods that rely on additional assumptions for their validity. In this work, we provide a fast and reliable algorithm for maximum-likelihood reconstruction that avoids this slow convergence. Our method utilizes the state-of-the-art convex optimization scheme, an accelerated projected-gradient method, that allows one to accommodate the quantum nature of the problem in a different way than in the standard methods. We demonstrate the power of our approach by comparing its performance with other algorithms for n -qubit state tomography. In particular, an eight-qubit situation that purportedly took weeks of computation time in 2005 can now be completed in under a minute for a single set of data, with far higher accuracy than previously possible. This refutes the common claim that MLE reconstruction is slow and reduces the need for alternative methods that often come with difficult-to-verify assumptions. In fact, recent methods assuming Gaussian statistics or relying on compressed sensing ideas are demonstrably inapplicable for the situation under consideration here. Our algorithm can be applied to general optimization problems over the quantum state space; the philosophy of projected gradients can further be utilized for optimization contexts with general constraints.
Neutron spectra unfolding with maximum entropy and maximum likelihood
International Nuclear Information System (INIS)
Itoh, Shikoh; Tsunoda, Toshiharu
1989-01-01
A new unfolding theory has been established on the basis of the maximum entropy principle and the maximum likelihood method. This theory correctly embodies the Poisson statistics of neutron detection, and always brings a positive solution over the whole energy range. Moreover, the theory unifies both problems of overdetermined and of underdetermined. For the latter, the ambiguity in assigning a prior probability, i.e. the initial guess in the Bayesian sense, has become extinct by virtue of the principle. An approximate expression of the covariance matrix for the resultant spectra is also presented. An efficient algorithm to solve the nonlinear system, which appears in the present study, has been established. Results of computer simulation showed the effectiveness of the present theory. (author)
Narrow band interference cancelation in OFDM: Astructured maximum likelihood approach
Sohail, Muhammad Sadiq
2012-06-01
This paper presents a maximum likelihood (ML) approach to mitigate the effect of narrow band interference (NBI) in a zero padded orthogonal frequency division multiplexing (ZP-OFDM) system. The NBI is assumed to be time variant and asynchronous with the frequency grid of the ZP-OFDM system. The proposed structure based technique uses the fact that the NBI signal is sparse as compared to the ZP-OFDM signal in the frequency domain. The structure is also useful in reducing the computational complexity of the proposed method. The paper also presents a data aided approach for improved NBI estimation. The suitability of the proposed method is demonstrated through simulations. © 2012 IEEE.
Preliminary attempt on maximum likelihood tomosynthesis reconstruction of DEI data
International Nuclear Information System (INIS)
Wang Zhentian; Huang Zhifeng; Zhang Li; Kang Kejun; Chen Zhiqiang; Zhu Peiping
2009-01-01
Tomosynthesis is a three-dimension reconstruction method that can remove the effect of superimposition with limited angle projections. It is especially promising in mammography where radiation dose is concerned. In this paper, we propose a maximum likelihood tomosynthesis reconstruction algorithm (ML-TS) on the apparent absorption data of diffraction enhanced imaging (DEI). The motivation of this contribution is to develop a tomosynthesis algorithm in low-dose or noisy circumstances and make DEI get closer to clinic application. The theoretical statistical models of DEI data in physics are analyzed and the proposed algorithm is validated with the experimental data at the Beijing Synchrotron Radiation Facility (BSRF). The results of ML-TS have better contrast compared with the well known 'shift-and-add' algorithm and FBP algorithm. (authors)
Marginal Maximum Likelihood Estimation of Item Response Models in R
Directory of Open Access Journals (Sweden)
Matthew S. Johnson
2007-02-01
Full Text Available Item response theory (IRT models are a class of statistical models used by researchers to describe the response behaviors of individuals to a set of categorically scored items. The most common IRT models can be classified as generalized linear fixed- and/or mixed-effect models. Although IRT models appear most often in the psychological testing literature, researchers in other fields have successfully utilized IRT-like models in a wide variety of applications. This paper discusses the three major methods of estimation in IRT and develops R functions utilizing the built-in capabilities of the R environment to find the marginal maximum likelihood estimates of the generalized partial credit model. The currently available R packages ltm is also discussed.
Targeted maximum likelihood estimation for a binary treatment: A tutorial.
Luque-Fernandez, Miguel Angel; Schomaker, Michael; Rachet, Bernard; Schnitzer, Mireille E
2018-04-23
When estimating the average effect of a binary treatment (or exposure) on an outcome, methods that incorporate propensity scores, the G-formula, or targeted maximum likelihood estimation (TMLE) are preferred over naïve regression approaches, which are biased under misspecification of a parametric outcome model. In contrast propensity score methods require the correct specification of an exposure model. Double-robust methods only require correct specification of either the outcome or the exposure model. Targeted maximum likelihood estimation is a semiparametric double-robust method that improves the chances of correct model specification by allowing for flexible estimation using (nonparametric) machine-learning methods. It therefore requires weaker assumptions than its competitors. We provide a step-by-step guided implementation of TMLE and illustrate it in a realistic scenario based on cancer epidemiology where assumptions about correct model specification and positivity (ie, when a study participant had 0 probability of receiving the treatment) are nearly violated. This article provides a concise and reproducible educational introduction to TMLE for a binary outcome and exposure. The reader should gain sufficient understanding of TMLE from this introductory tutorial to be able to apply the method in practice. Extensive R-code is provided in easy-to-read boxes throughout the article for replicability. Stata users will find a testing implementation of TMLE and additional material in the Appendix S1 and at the following GitHub repository: https://github.com/migariane/SIM-TMLE-tutorial. © 2018 The Authors. Statistics in Medicine published by John Wiley & Sons Ltd.
Determination of point of maximum likelihood in failure domain using genetic algorithms
International Nuclear Information System (INIS)
Obadage, A.S.; Harnpornchai, N.
2006-01-01
The point of maximum likelihood in a failure domain yields the highest value of the probability density function in the failure domain. The maximum-likelihood point thus represents the worst combination of random variables that contribute in the failure event. In this work Genetic Algorithms (GAs) with an adaptive penalty scheme have been proposed as a tool for the determination of the maximum likelihood point. The utilization of only numerical values in the GAs operation makes the algorithms applicable to cases of non-linear and implicit single and multiple limit state function(s). The algorithmic simplicity readily extends its application to higher dimensional problems. When combined with Monte Carlo Simulation, the proposed methodology will reduce the computational complexity and at the same time will enhance the possibility in rare-event analysis under limited computational resources. Since, there is no approximation done in the procedure, the solution obtained is considered accurate. Consequently, GAs can be used as a tool for increasing the computational efficiency in the element and system reliability analyses
Fast maximum likelihood estimation of mutation rates using a birth-death process.
Wu, Xiaowei; Zhu, Hongxiao
2015-02-07
Since fluctuation analysis was first introduced by Luria and Delbrück in 1943, it has been widely used to make inference about spontaneous mutation rates in cultured cells. Under certain model assumptions, the probability distribution of the number of mutants that appear in a fluctuation experiment can be derived explicitly, which provides the basis of mutation rate estimation. It has been shown that, among various existing estimators, the maximum likelihood estimator usually demonstrates some desirable properties such as consistency and lower mean squared error. However, its application in real experimental data is often hindered by slow computation of likelihood due to the recursive form of the mutant-count distribution. We propose a fast maximum likelihood estimator of mutation rates, MLE-BD, based on a birth-death process model with non-differential growth assumption. Simulation studies demonstrate that, compared with the conventional maximum likelihood estimator derived from the Luria-Delbrück distribution, MLE-BD achieves substantial improvement on computational speed and is applicable to arbitrarily large number of mutants. In addition, it still retains good accuracy on point estimation. Published by Elsevier Ltd.
Efficient algorithms for maximum likelihood decoding in the surface code
Bravyi, Sergey; Suchara, Martin; Vargo, Alexander
2014-09-01
We describe two implementations of the optimal error correction algorithm known as the maximum likelihood decoder (MLD) for the two-dimensional surface code with a noiseless syndrome extraction. First, we show how to implement MLD exactly in time O (n2), where n is the number of code qubits. Our implementation uses a reduction from MLD to simulation of matchgate quantum circuits. This reduction however requires a special noise model with independent bit-flip and phase-flip errors. Secondly, we show how to implement MLD approximately for more general noise models using matrix product states (MPS). Our implementation has running time O (nχ3), where χ is a parameter that controls the approximation precision. The key step of our algorithm, borrowed from the density matrix renormalization-group method, is a subroutine for contracting a tensor network on the two-dimensional grid. The subroutine uses MPS with a bond dimension χ to approximate the sequence of tensors arising in the course of contraction. We benchmark the MPS-based decoder against the standard minimum weight matching decoder observing a significant reduction of the logical error probability for χ ≥4.
Approximate maximum likelihood estimation for population genetic inference.
Bertl, Johanna; Ewing, Gregory; Kosiol, Carolin; Futschik, Andreas
2017-11-27
In many population genetic problems, parameter estimation is obstructed by an intractable likelihood function. Therefore, approximate estimation methods have been developed, and with growing computational power, sampling-based methods became popular. However, these methods such as Approximate Bayesian Computation (ABC) can be inefficient in high-dimensional problems. This led to the development of more sophisticated iterative estimation methods like particle filters. Here, we propose an alternative approach that is based on stochastic approximation. By moving along a simulated gradient or ascent direction, the algorithm produces a sequence of estimates that eventually converges to the maximum likelihood estimate, given a set of observed summary statistics. This strategy does not sample much from low-likelihood regions of the parameter space, and is fast, even when many summary statistics are involved. We put considerable efforts into providing tuning guidelines that improve the robustness and lead to good performance on problems with high-dimensional summary statistics and a low signal-to-noise ratio. We then investigate the performance of our resulting approach and study its properties in simulations. Finally, we re-estimate parameters describing the demographic history of Bornean and Sumatran orang-utans.
Maximum likelihood sequence estimation for optical complex direct modulation.
Che, Di; Yuan, Feng; Shieh, William
2017-04-17
Semiconductor lasers are versatile optical transmitters in nature. Through the direct modulation (DM), the intensity modulation is realized by the linear mapping between the injection current and the light power, while various angle modulations are enabled by the frequency chirp. Limited by the direct detection, DM lasers used to be exploited only as 1-D (intensity or angle) transmitters by suppressing or simply ignoring the other modulation. Nevertheless, through the digital coherent detection, simultaneous intensity and angle modulations (namely, 2-D complex DM, CDM) can be realized by a single laser diode. The crucial technique of CDM is the joint demodulation of intensity and differential phase with the maximum likelihood sequence estimation (MLSE), supported by a closed-form discrete signal approximation of frequency chirp to characterize the MLSE transition probability. This paper proposes a statistical method for the transition probability to significantly enhance the accuracy of the chirp model. Using the statistical estimation, we demonstrate the first single-channel 100-Gb/s PAM-4 transmission over 1600-km fiber with only 10G-class DM lasers.
Maximum-likelihood estimation of recent shared ancestry (ERSA).
Huff, Chad D; Witherspoon, David J; Simonson, Tatum S; Xing, Jinchuan; Watkins, W Scott; Zhang, Yuhua; Tuohy, Therese M; Neklason, Deborah W; Burt, Randall W; Guthery, Stephen L; Woodward, Scott R; Jorde, Lynn B
2011-05-01
Accurate estimation of recent shared ancestry is important for genetics, evolution, medicine, conservation biology, and forensics. Established methods estimate kinship accurately for first-degree through third-degree relatives. We demonstrate that chromosomal segments shared by two individuals due to identity by descent (IBD) provide much additional information about shared ancestry. We developed a maximum-likelihood method for the estimation of recent shared ancestry (ERSA) from the number and lengths of IBD segments derived from high-density SNP or whole-genome sequence data. We used ERSA to estimate relationships from SNP genotypes in 169 individuals from three large, well-defined human pedigrees. ERSA is accurate to within one degree of relationship for 97% of first-degree through fifth-degree relatives and 80% of sixth-degree and seventh-degree relatives. We demonstrate that ERSA's statistical power approaches the maximum theoretical limit imposed by the fact that distant relatives frequently share no DNA through a common ancestor. ERSA greatly expands the range of relationships that can be estimated from genetic data and is implemented in a freely available software package.
Maximum likelihood pedigree reconstruction using integer linear programming.
Cussens, James; Bartlett, Mark; Jones, Elinor M; Sheehan, Nuala A
2013-01-01
Large population biobanks of unrelated individuals have been highly successful in detecting common genetic variants affecting diseases of public health concern. However, they lack the statistical power to detect more modest gene-gene and gene-environment interaction effects or the effects of rare variants for which related individuals are ideally required. In reality, most large population studies will undoubtedly contain sets of undeclared relatives, or pedigrees. Although a crude measure of relatedness might sometimes suffice, having a good estimate of the true pedigree would be much more informative if this could be obtained efficiently. Relatives are more likely to share longer haplotypes around disease susceptibility loci and are hence biologically more informative for rare variants than unrelated cases and controls. Distant relatives are arguably more useful for detecting variants with small effects because they are less likely to share masking environmental effects. Moreover, the identification of relatives enables appropriate adjustments of statistical analyses that typically assume unrelatedness. We propose to exploit an integer linear programming optimisation approach to pedigree learning, which is adapted to find valid pedigrees by imposing appropriate constraints. Our method is not restricted to small pedigrees and is guaranteed to return a maximum likelihood pedigree. With additional constraints, we can also search for multiple high-probability pedigrees and thus account for the inherent uncertainty in any particular pedigree reconstruction. The true pedigree is found very quickly by comparison with other methods when all individuals are observed. Extensions to more complex problems seem feasible. © 2012 Wiley Periodicals, Inc.
Maximum likelihood approach for several stochastic volatility models
International Nuclear Information System (INIS)
Camprodon, Jordi; Perelló, Josep
2012-01-01
Volatility measures the amplitude of price fluctuations. Despite it being one of the most important quantities in finance, volatility is not directly observable. Here we apply a maximum likelihood method which assumes that price and volatility follow a two-dimensional diffusion process where volatility is the stochastic diffusion coefficient of the log-price dynamics. We apply this method to the simplest versions of the expOU, the OU and the Heston stochastic volatility models and we study their performance in terms of the log-price probability, the volatility probability, and its Mean First-Passage Time. The approach has some predictive power on the future returns amplitude by only knowing the current volatility. The assumed models do not consider long-range volatility autocorrelation and the asymmetric return-volatility cross-correlation but the method still yields very naturally these two important stylized facts. We apply the method to different market indices and with a good performance in all cases. (paper)
Molenaar, P.C.M.; Nesselroade, J.R.
1998-01-01
The study of intraindividual variability pervades empirical inquiry in virtually all subdisciplines of psychology. The statistical analysis of multivariate time-series data - a central product of intraindividual investigations - requires special modeling techniques. The dynamic factor model (DFM),
CSIR Research Space (South Africa)
Kok, S
2012-07-01
Full Text Available continuously as the correlation function hyper-parameters approach zero. Since the global minimizer of the maximum likelihood function is an asymptote in this case, it is unclear if maximum likelihood estimation (MLE) remains valid. Numerical ill...
Accelerated maximum likelihood parameter estimation for stochastic biochemical systems
Directory of Open Access Journals (Sweden)
Daigle Bernie J
2012-05-01
Full Text Available Abstract Background A prerequisite for the mechanistic simulation of a biochemical system is detailed knowledge of its kinetic parameters. Despite recent experimental advances, the estimation of unknown parameter values from observed data is still a bottleneck for obtaining accurate simulation results. Many methods exist for parameter estimation in deterministic biochemical systems; methods for discrete stochastic systems are less well developed. Given the probabilistic nature of stochastic biochemical models, a natural approach is to choose parameter values that maximize the probability of the observed data with respect to the unknown parameters, a.k.a. the maximum likelihood parameter estimates (MLEs. MLE computation for all but the simplest models requires the simulation of many system trajectories that are consistent with experimental data. For models with unknown parameters, this presents a computational challenge, as the generation of consistent trajectories can be an extremely rare occurrence. Results We have developed Monte Carlo Expectation-Maximization with Modified Cross-Entropy Method (MCEM2: an accelerated method for calculating MLEs that combines advances in rare event simulation with a computationally efficient version of the Monte Carlo expectation-maximization (MCEM algorithm. Our method requires no prior knowledge regarding parameter values, and it automatically provides a multivariate parameter uncertainty estimate. We applied the method to five stochastic systems of increasing complexity, progressing from an analytically tractable pure-birth model to a computationally demanding model of yeast-polarization. Our results demonstrate that MCEM2 substantially accelerates MLE computation on all tested models when compared to a stand-alone version of MCEM. Additionally, we show how our method identifies parameter values for certain classes of models more accurately than two recently proposed computationally efficient methods
A gateway for phylogenetic analysis powered by grid computing featuring GARLI 2.0.
Bazinet, Adam L; Zwickl, Derrick J; Cummings, Michael P
2014-09-01
We introduce molecularevolution.org, a publicly available gateway for high-throughput, maximum-likelihood phylogenetic analysis powered by grid computing. The gateway features a garli 2.0 web service that enables a user to quickly and easily submit thousands of maximum likelihood tree searches or bootstrap searches that are executed in parallel on distributed computing resources. The garli web service allows one to easily specify partitioned substitution models using a graphical interface, and it performs sophisticated post-processing of phylogenetic results. Although the garli web service has been used by the research community for over three years, here we formally announce the availability of the service, describe its capabilities, highlight new features and recent improvements, and provide details about how the grid system efficiently delivers high-quality phylogenetic results. © The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.
Fast and accurate estimation of the covariance between pairwise maximum likelihood distances
Directory of Open Access Journals (Sweden)
Manuel Gil
2014-09-01
Full Text Available Pairwise evolutionary distances are a model-based summary statistic for a set of molecular sequences. They represent the leaf-to-leaf path lengths of the underlying phylogenetic tree. Estimates of pairwise distances with overlapping paths covary because of shared mutation events. It is desirable to take these covariance structure into account to increase precision in any process that compares or combines distances. This paper introduces a fast estimator for the covariance of two pairwise maximum likelihood distances, estimated under general Markov models. The estimator is based on a conjecture (going back to Nei & Jin, 1989 which links the covariance to path lengths. It is proven here under a simple symmetric substitution model. A simulation shows that the estimator outperforms previously published ones in terms of the mean squared error.
Fast and accurate estimation of the covariance between pairwise maximum likelihood distances.
Gil, Manuel
2014-01-01
Pairwise evolutionary distances are a model-based summary statistic for a set of molecular sequences. They represent the leaf-to-leaf path lengths of the underlying phylogenetic tree. Estimates of pairwise distances with overlapping paths covary because of shared mutation events. It is desirable to take these covariance structure into account to increase precision in any process that compares or combines distances. This paper introduces a fast estimator for the covariance of two pairwise maximum likelihood distances, estimated under general Markov models. The estimator is based on a conjecture (going back to Nei & Jin, 1989) which links the covariance to path lengths. It is proven here under a simple symmetric substitution model. A simulation shows that the estimator outperforms previously published ones in terms of the mean squared error.
Maximum likelihood positioning algorithm for high-resolution PET scanners
International Nuclear Information System (INIS)
Gross-Weege, Nicolas; Schug, David; Hallen, Patrick; Schulz, Volkmar
2016-01-01
Purpose: In high-resolution positron emission tomography (PET), lightsharing elements are incorporated into typical detector stacks to read out scintillator arrays in which one scintillator element (crystal) is smaller than the size of the readout channel. In order to identify the hit crystal by means of the measured light distribution, a positioning algorithm is required. One commonly applied positioning algorithm uses the center of gravity (COG) of the measured light distribution. The COG algorithm is limited in spatial resolution by noise and intercrystal Compton scatter. The purpose of this work is to develop a positioning algorithm which overcomes this limitation. Methods: The authors present a maximum likelihood (ML) algorithm which compares a set of expected light distributions given by probability density functions (PDFs) with the measured light distribution. Instead of modeling the PDFs by using an analytical model, the PDFs of the proposed ML algorithm are generated assuming a single-gamma-interaction model from measured data. The algorithm was evaluated with a hot-rod phantom measurement acquired with the preclinical HYPERION II D PET scanner. In order to assess the performance with respect to sensitivity, energy resolution, and image quality, the ML algorithm was compared to a COG algorithm which calculates the COG from a restricted set of channels. The authors studied the energy resolution of the ML and the COG algorithm regarding incomplete light distributions (missing channel information caused by detector dead time). Furthermore, the authors investigated the effects of using a filter based on the likelihood values on sensitivity, energy resolution, and image quality. Results: A sensitivity gain of up to 19% was demonstrated in comparison to the COG algorithm for the selected operation parameters. Energy resolution and image quality were on a similar level for both algorithms. Additionally, the authors demonstrated that the performance of the ML
A Distance Measure for Genome Phylogenetic Analysis
Cao, Minh Duc; Allison, Lloyd; Dix, Trevor
Phylogenetic analyses of species based on single genes or parts of the genomes are often inconsistent because of factors such as variable rates of evolution and horizontal gene transfer. The availability of more and more sequenced genomes allows phylogeny construction from complete genomes that is less sensitive to such inconsistency. For such long sequences, construction methods like maximum parsimony and maximum likelihood are often not possible due to their intensive computational requirement. Another class of tree construction methods, namely distance-based methods, require a measure of distances between any two genomes. Some measures such as evolutionary edit distance of gene order and gene content are computational expensive or do not perform well when the gene content of the organisms are similar. This study presents an information theoretic measure of genetic distances between genomes based on the biological compression algorithm expert model. We demonstrate that our distance measure can be applied to reconstruct the consensus phylogenetic tree of a number of Plasmodium parasites from their genomes, the statistical bias of which would mislead conventional analysis methods. Our approach is also used to successfully construct a plausible evolutionary tree for the γ-Proteobacteria group whose genomes are known to contain many horizontally transferred genes.
A Fast Algorithm for Maximum Likelihood Estimation of Harmonic Chirp Parameters
DEFF Research Database (Denmark)
Jensen, Tobias Lindstrøm; Nielsen, Jesper Kjær; Jensen, Jesper Rindom
2017-01-01
. A statistically efficient estimator for extracting the parameters of the harmonic chirp model in additive white Gaussian noise is the maximum likelihood (ML) estimator which recently has been demonstrated to be robust to noise and accurate --- even when the model order is unknown. The main drawback of the ML......The analysis of (approximately) periodic signals is an important element in numerous applications. One generalization of standard periodic signals often occurring in practice are harmonic chirp signals where the instantaneous frequency increases/decreases linearly as a function of time...
Two-Stage Maximum Likelihood Estimation (TSMLE for MT-CDMA Signals in the Indoor Environment
Directory of Open Access Journals (Sweden)
Sesay Abu B
2004-01-01
Full Text Available This paper proposes a two-stage maximum likelihood estimation (TSMLE technique suited for multitone code division multiple access (MT-CDMA system. Here, an analytical framework is presented in the indoor environment for determining the average bit error rate (BER of the system, over Rayleigh and Ricean fading channels. The analytical model is derived for quadrature phase shift keying (QPSK modulation technique by taking into account the number of tones, signal bandwidth (BW, bit rate, and transmission power. Numerical results are presented to validate the analysis, and to justify the approximations made therein. Moreover, these results are shown to agree completely with those obtained by simulation.
Wu, Yufeng
2012-03-01
Incomplete lineage sorting can cause incongruence between the phylogenetic history of genes (the gene tree) and that of the species (the species tree), which can complicate the inference of phylogenies. In this article, I present a new coalescent-based algorithm for species tree inference with maximum likelihood. I first describe an improved method for computing the probability of a gene tree topology given a species tree, which is much faster than an existing algorithm by Degnan and Salter (2005). Based on this method, I develop a practical algorithm that takes a set of gene tree topologies and infers species trees with maximum likelihood. This algorithm searches for the best species tree by starting from initial species trees and performing heuristic search to obtain better trees with higher likelihood. This algorithm, called STELLS (which stands for Species Tree InfErence with Likelihood for Lineage Sorting), has been implemented in a program that is downloadable from the author's web page. The simulation results show that the STELLS algorithm is more accurate than an existing maximum likelihood method for many datasets, especially when there is noise in gene trees. I also show that the STELLS algorithm is efficient and can be applied to real biological datasets. © 2011 The Author. Evolution© 2011 The Society for the Study of Evolution.
Ros, B.P.; Bijma, F.; de Munck, J.C.; de Gunst, M.C.M.
2016-01-01
This paper deals with multivariate Gaussian models for which the covariance matrix is a Kronecker product of two matrices. We consider maximum likelihood estimation of the model parameters, in particular of the covariance matrix. There is no explicit expression for the maximum likelihood estimator
Goolsby, Eric W
2017-04-01
Ancestral state reconstruction is a method used to study the evolutionary trajectories of quantitative characters on phylogenies. Although efficient methods for univariate ancestral state reconstruction under a Brownian motion model have been described for at least 25 years, to date no generalization has been described to allow more complex evolutionary models, such as multivariate trait evolution, non-Brownian models, missing data, and within-species variation. Furthermore, even for simple univariate Brownian motion models, most phylogenetic comparative R packages compute ancestral states via inefficient tree rerooting and full tree traversals at each tree node, making ancestral state reconstruction extremely time-consuming for large phylogenies. Here, a computationally efficient method for fast maximum likelihood ancestral state reconstruction of continuous characters is described. The algorithm has linear complexity relative to the number of species and outperforms the fastest existing R implementations by several orders of magnitude. The described algorithm is capable of performing ancestral state reconstruction on a 1,000,000-species phylogeny in fewer than 2 s using a standard laptop, whereas the next fastest R implementation would take several days to complete. The method is generalizable to more complex evolutionary models, such as phylogenetic regression, within-species variation, non-Brownian evolutionary models, and multivariate trait evolution. Because this method enables fast repeated computations on phylogenies of virtually any size, implementation of the described algorithm can drastically alleviate the computational burden of many otherwise prohibitively time-consuming tasks requiring reconstruction of ancestral states, such as phylogenetic imputation of missing data, bootstrapping procedures, Expectation-Maximization algorithms, and Bayesian estimation. The described ancestral state reconstruction algorithm is implemented in the Rphylopars
Frequency-Domain Maximum-Likelihood Estimation of High-Voltage Pulse Transformer Model Parameters
Aguglia, D; Martins, C.D.A.
2014-01-01
This paper presents an offline frequency-domain nonlinear and stochastic identification method for equivalent model parameter estimation of high-voltage pulse transformers. Such kinds of transformers are widely used in the pulsed-power domain, and the difficulty in deriving pulsed-power converter optimal control strategies is directly linked to the accuracy of the equivalent circuit parameters. These components require models which take into account electric fields energies represented by stray capacitance in the equivalent circuit. These capacitive elements must be accurately identified, since they greatly influence the general converter performances. A nonlinear frequency-based identification method, based on maximum-likelihood estimation, is presented, and a sensitivity analysis of the best experimental test to be considered is carried out. The procedure takes into account magnetic saturation and skin effects occurring in the windings during the frequency tests. The presented method is validated by experim...
Directory of Open Access Journals (Sweden)
López-Valcarce Roberto
2004-01-01
Full Text Available We address the problem of estimating the speed of a road vehicle from its acoustic signature, recorded by a pair of omnidirectional microphones located next to the road. This choice of sensors is motivated by their nonintrusive nature as well as low installation and maintenance costs. A novel estimation technique is proposed, which is based on the maximum likelihood principle. It directly estimates car speed without any assumptions on the acoustic signal emitted by the vehicle. This has the advantages of bypassing troublesome intermediate delay estimation steps as well as eliminating the need for an accurate yet general enough acoustic traffic model. An analysis of the estimate for narrowband and broadband sources is provided and verified with computer simulations. The estimation algorithm uses a bank of modified crosscorrelators and therefore it is well suited to DSP implementation, performing well with preliminary field data.
A new maximum likelihood blood velocity estimator incorporating spatial and temporal correlation
DEFF Research Database (Denmark)
Schlaikjer, Malene; Jensen, Jørgen Arendt
2001-01-01
and space. This paper presents a new estimator (STC-MLE), which incorporates the correlation property. It is an expansion of the maximum likelihood estimator (MLE) developed by Ferrara et al. With the MLE a cross-correlation analysis between consecutive RF-lines on complex form is carried out for a range...... of possible velocities. In the new estimator an additional similarity investigation for each evaluated velocity and the available velocity estimates in a temporal (between frames) and spatial (within frames) neighborhood is performed. An a priori probability density term in the distribution...... of the observations gives a probability measure of the correlation between the velocities. Both the MLE and the STC-MLE have been evaluated on simulated and in-vivo RF-data obtained from the carotid artery. Using the MLE 4.1% of the estimates deviate significantly from the true velocities, when the performance...
Implementation of non-linear filters for iterative penalized maximum likelihood image reconstruction
International Nuclear Information System (INIS)
Liang, Z.; Gilland, D.; Jaszczak, R.; Coleman, R.
1990-01-01
In this paper, the authors report on the implementation of six edge-preserving, noise-smoothing, non-linear filters applied in image space for iterative penalized maximum-likelihood (ML) SPECT image reconstruction. The non-linear smoothing filters implemented were the median filter, the E 6 filter, the sigma filter, the edge-line filter, the gradient-inverse filter, and the 3-point edge filter with gradient-inverse filter, and the 3-point edge filter with gradient-inverse weight. A 3 x 3 window was used for all these filters. The best image obtained, by viewing the profiles through the image in terms of noise-smoothing, edge-sharpening, and contrast, was the one smoothed with the 3-point edge filter. The computation time for the smoothing was less than 1% of one iteration, and the memory space for the smoothing was negligible. These images were compared with the results obtained using Bayesian analysis
Phylogenetic analysis using parsimony and likelihood methods.
Yang, Z
1996-02-01
The assumptions underlying the maximum-parsimony (MP) method of phylogenetic tree reconstruction were intuitively examined by studying the way the method works. Computer simulations were performed to corroborate the intuitive examination. Parsimony appears to involve very stringent assumptions concerning the process of sequence evolution, such as constancy of substitution rates between nucleotides, constancy of rates across nucleotide sites, and equal branch lengths in the tree. For practical data analysis, the requirement of equal branch lengths means similar substitution rates among lineages (the existence of an approximate molecular clock), relatively long interior branches, and also few species in the data. However, a small amount of evolution is neither a necessary nor a sufficient requirement of the method. The difficulties involved in the application of current statistical estimation theory to tree reconstruction were discussed, and it was suggested that the approach proposed by Felsenstein (1981, J. Mol. Evol. 17: 368-376) for topology estimation, as well as its many variations and extensions, differs fundamentally from the maximum likelihood estimation of a conventional statistical parameter. Evidence was presented showing that the Felsenstein approach does not share the asymptotic efficiency of the maximum likelihood estimator of a statistical parameter. Computer simulations were performed to study the probability that MP recovers the true tree under a hierarchy of models of nucleotide substitution; its performance relative to the likelihood method was especially noted. The results appeared to support the intuitive examination of the assumptions underlying MP. When a simple model of nucleotide substitution was assumed to generate data, the probability that MP recovers the true topology could be as high as, or even higher than, that for the likelihood method. When the assumed model became more complex and realistic, e.g., when substitution rates were
MADmap: A Massively Parallel Maximum-Likelihood Cosmic Microwave Background Map-Maker
Energy Technology Data Exchange (ETDEWEB)
Cantalupo, Christopher; Borrill, Julian; Jaffe, Andrew; Kisner, Theodore; Stompor, Radoslaw
2009-06-09
MADmap is a software application used to produce maximum-likelihood images of the sky from time-ordered data which include correlated noise, such as those gathered by Cosmic Microwave Background (CMB) experiments. It works efficiently on platforms ranging from small workstations to the most massively parallel supercomputers. Map-making is a critical step in the analysis of all CMB data sets, and the maximum-likelihood approach is the most accurate and widely applicable algorithm; however, it is a computationally challenging task. This challenge will only increase with the next generation of ground-based, balloon-borne and satellite CMB polarization experiments. The faintness of the B-mode signal that these experiments seek to measure requires them to gather enormous data sets. MADmap is already being run on up to O(1011) time samples, O(108) pixels and O(104) cores, with ongoing work to scale to the next generation of data sets and supercomputers. We describe MADmap's algorithm based around a preconditioned conjugate gradient solver, fast Fourier transforms and sparse matrix operations. We highlight MADmap's ability to address problems typically encountered in the analysis of realistic CMB data sets and describe its application to simulations of the Planck and EBEX experiments. The massively parallel and distributed implementation is detailed and scaling complexities are given for the resources required. MADmap is capable of analysing the largest data sets now being collected on computing resources currently available, and we argue that, given Moore's Law, MADmap will be capable of reducing the most massive projected data sets.
Maximum likelihood inference of small trees in the presence of long branches.
Parks, Sarah L; Goldman, Nick
2014-09-01
The statistical basis of maximum likelihood (ML), its robustness, and the fact that it appears to suffer less from biases lead to it being one of the most popular methods for tree reconstruction. Despite its popularity, very few analytical solutions for ML exist, so biases suffered by ML are not well understood. One possible bias is long branch attraction (LBA), a regularly cited term generally used to describe a propensity for long branches to be joined together in estimated trees. Although initially mentioned in connection with inconsistency of parsimony, LBA has been claimed to affect all major phylogenetic reconstruction methods, including ML. Despite the widespread use of this term in the literature, exactly what LBA is and what may be causing it is poorly understood, even for simple evolutionary models and small model trees. Studies looking at LBA have focused on the effect of two long branches on tree reconstruction. However, to understand the effect of two long branches it is also important to understand the effect of just one long branch. If ML struggles to reconstruct one long branch, then this may have an impact on LBA. In this study, we look at the effect of one long branch on three-taxon tree reconstruction. We show that, counterintuitively, long branches are preferentially placed at the tips of the tree. This can be understood through the use of analytical solutions to the ML equation and distance matrix methods. We go on to look at the placement of two long branches on four-taxon trees, showing that there is no attraction between long branches, but that for extreme branch lengths long branches are joined together disproportionally often. These results illustrate that even small model trees are still interesting to help understand how ML phylogenetic reconstruction works, and that LBA is a complicated phenomenon that deserves further study. © The Author(s) 2014. Published by Oxford University Press, on behalf of the Society of Systematic Biologists.
Joint maximum-likelihood magnitudes of presumed underground nuclear test explosions
Peacock, Sheila; Douglas, Alan; Bowers, David
2017-08-01
Body-wave magnitudes (mb) of 606 seismic disturbances caused by presumed underground nuclear test explosions at specific test sites between 1964 and 1996 have been derived from station amplitudes collected by the International Seismological Centre (ISC), by a joint inversion for mb and station-specific magnitude corrections. A maximum-likelihood method was used to reduce the upward bias of network mean magnitudes caused by data censoring, where arrivals at stations that do not report arrivals are assumed to be hidden by the ambient noise at the time. Threshold noise levels at each station were derived from the ISC amplitudes using the method of Kelly and Lacoss, which fits to the observed magnitude-frequency distribution a Gutenberg-Richter exponential decay truncated at low magnitudes by an error function representing the low-magnitude threshold of the station. The joint maximum-likelihood inversion is applied to arrivals from the sites: Semipalatinsk (Kazakhstan) and Novaya Zemlya, former Soviet Union; Singer (Lop Nor), China; Mururoa and Fangataufa, French Polynesia; and Nevada, USA. At sites where eight or more arrivals could be used to derive magnitudes and station terms for 25 or more explosions (Nevada, Semipalatinsk and Mururoa), the resulting magnitudes and station terms were fixed and a second inversion carried out to derive magnitudes for additional explosions with three or more arrivals. 93 more magnitudes were thus derived. During processing for station thresholds, many stations were rejected for sparsity of data, obvious errors in reported amplitude, or great departure of the reported amplitude-frequency distribution from the expected left-truncated exponential decay. Abrupt changes in monthly mean amplitude at a station apparently coincide with changes in recording equipment and/or analysis method at the station.
Finite mixture model: A maximum likelihood estimation approach on time series data
Yen, Phoong Seuk; Ismail, Mohd Tahir; Hamzah, Firdaus Mohamad
2014-09-01
Recently, statistician emphasized on the fitting of finite mixture model by using maximum likelihood estimation as it provides asymptotic properties. In addition, it shows consistency properties as the sample sizes increases to infinity. This illustrated that maximum likelihood estimation is an unbiased estimator. Moreover, the estimate parameters obtained from the application of maximum likelihood estimation have smallest variance as compared to others statistical method as the sample sizes increases. Thus, maximum likelihood estimation is adopted in this paper to fit the two-component mixture model in order to explore the relationship between rubber price and exchange rate for Malaysia, Thailand, Philippines and Indonesia. Results described that there is a negative effect among rubber price and exchange rate for all selected countries.
Maximum Likelihood Approach for RFID Tag Set Cardinality Estimation with Detection Errors
DEFF Research Database (Denmark)
Nguyen, Chuyen T.; Hayashi, Kazunori; Kaneko, Megumi
2013-01-01
Abstract Estimation schemes of Radio Frequency IDentification (RFID) tag set cardinality are studied in this paper using Maximum Likelihood (ML) approach. We consider the estimation problem under the model of multiple independent reader sessions with detection errors due to unreliable radio...... is evaluated under dierent system parameters and compared with that of the conventional method via computer simulations assuming flat Rayleigh fading environments and framed-slotted ALOHA based protocol. Keywords RFID tag cardinality estimation maximum likelihood detection error...
Directory of Open Access Journals (Sweden)
Azam Zaka
2014-10-01
Full Text Available This paper is concerned with the modifications of maximum likelihood, moments and percentile estimators of the two parameter Power function distribution. Sampling behavior of the estimators is indicated by Monte Carlo simulation. For some combinations of parameter values, some of the modified estimators appear better than the traditional maximum likelihood, moments and percentile estimators with respect to bias, mean square error and total deviation.
International Nuclear Information System (INIS)
Eisler, P.; Youl, S.; Lwin, T.; Nelson, G.
1983-01-01
Simultaneous fitting of peaks and background functions from gamma-ray spectrometry using multichannel pulse height analysis is considered. The specific case of Gaussian peak and exponential background is treated in detail with respect to simultaneous estimation of both functions by using a technique which incorporates maximum likelihood method as well as a graphical method. Theoretical expressions for the standard errors of the estimates are also obtained. The technique is demonstrated for two experimental data sets. (orig.)
Maximum likelihood estimation of semiparametric mixture component models for competing risks data.
Choi, Sangbum; Huang, Xuelin
2014-09-01
In the analysis of competing risks data, the cumulative incidence function is a useful quantity to characterize the crude risk of failure from a specific event type. In this article, we consider an efficient semiparametric analysis of mixture component models on cumulative incidence functions. Under the proposed mixture model, latency survival regressions given the event type are performed through a class of semiparametric models that encompasses the proportional hazards model and the proportional odds model, allowing for time-dependent covariates. The marginal proportions of the occurrences of cause-specific events are assessed by a multinomial logistic model. Our mixture modeling approach is advantageous in that it makes a joint estimation of model parameters associated with all competing risks under consideration, satisfying the constraint that the cumulative probability of failing from any cause adds up to one given any covariates. We develop a novel maximum likelihood scheme based on semiparametric regression analysis that facilitates efficient and reliable estimation. Statistical inferences can be conveniently made from the inverse of the observed information matrix. We establish the consistency and asymptotic normality of the proposed estimators. We validate small sample properties with simulations and demonstrate the methodology with a data set from a study of follicular lymphoma. © 2014, The International Biometric Society.
Dang, Cuong Cao; Lefort, Vincent; Le, Vinh Sy; Le, Quang Si; Gascuel, Olivier
2011-10-01
Amino acid replacement rate matrices are an essential basis of protein studies (e.g. in phylogenetics and alignment). A number of general purpose matrices have been proposed (e.g. JTT, WAG, LG) since the seminal work of Margaret Dayhoff and co-workers. However, it has been shown that matrices specific to certain protein groups (e.g. mitochondrial) or life domains (e.g. viruses) differ significantly from general average matrices, and thus perform better when applied to the data to which they are dedicated. This Web server implements the maximum-likelihood estimation procedure that was used to estimate LG, and provides a number of tools and facilities. Users upload a set of multiple protein alignments from their domain of interest and receive the resulting matrix by email, along with statistics and comparisons with other matrices. A non-parametric bootstrap is performed optionally to assess the variability of replacement rate estimates. Maximum-likelihood trees, inferred using the estimated rate matrix, are also computed optionally for each input alignment. Finely tuned procedures and up-to-date ML software (PhyML 3.0, XRATE) are combined to perform all these heavy calculations on our clusters. http://www.atgc-montpellier.fr/ReplacementMatrix/ olivier.gascuel@lirmm.fr Supplementary data are available at http://www.atgc-montpellier.fr/ReplacementMatrix/
Directory of Open Access Journals (Sweden)
Andrey eStepanyuk
2014-10-01
Full Text Available Dendritic integration and neuronal firing patterns strongly depend on biophysical properties of synaptic ligand-gated channels. However, precise estimation of biophysical parameters of these channels in their intrinsic environment is complicated and still unresolved problem. Here we describe a novel method based on a maximum likelihood approach that allows to estimate not only the unitary current of synaptic receptor channels but also their multiple conductance levels, kinetic constants, the number of receptors bound with a neurotransmitter and the peak open probability from experimentally feasible number of postsynaptic currents. The new method also improves the accuracy of evaluation of unitary current as compared to the peak-scaled non-stationary fluctuation analysis, leading to a possibility to precisely estimate this important parameter from a few postsynaptic currents recorded in steady-state conditions. Estimation of unitary current with this method is robust even if postsynaptic currents are generated by receptors having different kinetic parameters, the case when peak-scaled non-stationary fluctuation analysis is not applicable. Thus, with the new method, routinely recorded postsynaptic currents could be used to study the properties of synaptic receptors in their native biochemical environment.
Maximum likelihood positioning for gamma-ray imaging detectors with depth of interaction measurement
International Nuclear Information System (INIS)
Lerche, Ch.W.; Ros, A.; Monzo, J.M.; Aliaga, R.J.; Ferrando, N.; Martinez, J.D.; Herrero, V.; Esteve, R.; Gadea, R.; Colom, R.J.; Toledo, J.; Mateo, F.; Sebastia, A.; Sanchez, F.; Benlloch, J.M.
2009-01-01
The center of gravity algorithm leads to strong artifacts for gamma-ray imaging detectors that are based on monolithic scintillation crystals and position sensitive photo-detectors. This is a consequence of using the centroids as position estimates. The fact that charge division circuits can also be used to compute the standard deviation of the scintillation light distribution opens a way out of this drawback. We studied the feasibility of maximum likelihood estimation for computing the true gamma-ray photo-conversion position from the centroids and the standard deviation of the light distribution. The method was evaluated on a test detector that consists of the position sensitive photomultiplier tube H8500 and a monolithic LSO crystal (42mmx42mmx10mm). Spatial resolution was measured for the centroids and the maximum likelihood estimates. The results suggest that the maximum likelihood positioning is feasible and partially removes the strong artifacts of the center of gravity algorithm.
Maximum likelihood positioning for gamma-ray imaging detectors with depth of interaction measurement
Energy Technology Data Exchange (ETDEWEB)
Lerche, Ch.W. [Grupo de Sistemas Digitales, ITACA, Universidad Politecnica de Valencia, 46022 Valencia (Spain)], E-mail: lerche@ific.uv.es; Ros, A. [Grupo de Fisica Medica Nuclear, IFIC, Universidad de Valencia-Consejo Superior de Investigaciones Cientificas, 46980 Paterna (Spain); Monzo, J.M.; Aliaga, R.J.; Ferrando, N.; Martinez, J.D.; Herrero, V.; Esteve, R.; Gadea, R.; Colom, R.J.; Toledo, J.; Mateo, F.; Sebastia, A. [Grupo de Sistemas Digitales, ITACA, Universidad Politecnica de Valencia, 46022 Valencia (Spain); Sanchez, F.; Benlloch, J.M. [Grupo de Fisica Medica Nuclear, IFIC, Universidad de Valencia-Consejo Superior de Investigaciones Cientificas, 46980 Paterna (Spain)
2009-06-01
The center of gravity algorithm leads to strong artifacts for gamma-ray imaging detectors that are based on monolithic scintillation crystals and position sensitive photo-detectors. This is a consequence of using the centroids as position estimates. The fact that charge division circuits can also be used to compute the standard deviation of the scintillation light distribution opens a way out of this drawback. We studied the feasibility of maximum likelihood estimation for computing the true gamma-ray photo-conversion position from the centroids and the standard deviation of the light distribution. The method was evaluated on a test detector that consists of the position sensitive photomultiplier tube H8500 and a monolithic LSO crystal (42mmx42mmx10mm). Spatial resolution was measured for the centroids and the maximum likelihood estimates. The results suggest that the maximum likelihood positioning is feasible and partially removes the strong artifacts of the center of gravity algorithm.
Maximum Likelihood and Restricted Likelihood Solutions in Multiple-Method Studies.
Rukhin, Andrew L
2011-01-01
A formulation of the problem of combining data from several sources is discussed in terms of random effects models. The unknown measurement precision is assumed not to be the same for all methods. We investigate maximum likelihood solutions in this model. By representing the likelihood equations as simultaneous polynomial equations, the exact form of the Groebner basis for their stationary points is derived when there are two methods. A parametrization of these solutions which allows their comparison is suggested. A numerical method for solving likelihood equations is outlined, and an alternative to the maximum likelihood method, the restricted maximum likelihood, is studied. In the situation when methods variances are considered to be known an upper bound on the between-method variance is obtained. The relationship between likelihood equations and moment-type equations is also discussed.
Stability of maximum-likelihood-based clustering methods: exploring the backbone of classifications
International Nuclear Information System (INIS)
Mungan, Muhittin; Ramasco, José J
2010-01-01
Components of complex systems are often classified according to the way they interact with each other. In graph theory such groups are known as clusters or communities. Many different techniques have been recently proposed to detect them, some of which involve inference methods using either Bayesian or maximum likelihood approaches. In this paper, we study a statistical model designed for detecting clusters based on connection similarity. The basic assumption of the model is that the graph was generated by a certain grouping of the nodes and an expectation maximization algorithm is employed to infer that grouping. We show that the method admits further development to yield a stability analysis of the groupings that quantifies the extent to which each node influences its neighbors' group membership. Our approach naturally allows for the identification of the key elements responsible for the grouping and their resilience to changes in the network. Given the generality of the assumptions underlying the statistical model, such nodes are likely to play special roles in the original system. We illustrate this point by analyzing several empirical networks for which further information about the properties of the nodes is available. The search and identification of stabilizing nodes constitutes thus a novel technique to characterize the relevance of nodes in complex networks
Directory of Open Access Journals (Sweden)
Dansereau Richard M
2007-01-01
Full Text Available We present a new technique for separating two speech signals from a single recording. The proposed method bridges the gap between underdetermined blind source separation techniques and those techniques that model the human auditory system, that is, computational auditory scene analysis (CASA. For this purpose, we decompose the speech signal into the excitation signal and the vocal-tract-related filter and then estimate the components from the mixed speech using a hybrid model. We first express the probability density function (PDF of the mixed speech's log spectral vectors in terms of the PDFs of the underlying speech signal's vocal-tract-related filters. Then, the mean vectors of PDFs of the vocal-tract-related filters are obtained using a maximum likelihood estimator given the mixed signal. Finally, the estimated vocal-tract-related filters along with the extracted fundamental frequencies are used to reconstruct estimates of the individual speech signals. The proposed technique effectively adds vocal-tract-related filter characteristics as a new cue to CASA models using a new grouping technique based on an underdetermined blind source separation. We compare our model with both an underdetermined blind source separation and a CASA method. The experimental results show that our model outperforms both techniques in terms of SNR improvement and the percentage of crosstalk suppression.
Directory of Open Access Journals (Sweden)
Mohammad H. Radfar
2006-11-01
Full Text Available We present a new technique for separating two speech signals from a single recording. The proposed method bridges the gap between underdetermined blind source separation techniques and those techniques that model the human auditory system, that is, computational auditory scene analysis (CASA. For this purpose, we decompose the speech signal into the excitation signal and the vocal-tract-related filter and then estimate the components from the mixed speech using a hybrid model. We first express the probability density function (PDF of the mixed speech's log spectral vectors in terms of the PDFs of the underlying speech signal's vocal-tract-related filters. Then, the mean vectors of PDFs of the vocal-tract-related filters are obtained using a maximum likelihood estimator given the mixed signal. Finally, the estimated vocal-tract-related filters along with the extracted fundamental frequencies are used to reconstruct estimates of the individual speech signals. The proposed technique effectively adds vocal-tract-related filter characteristics as a new cue to CASA models using a new grouping technique based on an underdetermined blind source separation. We compare our model with both an underdetermined blind source separation and a CASA method. The experimental results show that our model outperforms both techniques in terms of SNR improvement and the percentage of crosstalk suppression.
MAXIMUM LIKELIHOOD CLASSIFICATION OF HIGH-RESOLUTION SAR IMAGES IN URBAN AREA
Directory of Open Access Journals (Sweden)
M. Soheili Majd
2012-09-01
Full Text Available In this work, we propose a state-of-the-art on statistical analysis of polarimetric synthetic aperture radar (SAR data, through the modeling of several indices. We concentrate on eight ground classes which have been carried out from amplitudes, co-polarisation ratio, depolarization ratios, and other polarimetric descriptors. To study their different statistical behaviours, we consider Gauss, log- normal, Beta I, Weibull, Gamma, and Fisher statistical models and estimate their parameters using three methods: method of moments (MoM, maximum-likelihood (ML methodology, and log-cumulants method (MoML. Then, we study the opportunity of introducing this information in an adapted supervised classification scheme based on Maximum–Likelihood and Fisher pdf. Our work relies on an image of a suburban area, acquired by the airborne RAMSES SAR sensor of ONERA. The results prove the potential of such data to discriminate urban surfaces and show the usefulness of adapting any classical classification algorithm however classification maps present a persistant class confusion between flat gravelled or concrete roofs and trees.
Maximum Likelihood Estimation and Inference With Examples in R, SAS and ADMB
Millar, Russell B
2011-01-01
This book takes a fresh look at the popular and well-established method of maximum likelihood for statistical estimation and inference. It begins with an intuitive introduction to the concepts and background of likelihood, and moves through to the latest developments in maximum likelihood methodology, including general latent variable models and new material for the practical implementation of integrated likelihood using the free ADMB software. Fundamental issues of statistical inference are also examined, with a presentation of some of the philosophical debates underlying the choice of statis
Maximum likelihood estimation of the parameters of nonminimum phase and noncausal ARMA models
DEFF Research Database (Denmark)
Rasmussen, Klaus Bolding
1994-01-01
The well-known prediction-error-based maximum likelihood (PEML) method can only handle minimum phase ARMA models. This paper presents a new method known as the back-filtering-based maximum likelihood (BFML) method, which can handle nonminimum phase and noncausal ARMA models. The BFML method...... is identical to the PEML method in the case of a minimum phase ARMA model, and it turns out that the BFML method incorporates a noncausal ARMA filter with poles outside the unit circle for estimation of the parameters of a causal, nonminimum phase ARMA model...
Maximum likelihood estimation of the position of a radiating source in a waveguide
International Nuclear Information System (INIS)
Hinich, M.J.
1979-01-01
An array of sensors is receiving radiation from a source of interest. The source and the array are in a one- or two-dimensional waveguide. The maximum-likelihood estimators of the coordinates of the source are analyzed under the assumptions that the noise field is Gaussian. The Cramer-Rao lower bound is of the order of the number of modes which define the source excitation function. The results show that the accuracy of the maximum likelihood estimator of source depth using a vertical array in a infinite horizontal waveguide (such as the ocean) is limited by the number of modes detected by the array regardless of the array size
Detection and phylogenetic analysis of bacteriophage WO in spiders (Araneae).
Yan, Qian; Qiao, Huping; Gao, Jin; Yun, Yueli; Liu, Fengxiang; Peng, Yu
2015-11-01
Phage WO is a bacteriophage found in Wolbachia. Herein, we represent the first phylogenetic study of WOs that infect spiders (Araneae). Seven species of spiders (Araneus alternidens, Nephila clavata, Hylyphantes graminicola, Prosoponoides sinensis, Pholcus crypticolens, Coleosoma octomaculatum, and Nurscia albofasciata) from six families were infected by Wolbachia and WO, followed by comprehensive sequence analysis. Interestingly, WO could be only detected Wolbachia-infected spiders. The relative infection rates of those seven species of spiders were 75, 100, 88.9, 100, 62.5, 72.7, and 100 %, respectively. Our results indicated that both Wolbachia and WO were found in three different body parts of N. clavata, and WO could be passed to the next generation of H. graminicola by vertical transmission. There were three different sequences for WO infected in A. alternidens and two different WO sequences from C. octomaculatum. Only one sequence of WO was found for the other five species of spiders. The discovered sequence of WO ranged from 239 to 311 bp. Phylogenetic tree was generated using maximum likelihood (ML) based on the orf7 gene sequences. According to the phylogenetic tree, WOs in N. clavata and H. graminicola were clustered in the same group. WOs from A. alternidens (WAlt1) and C. octomaculatum (WOct2) were closely related to another clade, whereas WO in P. sinensis was classified as a sole cluster.
Bias Correction for the Maximum Likelihood Estimate of Ability. Research Report. ETS RR-05-15
Zhang, Jinming
2005-01-01
Lord's bias function and the weighted likelihood estimation method are effective in reducing the bias of the maximum likelihood estimate of an examinee's ability under the assumption that the true item parameters are known. This paper presents simulation studies to determine the effectiveness of these two methods in reducing the bias when the item…
Maximum likelihood estimation of ancestral codon usage bias parameters in Drosophila
DEFF Research Database (Denmark)
Nielsen, Rasmus; Bauer DuMont, Vanessa L; Hubisz, Melissa J
2007-01-01
: the selection coefficient for optimal codon usage (S), allowing joint maximum likelihood estimation of S and the dN/dS ratio. We apply the method to previously published data from Drosophila melanogaster, Drosophila simulans, and Drosophila yakuba and show, in accordance with previous results, that the D...
Application of the Method of Maximum Likelihood to Identification of Bipedal Walking Robots
Czech Academy of Sciences Publication Activity Database
Dolinský, Kamil; Čelikovský, Sergej
(2017) ISSN 1063-6536 R&D Projects: GA ČR(CZ) GA17-04682S Institutional support: RVO:67985556 Keywords : Control * identification * maximum likelihood (ML) * walking robots Subject RIV: BC - Control Systems Theory Impact factor: 3.882, year: 2016 http://ieeexplore.ieee.org/document/7954032/
Maximum-likelihood estimation of the hyperbolic parameters from grouped observations
DEFF Research Database (Denmark)
Jensen, Jens Ledet
1988-01-01
a least-squares problem. The second procedure Hypesti first approaches the maximum-likelihood estimate by iterating in the profile-log likelihood function for the scale parameter. Close to the maximum of the likelihood function, the estimation is brought to an end by iteration, using all four parameters...
Kuhnt, S.
2004-01-01
Observed cell counts in contingency tables are perceived as outliers if they have low probability under an anticipated loglinear Poisson model. New procedures for the identification of such outliers are derived using the classical maximum likelihood estimator and an estimator based on the L1 norm.
DEFF Research Database (Denmark)
Borkowski, Robert; Johannisson, Pontus; Wymeersch, Henk
2014-01-01
We perform an experimental investigation of a maximum likelihood-based (ML-based) algorithm for bulk chromatic dispersion estimation for digital coherent receivers operating in uncompensated optical networks. We demonstrate the robustness of the method at low optical signal-to-noise ratio (OSNR...
Casabianca, Jodi M.; Lewis, Charles
2015-01-01
Loglinear smoothing (LLS) estimates the latent trait distribution while making fewer assumptions about its form and maintaining parsimony, thus leading to more precise item response theory (IRT) item parameter estimates than standard marginal maximum likelihood (MML). This article provides the expectation-maximization algorithm for MML estimation…
A simple route to maximum-likelihood estimates of two-locus
Indian Academy of Sciences (India)
Home; Journals; Journal of Genetics; Volume 94; Issue 3. A simple route to maximum-likelihood estimates of two-locus recombination fractions under inequality restrictions. Iain L. Macdonald Philasande Nkalashe. Research Note Volume 94 Issue 3 September 2015 pp 479-481 ...
Monte Carlo Maximum Likelihood Estimation for Generalized Long-Memory Time Series Models
Mesters, G.; Koopman, S.J.; Ooms, M.
2016-01-01
An exact maximum likelihood method is developed for the estimation of parameters in a non-Gaussian nonlinear density function that depends on a latent Gaussian dynamic process with long-memory properties. Our method relies on the method of importance sampling and on a linear Gaussian approximating
Maximum likelihood estimation for Cox's regression model under nested case-control sampling
DEFF Research Database (Denmark)
Scheike, Thomas; Juul, Anders
2004-01-01
Nested case-control sampling is designed to reduce the costs of large cohort studies. It is important to estimate the parameters of interest as efficiently as possible. We present a new maximum likelihood estimator (MLE) for nested case-control sampling in the context of Cox's proportional hazard...
Penfield, Randall D.; Bergeron, Jennifer M.
2005-01-01
This article applies a weighted maximum likelihood (WML) latent trait estimator to the generalized partial credit model (GPCM). The relevant equations required to obtain the WML estimator using the Newton-Raphson algorithm are presented, and a simulation study is described that compared the properties of the WML estimator to those of the maximum…
Maximum Likelihood Dynamic Factor Modeling for Arbitrary "N" and "T" Using SEM
Voelkle, Manuel C.; Oud, Johan H. L.; von Oertzen, Timo; Lindenberger, Ulman
2012-01-01
This article has 3 objectives that build on each other. First, we demonstrate how to obtain maximum likelihood estimates for dynamic factor models (the direct autoregressive factor score model) with arbitrary "T" and "N" by means of structural equation modeling (SEM) and compare the approach to existing methods. Second, we go beyond standard time…
Optimized Large-scale CMB Likelihood and Quadratic Maximum Likelihood Power Spectrum Estimation
Gjerløw, E.; Colombo, L. P. L.; Eriksen, H. K.; Górski, K. M.; Gruppuso, A.; Jewell, J. B.; Plaszczynski, S.; Wehus, I. K.
2015-11-01
We revisit the problem of exact cosmic microwave background (CMB) likelihood and power spectrum estimation with the goal of minimizing computational costs through linear compression. This idea was originally proposed for CMB purposes by Tegmark et al., and here we develop it into a fully functioning computational framework for large-scale polarization analysis, adopting WMAP as a working example. We compare five different linear bases (pixel space, harmonic space, noise covariance eigenvectors, signal-to-noise covariance eigenvectors, and signal-plus-noise covariance eigenvectors) in terms of compression efficiency, and find that the computationally most efficient basis is the signal-to-noise eigenvector basis, which is closely related to the Karhunen-Loeve and Principal Component transforms, in agreement with previous suggestions. For this basis, the information in 6836 unmasked WMAP sky map pixels can be compressed into a smaller set of 3102 modes, with a maximum error increase of any single multipole of 3.8% at ℓ ≤ 32 and a maximum shift in the mean values of a joint distribution of an amplitude-tilt model of 0.006σ. This compression reduces the computational cost of a single likelihood evaluation by a factor of 5, from 38 to 7.5 CPU seconds, and it also results in a more robust likelihood by implicitly regularizing nearly degenerate modes. Finally, we use the same compression framework to formulate a numerically stable and computationally efficient variation of the Quadratic Maximum Likelihood implementation, which requires less than 3 GB of memory and 2 CPU minutes per iteration for ℓ ≤ 32, rendering low-ℓ QML CMB power spectrum analysis fully tractable on a standard laptop.
FlowMax: A Computational Tool for Maximum Likelihood Deconvolution of CFSE Time Courses.
Directory of Open Access Journals (Sweden)
Maxim Nikolaievich Shokhirev
Full Text Available The immune response is a concerted dynamic multi-cellular process. Upon infection, the dynamics of lymphocyte populations are an aggregate of molecular processes that determine the activation, division, and longevity of individual cells. The timing of these single-cell processes is remarkably widely distributed with some cells undergoing their third division while others undergo their first. High cell-to-cell variability and technical noise pose challenges for interpreting popular dye-dilution experiments objectively. It remains an unresolved challenge to avoid under- or over-interpretation of such data when phenotyping gene-targeted mouse models or patient samples. Here we develop and characterize a computational methodology to parameterize a cell population model in the context of noisy dye-dilution data. To enable objective interpretation of model fits, our method estimates fit sensitivity and redundancy by stochastically sampling the solution landscape, calculating parameter sensitivities, and clustering to determine the maximum-likelihood solution ranges. Our methodology accounts for both technical and biological variability by using a cell fluorescence model as an adaptor during population model fitting, resulting in improved fit accuracy without the need for ad hoc objective functions. We have incorporated our methodology into an integrated phenotyping tool, FlowMax, and used it to analyze B cells from two NFκB knockout mice with distinct phenotypes; we not only confirm previously published findings at a fraction of the expended effort and cost, but reveal a novel phenotype of nfkb1/p105/50 in limiting the proliferative capacity of B cells following B-cell receptor stimulation. In addition to complementing experimental work, FlowMax is suitable for high throughput analysis of dye dilution studies within clinical and pharmacological screens with objective and quantitative conclusions.
Maximum likelihood estimation for Cox's regression model under nested case-control sampling
DEFF Research Database (Denmark)
Scheike, Thomas Harder; Juul, Anders
2004-01-01
-like growth factor I was associated with ischemic heart disease. The study was based on a population of 3784 Danes and 231 cases of ischemic heart disease where controls were matched on age and gender. We illustrate the use of the MLE for these data and show how the maximum likelihood framework can be used......Nested case-control sampling is designed to reduce the costs of large cohort studies. It is important to estimate the parameters of interest as efficiently as possible. We present a new maximum likelihood estimator (MLE) for nested case-control sampling in the context of Cox's proportional hazards...... model. The MLE is computed by the EM-algorithm, which is easy to implement in the proportional hazards setting. Standard errors are estimated by a numerical profile likelihood approach based on EM aided differentiation. The work was motivated by a nested case-control study that hypothesized that insulin...
Maximum likelihood pixel labeling using a spatially variant finite mixture model
International Nuclear Information System (INIS)
Gopal, S.S.; Hebert, T.J.
1996-01-01
We propose a spatially-variant mixture model for pixel labeling. Based on this spatially-variant mixture model we derive an expectation maximization algorithm for maximum likelihood estimation of the pixel labels. While most algorithms using mixture models entail the subsequent use of a Bayes classifier for pixel labeling, the proposed algorithm yields maximum likelihood estimates of the labels themselves and results in unambiguous pixel labels. The proposed algorithm is fast, robust, easy to implement, flexible in that it can be applied to any arbitrary image data where the number of classes is known and, most importantly, obviates the need for an explicit labeling rule. The algorithm is evaluated both quantitatively and qualitatively on simulated data and on clinical magnetic resonance images of the human brain
Kelly, D. A.; Fermelia, A.; Lee, G. K. F.
1990-01-01
An adaptive Kalman filter design that utilizes recursive maximum likelihood parameter identification is discussed. At the center of this design is the Kalman filter itself, which has the responsibility for attitude determination. At the same time, the identification algorithm is continually identifying the system parameters. The approach is applicable to nonlinear, as well as linear systems. This adaptive Kalman filter design has much potential for real time implementation, especially considering the fast clock speeds, cache memory and internal RAM available today. The recursive maximum likelihood algorithm is discussed in detail, with special attention directed towards its unique matrix formulation. The procedure for using the algorithm is described along with comments on how this algorithm interacts with the Kalman filter.
International Nuclear Information System (INIS)
Beer, M.
1980-01-01
The maximum likelihood method for the multivariate normal distribution is applied to the case of several individual eigenvalues. Correlated Monte Carlo estimates of the eigenvalue are assumed to follow this prescription and aspects of the assumption are examined. Monte Carlo cell calculations using the SAM-CE and VIM codes for the TRX-1 and TRX-2 benchmark reactors, and SAM-CE full core results are analyzed with this method. Variance reductions of a few percent to a factor of 2 are obtained from maximum likelihood estimation as compared with the simple average and the minimum variance individual eigenvalue. The numerical results verify that the use of sample variances and correlation coefficients in place of the corresponding population statistics still leads to nearly minimum variance estimation for a sufficient number of histories and aggregates
Energy Technology Data Exchange (ETDEWEB)
Singh, Harpreet; Arvind; Dorai, Kavita, E-mail: kavita@iisermohali.ac.in
2016-09-07
Estimation of quantum states is an important step in any quantum information processing experiment. A naive reconstruction of the density matrix from experimental measurements can often give density matrices which are not positive, and hence not physically acceptable. How do we ensure that at all stages of reconstruction, we keep the density matrix positive? Recently a method has been suggested based on maximum likelihood estimation, wherein the density matrix is guaranteed to be positive definite. We experimentally implement this protocol on an NMR quantum information processor. We discuss several examples and compare with the standard method of state estimation. - Highlights: • State estimation using maximum likelihood method was performed on an NMR quantum information processor. • Physically valid density matrices were obtained every time in contrast to standard quantum state tomography. • Density matrices of several different entangled and separable states were reconstructed for two and three qubits.
Estimation of Financial Agent-Based Models with Simulated Maximum Likelihood
Czech Academy of Sciences Publication Activity Database
Kukačka, Jiří; Baruník, Jozef
2017-01-01
Roč. 85, č. 1 (2017), s. 21-45 ISSN 0165-1889 R&D Projects: GA ČR(CZ) GBP402/12/G097 Institutional support: RVO:67985556 Keywords : heterogeneous agent model, * simulated maximum likelihood * switching Subject RIV: AH - Economics OBOR OECD: Finance Impact factor: 1.000, year: 2016 http://library.utia.cas.cz/separaty/2017/E/kukacka-0478481.pdf
Maximum likelihood reconstruction in fully 3D PET via the SAGE algorithm
International Nuclear Information System (INIS)
Ollinger, J.M.; Goggin, A.S.
1996-01-01
The SAGE and ordered subsets algorithms have been proposed as fast methods to compute penalized maximum likelihood estimates in PET. We have implemented both for use in fully 3D PET and completed a preliminary evaluation. The technique used to compute the transition matrix is fully described. The evaluation suggests that the ordered subsets algorithm converges much faster than SAGE, but that it stops short of the optimal solution
Maximum likelihood unit rooting test in the presence GARCH: A new test with increased power
Cook , Steve
2008-01-01
Abstract The literature on testing the unit root hypothesis in the presence of GARCH errors is extended. A new test based upon the combination of local-to-unity detrending and joint maximum likelihood estimation of the autoregressive parameter and GARCH process is presented. The finite sample distribution of the test is derived under alternative decisions regarding the deterministic terms employed. Using Monte Carlo simulation, the newly proposed ML t-test is shown to exhibit incre...
Preliminary application of maximum likelihood method in HL-2A Thomson scattering system
International Nuclear Information System (INIS)
Yao Ke; Huang Yuan; Feng Zhen; Liu Chunhua; Li Enping; Nie Lin
2010-01-01
Maximum likelihood method to process the data of HL-2A Thomson scattering system is presented. Using mathematical statistics, this method maximizes the possibility of the likeness between the theoretical data and the observed data, so that we could get more accurate result. It has been proved to be applicable in comparison with that of the ratios method, and some of the drawbacks in ratios method do not exist in this new one. (authors)
Rahnamaei, Z.; Nematollahi, N.; Farnoosh, R.
2012-01-01
We introduce an alternative skew-slash distribution by using the scale mixture of the exponential power distribution. We derive the properties of this distribution and estimate its parameter by Maximum Likelihood and Bayesian methods. By a simulation study we compute the mentioned estimators and their mean square errors, and we provide an example on real data to demonstrate the modeling strength of the new distribution.
Directory of Open Access Journals (Sweden)
Z. Rahnamaei
2012-01-01
Full Text Available We introduce an alternative skew-slash distribution by using the scale mixture of the exponential power distribution. We derive the properties of this distribution and estimate its parameter by Maximum Likelihood and Bayesian methods. By a simulation study we compute the mentioned estimators and their mean square errors, and we provide an example on real data to demonstrate the modeling strength of the new distribution.
Unbinned maximum likelihood fit for the CP conserving couplings for W + photon production at CDF
International Nuclear Information System (INIS)
Lannon, K.
1994-01-01
We present an unbinned maximum likelihood fit as an alternative to the currently used fit for the CP conserving couplings W plus photon production studied at CDF. We show that a four parameter double exponential fits the E T spectrum of the photon very well. We also show that the fit parameters can be related to and by a second order polynomial. Finally, we discuss various conclusions we have reasoned from our results to the fit so far
Directory of Open Access Journals (Sweden)
Moliere Nguile-Makao
2015-12-01
Full Text Available The analysis of interaction effects involving genetic variants and environmental exposures on the risk of adverse obstetric and early-life outcomes is generally performed using standard logistic regression in the case-mother and control-mother design. However such an analysis is inefficient because it does not take into account the natural family-based constraints present in the parent-child relationship. Recently, a new approach based on semi-parametric maximum likelihood estimation was proposed. The advantage of this approach is that it takes into account the parental relationship between the mother and her child in estimation. But a package implementing this method has not been widely available. In this paper, we present SPmlficmcm, an R package implementing this new method and we propose an extension of the method to handle missing offspring genotype data by maximum likelihood estimation. Our choice to treat missing data of the offspring genotype was motivated by the fact that in genetic association studies where the genetic data of mother and child are available, there are usually more missing data on the genotype of the offspring than that of the mother. The package builds a non-linear system from the data and solves and computes the estimates from the gradient and the Hessian matrix of the log profile semi-parametric likelihood function. Finally, we analyze a simulated dataset to show the usefulness of the package.
Lei, Ning; Chiang, Kwo-Fu; Oudrari, Hassan; Xiong, Xiaoxiong
2011-01-01
Optical sensors aboard Earth orbiting satellites such as the next generation Visible/Infrared Imager/Radiometer Suite (VIIRS) assume that the sensors radiometric response in the Reflective Solar Bands (RSB) is described by a quadratic polynomial, in relating the aperture spectral radiance to the sensor Digital Number (DN) readout. For VIIRS Flight Unit 1, the coefficients are to be determined before launch by an attenuation method, although the linear coefficient will be further determined on-orbit through observing the Solar Diffuser. In determining the quadratic polynomial coefficients by the attenuation method, a Maximum Likelihood approach is applied in carrying out the least-squares procedure. Crucial to the Maximum Likelihood least-squares procedure is the computation of the weight. The weight not only has a contribution from the noise of the sensor s digital count, with an important contribution from digitization error, but also is affected heavily by the mathematical expression used to predict the value of the dependent variable, because both the independent and the dependent variables contain random noise. In addition, model errors have a major impact on the uncertainties of the coefficients. The Maximum Likelihood approach demonstrates the inadequacy of the attenuation method model with a quadratic polynomial for the retrieved spectral radiance. We show that using the inadequate model dramatically increases the uncertainties of the coefficients. We compute the coefficient values and their uncertainties, considering both measurement and model errors.
Estimation Methods for Non-Homogeneous Regression - Minimum CRPS vs Maximum Likelihood
Gebetsberger, Manuel; Messner, Jakob W.; Mayr, Georg J.; Zeileis, Achim
2017-04-01
Non-homogeneous regression models are widely used to statistically post-process numerical weather prediction models. Such regression models correct for errors in mean and variance and are capable to forecast a full probability distribution. In order to estimate the corresponding regression coefficients, CRPS minimization is performed in many meteorological post-processing studies since the last decade. In contrast to maximum likelihood estimation, CRPS minimization is claimed to yield more calibrated forecasts. Theoretically, both scoring rules used as an optimization score should be able to locate a similar and unknown optimum. Discrepancies might result from a wrong distributional assumption of the observed quantity. To address this theoretical concept, this study compares maximum likelihood and minimum CRPS estimation for different distributional assumptions. First, a synthetic case study shows that, for an appropriate distributional assumption, both estimation methods yield to similar regression coefficients. The log-likelihood estimator is slightly more efficient. A real world case study for surface temperature forecasts at different sites in Europe confirms these results but shows that surface temperature does not always follow the classical assumption of a Gaussian distribution. KEYWORDS: ensemble post-processing, maximum likelihood estimation, CRPS minimization, probabilistic temperature forecasting, distributional regression models
Liu, Lihong; Xia, Hongyan; Baule, Claudia; Belák, Sándor
2009-01-01
Bovine viral diarrhoea virus 1 (BVDV-1) and Bovine viral diarrhoea virus 2 (BVDV-2) are two recognised bovine pestivirus species of the genus Pestivirus. Recently, a pestivirus, termed SVA/cont-08, was detected in a batch of contaminated foetal calf serum originating from South America. Comparative sequence analysis showed that the SVA/cont-08 virus shares 15-28% higher sequence identity to pestivirus D32/00_'HoBi' than to members of BVDV-1 and BVDV-2. In order to reveal the phylogenetic relationship of SVA/cont-08 with other pestiviruses, a molecular dataset of 30 pestiviruses and 1,896 characters, comprising the 5'UTR, N(pro) and E2 gene regions, was analysed by two methods: maximum likelihood and Bayesian approach. An identical, well-supported tree topology was observed, where four pestiviruses (SVA/cont-08, D32/00_'HoBi', CH-KaHo/cont, and Th/04_KhonKaen) formed a monophyletic clade that is closely related to the BVDV-1 and BVDV-2 clades. The strategy applied in this study is useful for classifying novel pestiviruses in the future.
Peters, B. C., Jr.; Walker, H. F.
1975-01-01
New results and insights concerning a previously published iterative procedure for obtaining maximum-likelihood estimates of the parameters for a mixture of normal distributions were discussed. It was shown that the procedure converges locally to the consistent maximum likelihood estimate as long as a specified parameter is bounded between two limits. Bound values were given to yield optimal local convergence.
Iliff, K. W.; Maine, R. E.
1976-01-01
A maximum likelihood estimation method was applied to flight data and procedures to facilitate the routine analysis of a large amount of flight data were described. Techniques that can be used to obtain stability and control derivatives from aircraft maneuvers that are less than ideal for this purpose are described. The techniques involve detecting and correcting the effects of dependent or nearly dependent variables, structural vibration, data drift, inadequate instrumentation, and difficulties with the data acquisition system and the mathematical model. The use of uncertainty levels and multiple maneuver analysis also proved to be useful in improving the quality of the estimated coefficients. The procedures used for editing the data and for overall analysis are also discussed.
Cosmic shear measurement with maximum likelihood and maximum a posteriori inference
Hall, Alex; Taylor, Andy
2017-06-01
We investigate the problem of noise bias in maximum likelihood and maximum a posteriori estimators for cosmic shear. We derive the leading and next-to-leading order biases and compute them in the context of galaxy ellipticity measurements, extending previous work on maximum likelihood inference for weak lensing. We show that a large part of the bias on these point estimators can be removed using information already contained in the likelihood when a galaxy model is specified, without the need for external calibration. We test these bias-corrected estimators on simulated galaxy images similar to those expected from planned space-based weak lensing surveys, with promising results. We find that the introduction of an intrinsic shape prior can help with mitigation of noise bias, such that the maximum a posteriori estimate can be made less biased than the maximum likelihood estimate. Second-order terms offer a check on the convergence of the estimators, but are largely subdominant. We show how biases propagate to shear estimates, demonstrating in our simple set-up that shear biases can be reduced by orders of magnitude and potentially to within the requirements of planned space-based surveys at mild signal-to-noise ratio. We find that second-order terms can exhibit significant cancellations at low signal-to-noise ratio when Gaussian noise is assumed, which has implications for inferring the performance of shear-measurement algorithms from simplified simulations. We discuss the viability of our point estimators as tools for lensing inference, arguing that they allow for the robust measurement of ellipticity and shear.
A New Maximum-Likelihood Change Estimator for Two-Pass SAR Coherent Change Detection.
Energy Technology Data Exchange (ETDEWEB)
Wahl, Daniel E.; Yocky, David A.; Jakowatz, Charles V,
2014-09-01
In this paper, we derive a new optimal change metric to be used in synthetic aperture RADAR (SAR) coherent change detection (CCD). Previous CCD methods tend to produce false alarm states (showing change when there is none) in areas of the image that have a low clutter-to-noise power ratio (CNR). The new estimator does not suffer from this shortcoming. It is a surprisingly simple expression, easy to implement, and is optimal in the maximum-likelihood (ML) sense. The estimator produces very impressive results on the CCD collects that we have tested.
Maximum likelihood based multi-channel isotropic reverberation reduction for hearing aids
DEFF Research Database (Denmark)
Kuklasiński, Adam; Doclo, Simon; Jensen, Søren Holdt
2014-01-01
We propose a multi-channel Wiener filter for speech dereverberation in hearing aids. The proposed algorithm uses joint maximum likelihood estimation of the speech and late reverberation spectral variances, under the assumption that the late reverberant sound field is cylindrically isotropic....... The dereverberation performance of the algorithm is evaluated using computer simulations with realistic hearing aid microphone signals including head-related effects. The algorithm is shown to work well with signals reverberated both by synthetic and by measured room impulse responses, achieving improvements...
Directory of Open Access Journals (Sweden)
Behrooz Attaran
2015-01-01
Full Text Available Rotating machinery is the most common machinery in industry. The root of the faults in rotating machinery is often faulty rolling element bearings. This paper presents a technique using optimized artificial neural network by the Bees Algorithm for automated diagnosis of localized faults in rolling element bearings. The inputs of this technique are a number of features (maximum likelihood estimation values, which are derived from the vibration signals of test data. The results shows that the performance of the proposed optimized system is better than most previous studies, even though it uses only two features. Effectiveness of the above method is illustrated using obtained bearing vibration data.
A theory of timing in scintillation counters based on maximum likelihood estimation
International Nuclear Information System (INIS)
Tomitani, Takehiro
1982-01-01
A theory of timing in scintillation counters based on the maximum likelihood estimation is presented. An optimum filter that minimizes the variance of timing is described. A simple formula to estimate the variance of timing is presented as a function of photoelectron number, scintillation decay constant and the single electron transit time spread in the photomultiplier. The present method was compared with the theory by E. Gatti and V. Svelto. The proposed method was applied to two simple models and rough estimations of potential time resolution of several scintillators are given. The proposed method is applicable to the timing in Cerenkov counters and semiconductor detectors as well. (author)
DEFF Research Database (Denmark)
Silver, Jeremy D; Ritchie, Matthew E; Smyth, Gordon K
2009-01-01
exponentially distributed, representing background noise and signal, respectively. Using a saddle-point approximation, Ritchie and others (2007) found normexp to be the best background correction method for 2-color microarray data. This article develops the normexp method further by improving the estimation...... is developed for exact maximum likelihood estimation (MLE) using high-quality optimization software and using the saddle-point estimates as starting values. "MLE" is shown to outperform heuristic estimators proposed by other authors, both in terms of estimation accuracy and in terms of performance on real data...
A maximum-likelihood reconstruction algorithm for tomographic gamma-ray nondestructive assay
International Nuclear Information System (INIS)
Prettyman, T.H.; Estep, R.J.; Cole, R.A.; Sheppard, G.A.
1994-01-01
A new tomographic reconstruction algorithm for nondestructive assay with high resolution gamma-ray spectroscopy (HRGS) is presented. The reconstruction problem is formulated using a maximum-likelihood approach in which the statistical structure of both the gross and continuum measurements used to determine the full-energy response in HRGS is precisely modeled. An accelerated expectation-maximization algorithm is used to determine the optimal solution. The algorithm is applied to safeguards and environmental assays of large samples (for example, 55-gal. drums) in which high continuum levels caused by Compton scattering are routinely encountered. Details of the implementation of the algorithm and a comparative study of the algorithm's performance are presented
Application of the method of maximum likelihood to the determination of cepheid radii
International Nuclear Information System (INIS)
Balona, L.A.
1977-01-01
A method is described whereby the radius of any pulsating star can be obtained by applying the Principle of Maximum Likelihood. The relative merits of this method and of the usual Baade-Wesselink method are discussed in an Appendix. The new method is applied to 54 well-observed cepheids which include a number of spectroscopic binaries and two W Vir stars. An empirical period-radius relation is constructed and discussed in terms of two recent period-luminosity-colour calibrations. It is shown that the new method gives radii with an error of no more than 10 per cent. (author)
International Nuclear Information System (INIS)
Croce, R P; Demma, Th; Longo, M; Marano, S; Matta, V; Pierro, V; Pinto, I M
2003-01-01
The cumulative distribution of the supremum of a set (bank) of correlators is investigated in the context of maximum likelihood detection of gravitational wave chirps from coalescing binaries with unknown parameters. Accurate (lower-bound) approximants are introduced based on a suitable generalization of previous results by Mohanty. Asymptotic properties (in the limit where the number of correlators goes to infinity) are highlighted. The validity of numerical simulations made on small-size banks is extended to banks of any size, via a Gaussian correlation inequality
Directory of Open Access Journals (Sweden)
Lester L. Yuan
2007-06-01
Full Text Available This paper provides a brief introduction to the R package bio.infer, a set of scripts that facilitates the use of maximum likelihood (ML methods for predicting environmental conditions from assemblage composition. Environmental conditions can often be inferred from only biological data, and these inferences are useful when other sources of data are unavailable. ML prediction methods are statistically rigorous and applicable to a broader set of problems than more commonly used weighted averaging techniques. However, ML methods require a substantially greater investment of time to program algorithms and to perform computations. This package is designed to reduce the effort required to apply ML prediction methods.
Regularization parameter selection methods for ill-posed Poisson maximum likelihood estimation
International Nuclear Information System (INIS)
Bardsley, Johnathan M; Goldes, John
2009-01-01
In image processing applications, image intensity is often measured via the counting of incident photons emitted by the object of interest. In such cases, image data noise is accurately modeled by a Poisson distribution. This motivates the use of Poisson maximum likelihood estimation for image reconstruction. However, when the underlying model equation is ill-posed, regularization is needed. Regularized Poisson likelihood estimation has been studied extensively by the authors, though a problem of high importance remains: the choice of the regularization parameter. We will present three statistically motivated methods for choosing the regularization parameter, and numerical examples will be presented to illustrate their effectiveness
The unfolding of NaI(Tl) γ-ray spectrum based on maximum likelihood method
International Nuclear Information System (INIS)
Zhang Qingxian; Ge Liangquan; Gu Yi; Zeng Guoqiang; Lin Yanchang; Wang Guangxi
2011-01-01
NaI(Tl) detectors, having a good detection efficiency, are used to detect gamma rays in field surveys. But the poor energy resolution hinders their applications, despite the use of traditional methods to resolve the overlapped gamma-ray peaks. In this paper, the maximum likelihood (ML) solution is used to resolve the spectrum. The ML method,which is capable of decomposing the peaks in energy difference of over 2/3 FWHM, is applied to scale NaI(Tl) the spectrometer. The result shows that the net area is in proportion to the content of isotopes and the precision of scaling is better than the stripping ration method. (authors)
Directory of Open Access Journals (Sweden)
Maja Olsbjerg
2015-10-01
Full Text Available Item response theory models are often applied when a number items are used to measure a unidimensional latent variable. Originally proposed and used within educational research, they are also used when focus is on physical functioning or psychological wellbeing. Modern applications often need more general models, typically models for multidimensional latent variables or longitudinal models for repeated measurements. This paper describes a SAS macro that fits two-dimensional polytomous Rasch models using a specification of the model that is sufficiently flexible to accommodate longitudinal Rasch models. The macro estimates item parameters using marginal maximum likelihood estimation. A graphical presentation of item characteristic curves is included.
Maximum likelihood reconstruction for pinhole SPECT with a displaced center-of-rotation
International Nuclear Information System (INIS)
Li, J.; Jaszczak, R.J.; Coleman, R.E.
1995-01-01
In this paper, the authors describe the implementation of a maximum likelihood (ML), algorithm using expectation maximization (EM) for pin-hole SPECT with a displaced center-of-rotation. A ray-tracing technique is used in implementing the ML-EM algorithm. The proposed ML-EM algorithm is able to correct the center of rotation displacement which can be characterized by two orthogonal components. The algorithm is tested using experimentally acquired data, and the results demonstrate that the pinhole ML-EM algorithm is able to correct artifacts associated with the center-of-rotation displacement
Maximum likelihood approach to “informed” Sound Source Localization for Hearing Aid applications
DEFF Research Database (Denmark)
Farmani, Mojtaba; Pedersen, Michael Syskind; Tan, Zheng-Hua
2015-01-01
Most state-of-the-art Sound Source Localization (SSL) algorithms have been proposed for applications which are "uninformed'' about the target sound content; however, utilizing a wireless microphone worn by a target talker, enables recent Hearing Aid Systems (HASs) to access to an almost noise......-free sound signal of the target talker at the HAS via the wireless connection. Therefore, in this paper, we propose a maximum likelihood (ML) approach, which we call MLSSL, to estimate the Direction of Arrival (DoA) of the target signal given access to the target signal content. Compared with other "informed...
Directory of Open Access Journals (Sweden)
Dongming Li
2017-04-01
Full Text Available An adaptive optics (AO system provides real-time compensation for atmospheric turbulence. However, an AO image is usually of poor contrast because of the nature of the imaging process, meaning that the image contains information coming from both out-of-focus and in-focus planes of the object, which also brings about a loss in quality. In this paper, we present a robust multi-frame adaptive optics image restoration algorithm via maximum likelihood estimation. Our proposed algorithm uses a maximum likelihood method with image regularization as the basic principle, and constructs the joint log likelihood function for multi-frame AO images based on a Poisson distribution model. To begin with, a frame selection method based on image variance is applied to the observed multi-frame AO images to select images with better quality to improve the convergence of a blind deconvolution algorithm. Then, by combining the imaging conditions and the AO system properties, a point spread function estimation model is built. Finally, we develop our iterative solutions for AO image restoration addressing the joint deconvolution issue. We conduct a number of experiments to evaluate the performances of our proposed algorithm. Experimental results show that our algorithm produces accurate AO image restoration results and outperforms the current state-of-the-art blind deconvolution methods.
Maximum-likelihood fitting of data dominated by Poisson statistical uncertainties
International Nuclear Information System (INIS)
Stoneking, M.R.; Den Hartog, D.J.
1996-06-01
The fitting of data by χ 2 -minimization is valid only when the uncertainties in the data are normally distributed. When analyzing spectroscopic or particle counting data at very low signal level (e.g., a Thomson scattering diagnostic), the uncertainties are distributed with a Poisson distribution. The authors have developed a maximum-likelihood method for fitting data that correctly treats the Poisson statistical character of the uncertainties. This method maximizes the total probability that the observed data are drawn from the assumed fit function using the Poisson probability function to determine the probability for each data point. The algorithm also returns uncertainty estimates for the fit parameters. They compare this method with a χ 2 -minimization routine applied to both simulated and real data. Differences in the returned fits are greater at low signal level (less than ∼20 counts per measurement). the maximum-likelihood method is found to be more accurate and robust, returning a narrower distribution of values for the fit parameters with fewer outliers
Directory of Open Access Journals (Sweden)
Hassibi Babak
2002-01-01
Full Text Available Multiple antenna systems are capable of providing high data rate transmissions over wireless channels. When the channels are dispersive, the signal at each receive antenna is a combination of both the current and past symbols sent from all transmit antennas corrupted by noise. The optimal receiver is a maximum-likelihood sequence detector and is often considered to be practically infeasible due to high computational complexity (exponential in number of antennas and channel memory. Therefore, in practice, one often settles for a less complex suboptimal receiver structure, typically with an equalizer meant to suppress both the intersymbol and interuser interference, followed by the decoder. We propose a sphere decoding for the sequence detection in multiple antenna communication systems over dispersive channels. The sphere decoding provides the maximum-likelihood estimate with computational complexity comparable to the standard space-time decision-feedback equalizing (DFE algorithms. The performance and complexity of the sphere decoding are compared with the DFE algorithm by means of simulations.
Maximum-likelihood methods for array processing based on time-frequency distributions
Zhang, Yimin; Mu, Weifeng; Amin, Moeness G.
1999-11-01
This paper proposes a novel time-frequency maximum likelihood (t-f ML) method for direction-of-arrival (DOA) estimation for non- stationary signals, and compares this method with conventional maximum likelihood DOA estimation techniques. Time-frequency distributions localize the signal power in the time-frequency domain, and as such enhance the effective SNR, leading to improved DOA estimation. The localization of signals with different t-f signatures permits the division of the time-frequency domain into smaller regions, each contains fewer signals than those incident on the array. The reduction of the number of signals within different time-frequency regions not only reduces the required number of sensors, but also decreases the computational load in multi- dimensional optimizations. Compared to the recently proposed time- frequency MUSIC (t-f MUSIC), the proposed t-f ML method can be applied in coherent environments, without the need to perform any type of preprocessing that is subject to both array geometry and array aperture.
Guindon, Stéphane; Dufayard, Jean-François; Lefort, Vincent; Anisimova, Maria; Hordijk, Wim; Gascuel, Olivier
2010-05-01
PhyML is a phylogeny software based on the maximum-likelihood principle. Early PhyML versions used a fast algorithm performing nearest neighbor interchanges to improve a reasonable starting tree topology. Since the original publication (Guindon S., Gascuel O. 2003. A simple, fast and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52:696-704), PhyML has been widely used (>2500 citations in ISI Web of Science) because of its simplicity and a fair compromise between accuracy and speed. In the meantime, research around PhyML has continued, and this article describes the new algorithms and methods implemented in the program. First, we introduce a new algorithm to search the tree space with user-defined intensity using subtree pruning and regrafting topological moves. The parsimony criterion is used here to filter out the least promising topology modifications with respect to the likelihood function. The analysis of a large collection of real nucleotide and amino acid data sets of various sizes demonstrates the good performance of this method. Second, we describe a new test to assess the support of the data for internal branches of a phylogeny. This approach extends the recently proposed approximate likelihood-ratio test and relies on a nonparametric, Shimodaira-Hasegawa-like procedure. A detailed analysis of real alignments sheds light on the links between this new approach and the more classical nonparametric bootstrap method. Overall, our tests show that the last version (3.0) of PhyML is fast, accurate, stable, and ready to use. A Web server and binary files are available from http://www.atgc-montpellier.fr/phyml/.
Parallelization of maximum likelihood fits with OpenMP and CUDA
Jarp, S; Leduc, J; Nowak, A; Pantaleo, F
2011-01-01
Data analyses based on maximum likelihood fits are commonly used in the high energy physics community for fitting statistical models to data samples. This technique requires the numerical minimization of the negative log-likelihood function. MINUIT is the most common package used for this purpose in the high energy physics community. The main algorithm in this package, MIGRAD, searches the minimum by using the gradient information. The procedure requires several evaluations of the function, depending on the number of free parameters and their initial values. The whole procedure can be very CPU-time consuming in case of complex functions, with several free parameters, many independent variables and large data samples. Therefore, it becomes particularly important to speed-up the evaluation of the negative log-likelihood function. In this paper we present an algorithm and its implementation which benefits from data vectorization and parallelization (based on OpenMP) and which was also ported to Graphics Processi...
A Sum-of-Squares and Semidefinite Programming Approach for Maximum Likelihood DOA Estimation
Directory of Open Access Journals (Sweden)
Shu Cai
2016-12-01
Full Text Available Direction of arrival (DOA estimation using a uniform linear array (ULA is a classical problem in array signal processing. In this paper, we focus on DOA estimation based on the maximum likelihood (ML criterion, transform the estimation problem into a novel formulation, named as sum-of-squares (SOS, and then solve it using semidefinite programming (SDP. We first derive the SOS and SDP method for DOA estimation in the scenario of a single source and then extend it under the framework of alternating projection for multiple DOA estimation. The simulations demonstrate that the SOS- and SDP-based algorithms can provide stable and accurate DOA estimation when the number of snapshots is small and the signal-to-noise ratio (SNR is low. Moveover, it has a higher spatial resolution compared to existing methods based on the ML criterion.
Kojima, Yohei; Takeda, Kazuaki; Adachi, Fumiyuki
Frequency-domain equalization (FDE) based on the minimum mean square error (MMSE) criterion can provide better downlink bit error rate (BER) performance of direct sequence code division multiple access (DS-CDMA) than the conventional rake combining in a frequency-selective fading channel. FDE requires accurate channel estimation. In this paper, we propose a new 2-step maximum likelihood channel estimation (MLCE) for DS-CDMA with FDE in a very slow frequency-selective fading environment. The 1st step uses the conventional pilot-assisted MMSE-CE and the 2nd step carries out the MLCE using decision feedback from the 1st step. The BER performance improvement achieved by 2-step MLCE over pilot assisted MMSE-CE is confirmed by computer simulation.
Maximum Likelihood PSD Estimation for Speech Enhancement in Reverberation and Noise
DEFF Research Database (Denmark)
Kuklasinski, Adam; Doclo, Simon; Jensen, Søren Holdt
2016-01-01
In this contribution we focus on the problem of power spectral density (PSD) estimation from multiple microphone signals in reverberant and noisy environments. The PSD estimation method proposed in this paper is based on the maximum likelihood (ML) methodology. In particular, we derive a novel ML...... instrumental measures and is shown to be higher than when the competing estimator is used. Moreover, we perform a speech intelligibility test where we demonstrate that both the proposed and the competing PSD estimators lead to similar intelligibility improvements......., it is shown numerically that the mean squared estimation error achieved by the proposed method is near the limit set by the corresponding Cram´er-Rao lower bound. The speech dereverberation performance of a multi-channel Wiener filter (MWF) based on the proposed PSD estimators is measured using several...
Smolin, John A; Gambetta, Jay M; Smith, Graeme
2012-02-17
We provide an efficient method for computing the maximum-likelihood mixed quantum state (with density matrix ρ) given a set of measurement outcomes in a complete orthonormal operator basis subject to Gaussian noise. Our method works by first changing basis yielding a candidate density matrix μ which may have nonphysical (negative) eigenvalues, and then finding the nearest physical state under the 2-norm. Our algorithm takes at worst O(d(4)) for the basis change plus O(d(3)) for finding ρ where d is the dimension of the quantum state. In the special case where the measurement basis is strings of Pauli operators, the basis change takes only O(d(3)) as well. The workhorse of the algorithm is a new linear-time method for finding the closest probability distribution (in Euclidean distance) to a set of real numbers summing to one.
Maximum Likelihood-Based Methods for Target Velocity Estimation with Distributed MIMO Radar
Directory of Open Access Journals (Sweden)
Zhenxin Cao
2018-02-01
Full Text Available The estimation problem for target velocity is addressed in this in the scenario with a distributed multi-input multi-out (MIMO radar system. A maximum likelihood (ML-based estimation method is derived with the knowledge of target position. Then, in the scenario without the knowledge of target position, an iterative method is proposed to estimate the target velocity by updating the position information iteratively. Moreover, the Carmér-Rao Lower Bounds (CRLBs for both scenarios are derived, and the performance degradation of velocity estimation without the position information is also expressed. Simulation results show that the proposed estimation methods can approach the CRLBs, and the velocity estimation performance can be further improved by increasing either the number of radar antennas or the information accuracy of the target position. Furthermore, compared with the existing methods, a better estimation performance can be achieved.
On Maximum Likelihood Estimation for Left Censored Burr Type III Distribution
Directory of Open Access Journals (Sweden)
Navid Feroze
2015-12-01
Full Text Available Burr type III is an important distribution used to model the failure time data. The paper addresses the problem of estimation of parameters of the Burr type III distribution based on maximum likelihood estimation (MLE when the samples are left censored. As the closed form expression for the MLEs of the parameters cannot be derived, the approximate solutions have been obtained through iterative procedures. An extensive simulation study has been carried out to investigate the performance of the estimators with respect to sample size, censoring rate and true parametric values. A real life example has also been presented. The study revealed that the proposed estimators are consistent and capable of providing efficient results under small to moderate samples.
Targeted search for continuous gravitational waves: Bayesian versus maximum-likelihood statistics
International Nuclear Information System (INIS)
Prix, Reinhard; Krishnan, Badri
2009-01-01
We investigate the Bayesian framework for detection of continuous gravitational waves (GWs) in the context of targeted searches, where the phase evolution of the GW signal is assumed to be known, while the four amplitude parameters are unknown. We show that the orthodox maximum-likelihood statistic (known as F-statistic) can be rediscovered as a Bayes factor with an unphysical prior in amplitude parameter space. We introduce an alternative detection statistic ('B-statistic') using the Bayes factor with a more natural amplitude prior, namely an isotropic probability distribution for the orientation of GW sources. Monte Carlo simulations of targeted searches show that the resulting Bayesian B-statistic is more powerful in the Neyman-Pearson sense (i.e., has a higher expected detection probability at equal false-alarm probability) than the frequentist F-statistic.
Pritikin, Joshua N; Brick, Timothy R; Neale, Michael C
2018-04-01
A novel method for the maximum likelihood estimation of structural equation models (SEM) with both ordinal and continuous indicators is introduced using a flexible multivariate probit model for the ordinal indicators. A full information approach ensures unbiased estimates for data missing at random. Exceeding the capability of prior methods, up to 13 ordinal variables can be included before integration time increases beyond 1 s per row. The method relies on the axiom of conditional probability to split apart the distribution of continuous and ordinal variables. Due to the symmetry of the axiom, two similar methods are available. A simulation study provides evidence that the two similar approaches offer equal accuracy. A further simulation is used to develop a heuristic to automatically select the most computationally efficient approach. Joint ordinal continuous SEM is implemented in OpenMx, free and open-source software.
Li, Xinya; Deng, Z. Daniel; Sun, Yannan; Martinez, Jayson J.; Fu, Tao; McMichael, Geoffrey A.; Carlson, Thomas J.
2014-11-01
Better understanding of fish behavior is vital for recovery of many endangered species including salmon. The Juvenile Salmon Acoustic Telemetry System (JSATS) was developed to observe the out-migratory behavior of juvenile salmonids tagged by surgical implantation of acoustic micro-transmitters and to estimate the survival when passing through dams on the Snake and Columbia Rivers. A robust three-dimensional solver was needed to accurately and efficiently estimate the time sequence of locations of fish tagged with JSATS acoustic transmitters, to describe in sufficient detail the information needed to assess the function of dam-passage design alternatives. An approximate maximum likelihood solver was developed using measurements of time difference of arrival from all hydrophones in receiving arrays on which a transmission was detected. Field experiments demonstrated that the developed solver performed significantly better in tracking efficiency and accuracy than other solvers described in the literature.
BOREAS TE-18 Landsat TM Maximum Likelihood Classification Image of the NSA
Hall, Forrest G. (Editor); Knapp, David
2000-01-01
The BOREAS TE-18 team focused its efforts on using remotely sensed data to characterize the successional and disturbance dynamics of the boreal forest for use in carbon modeling. The objective of this classification is to provide the BOREAS investigators with a data product that characterizes the land cover of the NSA. A Landsat-5 TM image from 20-Aug-1988 was used to derive this classification. A standard supervised maximum likelihood classification approach was used to produce this classification. The data are provided in a binary image format file. The data files are available on a CD-ROM (see document number 20010000884), or from the Oak Ridge National Laboratory (ORNL) Distributed Activity Archive Center (DAAC).
Supervised maximum-likelihood weighting of composite protein networks for complex prediction
Directory of Open Access Journals (Sweden)
Yong Chern Han
2012-12-01
Full Text Available Abstract Background Protein complexes participate in many important cellular functions, so finding the set of existent complexes is essential for understanding the organization and regulation of processes in the cell. With the availability of large amounts of high-throughput protein-protein interaction (PPI data, many algorithms have been proposed to discover protein complexes from PPI networks. However, such approaches are hindered by the high rate of noise in high-throughput PPI data, including spurious and missing interactions. Furthermore, many transient interactions are detected between proteins that are not from the same complex, while not all proteins from the same complex may actually interact. As a result, predicted complexes often do not match true complexes well, and many true complexes go undetected. Results We address these challenges by integrating PPI data with other heterogeneous data sources to construct a composite protein network, and using a supervised maximum-likelihood approach to weight each edge based on its posterior probability of belonging to a complex. We then use six different clustering algorithms, and an aggregative clustering strategy, to discover complexes in the weighted network. We test our method on Saccharomyces cerevisiae and Homo sapiens, and show that complex discovery is improved: compared to previously proposed supervised and unsupervised weighting approaches, our method recalls more known complexes, achieves higher precision at all recall levels, and generates novel complexes of greater functional similarity. Furthermore, our maximum-likelihood approach allows learned parameters to be used to visualize and evaluate the evidence of novel predictions, aiding human judgment of their credibility. Conclusions Our approach integrates multiple data sources with supervised learning to create a weighted composite protein network, and uses six clustering algorithms with an aggregative clustering strategy to
Directory of Open Access Journals (Sweden)
César da Silva Chagas
2013-04-01
Full Text Available Soil surveys are the main source of spatial information on soils and have a range of different applications, mainly in agriculture. The continuity of this activity has however been severely compromised, mainly due to a lack of governmental funding. The purpose of this study was to evaluate the feasibility of two different classifiers (artificial neural networks and a maximum likelihood algorithm in the prediction of soil classes in the northwest of the state of Rio de Janeiro. Terrain attributes such as elevation, slope, aspect, plan curvature and compound topographic index (CTI and indices of clay minerals, iron oxide and Normalized Difference Vegetation Index (NDVI, derived from Landsat 7 ETM+ sensor imagery, were used as discriminating variables. The two classifiers were trained and validated for each soil class using 300 and 150 samples respectively, representing the characteristics of these classes in terms of the discriminating variables. According to the statistical tests, the accuracy of the classifier based on artificial neural networks (ANNs was greater than of the classic Maximum Likelihood Classifier (MLC. Comparing the results with 126 points of reference showed that the resulting ANN map (73.81 % was superior to the MLC map (57.94 %. The main errors when using the two classifiers were caused by: a the geological heterogeneity of the area coupled with problems related to the geological map; b the depth of lithic contact and/or rock exposure, and c problems with the environmental correlation model used due to the polygenetic nature of the soils. This study confirms that the use of terrain attributes together with remote sensing data by an ANN approach can be a tool to facilitate soil mapping in Brazil, primarily due to the availability of low-cost remote sensing data and the ease by which terrain attributes can be obtained.
Performance of penalized maximum likelihood in estimation of genetic covariances matrices
Directory of Open Access Journals (Sweden)
Meyer Karin
2011-11-01
Full Text Available Abstract Background Estimation of genetic covariance matrices for multivariate problems comprising more than a few traits is inherently problematic, since sampling variation increases dramatically with the number of traits. This paper investigates the efficacy of regularized estimation of covariance components in a maximum likelihood framework, imposing a penalty on the likelihood designed to reduce sampling variation. In particular, penalties that "borrow strength" from the phenotypic covariance matrix are considered. Methods An extensive simulation study was carried out to investigate the reduction in average 'loss', i.e. the deviation in estimated matrices from the population values, and the accompanying bias for a range of parameter values and sample sizes. A number of penalties are examined, penalizing either the canonical eigenvalues or the genetic covariance or correlation matrices. In addition, several strategies to determine the amount of penalization to be applied, i.e. to estimate the appropriate tuning factor, are explored. Results It is shown that substantial reductions in loss for estimates of genetic covariance can be achieved for small to moderate sample sizes. While no penalty performed best overall, penalizing the variance among the estimated canonical eigenvalues on the logarithmic scale or shrinking the genetic towards the phenotypic correlation matrix appeared most advantageous. Estimating the tuning factor using cross-validation resulted in a loss reduction 10 to 15% less than that obtained if population values were known. Applying a mild penalty, chosen so that the deviation in likelihood from the maximum was non-significant, performed as well if not better than cross-validation and can be recommended as a pragmatic strategy. Conclusions Penalized maximum likelihood estimation provides the means to 'make the most' of limited and precious data and facilitates more stable estimation for multi-dimensional analyses. It should
Castrillon, Julio; Genton, Marc G.; Yokota, Rio
2015-01-01
We develop a multi-level restricted Gaussian maximum likelihood method for estimating the covariance function parameters and computing the best unbiased predictor. Our approach produces a new set of multi-level contrasts where the deterministic
Model uncertainty estimation and risk assessment is essential to environmental management and informed decision making on pollution mitigation strategies. In this study, we apply a probabilistic methodology, which combines Bayesian Monte Carlo simulation and Maximum Likelihood e...
Godolphin, E. J.
1980-01-01
It is shown that the estimation procedure of Walker leads to estimates of the parameters of a Gaussian moving average process which are asymptotically equivalent to the maximum likelihood estimates proposed by Whittle and represented by Godolphin.
THE GENERALIZED MAXIMUM LIKELIHOOD METHOD APPLIED TO HIGH PRESSURE PHASE EQUILIBRIUM
Directory of Open Access Journals (Sweden)
Lúcio CARDOZO-FILHO
1997-12-01
Full Text Available The generalized maximum likelihood method was used to determine binary interaction parameters between carbon dioxide and components of orange essential oil. Vapor-liquid equilibrium was modeled with Peng-Robinson and Soave-Redlich-Kwong equations, using a methodology proposed in 1979 by Asselineau, Bogdanic and Vidal. Experimental vapor-liquid equilibrium data on binary mixtures formed with carbon dioxide and compounds usually found in orange essential oil were used to test the model. These systems were chosen to demonstrate that the maximum likelihood method produces binary interaction parameters for cubic equations of state capable of satisfactorily describing phase equilibrium, even for a binary such as ethanol/CO2. Results corroborate that the Peng-Robinson, as well as the Soave-Redlich-Kwong, equation can be used to describe phase equilibrium for the following systems: components of essential oil of orange/CO2.Foi empregado o método da máxima verossimilhança generalizado para determinação de parâmetros de interação binária entre os componentes do óleo essencial de laranja e dióxido de carbono. Foram usados dados experimentais de equilíbrio líquido-vapor de misturas binárias de dióxido de carbono e componentes do óleo essencial de laranja. O equilíbrio líquido-vapor foi modelado com as equações de Peng-Robinson e de Soave-Redlich-Kwong usando a metodologia proposta em 1979 por Asselineau, Bogdanic e Vidal. A escolha destes sistemas teve como objetivo demonstrar que o método da máxima verosimilhança produz parâmetros de interação binária, para equações cúbicas de estado capazes de descrever satisfatoriamente até mesmo o equilíbrio para o binário etanol/CO2. Os resultados comprovam que tanto a equação de Peng-Robinson quanto a de Soave-Redlich-Kwong podem ser empregadas para descrever o equilíbrio de fases para o sistemas: componentes do óleo essencial de laranja/CO2.
Bias correction for estimated QTL effects using the penalized maximum likelihood method.
Zhang, J; Yue, C; Zhang, Y-M
2012-04-01
A penalized maximum likelihood method has been proposed as an important approach to the detection of epistatic quantitative trait loci (QTL). However, this approach is not optimal in two special situations: (1) closely linked QTL with effects in opposite directions and (2) small-effect QTL, because the method produces downwardly biased estimates of QTL effects. The present study aims to correct the bias by using correction coefficients and shifting from the use of a uniform prior on the variance parameter of a QTL effect to that of a scaled inverse chi-square prior. The results of Monte Carlo simulation experiments show that the improved method increases the power from 25 to 88% in the detection of two closely linked QTL of equal size in opposite directions and from 60 to 80% in the identification of QTL with small effects (0.5% of the total phenotypic variance). We used the improved method to detect QTL responsible for the barley kernel weight trait using 145 doubled haploid lines developed in the North American Barley Genome Mapping Project. Application of the proposed method to other shrinkage estimation of QTL effects is discussed.
Tsou, Haiping; Yan, Tsun-Yee
1999-04-01
This paper describes an extended-source spatial acquisition and tracking scheme for planetary optical communications. This scheme uses the Sun-lit Earth image as the beacon signal, which can be computed according to the current Sun-Earth-Probe angle from a pre-stored Earth image or a received snapshot taken by other Earth-orbiting satellite. Onboard the spacecraft, the reference image is correlated in the transform domain with the received image obtained from a detector array, which is assumed to have each of its pixels corrupted by an independent additive white Gaussian noise. The coordinate of the ground station is acquired and tracked, respectively, by an open-loop acquisition algorithm and a closed-loop tracking algorithm derived from the maximum likelihood criterion. As shown in the paper, the optimal spatial acquisition requires solving two nonlinear equations, or iteratively solving their linearized variants, to estimate the coordinate when translation in the relative positions of onboard and ground transceivers is considered. Similar assumption of linearization leads to the closed-loop spatial tracking algorithm in which the loop feedback signals can be derived from the weighted transform-domain correlation. Numerical results using a sample Sun-lit Earth image demonstrate that sub-pixel resolutions can be achieved by this scheme in a high disturbance environment.
Directory of Open Access Journals (Sweden)
Zhang Zhang
2009-06-01
Full Text Available A major analytical challenge in computational biology is the detection and description of clusters of specified site types, such as polymorphic or substituted sites within DNA or protein sequences. Progress has been stymied by a lack of suitable methods to detect clusters and to estimate the extent of clustering in discrete linear sequences, particularly when there is no a priori specification of cluster size or cluster count. Here we derive and demonstrate a maximum likelihood method of hierarchical clustering. Our method incorporates a tripartite divide-and-conquer strategy that models sequence heterogeneity, delineates clusters, and yields a profile of the level of clustering associated with each site. The clustering model may be evaluated via model selection using the Akaike Information Criterion, the corrected Akaike Information Criterion, and the Bayesian Information Criterion. Furthermore, model averaging using weighted model likelihoods may be applied to incorporate model uncertainty into the profile of heterogeneity across sites. We evaluated our method by examining its performance on a number of simulated datasets as well as on empirical polymorphism data from diverse natural alleles of the Drosophila alcohol dehydrogenase gene. Our method yielded greater power for the detection of clustered sites across a breadth of parameter ranges, and achieved better accuracy and precision of estimation of clusters, than did the existing empirical cumulative distribution function statistics.
Efficient Levenberg-Marquardt minimization of the maximum likelihood estimator for Poisson deviates
International Nuclear Information System (INIS)
Laurence, T.; Chromy, B.
2010-01-01
Histograms of counted events are Poisson distributed, but are typically fitted without justification using nonlinear least squares fitting. The more appropriate maximum likelihood estimator (MLE) for Poisson distributed data is seldom used. We extend the use of the Levenberg-Marquardt algorithm commonly used for nonlinear least squares minimization for use with the MLE for Poisson distributed data. In so doing, we remove any excuse for not using this more appropriate MLE. We demonstrate the use of the algorithm and the superior performance of the MLE using simulations and experiments in the context of fluorescence lifetime imaging. Scientists commonly form histograms of counted events from their data, and extract parameters by fitting to a specified model. Assuming that the probability of occurrence for each bin is small, event counts in the histogram bins will be distributed according to the Poisson distribution. We develop here an efficient algorithm for fitting event counting histograms using the maximum likelihood estimator (MLE) for Poisson distributed data, rather than the non-linear least squares measure. This algorithm is a simple extension of the common Levenberg-Marquardt (L-M) algorithm, is simple to implement, quick and robust. Fitting using a least squares measure is most common, but it is the maximum likelihood estimator only for Gaussian-distributed data. Non-linear least squares methods may be applied to event counting histograms in cases where the number of events is very large, so that the Poisson distribution is well approximated by a Gaussian. However, it is not easy to satisfy this criterion in practice - which requires a large number of events. It has been well-known for years that least squares procedures lead to biased results when applied to Poisson-distributed data; a recent paper providing extensive characterization of these biases in exponential fitting is given. The more appropriate measure based on the maximum likelihood estimator (MLE
Penalized maximum-likelihood sinogram restoration for dual focal spot computed tomography
International Nuclear Information System (INIS)
Forthmann, P; Koehler, T; Begemann, P G C; Defrise, M
2007-01-01
Due to various system non-idealities, the raw data generated by a computed tomography (CT) machine are not readily usable for reconstruction. Although the deterministic nature of corruption effects such as crosstalk and afterglow permits correction by deconvolution, there is a drawback because deconvolution usually amplifies noise. Methods that perform raw data correction combined with noise suppression are commonly termed sinogram restoration methods. The need for sinogram restoration arises, for example, when photon counts are low and non-statistical reconstruction algorithms such as filtered backprojection are used. Many modern CT machines offer a dual focal spot (DFS) mode, which serves the goal of increased radial sampling by alternating the focal spot between two positions on the anode plate during the scan. Although the focal spot mode does not play a role with respect to how the data are affected by the above-mentioned corruption effects, it needs to be taken into account if regularized sinogram restoration is to be applied to the data. This work points out the subtle difference in processing that sinogram restoration for DFS requires, how it is correctly employed within the penalized maximum-likelihood sinogram restoration algorithm and what impact it has on image quality
Directory of Open Access Journals (Sweden)
K. Yao
2007-12-01
Full Text Available We investigate the maximum likelihood (ML direction-of-arrival (DOA estimation of multiple wideband sources in the presence of unknown nonuniform sensor noise. New closed-form expression for the direction estimation CramÃƒÂ©r-Rao-Bound (CRB has been derived. The performance of the conventional wideband uniform ML estimator under nonuniform noise has been studied. In order to mitigate the performance degradation caused by the nonuniformity of the noise, a new deterministic wideband nonuniform ML DOA estimator is derived and two associated processing algorithms are proposed. The first algorithm is based on an iterative procedure which stepwise concentrates the log-likelihood function with respect to the DOAs and the noise nuisance parameters, while the second is a noniterative algorithm that maximizes the derived approximately concentrated log-likelihood function. The performance of the proposed algorithms is tested through extensive computer simulations. Simulation results show the stepwise-concentrated ML algorithm (SC-ML requires only a few iterations to converge and both the SC-ML and the approximately-concentrated ML algorithm (AC-ML attain a solution close to the derived CRB at high signal-to-noise ratio.
Maximum Likelihood Time-of-Arrival Estimation of Optical Pulses via Photon-Counting Photodetectors
Erkmen, Baris I.; Moision, Bruce E.
2010-01-01
Many optical imaging, ranging, and communications systems rely on the estimation of the arrival time of an optical pulse. Recently, such systems have been increasingly employing photon-counting photodetector technology, which changes the statistics of the observed photocurrent. This requires time-of-arrival estimators to be developed and their performances characterized. The statistics of the output of an ideal photodetector, which are well modeled as a Poisson point process, were considered. An analytical model was developed for the mean-square error of the maximum likelihood (ML) estimator, demonstrating two phenomena that cause deviations from the minimum achievable error at low signal power. An approximation was derived to the threshold at which the ML estimator essentially fails to provide better than a random guess of the pulse arrival time. Comparing the analytic model performance predictions to those obtained via simulations, it was verified that the model accurately predicts the ML performance over all regimes considered. There is little prior art that attempts to understand the fundamental limitations to time-of-arrival estimation from Poisson statistics. This work establishes both a simple mathematical description of the error behavior, and the associated physical processes that yield this behavior. Previous work on mean-square error characterization for ML estimators has predominantly focused on additive Gaussian noise. This work demonstrates that the discrete nature of the Poisson noise process leads to a distinctly different error behavior.
Implementation of linear filters for iterative penalized maximum likelihood SPECT reconstruction
International Nuclear Information System (INIS)
Liang, Z.
1991-01-01
This paper reports on six low-pass linear filters applied in frequency space implemented for iterative penalized maximum-likelihood (ML) SPECT image reconstruction. The filters implemented were the Shepp-Logan filter, the Butterworth filer, the Gaussian filter, the Hann filter, the Parzen filer, and the Lagrange filter. The low-pass filtering was applied in frequency space to projection data for the initial estimate and to the difference of projection data and reprojected data for higher order approximations. The projection data were acquired experimentally from a chest phantom consisting of non-uniform attenuating media. All the filters could effectively remove the noise and edge artifacts associated with ML approach if the frequency cutoff was properly chosen. The improved performance of the Parzen and Lagrange filters relative to the others was observed. The best image, by viewing its profiles in terms of noise-smoothing, edge-sharpening, and contrast, was the one obtained with the Parzen filter. However, the Lagrange filter has the potential to consider the characteristics of detector response function
Maximum likelihood versus likelihood-free quantum system identification in the atom maser
International Nuclear Information System (INIS)
Catana, Catalin; Kypraios, Theodore; Guţă, Mădălin
2014-01-01
We consider the problem of estimating a dynamical parameter of a Markovian quantum open system (the atom maser), by performing continuous time measurements in the system's output (outgoing atoms). Two estimation methods are investigated and compared. Firstly, the maximum likelihood estimator (MLE) takes into account the full measurement data and is asymptotically optimal in terms of its mean square error. Secondly, the ‘likelihood-free’ method of approximate Bayesian computation (ABC) produces an approximation of the posterior distribution for a given set of summary statistics, by sampling trajectories at different parameter values and comparing them with the measurement data via chosen statistics. Building on previous results which showed that atom counts are poor statistics for certain values of the Rabi angle, we apply MLE to the full measurement data and estimate its Fisher information. We then select several correlation statistics such as waiting times, distribution of successive identical detections, and use them as input of the ABC algorithm. The resulting posterior distribution follows closely the data likelihood, showing that the selected statistics capture ‘most’ statistical information about the Rabi angle. (paper)
Fast Maximum-Likelihood Decoder for Quasi-Orthogonal Space-Time Block Code
Directory of Open Access Journals (Sweden)
Adel Ahmadi
2015-01-01
Full Text Available Motivated by the decompositions of sphere and QR-based methods, in this paper we present an extremely fast maximum-likelihood (ML detection approach for quasi-orthogonal space-time block code (QOSTBC. The proposed algorithm with a relatively simple design exploits structure of quadrature amplitude modulation (QAM constellations to achieve its goal and can be extended to any arbitrary constellation. Our decoder utilizes a new decomposition technique for ML metric which divides the metric into independent positive parts and a positive interference part. Search spaces of symbols are substantially reduced by employing the independent parts and statistics of noise. Symbols within the search spaces are successively evaluated until the metric is minimized. Simulation results confirm that the proposed decoder’s performance is superior to many of the recently published state-of-the-art solutions in terms of complexity level. More specifically, it was possible to verify that application of the new algorithms with 1024-QAM would decrease the computational complexity compared to state-of-the-art solution with 16-QAM.
Directory of Open Access Journals (Sweden)
Sonali Sachin Sankpal
2016-01-01
Full Text Available Scattering and absorption of light is main reason for limited visibility in water. The suspended particles and dissolved chemical compounds in water are also responsible for scattering and absorption of light in water. The limited visibility in water results in degradation of underwater images. The visibility can be increased by using artificial light source in underwater imaging system. But the artificial light illuminates the scene in a nonuniform fashion. It produces bright spot at the center with the dark region at surroundings. In some cases imaging system itself creates dark region in the image by producing shadow on the objects. The problem of nonuniform illumination is neglected by the researchers in most of the image enhancement techniques of underwater images. Also very few methods are discussed showing the results on color images. This paper suggests a method for nonuniform illumination correction for underwater images. The method assumes that natural underwater images are Rayleigh distributed. This paper used maximum likelihood estimation of scale parameter to map distribution of image to Rayleigh distribution. The method is compared with traditional methods for nonuniform illumination correction using no-reference image quality metrics like average luminance, average information entropy, normalized neighborhood function, average contrast, and comprehensive assessment function.
Chernick, Julian A.; Perlovsky, Leonid I.; Tye, David M.
1994-06-01
This paper describes applications of maximum likelihood adaptive neural system (MLANS) to the characterization of clutter in IR images and to the identification of targets. The characterization of image clutter is needed to improve target detection and to enhance the ability to compare performance of different algorithms using diverse imagery data. Enhanced unambiguous IFF is important for fratricide reduction while automatic cueing and targeting is becoming an ever increasing part of operations. We utilized MLANS which is a parametric neural network that combines optimal statistical techniques with a model-based approach. This paper shows that MLANS outperforms classical classifiers, the quadratic classifier and the nearest neighbor classifier, because on the one hand it is not limited to the usual Gaussian distribution assumption and can adapt in real time to the image clutter distribution; on the other hand MLANS learns from fewer samples and is more robust than the nearest neighbor classifiers. Future research will address uncooperative IFF using fused IR and MMW data.
International Nuclear Information System (INIS)
Rajan, Jeny; Jeurissen, Ben; Sijbers, Jan; Verhoye, Marleen; Van Audekerke, Johan
2011-01-01
In this paper, we propose a method to denoise magnitude magnetic resonance (MR) images, which are Rician distributed. Conventionally, maximum likelihood methods incorporate the Rice distribution to estimate the true, underlying signal from a local neighborhood within which the signal is assumed to be constant. However, if this assumption is not met, such filtering will lead to blurred edges and loss of fine structures. As a solution to this problem, we put forward the concept of restricted local neighborhoods where the true intensity for each noisy pixel is estimated from a set of preselected neighboring pixels. To this end, a reference image is created from the noisy image using a recently proposed nonlocal means algorithm. This reference image is used as a prior for further noise reduction. A scheme is developed to locally select an appropriate subset of pixels from which the underlying signal is estimated. Experimental results based on the peak signal to noise ratio, structural similarity index matrix, Bhattacharyya coefficient and mean absolute difference from synthetic and real MR images demonstrate the superior performance of the proposed method over other state-of-the-art methods.
Eggers, G. L.; Lewis, K. W.; Simons, F. J.; Olhede, S.
2013-12-01
topography and gravity, in which the INITIAL loading by topography retains the Matern form but the FINAL topography and gravity are the result of flexural compensation. In our modeling, we pay explicit attention to finite-field spectral estimation effects (and their remedy via tapering), and to the implementation of statistical tests (for anisotropy, for initial-loading process correlation, to ascertain the proper density contrasts and interface depth in a two-layer model), robustness assessment and uncertainty quantification, as well as to algorithmic intricacies related to low-dimensional but poorly scaled maximum-likelihood inversions. We conclude that Venusian geomorphic terrains are well described by their 2-D topographic and gravity (cross-)power spectra, and the spectral properties of distinct geologic provinces on Venus are worth quantifying via maximum-likelihood-based methods under idealized three-parameter Matern distributions. Analysis of fitted parameters and the fitted-data residuals reveals natural variability in the (sub)surface properties on Venus, as well as some directional anisotropy. Geologic regions tend to cluster according to terrain type in our parameter space, which we analyze to confirm their shared geologic histories and utilize for guidance in ongoing mapping efforts of Venus and other terrestrial bodies.
First phylogenetic analysis of Ehrlichia canis in dogs and ticks from Mexico. Preliminary study
Directory of Open Access Journals (Sweden)
Carolina G. Sosa-Gutiérrez
2016-09-01
Full Text Available Objective. Phylogenetic characterization of Ehrlichia canis in dogs naturally infected and ticks, diagnosed by PCR and sequencing of 16SrRNA gene; compare different isolates found in American countries. Materials and methods. Were collected Blood samples from 139 dogs with suggestive clinical manifestations of this disease and they were infested with ticks; part of 16SrRNA gene was sequenced and aligned, with 17 sequences reported in American countries. Two phylogenetic trees were constructed using the Maximum likelihood method, and Maximum parsimony. Results. They were positive to E. canis 25/139 (18.0% dogs and 29/139 (20.9% ticks. The clinical manifestations presented were fever, fatigue, depression and vomiting. Rhipicephalus sanguineus Dermacentor variabilis and Haemaphysalis leporis-palustris ticks were positive for E. canis. Phylogenetic analysis showed that the sequences of dogs and ticks in Mexico form a third group diverging of sequences from South America and USA. Conclusions. This is the first phylogenetic analysis of E. canis in Mexico. There are differences in the sequences of Mexico with those reported in South America and USA. This research lays the foundation for further study of genetic variability.
Phylogenetic Analysis Using Protein Mass Spectrometry.
Ma, Shiyong; Downard, Kevin M; Wong, Jason W H
2017-01-01
Through advances in molecular biology, comparative analysis of DNA sequences is currently the cornerstone in the study of molecular evolution and phylogenetics. Nevertheless, protein mass spectrometry offers some unique opportunities to enable phylogenetic analyses in organisms where DNA may be difficult or costly to obtain. To date, the methods of phylogenetic analysis using protein mass spectrometry can be classified into three categories: (1) de novo protein sequencing followed by classical phylogenetic reconstruction, (2) direct phylogenetic reconstruction using proteolytic peptide mass maps, and (3) mapping of mass spectral data onto classical phylogenetic trees. In this chapter, we provide a brief description of the three methods and the protocol for each method along with relevant tools and algorithms.
Ellis, Sam; Reader, Andrew J
2018-04-26
Many clinical contexts require the acquisition of multiple positron emission tomography (PET) scans of a single subject, for example to observe and quantify changes in functional behaviour in tumours after treatment in oncology. Typically, the datasets from each of these scans are reconstructed individually, without exploiting the similarities between them. We have recently shown that sharing information between longitudinal PET datasets by penalising voxel-wise differences during image reconstruction can improve reconstructed images by reducing background noise and increasing the contrast-to-noise ratio of high activity lesions. Here we present two additional novel longitudinal difference-image priors and evaluate their performance using 2D simulation studies and a 3D real dataset case study. We have previously proposed a simultaneous difference-image-based penalised maximum likelihood (PML) longitudinal image reconstruction method that encourages sparse difference images (DS-PML), and in this work we propose two further novel prior terms. The priors are designed to encourage longitudinal images with corresponding differences which have i) low entropy (DE-PML), and ii) high sparsity in their spatial gradients (DTV-PML). These two new priors and the originally proposed longitudinal prior were applied to 2D simulated treatment response [ 18 F]fluorodeoxyglucose (FDG) brain tumour datasets and compared to standard maximum likelihood expectation-maximisation (MLEM) reconstructions. These 2D simulation studies explored the effects of penalty strengths, tumour behaviour, and inter-scan coupling on reconstructed images. Finally, a real two-scan longitudinal data series acquired from a head and neck cancer patient was reconstructed with the proposed methods and the results compared to standard reconstruction methods. Using any of the three priors with an appropriate penalty strength produced images with noise levels equivalent to those seen when using standard
Deconvolving the wedge: maximum-likelihood power spectra via spherical-wave visibility modelling
Ghosh, A.; Mertens, F. G.; Koopmans, L. V. E.
2018-03-01
Direct detection of the Epoch of Reionization (EoR) via the red-shifted 21-cm line will have unprecedented implications on the study of structure formation in the infant Universe. To fulfil this promise, current and future 21-cm experiments need to detect this weak EoR signal in the presence of foregrounds that are several orders of magnitude larger. This requires extreme noise control and improved wide-field high dynamic-range imaging techniques. We propose a new imaging method based on a maximum likelihood framework which solves for the interferometric equation directly on the sphere, or equivalently in the uvw-domain. The method uses the one-to-one relation between spherical waves and spherical harmonics (SpH). It consistently handles signals from the entire sky, and does not require a w-term correction. The SpH coefficients represent the sky-brightness distribution and the visibilities in the uvw-domain, and provide a direct estimate of the spatial power spectrum. Using these spectrally smooth SpH coefficients, bright foregrounds can be removed from the signal, including their side-lobe noise, which is one of the limiting factors in high dynamics-range wide-field imaging. Chromatic effects causing the so-called `wedge' are effectively eliminated (i.e. deconvolved) in the cylindrical (k⊥, k∥) power spectrum, compared to a power spectrum computed directly from the images of the foreground visibilities where the wedge is clearly present. We illustrate our method using simulated Low-Frequency Array observations, finding an excellent reconstruction of the input EoR signal with minimal bias.
Penalized maximum likelihood reconstruction for x-ray differential phase-contrast tomography
International Nuclear Information System (INIS)
Brendel, Bernhard; Teuffenbach, Maximilian von; Noël, Peter B.; Pfeiffer, Franz; Koehler, Thomas
2016-01-01
Purpose: The purpose of this work is to propose a cost function with regularization to iteratively reconstruct attenuation, phase, and scatter images simultaneously from differential phase contrast (DPC) acquisitions, without the need of phase retrieval, and examine its properties. Furthermore this reconstruction method is applied to an acquisition pattern that is suitable for a DPC tomographic system with continuously rotating gantry (sliding window acquisition), overcoming the severe smearing in noniterative reconstruction. Methods: We derive a penalized maximum likelihood reconstruction algorithm to directly reconstruct attenuation, phase, and scatter image from the measured detector values of a DPC acquisition. The proposed penalty comprises, for each of the three images, an independent smoothing prior. Image quality of the proposed reconstruction is compared to images generated with FBP and iterative reconstruction after phase retrieval. Furthermore, the influence between the priors is analyzed. Finally, the proposed reconstruction algorithm is applied to experimental sliding window data acquired at a synchrotron and results are compared to reconstructions based on phase retrieval. Results: The results show that the proposed algorithm significantly increases image quality in comparison to reconstructions based on phase retrieval. No significant mutual influence between the proposed independent priors could be observed. Further it could be illustrated that the iterative reconstruction of a sliding window acquisition results in images with substantially reduced smearing artifacts. Conclusions: Although the proposed cost function is inherently nonconvex, it can be used to reconstruct images with less aliasing artifacts and less streak artifacts than reconstruction methods based on phase retrieval. Furthermore, the proposed method can be used to reconstruct images of sliding window acquisitions with negligible smearing artifacts
L.U.St: a tool for approximated maximum likelihood supertree reconstruction.
Akanni, Wasiu A; Creevey, Christopher J; Wilkinson, Mark; Pisani, Davide
2014-06-12
Supertrees combine disparate, partially overlapping trees to generate a synthesis that provides a high level perspective that cannot be attained from the inspection of individual phylogenies. Supertrees can be seen as meta-analytical tools that can be used to make inferences based on results of previous scientific studies. Their meta-analytical application has increased in popularity since it was realised that the power of statistical tests for the study of evolutionary trends critically depends on the use of taxon-dense phylogenies. Further to that, supertrees have found applications in phylogenomics where they are used to combine gene trees and recover species phylogenies based on genome-scale data sets. Here, we present the L.U.St package, a python tool for approximate maximum likelihood supertree inference and illustrate its application using a genomic data set for the placental mammals. L.U.St allows the calculation of the approximate likelihood of a supertree, given a set of input trees, performs heuristic searches to look for the supertree of highest likelihood, and performs statistical tests of two or more supertrees. To this end, L.U.St implements a winning sites test allowing ranking of a collection of a-priori selected hypotheses, given as a collection of input supertree topologies. It also outputs a file of input-tree-wise likelihood scores that can be used as input to CONSEL for calculation of standard tests of two trees (e.g. Kishino-Hasegawa, Shimidoara-Hasegawa and Approximately Unbiased tests). This is the first fully parametric implementation of a supertree method, it has clearly understood properties, and provides several advantages over currently available supertree approaches. It is easy to implement and works on any platform that has python installed. bitBucket page - https://afro-juju@bitbucket.org/afro-juju/l.u.st.git. Davide.Pisani@bristol.ac.uk.
Accuracy of maximum likelihood estimates of a two-state model in single-molecule FRET
Energy Technology Data Exchange (ETDEWEB)
Gopich, Irina V. [Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892 (United States)
2015-01-21
Photon sequences from single-molecule Förster resonance energy transfer (FRET) experiments can be analyzed using a maximum likelihood method. Parameters of the underlying kinetic model (FRET efficiencies of the states and transition rates between conformational states) are obtained by maximizing the appropriate likelihood function. In addition, the errors (uncertainties) of the extracted parameters can be obtained from the curvature of the likelihood function at the maximum. We study the standard deviations of the parameters of a two-state model obtained from photon sequences with recorded colors and arrival times. The standard deviations can be obtained analytically in a special case when the FRET efficiencies of the states are 0 and 1 and in the limiting cases of fast and slow conformational dynamics. These results are compared with the results of numerical simulations. The accuracy and, therefore, the ability to predict model parameters depend on how fast the transition rates are compared to the photon count rate. In the limit of slow transitions, the key parameters that determine the accuracy are the number of transitions between the states and the number of independent photon sequences. In the fast transition limit, the accuracy is determined by the small fraction of photons that are correlated with their neighbors. The relative standard deviation of the relaxation rate has a “chevron” shape as a function of the transition rate in the log-log scale. The location of the minimum of this function dramatically depends on how well the FRET efficiencies of the states are separated.
International Nuclear Information System (INIS)
Stsepankou, D; Arns, A; Hesser, J; Ng, S K; Zygmanski, P
2012-01-01
The objective of this paper is to evaluate an iterative maximum likelihood (ML) cone–beam computed tomography (CBCT) reconstruction with total variation (TV) regularization with respect to the robustness of the algorithm due to data inconsistencies. Three different and (for clinical application) typical classes of errors are considered for simulated phantom and measured projection data: quantum noise, defect detector pixels and projection matrix errors. To quantify those errors we apply error measures like mean square error, signal-to-noise ratio, contrast-to-noise ratio and streak indicator. These measures are derived from linear signal theory and generalized and applied for nonlinear signal reconstruction. For quality check, we focus on resolution and CT-number linearity based on a Catphan phantom. All comparisons are made versus the clinical standard, the filtered backprojection algorithm (FBP). In our results, we confirm and substantially extend previous results on iterative reconstruction such as massive undersampling of the number of projections. Errors of projection matrix parameters of up to 1° projection angle deviations are still in the tolerance level. Single defect pixels exhibit ring artifacts for each method. However using defect pixel compensation, allows up to 40% of defect pixels for passing the standard clinical quality check. Further, the iterative algorithm is extraordinarily robust in the low photon regime (down to 0.05 mAs) when compared to FPB, allowing for extremely low-dose image acquisitions, a substantial issue when considering daily CBCT imaging for position correction in radiotherapy. We conclude that the ML method studied herein is robust under clinical quality assurance conditions. Consequently, low-dose regime imaging, especially for daily patient localization in radiation therapy is possible without change of the current hardware of the imaging system. (paper)
Wobbling and LSF-based maximum likelihood expectation maximization reconstruction for wobbling PET
International Nuclear Information System (INIS)
Kim, Hang-Keun; Son, Young-Don; Kwon, Dae-Hyuk; Joo, Yohan; Cho, Zang-Hee
2016-01-01
Positron emission tomography (PET) is a widely used imaging modality; however, the PET spatial resolution is not yet satisfactory for precise anatomical localization of molecular activities. Detector size is the most important factor because it determines the intrinsic resolution, which is approximately half of the detector size and determines the ultimate PET resolution. Detector size, however, cannot be made too small because both the decreased detection efficiency and the increased septal penetration effect degrade the image quality. A wobbling and line spread function (LSF)-based maximum likelihood expectation maximization (WL-MLEM) algorithm, which combined the MLEM iterative reconstruction algorithm with wobbled sampling and LSF-based deconvolution using the system matrix, was proposed for improving the spatial resolution of PET without reducing the scintillator or detector size. The new algorithm was evaluated using a simulation, and its performance was compared with that of the existing algorithms, such as conventional MLEM and LSF-based MLEM. Simulations demonstrated that the WL-MLEM algorithm yielded higher spatial resolution and image quality than the existing algorithms. The WL-MLEM algorithm with wobbling PET yielded substantially improved resolution compared with conventional algorithms with stationary PET. The algorithm can be easily extended to other iterative reconstruction algorithms, such as maximum a priori (MAP) and ordered subset expectation maximization (OSEM). The WL-MLEM algorithm with wobbling PET may offer improvements in both sensitivity and resolution, the two most sought-after features in PET design. - Highlights: • This paper proposed WL-MLEM algorithm for PET and demonstrated its performance. • WL-MLEM algorithm effectively combined wobbling and line spread function based MLEM. • WL-MLEM provided improvements in the spatial resolution and the PET image quality. • WL-MLEM can be easily extended to the other iterative
Parameter-free bearing fault detection based on maximum likelihood estimation and differentiation
International Nuclear Information System (INIS)
Bozchalooi, I Soltani; Liang, Ming
2009-01-01
Bearing faults can lead to malfunction and ultimately complete stall of many machines. The conventional high-frequency resonance (HFR) method has been commonly used for bearing fault detection. However, it is often very difficult to obtain and calibrate bandpass filter parameters, i.e. the center frequency and bandwidth, the key to the success of the HFR method. This inevitably undermines the usefulness of the conventional HFR technique. To avoid such difficulties, we propose parameter-free, versatile yet straightforward techniques to detect bearing faults. We focus on two types of measured signals frequently encountered in practice: (1) a mixture of impulsive faulty bearing vibrations and intrinsic background noise and (2) impulsive faulty bearing vibrations blended with intrinsic background noise and vibration interferences. To design a proper signal processing technique for each case, we analyze the effects of intrinsic background noise and vibration interferences on amplitude demodulation. For the first case, a maximum likelihood-based fault detection method is proposed to accommodate the Rician distribution of the amplitude-demodulated signal mixture. For the second case, we first illustrate that the high-amplitude low-frequency vibration interferences can make the amplitude demodulation ineffective. Then we propose a differentiation method to enhance the fault detectability. It is shown that the iterative application of a differentiation step can boost the relative strength of the impulsive faulty bearing signal component with respect to the vibration interferences. This preserves the effectiveness of amplitude demodulation and hence leads to more accurate fault detection. The proposed approaches are evaluated on simulated signals and experimental data acquired from faulty bearings
Kim, Kyungsoo; Lim, Sung-Ho; Lee, Jaeseok; Kang, Won-Seok; Moon, Cheil; Choi, Ji-Woong
2016-01-01
Electroencephalograms (EEGs) measure a brain signal that contains abundant information about the human brain function and health. For this reason, recent clinical brain research and brain computer interface (BCI) studies use EEG signals in many applications. Due to the significant noise in EEG traces, signal processing to enhance the signal to noise power ratio (SNR) is necessary for EEG analysis, especially for non-invasive EEG. A typical method to improve the SNR is averaging many trials of event related potential (ERP) signal that represents a brain’s response to a particular stimulus or a task. The averaging, however, is very sensitive to variable delays. In this study, we propose two time delay estimation (TDE) schemes based on a joint maximum likelihood (ML) criterion to compensate the uncertain delays which may be different in each trial. We evaluate the performance for different types of signals such as random, deterministic, and real EEG signals. The results show that the proposed schemes provide better performance than other conventional schemes employing averaged signal as a reference, e.g., up to 4 dB gain at the expected delay error of 10°. PMID:27322267
Directory of Open Access Journals (Sweden)
Kyungsoo Kim
2016-06-01
Full Text Available Electroencephalograms (EEGs measure a brain signal that contains abundant information about the human brain function and health. For this reason, recent clinical brain research and brain computer interface (BCI studies use EEG signals in many applications. Due to the significant noise in EEG traces, signal processing to enhance the signal to noise power ratio (SNR is necessary for EEG analysis, especially for non-invasive EEG. A typical method to improve the SNR is averaging many trials of event related potential (ERP signal that represents a brain’s response to a particular stimulus or a task. The averaging, however, is very sensitive to variable delays. In this study, we propose two time delay estimation (TDE schemes based on a joint maximum likelihood (ML criterion to compensate the uncertain delays which may be different in each trial. We evaluate the performance for different types of signals such as random, deterministic, and real EEG signals. The results show that the proposed schemes provide better performance than other conventional schemes employing averaged signal as a reference, e.g., up to 4 dB gain at the expected delay error of 10°.
DEFF Research Database (Denmark)
Boiroux, Dimitri; Juhl, Rune; Madsen, Henrik
2016-01-01
This paper addresses maximum likelihood parameter estimation of continuous-time nonlinear systems with discrete-time measurements. We derive an efficient algorithm for the computation of the log-likelihood function and its gradient, which can be used in gradient-based optimization algorithms....... This algorithm uses UD decomposition of symmetric matrices and the array algorithm for covariance update and gradient computation. We test our algorithm on the Lotka-Volterra equations. Compared to the maximum likelihood estimation based on finite difference gradient computation, we get a significant speedup...
International Nuclear Information System (INIS)
Song, N; Frey, E C; He, B; Wahl, R L
2011-01-01
Optimizing targeted radionuclide therapy requires patient-specific estimation of organ doses. The organ doses are estimated from quantitative nuclear medicine imaging studies, many of which involve planar whole body scans. We have previously developed the quantitative planar (QPlanar) processing method and demonstrated its ability to provide more accurate activity estimates than conventional geometric-mean-based planar (CPlanar) processing methods using physical phantom and simulation studies. The QPlanar method uses the maximum likelihood-expectation maximization algorithm, 3D organ volume of interests (VOIs), and rigorous models of physical image degrading factors to estimate organ activities. However, the QPlanar method requires alignment between the 3D organ VOIs and the 2D planar projections and assumes uniform activity distribution in each VOI. This makes application to patients challenging. As a result, in this paper we propose an extended QPlanar (EQPlanar) method that provides independent-organ rigid registration and includes multiple background regions. We have validated this method using both Monte Carlo simulation and patient data. In the simulation study, we evaluated the precision and accuracy of the method in comparison to the original QPlanar method. For the patient studies, we compared organ activity estimates at 24 h after injection with those from conventional geometric mean-based planar quantification using a 24 h post-injection quantitative SPECT reconstruction as the gold standard. We also compared the goodness of fit of the measured and estimated projections obtained from the EQPlanar method to those from the original method at four other time points where gold standard data were not available. In the simulation study, more accurate activity estimates were provided by the EQPlanar method for all the organs at all the time points compared with the QPlanar method. Based on the patient data, we concluded that the EQPlanar method provided a
Kelderman, Henk
1991-01-01
In this paper, algorithms are described for obtaining the maximum likelihood estimates of the parameters in log-linear models. Modified versions of the iterative proportional fitting and Newton-Raphson algorithms are described that work on the minimal sufficient statistics rather than on the usual
Kelderman, Henk
1992-01-01
In this paper algorithms are described for obtaining the maximum likelihood estimates of the parameters in loglinear models. Modified versions of the iterative proportional fitting and Newton-Raphson algorithms are described that work on the minimal sufficient statistics rather than on the usual
Kieftenbeld, Vincent; Natesan, Prathiba
2012-01-01
Markov chain Monte Carlo (MCMC) methods enable a fully Bayesian approach to parameter estimation of item response models. In this simulation study, the authors compared the recovery of graded response model parameters using marginal maximum likelihood (MML) and Gibbs sampling (MCMC) under various latent trait distributions, test lengths, and…
Beauducel, Andre; Herzberg, Philipp Yorck
2006-01-01
This simulation study compared maximum likelihood (ML) estimation with weighted least squares means and variance adjusted (WLSMV) estimation. The study was based on confirmatory factor analyses with 1, 2, 4, and 8 factors, based on 250, 500, 750, and 1,000 cases, and on 5, 10, 20, and 40 variables with 2, 3, 4, 5, and 6 categories. There was no…
Bergboer, N.H.; Verdult, V.; Verhaegen, M.H.G.
2002-01-01
We present a numerically efficient implementation of the nonlinear least squares and maximum likelihood identification of multivariable linear time-invariant (LTI) state-space models. This implementation is based on a local parameterization of the system and a gradient search in the resulting
Ari, Eszter; Ittzés, Péter; Podani, János; Thi, Quynh Chi Le; Jakó, Eena
2012-04-01
Boolean analysis (or BOOL-AN; Jakó et al., 2009. BOOL-AN: A method for comparative sequence analysis and phylogenetic reconstruction. Mol. Phylogenet. Evol. 52, 887-97.), a recently developed method for sequence comparison uses the Iterative Canonical Form of Boolean functions. It considers sequence information in a way entirely different from standard phylogenetic methods (i.e. Maximum Parsimony, Maximum-Likelihood, Neighbor-Joining, and Bayesian analysis). The performance and reliability of Boolean analysis were tested and compared with the standard phylogenetic methods, using artificially evolved - simulated - nucleotide sequences and the 22 mitochondrial tRNA genes of the great apes. At the outset, we assumed that the phylogeny of Hominidae is generally well established, and the guide tree of artificial sequence evolution can also be used as a benchmark. These offer a possibility to compare and test the performance of different phylogenetic methods. Trees were reconstructed by each method from 2500 simulated sequences and 22 mitochondrial tRNA sequences. We also introduced a special re-sampling method for Boolean analysis on permuted sequence sites, the P-BOOL-AN procedure. Considering the reliability values (branch support values of consensus trees and Robinson-Foulds distances) we used for simulated sequence trees produced by different phylogenetic methods, BOOL-AN appeared as the most reliable method. Although the mitochondrial tRNA sequences of great apes are relatively short (59-75 bases long) and the ratio of their constant characters is about 75%, BOOL-AN, P-BOOL-AN and the Bayesian approach produced the same tree-topology as the established phylogeny, while the outcomes of Maximum Parsimony, Maximum-Likelihood and Neighbor-Joining methods were equivocal. We conclude that Boolean analysis is a promising alternative to existing methods of sequence comparison for phylogenetic reconstruction and congruence analysis. Copyright Â© 2012 Elsevier Inc. All
Performance analysis of linear codes under maximum-likelihood decoding: a tutorial
National Research Council Canada - National Science Library
Sason, Igal; Shamai, Shlomo
2006-01-01
..., upper and lower bounds on the error probability of linear codes under ML decoding are surveyed and applied to codes and ensembles of codes on graphs. For upper bounds, we discuss various bounds where focus is put on Gallager bounding techniques and their relation to a variety of other reported bounds. Within the class of lower bounds, we ad...
Lirio, R B; Dondériz, I C; Pérez Abalo, M C
1992-08-01
The methodology of Receiver Operating Characteristic curves based on the signal detection model is extended to evaluate the accuracy of two-stage diagnostic strategies. A computer program is developed for the maximum likelihood estimation of parameters that characterize the sensitivity and specificity of two-stage classifiers according to this extended methodology. Its use is briefly illustrated with data collected in a two-stage screening for auditory defects.
International Nuclear Information System (INIS)
Manglos, S.H.
1992-01-01
Transverse image truncation can be a serious problem for human imaging using cone-beam transmission CT (CB-CT) implemented on a conventional rotating gamma camera. This paper presents a reconstruction method to reduce or eliminate the artifacts resulting from the truncation. The method uses a previously published transmission maximum likelihood EM algorithm, adapted to the cone-beam geometry. The reconstruction method is evaluated qualitatively using three human subjects of various dimensions and various degrees of truncation. (author)
Peters, B. C., Jr.; Walker, H. F.
1976-01-01
The problem of obtaining numerically maximum likelihood estimates of the parameters for a mixture of normal distributions is addressed. In recent literature, a certain successive approximations procedure, based on the likelihood equations, is shown empirically to be effective in numerically approximating such maximum-likelihood estimates; however, the reliability of this procedure was not established theoretically. Here, a general iterative procedure is introduced, of the generalized steepest-ascent (deflected-gradient) type, which is just the procedure known in the literature when the step-size is taken to be 1. With probability 1 as the sample size grows large, it is shown that this procedure converges locally to the strongly consistent maximum-likelihood estimate whenever the step-size lies between 0 and 2. The step-size which yields optimal local convergence rates for large samples is determined in a sense by the separation of the component normal densities and is bounded below by a number between 1 and 2.
Peters, B. C., Jr.; Walker, H. F.
1978-01-01
This paper addresses the problem of obtaining numerically maximum-likelihood estimates of the parameters for a mixture of normal distributions. In recent literature, a certain successive-approximations procedure, based on the likelihood equations, was shown empirically to be effective in numerically approximating such maximum-likelihood estimates; however, the reliability of this procedure was not established theoretically. Here, we introduce a general iterative procedure, of the generalized steepest-ascent (deflected-gradient) type, which is just the procedure known in the literature when the step-size is taken to be 1. We show that, with probability 1 as the sample size grows large, this procedure converges locally to the strongly consistent maximum-likelihood estimate whenever the step-size lies between 0 and 2. We also show that the step-size which yields optimal local convergence rates for large samples is determined in a sense by the 'separation' of the component normal densities and is bounded below by a number between 1 and 2.
Damkliang, Kasikrit; Tandayya, Pichaya; Sangket, Unitsa; Pasomsub, Ekawat
2016-11-28
At the present, coding sequence (CDS) has been discovered and larger CDS is being revealed frequently. Approaches and related tools have also been developed and upgraded concurrently, especially for phylogenetic tree analysis. This paper proposes an integrated automatic Taverna workflow for the phylogenetic tree inferring analysis using public access web services at European Bioinformatics Institute (EMBL-EBI) and Swiss Institute of Bioinformatics (SIB), and our own deployed local web services. The workflow input is a set of CDS in the Fasta format. The workflow supports 1,000 to 20,000 numbers in bootstrapping replication. The workflow performs the tree inferring such as Parsimony (PARS), Distance Matrix - Neighbor Joining (DIST-NJ), and Maximum Likelihood (ML) algorithms of EMBOSS PHYLIPNEW package based on our proposed Multiple Sequence Alignment (MSA) similarity score. The local web services are implemented and deployed into two types using the Soaplab2 and Apache Axis2 deployment. There are SOAP and Java Web Service (JWS) providing WSDL endpoints to Taverna Workbench, a workflow manager. The workflow has been validated, the performance has been measured, and its results have been verified. Our workflow's execution time is less than ten minutes for inferring a tree with 10,000 replicates of the bootstrapping numbers. This paper proposes a new integrated automatic workflow which will be beneficial to the bioinformaticians with an intermediate level of knowledge and experiences. All local services have been deployed at our portal http://bioservices.sci.psu.ac.th.
230Th and 234Th as coupled tracers of particle cycling in the ocean: A maximum likelihood approach
Wang, Wei-Lei; Armstrong, Robert A.; Cochran, J. Kirk; Heilbrun, Christina
2016-05-01
We applied maximum likelihood estimation to measurements of Th isotopes (234,230Th) in Mediterranean Sea sediment traps that separated particles according to settling velocity. This study contains two unique aspects. First, it relies on settling velocities that were measured using sediment traps, rather than on measured particle sizes and an assumed relationship between particle size and sinking velocity. Second, because of the labor and expense involved in obtaining these data, they were obtained at only a few depths, and their analysis required constructing a new type of box-like model, which we refer to as a "two-layer" model, that we then analyzed using likelihood techniques. Likelihood techniques were developed in the 1930s by statisticians, and form the computational core of both Bayesian and non-Bayesian statistics. Their use has recently become very popular in ecology, but they are relatively unknown in geochemistry. Our model was formulated by assuming steady state and first-order reaction kinetics for thorium adsorption and desorption, and for particle aggregation, disaggregation, and remineralization. We adopted a cutoff settling velocity (49 m/d) from Armstrong et al. (2009) to separate particles into fast- and slow-sinking classes. A unique set of parameters with no dependence on prior values was obtained. Adsorption rate constants for both slow- and fast-sinking particles are slightly higher in the upper layer than in the lower layer. Slow-sinking particles have higher adsorption rate constants than fast-sinking particles. Desorption rate constants are higher in the lower layer (slow-sinking particles: 13.17 ± 1.61, fast-sinking particles: 13.96 ± 0.48) than in the upper layer (slow-sinking particles: 7.87 ± 0.60 y-1, fast-sinking particles: 1.81 ± 0.44 y-1). Aggregation rate constants were higher, 1.88 ± 0.04, in the upper layer and just 0.07 ± 0.01 y-1 in the lower layer. Disaggregation rate constants were just 0.30 ± 0.10 y-1 in the upper
Naushad, Sohail; Barkema, Herman W.; Luby, Christopher; Condas, Larissa A. Z.; Nobrega, Diego B.; Carson, Domonique A.; De Buck, Jeroen
2016-01-01
Non-aureus staphylococci (NAS), a heterogeneous group of a large number of species and subspecies, are the most frequently isolated pathogens from intramammary infections in dairy cattle. Phylogenetic relationships among bovine NAS species are controversial and have mostly been determined based on single-gene trees. Herein, we analyzed phylogeny of bovine NAS species using whole-genome sequencing (WGS) of 441 distinct isolates. In addition, evolutionary relationships among bovine NAS were estimated from multilocus data of 16S rRNA, hsp60, rpoB, sodA, and tuf genes and sequences from these and numerous other single genes/proteins. All phylogenies were created with FastTree, Maximum-Likelihood, Maximum-Parsimony, and Neighbor-Joining methods. Regardless of methodology, WGS-trees clearly separated bovine NAS species into five monophyletic coherent clades. Furthermore, there were consistent interspecies relationships within clades in all WGS phylogenetic reconstructions. Except for the Maximum-Parsimony tree, multilocus data analysis similarly produced five clades. There were large variations in determining clades and interspecies relationships in single gene/protein trees, under different methods of tree constructions, highlighting limitations of using single genes for determining bovine NAS phylogeny. However, based on WGS data, we established a robust phylogeny of bovine NAS species, unaffected by method or model of evolutionary reconstructions. Therefore, it is now possible to determine associations between phylogeny and many biological traits, such as virulence, antimicrobial resistance, environmental niche, geographical distribution, and host specificity. PMID:28066335
Directory of Open Access Journals (Sweden)
L. STRAGEVITCH
1997-03-01
Full Text Available The equations of the method based on the maximum likelihood principle have been rewritten in a suitable generalized form to allow the use of any number of implicit constraints in the determination of model parameters from experimental data and from the associated experimental uncertainties. In addition to the use of any number of constraints, this method also allows data, with different numbers of constraints, to be reduced simultaneously. Application of the method is illustrated in the reduction of liquid-liquid equilibrium data of binary, ternary and quaternary systems simultaneously
Directory of Open Access Journals (Sweden)
Jianlong Li
Full Text Available Anemonefishes (Pomacentridae Amphiprioninae are a group of 30 valid coral reef fish species with their phylogenetic relationships still under debate. The eight available mitogenomes of anemonefishes were used to reconstruct the molecular phylogenetic tree; six were obtained from this study (Amphiprion clarkii, A. frenatus, A. percula, A. perideraion, A. polymnus and Premnas biaculeatus and two from GenBank (A. bicinctus and A. ocellaris. The seven Amphiprion species represent all four subgenera and P. biaculeatus is the only species from Premnas. The eight mitogenomes of anemonefishes encoded 13 protein-coding genes, two rRNA genes, 22 tRNA genes and two main non-coding regions, with the gene arrangement and translation direction basically identical to other typical vertebrate mitogenomes. Among the 13 protein-coding genes, A. ocellaris (AP006017 and A. percula (KJ174497 had the same length in ND5 with 1,866 bp, which were three nucleotides less than the other six anemonefishes. Both structures of ND5, however, could translate to amino acid successfully. Only four mitogenomes had the tandem repeats in D-loop; the tandem repeats were located in downstream after Conserved Sequence Block rather than the upstream and repeated in a simply way. The phylogenetic utility was tested with Bayesian and Maximum Likelihood methods using all 13 protein-coding genes. The results strongly supported that the subfamily Amphiprioninae was monophyletic and P. biaculeatus should be assigned to the genus Amphiprion. Premnas biaculeatus with the percula complex were revealed to be the ancient anemonefish species. The tree forms of ND1, COIII, ND4, Cytb, Cytb+12S rRNA, Cytb+COI and Cytb+COI+12S rRNA were similar to that 13 protein-coding genes, therefore, we suggested that the suitable single mitochondrial gene for phylogenetic analysis of anemonefishes maybe Cytb. Additional mitogenomes of anemonefishes with a combination of nuclear markers will be useful to
Phylogenetic reconstruction methods: an overview.
De Bruyn, Alexandre; Martin, Darren P; Lefeuvre, Pierre
2014-01-01
Initially designed to infer evolutionary relationships based on morphological and physiological characters, phylogenetic reconstruction methods have greatly benefited from recent developments in molecular biology and sequencing technologies with a number of powerful methods having been developed specifically to infer phylogenies from macromolecular data. This chapter, while presenting an overview of basic concepts and methods used in phylogenetic reconstruction, is primarily intended as a simplified step-by-step guide to the construction of phylogenetic trees from nucleotide sequences using fairly up-to-date maximum likelihood methods implemented in freely available computer programs. While the analysis of chloroplast sequences from various Vanilla species is used as an illustrative example, the techniques covered here are relevant to the comparative analysis of homologous sequences datasets sampled from any group of organisms.
Mavrodiev, Evgeny V; Laktionov, Alexy P; Cellinese, Nico
2012-01-01
The evolution of the diverse flora in the Lower Volga Valley (LVV) (southwest Russia) is complex due to the composite geomorphology and tectonic history of the Caspian Sea and adjacent areas. In the absence of phylogenetic studies and temporal information, we implemented a maximum likelihood (ML) approach and stochastic character mapping reconstruction aiming at recovering historical signals from species occurrence data. A taxon-area matrix of 13 floristic areas and 1018 extant species was constructed and analyzed with RAxML and Mesquite. Additionally, we simulated scenarios with numbers of hypothetical extinct taxa from an unknown palaeoflora that occupied the areas before the dramatic transgression and regression events that have occurred from the Pleistocene to the present day. The flora occurring strictly along the river valley and delta appear to be younger than that of adjacent steppes and desert-like regions, regardless of the chronology of transgression and regression events that led to the geomorphological formation of the LVV. This result is also supported when hypothetical extinct taxa are included in the analyses. The history of each species was inferred by using a stochastic character mapping reconstruction method as implemented in Mesquite. Individual histories appear to be independent from one another and have been shaped by repeated dispersal and extinction events. These reconstructions provide testable hypotheses for more in-depth investigations of their population structure and dynamics. PMID:22957179
Ning, Jing; Chen, Yong; Piao, Jin
2017-07-01
Publication bias occurs when the published research results are systematically unrepresentative of the population of studies that have been conducted, and is a potential threat to meaningful meta-analysis. The Copas selection model provides a flexible framework for correcting estimates and offers considerable insight into the publication bias. However, maximizing the observed likelihood under the Copas selection model is challenging because the observed data contain very little information on the latent variable. In this article, we study a Copas-like selection model and propose an expectation-maximization (EM) algorithm for estimation based on the full likelihood. Empirical simulation studies show that the EM algorithm and its associated inferential procedure performs well and avoids the non-convergence problem when maximizing the observed likelihood. © The Author 2017. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Galili, Tal; Meilijson, Isaac
2016-01-02
The Rao-Blackwell theorem offers a procedure for converting a crude unbiased estimator of a parameter θ into a "better" one, in fact unique and optimal if the improvement is based on a minimal sufficient statistic that is complete. In contrast, behind every minimal sufficient statistic that is not complete, there is an improvable Rao-Blackwell improvement. This is illustrated via a simple example based on the uniform distribution, in which a rather natural Rao-Blackwell improvement is uniformly improvable. Furthermore, in this example the maximum likelihood estimator is inefficient, and an unbiased generalized Bayes estimator performs exceptionally well. Counterexamples of this sort can be useful didactic tools for explaining the true nature of a methodology and possible consequences when some of the assumptions are violated. [Received December 2014. Revised September 2015.].
Castrillon, Julio
2015-11-10
We develop a multi-level restricted Gaussian maximum likelihood method for estimating the covariance function parameters and computing the best unbiased predictor. Our approach produces a new set of multi-level contrasts where the deterministic parameters of the model are filtered out thus enabling the estimation of the covariance parameters to be decoupled from the deterministic component. Moreover, the multi-level covariance matrix of the contrasts exhibit fast decay that is dependent on the smoothness of the covariance function. Due to the fast decay of the multi-level covariance matrix coefficients only a small set is computed with a level dependent criterion. We demonstrate our approach on problems of up to 512,000 observations with a Matérn covariance function and highly irregular placements of the observations. In addition, these problems are numerically unstable and hard to solve with traditional methods.
International Nuclear Information System (INIS)
He, Yi; Scheraga, Harold A.; Liwo, Adam
2015-01-01
Coarse-grained models are useful tools to investigate the structural and thermodynamic properties of biomolecules. They are obtained by merging several atoms into one interaction site. Such simplified models try to capture as much as possible information of the original biomolecular system in all-atom representation but the resulting parameters of these coarse-grained force fields still need further optimization. In this paper, a force field optimization method, which is based on maximum-likelihood fitting of the simulated to the experimental conformational ensembles and least-squares fitting of the simulated to the experimental heat-capacity curves, is applied to optimize the Nucleic Acid united-RESidue 2-point (NARES-2P) model for coarse-grained simulations of nucleic acids recently developed in our laboratory. The optimized NARES-2P force field reproduces the structural and thermodynamic data of small DNA molecules much better than the original force field
Directory of Open Access Journals (Sweden)
Min Chen
Full Text Available To investigate the linkage of HIV transmission from a man to a woman through unprotected sexual contact without disclosing his HIV-positive status.Combined with epidemiological information and serological tests, phylogenetic analysis was used to test the a priori hypothesis of HIV transmission from the man to the woman. Control subjects, infected with HIV through heterosexual intercourse, from the same location were also sampled. Phylogenetic analyses were performed using the consensus gag, pol and env sequences obtained from blood samples of the man, the woman and the local control subjects. The env quasispecies of the man, the woman, and two controls were also obtained using single genome amplification and sequencing (SGA/S to explore the paraphyletic relationship by phylogenetic analysis.Epidemiological information and serological tests indicated that the man was infected with HIV-1 earlier than the woman. Phylogenetic analyses of the consensus sequences showed a monophyletic cluster for the man and woman in all three genomic regions. Furthermore, gag sequences of the man and woman shared a unique recombination pattern from subtype B and C, which was different from those of CRF07_BC or CRF08_BC observed in the local samples. These indicated that the viral sequences from the two subjects display a high level of similarity. Further, viral quasispecies from the man exhibited a paraphyletic relationship with those from the woman in the Bayesian and maximum-likelihood (ML phylogenetic trees of the env region, which supported the transmission direction from the man to the woman.In the context of epidemiological and serological evidence, the results of phylogenetic analyses support the transmission from the man to the woman.
Li, Xiao-xu; Liu, Cheng; Li, Wei; Zhang, Zeng-lin; Gao, Xiao-ming; Zhou, Hui; Guo, Yong-feng
2016-05-01
Members of the plant-specific WOX transcription factor family have been reported to play important roles in cell to cell communication as well as other physiological and developmental processes. In this study, ten members of the WOX transcription factor family were identified in Solanum lycopersicum with HMMER. Neighbor-joining phylogenetic tree, maximum-likelihood tree and Bayesian-inference tree were constructed and similar topologies were shown using the protein sequences of the homeodomain. Phylogenetic study revealed that the 25 WOX family members from Arabidopsis and tomato fall into three clades and nine subfamilies. The patterns of exon-intron structures and organization of conserved domains in Arabidopsis and tomato were consistent based on the phylogenetic results. Transcriptome analysis showed that the expression patterns of SlWOXs were different in different tissue types. Gene Ontology (GO) analysis suggested that, as transcription factors, the SlWOX family members could be involved in a number of biological processes including cell to cell communication and tissue development. Our results are useful for future studies on WOX family members in tomato and other plant species.
Hamaker, Ellen L.; Dolan, Conor V.; Molenaar, Peter C. M.
2003-01-01
Demonstrated, through simulation, that stationary autoregressive moving average (ARMA) models may be fitted readily when T>N, using normal theory raw maximum likelihood structural equation modeling. Also provides some illustrations based on real data. (SLD)
Phylogenetic and biogeographic analysis of sphaerexochine trilobites.
Directory of Open Access Journals (Sweden)
Curtis R Congreve
Full Text Available BACKGROUND: Sphaerexochinae is a speciose and widely distributed group of cheirurid trilobites. Their temporal range extends from the earliest Ordovician through the Silurian, and they survived the end Ordovician mass extinction event (the second largest mass extinction in Earth history. Prior to this study, the individual evolutionary relationships within the group had yet to be determined utilizing rigorous phylogenetic methods. Understanding these evolutionary relationships is important for producing a stable classification of the group, and will be useful in elucidating the effects the end Ordovician mass extinction had on the evolutionary and biogeographic history of the group. METHODOLOGY/PRINCIPAL FINDINGS: Cladistic parsimony analysis of cheirurid trilobites assigned to the subfamily Sphaerexochinae was conducted to evaluate phylogenetic patterns and produce a hypothesis of relationship for the group. This study utilized the program TNT, and the analysis included thirty-one taxa and thirty-nine characters. The results of this analysis were then used in a Lieberman-modified Brooks Parsimony Analysis to analyze biogeographic patterns during the Ordovician-Silurian. CONCLUSIONS/SIGNIFICANCE: The genus Sphaerexochus was found to be monophyletic, consisting of two smaller clades (one composed entirely of Ordovician species and another composed of Silurian and Ordovician species. By contrast, the genus Kawina was found to be paraphyletic. It is a basal grade that also contains taxa formerly assigned to Cydonocephalus. Phylogenetic patterns suggest Sphaerexochinae is a relatively distinctive trilobite clade because it appears to have been largely unaffected by the end Ordovician mass extinction. Finally, the biogeographic analysis yields two major conclusions about Sphaerexochus biogeography: Bohemia and Avalonia were close enough during the Silurian to exchange taxa; and during the Ordovician there was dispersal between Eastern Laurentia and
Sequence comparison and phylogenetic analysis of core gene of ...
African Journals Online (AJOL)
Phylogenetic analysis suggests that our sequences are clustered with sequences reported from Japan. This is the first phylogenetic analysis of HCV core gene from Pakistani population. Our sequences and sequences from Japan are grouped into same cluster in the phylogenetic tree. Sequence comparison and ...
Analysis of Acorus calamus chloroplast genome and its phylogenetic implications.
Goremykin, Vadim V; Holland, Barbara; Hirsch-Ernst, Karen I; Hellwig, Frank H
2005-09-01
Determining the phylogenetic relationships among the major lines of angiosperms is a long-standing problem, yet the uncertainty as to the phylogenetic affinity of these lines persists. While a number of studies have suggested that the ANITA (Amborella-Nymphaeales-Illiciales-Trimeniales-Aristolochiales) grade is basal within angiosperms, studies of complete chloroplast genome sequences also suggested an alternative tree, wherein the line leading to the grasses branches first among the angiosperms. To improve taxon sampling in the existing chloroplast genome data, we sequenced the chloroplast genome of the monocot Acorus calamus. We generated a concatenated alignment (89,436 positions for 15 taxa), encompassing almost all sequences usable for phylogeny reconstruction within spermatophytes. The data still contain support for both the ANITA-basal and grasses-basal hypotheses. Using simulations we can show that were the ANITA-basal hypothesis true, parsimony (and distance-based methods with many models) would be expected to fail to recover it. The self-evident explanation for this failure appears to be a long-branch attraction (LBA) between the clade of grasses and the out-group. However, this LBA cannot explain the discrepancies observed between tree topology recovered using the maximum likelihood (ML) method and the topologies recovered using the parsimony and distance-based methods when grasses are deleted. Furthermore, the fact that neither maximum parsimony nor distance methods consistently recover the ML tree, when according to the simulations they would be expected to, when the out-group (Pinus) is deleted, suggests that either the generating tree is not correct or the best symmetric model is misspecified (or both). We demonstrate that the tree recovered under ML is extremely sensitive to model specification and that the best symmetric model is misspecified. Hence, we remain agnostic regarding phylogenetic relationships among basal angiosperm lineages.
Directory of Open Access Journals (Sweden)
Salces Judit
2011-08-01
Full Text Available Abstract Background Reference genes with stable expression are required to normalize expression differences of target genes in qPCR experiments. Several procedures and companion software have been proposed to find the most stable genes. Model based procedures are attractive because they provide a solid statistical framework. NormFinder, a widely used software, uses a model based method. The pairwise comparison procedure implemented in GeNorm is a simpler procedure but one of the most extensively used. In the present work a statistical approach based in Maximum Likelihood estimation under mixed models was tested and compared with NormFinder and geNorm softwares. Sixteen candidate genes were tested in whole blood samples from control and heat stressed sheep. Results A model including gene and treatment as fixed effects, sample (animal, gene by treatment, gene by sample and treatment by sample interactions as random effects with heteroskedastic residual variance in gene by treatment levels was selected using goodness of fit and predictive ability criteria among a variety of models. Mean Square Error obtained under the selected model was used as indicator of gene expression stability. Genes top and bottom ranked by the three approaches were similar; however, notable differences for the best pair of genes selected for each method and the remaining genes of the rankings were shown. Differences among the expression values of normalized targets for each statistical approach were also found. Conclusions Optimal statistical properties of Maximum Likelihood estimation joined to mixed model flexibility allow for more accurate estimation of expression stability of genes under many different situations. Accurate selection of reference genes has a direct impact over the normalized expression values of a given target gene. This may be critical when the aim of the study is to compare expression rate differences among samples under different environmental
International Nuclear Information System (INIS)
Veklerov, E.; Llacer, J.; Hoffman, E.J.
1987-10-01
In order to study properties of the Maximum Likelihood Estimator (MLE) algorithm for image reconstruction in Positron Emission Tomographyy (PET), the algorithm is applied to data obtained by the ECAT-III tomograph from a brain phantom. The procedure for subtracting accidental coincidences from the data stream generated by this physical phantom is such that he resultant data are not Poisson distributed. This makes the present investigation different from other investigations based on computer-simulated phantoms. It is shown that the MLE algorithm is robust enough to yield comparatively good images, especially when the phantom is in the periphery of the field of view, even though the underlying assumption of the algorithm is violated. Two transition matrices are utilized. The first uses geometric considerations only. The second is derived by a Monte Carlo simulation which takes into account Compton scattering in the detectors, positron range, etc. in the detectors. It is demonstrated that the images obtained from the Monte Carlo matrix are superior in some specific ways. A stopping rule derived earlier and allowing the user to stop the iterative process before the images begin to deteriorate is tested. Since the rule is based on the Poisson assumption, it does not work well with the presently available data, although it is successful wit computer-simulated Poisson data
Lin, Jen-Jen; Cheng, Jung-Yu; Huang, Li-Fei; Lin, Ying-Hsiu; Wan, Yung-Liang; Tsui, Po-Hsiang
2017-05-01
The Nakagami distribution is an approximation useful to the statistics of ultrasound backscattered signals for tissue characterization. Various estimators may affect the Nakagami parameter in the detection of changes in backscattered statistics. In particular, the moment-based estimator (MBE) and maximum likelihood estimator (MLE) are two primary methods used to estimate the Nakagami parameters of ultrasound signals. This study explored the effects of the MBE and different MLE approximations on Nakagami parameter estimations. Ultrasound backscattered signals of different scatterer number densities were generated using a simulation model, and phantom experiments and measurements of human liver tissues were also conducted to acquire real backscattered echoes. Envelope signals were employed to estimate the Nakagami parameters by using the MBE, first- and second-order approximations of MLE (MLE 1 and MLE 2 , respectively), and Greenwood approximation (MLE gw ) for comparisons. The simulation results demonstrated that, compared with the MBE and MLE 1 , the MLE 2 and MLE gw enabled more stable parameter estimations with small sample sizes. Notably, the required data length of the envelope signal was 3.6 times the pulse length. The phantom and tissue measurement results also showed that the Nakagami parameters estimated using the MLE 2 and MLE gw could simultaneously differentiate various scatterer concentrations with lower standard deviations and reliably reflect physical meanings associated with the backscattered statistics. Therefore, the MLE 2 and MLE gw are suggested as estimators for the development of Nakagami-based methodologies for ultrasound tissue characterization. Copyright © 2017 Elsevier B.V. All rights reserved.
Craciunescu, Teddy; Peluso, Emmanuele; Murari, Andrea; Gelfusa, Michela; JET Contributors
2018-05-01
The total emission of radiation is a crucial quantity to calculate the power balances and to understand the physics of any Tokamak. Bolometric systems are the main tool to measure this important physical quantity through quite sophisticated tomographic inversion methods. On the Joint European Torus, the coverage of the bolometric diagnostic, due to the availability of basically only two projection angles, is quite limited, rendering the inversion a very ill-posed mathematical problem. A new approach, based on the maximum likelihood, has therefore been developed and implemented to alleviate one of the major weaknesses of traditional tomographic techniques: the difficulty to determine routinely the confidence intervals in the results. The method has been validated by numerical simulations with phantoms to assess the quality of the results and to optimise the configuration of the parameters for the main types of emissivity encountered experimentally. The typical levels of statistical errors, which may significantly influence the quality of the reconstructions, have been identified. The systematic tests with phantoms indicate that the errors in the reconstructions are quite limited and their effect on the total radiated power remains well below 10%. A comparison with other approaches to the inversion and to the regularization has also been performed.
Wang, Kezhi
2014-10-01
Bit error rate (BER) and outage probability for amplify-and-forward (AF) relaying systems with two different channel estimation methods, disintegrated channel estimation and cascaded channel estimation, using pilot-aided maximum likelihood method in slowly fading Rayleigh channels are derived. Based on the BERs, the optimal values of pilot power under the total transmitting power constraints at the source and the optimal values of pilot power under the total transmitting power constraints at the relay are obtained, separately. Moreover, the optimal power allocation between the pilot power at the source, the pilot power at the relay, the data power at the source and the data power at the relay are obtained when their total transmitting power is fixed. Numerical results show that the derived BER expressions match with the simulation results. They also show that the proposed systems with optimal power allocation outperform the conventional systems without power allocation under the same other conditions. In some cases, the gain could be as large as several dB\\'s in effective signal-to-noise ratio.
Energy Technology Data Exchange (ETDEWEB)
Raghunathan, Srinivasan; Patil, Sanjaykumar; Bianchini, Federico; Reichardt, Christian L. [School of Physics, University of Melbourne, 313 David Caro building, Swanston St and Tin Alley, Parkville VIC 3010 (Australia); Baxter, Eric J. [Department of Physics and Astronomy, University of Pennsylvania, 209 S. 33rd Street, Philadelphia, PA 19104 (United States); Bleem, Lindsey E. [Argonne National Laboratory, High-Energy Physics Division, 9700 S. Cass Avenue, Argonne, IL 60439 (United States); Crawford, Thomas M. [Kavli Institute for Cosmological Physics, University of Chicago, 5640 South Ellis Avenue, Chicago, IL 60637 (United States); Holder, Gilbert P. [Department of Astronomy and Department of Physics, University of Illinois, 1002 West Green St., Urbana, IL 61801 (United States); Manzotti, Alessandro, E-mail: srinivasan.raghunathan@unimelb.edu.au, E-mail: s.patil2@student.unimelb.edu.au, E-mail: ebax@sas.upenn.edu, E-mail: federico.bianchini@unimelb.edu.au, E-mail: bleeml@uchicago.edu, E-mail: tcrawfor@kicp.uchicago.edu, E-mail: gholder@illinois.edu, E-mail: manzotti@uchicago.edu, E-mail: christian.reichardt@unimelb.edu.au [Department of Astronomy and Astrophysics, University of Chicago, 5640 South Ellis Avenue, Chicago, IL 60637 (United States)
2017-08-01
We develop a Maximum Likelihood estimator (MLE) to measure the masses of galaxy clusters through the impact of gravitational lensing on the temperature and polarization anisotropies of the cosmic microwave background (CMB). We show that, at low noise levels in temperature, this optimal estimator outperforms the standard quadratic estimator by a factor of two. For polarization, we show that the Stokes Q/U maps can be used instead of the traditional E- and B-mode maps without losing information. We test and quantify the bias in the recovered lensing mass for a comprehensive list of potential systematic errors. Using realistic simulations, we examine the cluster mass uncertainties from CMB-cluster lensing as a function of an experiment's beam size and noise level. We predict the cluster mass uncertainties will be 3 - 6% for SPT-3G, AdvACT, and Simons Array experiments with 10,000 clusters and less than 1% for the CMB-S4 experiment with a sample containing 100,000 clusters. The mass constraints from CMB polarization are very sensitive to the experimental beam size and map noise level: for a factor of three reduction in either the beam size or noise level, the lensing signal-to-noise improves by roughly a factor of two.
Lusiana, Evellin Dewi
2017-12-01
The parameters of binary probit regression model are commonly estimated by using Maximum Likelihood Estimation (MLE) method. However, MLE method has limitation if the binary data contains separation. Separation is the condition where there are one or several independent variables that exactly grouped the categories in binary response. It will result the estimators of MLE method become non-convergent, so that they cannot be used in modeling. One of the effort to resolve the separation is using Firths approach instead. This research has two aims. First, to identify the chance of separation occurrence in binary probit regression model between MLE method and Firths approach. Second, to compare the performance of binary probit regression model estimator that obtained by MLE method and Firths approach using RMSE criteria. Those are performed using simulation method and under different sample size. The results showed that the chance of separation occurrence in MLE method for small sample size is higher than Firths approach. On the other hand, for larger sample size, the probability decreased and relatively identic between MLE method and Firths approach. Meanwhile, Firths estimators have smaller RMSE than MLEs especially for smaller sample sizes. But for larger sample sizes, the RMSEs are not much different. It means that Firths estimators outperformed MLE estimator.
International Nuclear Information System (INIS)
Llacer, J.; Veklerov, E.; Nolan, D.; Grafton, S.T.; Mazziotta, J.C.; Hawkins, R.A.; Hoh, C.K.; Hoffman, E.J.
1990-10-01
This paper will report on the progress to date in carrying out Receiver Operating Characteristics (ROC) studies comparing Maximum Likelihood Estimator (MLE) and Filtered Backprojection (FBP) reconstructions of normal and abnormal human brain PET data in a clinical setting. A previous statistical study of reconstructions of the Hoffman brain phantom with real data indicated that the pixel-to-pixel standard deviation in feasible MLE images is approximately proportional to the square root of the number of counts in a region, as opposed to a standard deviation which is high and largely independent of the number of counts in FBP. A preliminary ROC study carried out with 10 non-medical observers performing a relatively simple detectability task indicates that, for the majority of observers, lower standard deviation translates itself into a statistically significant detectability advantage in MLE reconstructions. The initial results of ongoing tests with four experienced neurologists/nuclear medicine physicians are presented. Normal cases of 18 F -- fluorodeoxyglucose (FDG) cerebral metabolism studies and abnormal cases in which a variety of lesions have been introduced into normal data sets have been evaluated. We report on the results of reading the reconstructions of 90 data sets, each corresponding to a single brain slice. It has become apparent that the design of the study based on reading single brain slices is too insensitive and we propose a variation based on reading three consecutive slices at a time, rating only the center slice. 9 refs., 2 figs., 1 tab
Wang, Kezhi; Chen, Yunfei; Alouini, Mohamed-Slim; Xu, Feng
2014-01-01
Bit error rate (BER) and outage probability for amplify-and-forward (AF) relaying systems with two different channel estimation methods, disintegrated channel estimation and cascaded channel estimation, using pilot-aided maximum likelihood method in slowly fading Rayleigh channels are derived. Based on the BERs, the optimal values of pilot power under the total transmitting power constraints at the source and the optimal values of pilot power under the total transmitting power constraints at the relay are obtained, separately. Moreover, the optimal power allocation between the pilot power at the source, the pilot power at the relay, the data power at the source and the data power at the relay are obtained when their total transmitting power is fixed. Numerical results show that the derived BER expressions match with the simulation results. They also show that the proposed systems with optimal power allocation outperform the conventional systems without power allocation under the same other conditions. In some cases, the gain could be as large as several dB's in effective signal-to-noise ratio.
Subbotin, S A; Vierstraete, A; De Ley, P; Rowe, J; Waeyenberge, L; Moens, M; Vanfleteren, J R
2001-10-01
The ITS1, ITS2, and 5.8S gene sequences of nuclear ribosomal DNA from 40 taxa of the family Heteroderidae (including the genera Afenestrata, Cactodera, Heterodera, Globodera, Punctodera, Meloidodera, Cryphodera, and Thecavermiculatus) were sequenced and analyzed. The ITS regions displayed high levels of sequence divergence within Heteroderinae and compared to outgroup taxa. Unlike recent findings in root knot nematodes, ITS sequence polymorphism does not appear to complicate phylogenetic analysis of cyst nematodes. Phylogenetic analyses with maximum-parsimony, minimum-evolution, and maximum-likelihood methods were performed with a range of computer alignments, including elision and culled alignments. All multiple alignments and phylogenetic methods yielded similar basic structure for phylogenetic relationships of Heteroderidae. The cyst-forming nematodes are represented by six main clades corresponding to morphological characters and host specialization, with certain clades assuming different positions depending on alignment procedure and/or method of phylogenetic inference. Hypotheses of monophyly of Punctoderinae and Heteroderinae are, respectively, strongly and moderately supported by the ITS data across most alignments. Close relationships were revealed between the Avenae and the Sacchari groups and between the Humuli group and the species H. salixophila within Heteroderinae. The Goettingiana group occupies a basal position within this subfamily. The validity of the genera Afenestrata and Bidera was tested and is discussed based on molecular data. We conclude that ITS sequence data are appropriate for studies of relationships within the different species groups and less so for recovery of more ancient speciations within Heteroderidae. Copyright 2001 Academic Press.
Murray, Aja Louise; Booth, Tom; Eisner, Manuel; Obsuth, Ingrid; Ribeaud, Denis
2018-05-22
Whether or not importance should be placed on an all-encompassing general factor of psychopathology (or p factor) in classifying, researching, diagnosing, and treating psychiatric disorders depends (among other issues) on the extent to which comorbidity is symptom-general rather than staying largely within the confines of narrower transdiagnostic factors such as internalizing and externalizing. In this study, we compared three methods of estimating p factor strength. We compared omega hierarchical and explained common variance calculated from confirmatory factor analysis (CFA) bifactor models with maximum likelihood (ML) estimation, from exploratory structural equation modeling/exploratory factor analysis models with a bifactor rotation, and from Bayesian structural equation modeling (BSEM) bifactor models. Our simulation results suggested that BSEM with small variance priors on secondary loadings might be the preferred option. However, CFA with ML also performed well provided secondary loadings were modeled. We provide two empirical examples of applying the three methodologies using a normative sample of youth (z-proso, n = 1,286) and a university counseling sample (n = 359).
International Nuclear Information System (INIS)
Arab, M.N.; Ayaz, M.
2004-01-01
The performance of transmission line insulator is greatly affected by dust, fumes from industrial areas and saline deposit near the coast. Such pollutants in the presence of moisture form a coating on the surface of the insulator, which in turn allows the passage of leakage current. This leakage builds up to a point where flashover develops. The flashover is often followed by permanent failure of insulation resulting in prolong outages. With the increase in system voltage owing to the greater demand of electrical energy over the past few decades, the importance of flashover due to pollution has received special attention. The objective of the present work was to study the performance of overhead line insulators in the presence of contaminants such as induced salts. A detailed review of the literature and the mechanisms of insulator flashover due to the pollution are presented. Experimental investigations on the behavior of overhead line insulators under industrial salt contamination are carried out. A special fog chamber was designed in which the contamination testing of insulators was carried out. Flashover behavior under various degrees of contamination of insulators with the most common industrial fume components such as Nitrate and Sulphate compounds was studied. Substituting the normal distribution parameter in the probability distribution function based on maximum likelihood develops a statistical method. The method gives a high accuracy in the estimation of the 50% flashover voltage, which is then used to evaluate the critical flashover index at various contamination levels. The critical flashover index is a valuable parameter in insulation design for numerous applications. (author)
Energy Technology Data Exchange (ETDEWEB)
Hogden, J.
1996-11-05
The goal of the proposed research is to test a statistical model of speech recognition that incorporates the knowledge that speech is produced by relatively slow motions of the tongue, lips, and other speech articulators. This model is called Maximum Likelihood Continuity Mapping (Malcom). Many speech researchers believe that by using constraints imposed by articulator motions, we can improve or replace the current hidden Markov model based speech recognition algorithms. Unfortunately, previous efforts to incorporate information about articulation into speech recognition algorithms have suffered because (1) slight inaccuracies in our knowledge or the formulation of our knowledge about articulation may decrease recognition performance, (2) small changes in the assumptions underlying models of speech production can lead to large changes in the speech derived from the models, and (3) collecting measurements of human articulator positions in sufficient quantity for training a speech recognition algorithm is still impractical. The most interesting (and in fact, unique) quality of Malcom is that, even though Malcom makes use of a mapping between acoustics and articulation, Malcom can be trained to recognize speech using only acoustic data. By learning the mapping between acoustics and articulation using only acoustic data, Malcom avoids the difficulties involved in collecting articulator position measurements and does not require an articulatory synthesizer model to estimate the mapping between vocal tract shapes and speech acoustics. Preliminary experiments that demonstrate that Malcom can learn the mapping between acoustics and articulation are discussed. Potential applications of Malcom aside from speech recognition are also discussed. Finally, specific deliverables resulting from the proposed research are described.
Directory of Open Access Journals (Sweden)
Kaarina Matilainen
Full Text Available Estimation of variance components by Monte Carlo (MC expectation maximization (EM restricted maximum likelihood (REML is computationally efficient for large data sets and complex linear mixed effects models. However, efficiency may be lost due to the need for a large number of iterations of the EM algorithm. To decrease the computing time we explored the use of faster converging Newton-type algorithms within MC REML implementations. The implemented algorithms were: MC Newton-Raphson (NR, where the information matrix was generated via sampling; MC average information(AI, where the information was computed as an average of observed and expected information; and MC Broyden's method, where the zero of the gradient was searched using a quasi-Newton-type algorithm. Performance of these algorithms was evaluated using simulated data. The final estimates were in good agreement with corresponding analytical ones. MC NR REML and MC AI REML enhanced convergence compared to MC EM REML and gave standard errors for the estimates as a by-product. MC NR REML required a larger number of MC samples, while each MC AI REML iteration demanded extra solving of mixed model equations by the number of parameters to be estimated. MC Broyden's method required the largest number of MC samples with our small data and did not give standard errors for the parameters directly. We studied the performance of three different convergence criteria for the MC AI REML algorithm. Our results indicate the importance of defining a suitable convergence criterion and critical value in order to obtain an efficient Newton-type method utilizing a MC algorithm. Overall, use of a MC algorithm with Newton-type methods proved feasible and the results encourage testing of these methods with different kinds of large-scale problem settings.
Sivaguru, Mayandi; Kabir, Mohammad M.; Gartia, Manas Ranjan; Biggs, David S. C.; Sivaguru, Barghav S.; Sivaguru, Vignesh A.; Berent, Zachary T.; Wagoner Johnson, Amy J.; Fried, Glenn A.; Liu, Gang Logan; Sadayappan, Sakthivel; Toussaint, Kimani C.
2017-02-01
Second-harmonic generation (SHG) microscopy is a label-free imaging technique to study collagenous materials in extracellular matrix environment with high resolution and contrast. However, like many other microscopy techniques, the actual spatial resolution achievable by SHG microscopy is reduced by out-of-focus blur and optical aberrations that degrade particularly the amplitude of the detectable higher spatial frequencies. Being a two-photon scattering process, it is challenging to define a point spread function (PSF) for the SHG imaging modality. As a result, in comparison with other two-photon imaging systems like two-photon fluorescence, it is difficult to apply any PSF-engineering techniques to enhance the experimental spatial resolution closer to the diffraction limit. Here, we present a method to improve the spatial resolution in SHG microscopy using an advanced maximum likelihood estimation (AdvMLE) algorithm to recover the otherwise degraded higher spatial frequencies in an SHG image. Through adaptation and iteration, the AdvMLE algorithm calculates an improved PSF for an SHG image and enhances the spatial resolution by decreasing the full-width-at-halfmaximum (FWHM) by 20%. Similar results are consistently observed for biological tissues with varying SHG sources, such as gold nanoparticles and collagen in porcine feet tendons. By obtaining an experimental transverse spatial resolution of 400 nm, we show that the AdvMLE algorithm brings the practical spatial resolution closer to the theoretical diffraction limit. Our approach is suitable for adaptation in micro-nano CT and MRI imaging, which has the potential to impact diagnosis and treatment of human diseases.
Wang, Chaolong; Schroeder, Kari B.; Rosenberg, Noah A.
2012-01-01
Allelic dropout is a commonly observed source of missing data in microsatellite genotypes, in which one or both allelic copies at a locus fail to be amplified by the polymerase chain reaction. Especially for samples with poor DNA quality, this problem causes a downward bias in estimates of observed heterozygosity and an upward bias in estimates of inbreeding, owing to mistaken classifications of heterozygotes as homozygotes when one of the two copies drops out. One general approach for avoiding allelic dropout involves repeated genotyping of homozygous loci to minimize the effects of experimental error. Existing computational alternatives often require replicate genotyping as well. These approaches, however, are costly and are suitable only when enough DNA is available for repeated genotyping. In this study, we propose a maximum-likelihood approach together with an expectation-maximization algorithm to jointly estimate allelic dropout rates and allele frequencies when only one set of nonreplicated genotypes is available. Our method considers estimates of allelic dropout caused by both sample-specific factors and locus-specific factors, and it allows for deviation from Hardy–Weinberg equilibrium owing to inbreeding. Using the estimated parameters, we correct the bias in the estimation of observed heterozygosity through the use of multiple imputations of alleles in cases where dropout might have occurred. With simulated data, we show that our method can (1) effectively reproduce patterns of missing data and heterozygosity observed in real data; (2) correctly estimate model parameters, including sample-specific dropout rates, locus-specific dropout rates, and the inbreeding coefficient; and (3) successfully correct the downward bias in estimating the observed heterozygosity. We find that our method is fairly robust to violations of model assumptions caused by population structure and by genotyping errors from sources other than allelic dropout. Because the data sets
Papaconstadopoulos, P; Levesque, I R; Maglieri, R; Seuntjens, J
2016-02-07
Direct determination of the source intensity distribution of clinical linear accelerators is still a challenging problem for small field beam modeling. Current techniques most often involve special equipment and are difficult to implement in the clinic. In this work we present a maximum-likelihood expectation-maximization (MLEM) approach to the source reconstruction problem utilizing small fields and a simple experimental set-up. The MLEM algorithm iteratively ray-traces photons from the source plane to the exit plane and extracts corrections based on photon fluence profile measurements. The photon fluence profiles were determined by dose profile film measurements in air using a high density thin foil as build-up material and an appropriate point spread function (PSF). The effect of other beam parameters and scatter sources was minimized by using the smallest field size ([Formula: see text] cm(2)). The source occlusion effect was reproduced by estimating the position of the collimating jaws during this process. The method was first benchmarked against simulations for a range of typical accelerator source sizes. The sources were reconstructed with an accuracy better than 0.12 mm in the full width at half maximum (FWHM) to the respective electron sources incident on the target. The estimated jaw positions agreed within 0.2 mm with the expected values. The reconstruction technique was also tested against measurements on a Varian Novalis Tx linear accelerator and compared to a previously commissioned Monte Carlo model. The reconstructed FWHM of the source agreed within 0.03 mm and 0.11 mm to the commissioned electron source in the crossplane and inplane orientations respectively. The impact of the jaw positioning, experimental and PSF uncertainties on the reconstructed source distribution was evaluated with the former presenting the dominant effect.
Walker, H. F.
1976-01-01
Likelihood equations determined by the two types of samples which are necessary conditions for a maximum-likelihood estimate are considered. These equations, suggest certain successive-approximations iterative procedures for obtaining maximum-likelihood estimates. These are generalized steepest ascent (deflected gradient) procedures. It is shown that, with probability 1 as N sub 0 approaches infinity (regardless of the relative sizes of N sub 0 and N sub 1, i=1,...,m), these procedures converge locally to the strongly consistent maximum-likelihood estimates whenever the step size is between 0 and 2. Furthermore, the value of the step size which yields optimal local convergence rates is bounded from below by a number which always lies between 1 and 2.
Zhou, Tai-Cheng; Sha, Tao; Irwin, David M; Zhang, Ya-Ping
2015-01-01
Pavo cristatus, known as the Indian peafowl, is endemic to India and Sri Lanka and has been domesticated for its ornamental and food value. However, its phylogenetic status is still debated. Here, to clarify the phylogenetic status of P. cristatus within Phasianidae, we analyzed its mitochondrial genome (mtDNA). The complete mitochondrial DNA (mtDNA) genome was determined using 34 pairs of primers. Our data show that the mtDNA genome of P. cristatus is 16,686 bp in length. Molecular phylogenetic analyses of P. cristatus was performed along with 22 complete mtDNA genomes belonging to other species in Phasianidae using Bayesian and maximum likelihood methods, where Aythya americana and Anas platyrhynchos were used as outgroups. Our results show that P. critatus has its closest genetic affinity with Pavo muticus and belongs to clade that contains Gallus, Bambusicola and Francolinus.
Chen, Qingxia; Ibrahim, Joseph G
2014-07-01
Multiple Imputation, Maximum Likelihood and Fully Bayesian methods are the three most commonly used model-based approaches in missing data problems. Although it is easy to show that when the responses are missing at random (MAR), the complete case analysis is unbiased and efficient, the aforementioned methods are still commonly used in practice for this setting. To examine the performance of and relationships between these three methods in this setting, we derive and investigate small sample and asymptotic expressions of the estimates and standard errors, and fully examine how these estimates are related for the three approaches in the linear regression model when the responses are MAR. We show that when the responses are MAR in the linear model, the estimates of the regression coefficients using these three methods are asymptotically equivalent to the complete case estimates under general conditions. One simulation and a real data set from a liver cancer clinical trial are given to compare the properties of these methods when the responses are MAR.
Phylogenetic analysis of fungal ABC transporters.
Kovalchuk, Andriy; Driessen, Arnold J M
2010-03-16
The superfamily of ABC proteins is among the largest known in nature. Its members are mainly, but not exclusively, involved in the transport of a broad range of substrates across biological membranes. Many contribute to multidrug resistance in microbial pathogens and cancer cells. The diversity of ABC proteins in fungi is comparable with those in multicellular animals, but so far fungal ABC proteins have barely been studied. We performed a phylogenetic analysis of the ABC proteins extracted from the genomes of 27 fungal species from 18 orders representing 5 fungal phyla thereby covering the most important groups. Our analysis demonstrated that some of the subfamilies of ABC proteins remained highly conserved in fungi, while others have undergone a remarkable group-specific diversification. Members of the various fungal phyla also differed significantly in the number of ABC proteins found in their genomes, which is especially reduced in the yeast S. cerevisiae and S. pombe. Data obtained during our analysis should contribute to a better understanding of the diversity of the fungal ABC proteins and provide important clues about their possible biological functions.
Helaers, Raphaël; Milinkovitch, Michel C
2010-07-15
The development, in the last decade, of stochastic heuristics implemented in robust application softwares has made large phylogeny inference a key step in most comparative studies involving molecular sequences. Still, the choice of a phylogeny inference software is often dictated by a combination of parameters not related to the raw performance of the implemented algorithm(s) but rather by practical issues such as ergonomics and/or the availability of specific functionalities. Here, we present MetaPIGA v2.0, a robust implementation of several stochastic heuristics for large phylogeny inference (under maximum likelihood), including a Simulated Annealing algorithm, a classical Genetic Algorithm, and the Metapopulation Genetic Algorithm (metaGA) together with complex substitution models, discrete Gamma rate heterogeneity, and the possibility to partition data. MetaPIGA v2.0 also implements the Likelihood Ratio Test, the Akaike Information Criterion, and the Bayesian Information Criterion for automated selection of substitution models that best fit the data. Heuristics and substitution models are highly customizable through manual batch files and command line processing. However, MetaPIGA v2.0 also offers an extensive graphical user interface for parameters setting, generating and running batch files, following run progress, and manipulating result trees. MetaPIGA v2.0 uses standard formats for data sets and trees, is platform independent, runs in 32 and 64-bits systems, and takes advantage of multiprocessor and multicore computers. The metaGA resolves the major problem inherent to classical Genetic Algorithms by maintaining high inter-population variation even under strong intra-population selection. Implementation of the metaGA together with additional stochastic heuristics into a single software will allow rigorous optimization of each heuristic as well as a meaningful comparison of performances among these algorithms. MetaPIGA v2.0 gives access both to high
Directory of Open Access Journals (Sweden)
Milinkovitch Michel C
2010-07-01
Full Text Available Abstract Background The development, in the last decade, of stochastic heuristics implemented in robust application softwares has made large phylogeny inference a key step in most comparative studies involving molecular sequences. Still, the choice of a phylogeny inference software is often dictated by a combination of parameters not related to the raw performance of the implemented algorithm(s but rather by practical issues such as ergonomics and/or the availability of specific functionalities. Results Here, we present MetaPIGA v2.0, a robust implementation of several stochastic heuristics for large phylogeny inference (under maximum likelihood, including a Simulated Annealing algorithm, a classical Genetic Algorithm, and the Metapopulation Genetic Algorithm (metaGA together with complex substitution models, discrete Gamma rate heterogeneity, and the possibility to partition data. MetaPIGA v2.0 also implements the Likelihood Ratio Test, the Akaike Information Criterion, and the Bayesian Information Criterion for automated selection of substitution models that best fit the data. Heuristics and substitution models are highly customizable through manual batch files and command line processing. However, MetaPIGA v2.0 also offers an extensive graphical user interface for parameters setting, generating and running batch files, following run progress, and manipulating result trees. MetaPIGA v2.0 uses standard formats for data sets and trees, is platform independent, runs in 32 and 64-bits systems, and takes advantage of multiprocessor and multicore computers. Conclusions The metaGA resolves the major problem inherent to classical Genetic Algorithms by maintaining high inter-population variation even under strong intra-population selection. Implementation of the metaGA together with additional stochastic heuristics into a single software will allow rigorous optimization of each heuristic as well as a meaningful comparison of performances among these
Comparative phylogenetic analysis of intergenic spacers and small ...
African Journals Online (AJOL)
The phylogenetic analysis of test isolates included assessment of variation in sequences and length of IGS and SSU-rRNA genes with reference to 16 different microsporidian sequences. The results proved that IGS sequences have more variation than SSU-rRNA gene sequences. Analysis of phylogenetic trees reveal that ...
Open Reading Frame Phylogenetic Analysis on the Cloud
Directory of Open Access Journals (Sweden)
Che-Lun Hung
2013-01-01
Full Text Available Phylogenetic analysis has become essential in researching the evolutionary relationships between viruses. These relationships are depicted on phylogenetic trees, in which viruses are grouped based on sequence similarity. Viral evolutionary relationships are identified from open reading frames rather than from complete sequences. Recently, cloud computing has become popular for developing internet-based bioinformatics tools. Biocloud is an efficient, scalable, and robust bioinformatics computing service. In this paper, we propose a cloud-based open reading frame phylogenetic analysis service. The proposed service integrates the Hadoop framework, virtualization technology, and phylogenetic analysis methods to provide a high-availability, large-scale bioservice. In a case study, we analyze the phylogenetic relationships among Norovirus. Evolutionary relationships are elucidated by aligning different open reading frame sequences. The proposed platform correctly identifies the evolutionary relationships between members of Norovirus.
Consequences of recombination on traditional phylogenetic analysis
DEFF Research Database (Denmark)
Schierup, M H; Hein, J
2000-01-01
We investigate the shape of a phylogenetic tree reconstructed from sequences evolving under the coalescent with recombination. The motivation is that evolutionary inferences are often made from phylogenetic trees reconstructed from population data even though recombination may well occur (mt......DNA or viral sequences) or does occur (nuclear sequences). We investigate the size and direction of biases when a single tree is reconstructed ignoring recombination. Standard software (PHYLIP) was used to construct the best phylogenetic tree from sequences simulated under the coalescent with recombination....... With recombination present, the length of terminal branches and the total branch length are larger, and the time to the most recent common ancestor smaller, than for a tree reconstructed from sequences evolving with no recombination. The effects are pronounced even for small levels of recombination that may...
A phylogenetic analysis of rissooidean and cingulopsoidean families (Gastropoda: Caenogastropoda).
Criscione, Francesco; Ponder, Winston Frank
2013-03-01
The Rissooidea is one of the largest and most diverse molluscan superfamilies, with 23 recognized Recent families including marine, freshwater and terrestrial members. The Cingulopsoidea are a group of three marine families previously included within the Rissooidea. A previous molecular analysis including two rissooideans and one cingulopsoidean, indicated the possibility that the Rissooidea is at least diphyletic. We use new molecular data to investigate the polyphyly of Rissooidea and test the monophyly of Cingulopsoidea with a greatly increased taxon set. This study includes the greatest sampling to date with 43 species of 14 families of Rissooidea and all families of Cingulopsoidea. Bayesian and maximum likelihood analyses of 16S and 28S show that there are two major clades encompassing taxa previously included in Rissooidea. These are the Rissooidea s.s. containing Rissoidae and Barleeiidae and the Truncatelloidea containing Anabathridae, Assimineidae, Falsicingulidae, Truncatellidae, Pomatiopsidae, Hydrobiidae s.l., Hydrococcidae, Stenothyridae, Calopiidae, Clenchiellidae, Caecidae, Tornidae, and Iravadiidae. Rissoidae is not monophyletic, with Lironoba grouping with Emblanda (Emblandidae) and Rissoina forming a separate clade with Barleeiidae. Iravadiidae is not monophyletic, with Nozeba being sister to the Tornidae. Tatea, usually included within Hydrobiidae, is distinct from that family and Nodulus, previously included in Anabathridae, groups with the hydrobiids. Copyright © 2012 Elsevier Inc. All rights reserved.
Deville, Yannick; Deville, Alain
2009-01-01
In a previous paper [1], we investigated the Blind Source Separation (BSS) problem, for the nonlinear mixing model that we introduced in that paper. We proposed to solve this problem by using a maximum likelihood (ML) approach. When applying the ML approach to BSS problems, one usually determines the analytical expressions of the derivatives of the log-likelihood with respect to the parameters of the considered mixing model. In the literature, these calculations were mainly considered for lin...
BioMatriX: Sequence analysis, structure visualization, phylogenetics ...
African Journals Online (AJOL)
bmx-biomatrix.blogspot.com) developed for biological science community to augment scientific research regarding genomics, proteomics, phylogenetics and linkage analysis in one platform. BioMatriX offers multi-functional services to perform ...
Phylogenetic analysis of anemone fishes of the Persian Gulf using ...
African Journals Online (AJOL)
STORAGESEVER
2008-06-17
Jun 17, 2008 ... genetic diversity among samples was investigated by phylogenetic analysis. Results show that there is ... more about the living organisms found in this region. Many marine ... Kish (modified from Pous et al., 2004). Table 2.
Phylogenetic Trees From Sequences
Ryvkin, Paul; Wang, Li-San
In this chapter, we review important concepts and approaches for phylogeny reconstruction from sequence data.We first cover some basic definitions and properties of phylogenetics, and briefly explain how scientists model sequence evolution and measure sequence divergence. We then discuss three major approaches for phylogenetic reconstruction: distance-based phylogenetic reconstruction, maximum parsimony, and maximum likelihood. In the third part of the chapter, we review how multiple phylogenies are compared by consensus methods and how to assess confidence using bootstrapping. At the end of the chapter are two sections that list popular software packages and additional reading.
Chicas, Mauricio; Caviedes, Mario; Hammond, Rosemarie; Madriz, Kenneth; Albertazzi, Federico; Villalobos, Heydi; Ramírez, Pilar
2007-06-01
Maize rayado fino virus (MRFV) infects maize and appears to be restricted to, yet widespread in, the Americas. MRFV was previously unreported from Ecuador. Maize plants exhibiting symptoms of MRFV infection were collected at the Santa Catalina experiment station in Quito, Ecuador. RT-PCR reactions were performed on total RNA extracted from the symptomatic leaves using primers specific for the capsid protein (CP) gene and 3' non-translated region of MRFV and first strand cDNA as a template. Nucleotide sequence comparisons to previously sequenced MRFV isolates from other geographic regions revealed 88-91% sequence identity. Phylogenetic trees constructed using Maximum Likelihood, UPGMA, Minimal Evolution, Neighbor Joining, and Maximum Parsimony methods separated the MRFV isolates into four groups. These groups may represent geographic isolation generated by the mountainous chains of the American continent. Analysis of the sequences and the genetic distances among the different isolates suggests that MRFV may have originated in Mexico and/or Guatemala and from there it dispersed to the rest of the Americas.
Jones, Christopher M; Stres, Blaz; Rosenquist, Magnus; Hallin, Sara
2008-09-01
Denitrification is a facultative respiratory pathway in which nitrite (NO2(-)), nitric oxide (NO), and nitrous oxide (N2O) are successively reduced to nitrogen gas (N(2)), effectively closing the nitrogen cycle. The ability to denitrify is widely dispersed among prokaryotes, and this polyphyletic distribution has raised the possibility of horizontal gene transfer (HGT) having a substantial role in the evolution of denitrification. Comparisons of 16S rRNA and denitrification gene phylogenies in recent studies support this possibility; however, these results remain speculative as they are based on visual comparisons of phylogenies from partial sequences. We reanalyzed publicly available nirS, nirK, norB, and nosZ partial sequences using Bayesian and maximum likelihood phylogenetic inference. Concomitant analysis of denitrification genes with 16S rRNA sequences from the same organisms showed substantial differences between the trees, which were supported by examining the posterior probability of monophyletic constraints at different taxonomic levels. Although these differences suggest HGT of denitrification genes, the presence of structural variants for nirK, norB, and nosZ makes it difficult to determine HGT from other evolutionary events. Additional analysis using phylogenetic networks and likelihood ratio tests of phylogenies based on full-length sequences retrieved from genomes also revealed significant differences in tree topologies among denitrification and 16S rRNA gene phylogenies, with the exception of the nosZ gene phylogeny within the data set of the nirK-harboring genomes. However, inspection of codon usage and G + C content plots from complete genomes gave no evidence for recent HGT. Instead, the close proximity of denitrification gene copies in the genomes of several denitrifying bacteria suggests duplication. Although HGT cannot be ruled out as a factor in the evolution of denitrification genes, our analysis suggests that other phenomena, such gene
Meerow, Alan W.; Noblick, Larry; Borrone, James W.; Couvreur, Thomas L. P.; Mauro-Herrera, Margarita; Hahn, William J.; Kuhn, David N.; Nakamura, Kyoko; Oleas, Nora H.; Schnell, Raymond J.
2009-01-01
Background The Cocoseae is one of 13 tribes of Arecaceae subfam. Arecoideae, and contains a number of palms with significant economic importance, including the monotypic and pantropical Cocos nucifera L., the coconut, the origins of which have been one of the “abominable mysteries” of palm systematics for decades. Previous studies with predominantly plastid genes weakly supported American ancestry for the coconut but ambiguous sister relationships. In this paper, we use multiple single copy nuclear loci to address the phylogeny of the Cocoseae subtribe Attaleinae, and resolve the closest extant relative of the coconut. Methodology/Principal Findings We present the results of combined analysis of DNA sequences of seven WRKY transcription factor loci across 72 samples of Arecaceae tribe Cocoseae subtribe Attaleinae, representing all genera classified within the subtribe, and three outgroup taxa with maximum parsimony, maximum likelihood, and Bayesian approaches, producing highly congruent and well-resolved trees that robustly identify the genus Syagrus as sister to Cocos and resolve novel and well-supported relationships among the other genera of the Attaleinae. We also address incongruence among the gene trees with gene tree reconciliation analysis, and assign estimated ages to the nodes of our tree. Conclusions/Significance This study represents the as yet most extensive phylogenetic analyses of Cocoseae subtribe Attaleinae. We present a well-resolved and supported phylogeny of the subtribe that robustly indicates a sister relationship between Cocos and Syagrus. This is not only of biogeographic interest, but will also open fruitful avenues of inquiry regarding evolution of functional genes useful for crop improvement. Establishment of two major clades of American Attaleinae occurred in the Oligocene (ca. 37 MYBP) in Eastern Brazil. The divergence of Cocos from Syagrus is estimated at 35 MYBP. The biogeographic and morphological congruence that we see for
Phylogenetic comparative methods complement discriminant function analysis in ecomorphology.
Barr, W Andrew; Scott, Robert S
2014-04-01
In ecomorphology, Discriminant Function Analysis (DFA) has been used as evidence for the presence of functional links between morphometric variables and ecological categories. Here we conduct simulations of characters containing phylogenetic signal to explore the performance of DFA under a variety of conditions. Characters were simulated using a phylogeny of extant antelope species from known habitats. Characters were modeled with no biomechanical relationship to the habitat category; the only sources of variation were body mass, phylogenetic signal, or random "noise." DFA on the discriminability of habitat categories was performed using subsets of the simulated characters, and Phylogenetic Generalized Least Squares (PGLS) was performed for each character. Analyses were repeated with randomized habitat assignments. When simulated characters lacked phylogenetic signal and/or habitat assignments were random, ecomorphology. Copyright © 2013 Wiley Periodicals, Inc.
Bashalkhanov, S I; Konstantinov, Iu M; Verbitskiĭ, D S; Kobzev, V F
2003-10-01
To reconstruct the systematic relationships of larch Larix sukaczewii, we used the chloroplast trnK intron sequences of L. decidua, L. sukaczewii, L. sibirica, L. czekanovskii, and L. gmelinii. Analysis of phylogenetic trees constructed using the maximum parsimony and maximum likelihood methods showed a clear divergence of the trnK intron sequences between L. sukaczewii and L. sibirica. This divergence reaches intraspecific level, which supports a previously published hypothesis on the taxonomic isolation of L. sukaczewii.
TREEFINDER: a powerful graphical analysis environment for molecular phylogenetics
Directory of Open Access Journals (Sweden)
von Haeseler Arndt
2004-06-01
Full Text Available Abstract Background Most analysis programs for inferring molecular phylogenies are difficult to use, in particular for researchers with little programming experience. Results TREEFINDER is an easy-to-use integrative platform-independent analysis environment for molecular phylogenetics. In this paper the main features of TREEFINDER (version of April 2004 are described. TREEFINDER is written in ANSI C and Java and implements powerful statistical approaches for inferring gene tree and related analyzes. In addition, it provides a user-friendly graphical interface and a phylogenetic programming language. Conclusions TREEFINDER is a versatile framework for analyzing phylogenetic data across different platforms that is suited both for exploratory as well as advanced studies.
BLAST-EXPLORER helps you building datasets for phylogenetic analysis
Directory of Open Access Journals (Sweden)
Claverie Jean-Michel
2010-01-01
Full Text Available Abstract Background The right sampling of homologous sequences for phylogenetic or molecular evolution analyses is a crucial step, the quality of which can have a significant impact on the final interpretation of the study. There is no single way for constructing datasets suitable for phylogenetic analysis, because this task intimately depends on the scientific question we want to address, Moreover, database mining softwares such as BLAST which are routinely used for searching homologous sequences are not specifically optimized for this task. Results To fill this gap, we designed BLAST-Explorer, an original and friendly web-based application that combines a BLAST search with a suite of tools that allows interactive, phylogenetic-oriented exploration of the BLAST results and flexible selection of homologous sequences among the BLAST hits. Once the selection of the BLAST hits is done using BLAST-Explorer, the corresponding sequence can be imported locally for external analysis or passed to the phylogenetic tree reconstruction pipelines available on the Phylogeny.fr platform. Conclusions BLAST-Explorer provides a simple, intuitive and interactive graphical representation of the BLAST results and allows selection and retrieving of the BLAST hit sequences based a wide range of criterions. Although BLAST-Explorer primarily aims at helping the construction of sequence datasets for further phylogenetic study, it can also be used as a standard BLAST server with enriched output. BLAST-Explorer is available at http://www.phylogeny.fr
Nishibori, M; Tsudzuki, M; Hayashi, T; Yamamoto, Y; Yasue, H
2002-01-01
Coturnix chinensis (blue-breasted quail) has been classically grouped in Galliformes Phasianidae Coturnix, based on morphologic features and biochemical evidence. Since the blue-breasted quail has the smallest body size among the species of Galliformes, in addition to a short generation time and an excellent reproductive performance, it is a possible model fowl for breeding and physiological studies of the Coturnix japonica (Japanese quail) and Gallus gallus domesticus (chicken), which are classified in the same family as blue-breasted quail. However, since its phylogenetic position in the family Phasianidae has not been determined conclusively, the sequence of the entire blue-breasted quail mitochondria (mt) genome was obtained to provide genetic information for phylogenetic analysis in the present study. The blue-breasted quail mtDNA was found to be a circular DNA of 16,687 base pairs (bp) with the same genomic structure as the mtDNAs of Japanese quail and chicken, though it is smaller than Japanese quail and chicken mtDNAs by 10 bp and 88 bp, respectively. The sequence identity of all mitochondrial genes, including those for 12S and 16S ribosomal RNAs, between blue-breasted quail and Japanese quail ranged from 84.5% to 93.5%; between blue-breasted quail and chicken, sequence identity ranged from 78.0% to 89.6%. In order to obtain information on the phylogenetic position of blue-breasted quail in Galliformes Phasianidae, the 2,184 bp sequence comprising NADH dehydrogenase subunit 2 and cytochrome b genes available for eight species in Galliformes [Japanese quail, chicken, Gallus varius (green junglefowl), Bambusicola thoracica (Chinese bamboo partridge), Pavo cristatus (Indian peafowl), Perdix perdix (gray partridge), Phasianus colchicus (ring-neck pheasant), and Tympanchus phasianellus (sharp-tailed grouse)] together with that of Aythya americana (redhead) were examined using a maximum likelihood (ML) method. The ML analyses on the first/second codon positions
Etzel, C J; Shete, S; Beasley, T M; Fernandez, J R; Allison, D B; Amos, C I
2003-01-01
Non-normality of the phenotypic distribution can affect power to detect quantitative trait loci in sib pair studies. Previously, we observed that Winsorizing the sib pair phenotypes increased the power of quantitative trait locus (QTL) detection for both Haseman-Elston (HE) least-squares tests [Hum Hered 2002;53:59-67] and maximum likelihood-based variance components (MLVC) analysis [Behav Genet (in press)]. Winsorizing the phenotypes led to a slight increase in type 1 error in H-E tests and a slight decrease in type I error for MLVC analysis. Herein, we considered transforming the sib pair phenotypes using the Box-Cox family of transformations. Data were simulated for normal and non-normal (skewed and kurtic) distributions. Phenotypic values were replaced by Box-Cox transformed values. Twenty thousand replications were performed for three H-E tests of linkage and the likelihood ratio test (LRT), the Wald test and other robust versions based on the MLVC method. We calculated the relative nominal inflation rate as the ratio of observed empirical type 1 error divided by the set alpha level (5, 1 and 0.1% alpha levels). MLVC tests applied to non-normal data had inflated type I errors (rate ratio greater than 1.0), which were controlled best by Box-Cox transformation and to a lesser degree by Winsorizing. For example, for non-transformed, skewed phenotypes (derived from a chi2 distribution with 2 degrees of freedom), the rates of empirical type 1 error with respect to set alpha level=0.01 were 0.80, 4.35 and 7.33 for the original H-E test, LRT and Wald test, respectively. For the same alpha level=0.01, these rates were 1.12, 3.095 and 4.088 after Winsorizing and 0.723, 1.195 and 1.905 after Box-Cox transformation. Winsorizing reduced inflated error rates for the leptokurtic distribution (derived from a Laplace distribution with mean 0 and variance 8). Further, power (adjusted for empirical type 1 error) at the 0.01 alpha level ranged from 4.7 to 17.3% across all tests
Elbouchikhi, Elhoussin; Choqueuse, Vincent; Benbouzid, Mohamed
2016-07-01
Condition monitoring of electric drives is of paramount importance since it contributes to enhance the system reliability and availability. Moreover, the knowledge about the fault mode behavior is extremely important in order to improve system protection and fault-tolerant control. Fault detection and diagnosis in squirrel cage induction machines based on motor current signature analysis (MCSA) has been widely investigated. Several high resolution spectral estimation techniques have been developed and used to detect induction machine abnormal operating conditions. This paper focuses on the application of MCSA for the detection of abnormal mechanical conditions that may lead to induction machines failure. In fact, this paper is devoted to the detection of single-point defects in bearings based on parametric spectral estimation. A multi-dimensional MUSIC (MD MUSIC) algorithm has been developed for bearing faults detection based on bearing faults characteristic frequencies. This method has been used to estimate the fundamental frequency and the fault related frequency. Then, an amplitude estimator of the fault characteristic frequencies has been proposed and fault indicator has been derived for fault severity measurement. The proposed bearing faults detection approach is assessed using simulated stator currents data, issued from a coupled electromagnetic circuits approach for air-gap eccentricity emulating bearing faults. Then, experimental data are used for validation purposes. Copyright © 2016 ISA. Published by Elsevier Ltd. All rights reserved.
Buckley, Thomas R; James, Sam; Allwood, Julia; Bartlam, Scott; Howitt, Robyn; Prada, Diana
2011-01-01
We have constructed the first ever phylogeny for the New Zealand earthworm fauna (Megascolecinae and Acanthodrilinae) including representatives from other major continental regions. Bayesian and maximum likelihood phylogenetic trees were constructed from 427 base pairs from the mitochondrial large subunit (16S) rRNA gene and 661 base pairs from the nuclear large subunit (28S) rRNA gene. Within the Acanthodrilinae we were able to identify a number of well-supported clades that were restricted to continental landmasses. Estimates of nodal support for these major clades were generally high, but relationships among clades were poorly resolved. The phylogenetic analyses revealed several independent lineages in New Zealand, some of which had a comparable phylogenetic depth to monophyletic groups sampled from Madagascar, Africa, North America and Australia. These results are consistent with at least some of these clades having inhabited New Zealand since rifting from Gondwana in the Late Cretaceous. Within the New Zealand Acanthodrilinae, major clades tended to be restricted to specific regions of New Zealand, with the central North Island and Cook Strait representing major biogeographic boundaries. Our field surveys of New Zealand and subsequent identification has also revealed extensive cryptic taxonomic diversity with approximately 48 new species sampled in addition to the 199 species recognized by previous authors. Our results indicate that further survey and taxonomic work is required to establish a foundation for future biogeographic and ecological research on this vitally important component of the New Zealand biota. Copyright © 2010 Elsevier Inc. All rights reserved.
Phylogenetic analysis of West Nile virus, Nuevo Leon State, Mexico.
Blitvich, Bradley J; Fernández-Salas, Ildefonso; Contreras-Cordero, Juan F; Loroño-Pino, María A; Marlenee, Nicole L; Díaz, Francisco J; González-Rojas, José I; Obregón-Martínez, Nelson; Chiu-García, Jorge A; Black, William C; Beaty, Barry J
2004-07-01
West Nile virus RNA was detected in brain tissue from a horse that died in June 2003 in Nuevo Leon State, Mexico. Nucleotide sequencing and phylogenetic analysis of the premembrane and envelope genes showed that the virus was most closely related to West Nile virus isolates collected in Texas in 2002.
PhyloSift: phylogenetic analysis of genomes and metagenomes.
Darling, Aaron E; Jospin, Guillaume; Lowe, Eric; Matsen, Frederick A; Bik, Holly M; Eisen, Jonathan A
2014-01-01
Like all organisms on the planet, environmental microbes are subject to the forces of molecular evolution. Metagenomic sequencing provides a means to access the DNA sequence of uncultured microbes. By combining DNA sequencing of microbial communities with evolutionary modeling and phylogenetic analysis we might obtain new insights into microbiology and also provide a basis for practical tools such as forensic pathogen detection. In this work we present an approach to leverage phylogenetic analysis of metagenomic sequence data to conduct several types of analysis. First, we present a method to conduct phylogeny-driven Bayesian hypothesis tests for the presence of an organism in a sample. Second, we present a means to compare community structure across a collection of many samples and develop direct associations between the abundance of certain organisms and sample metadata. Third, we apply new tools to analyze the phylogenetic diversity of microbial communities and again demonstrate how this can be associated to sample metadata. These analyses are implemented in an open source software pipeline called PhyloSift. As a pipeline, PhyloSift incorporates several other programs including LAST, HMMER, and pplacer to automate phylogenetic analysis of protein coding and RNA sequences in metagenomic datasets generated by modern sequencing platforms (e.g., Illumina, 454).
PhyloSift: phylogenetic analysis of genomes and metagenomes
Directory of Open Access Journals (Sweden)
Aaron E. Darling
2014-01-01
Full Text Available Like all organisms on the planet, environmental microbes are subject to the forces of molecular evolution. Metagenomic sequencing provides a means to access the DNA sequence of uncultured microbes. By combining DNA sequencing of microbial communities with evolutionary modeling and phylogenetic analysis we might obtain new insights into microbiology and also provide a basis for practical tools such as forensic pathogen detection.In this work we present an approach to leverage phylogenetic analysis of metagenomic sequence data to conduct several types of analysis. First, we present a method to conduct phylogeny-driven Bayesian hypothesis tests for the presence of an organism in a sample. Second, we present a means to compare community structure across a collection of many samples and develop direct associations between the abundance of certain organisms and sample metadata. Third, we apply new tools to analyze the phylogenetic diversity of microbial communities and again demonstrate how this can be associated to sample metadata.These analyses are implemented in an open source software pipeline called PhyloSift. As a pipeline, PhyloSift incorporates several other programs including LAST, HMMER, and pplacer to automate phylogenetic analysis of protein coding and RNA sequences in metagenomic datasets generated by modern sequencing platforms (e.g., Illumina, 454.
Phylogenetic diversity analysis of Trichoderma species based on ...
African Journals Online (AJOL)
vi-4177/CSAU be assigned as the type strains of a species of genus Trichoderma based on phylogenetic tree analysis together with the 18S rRNA gene sequence search in Ribosomal Database Project, small subunit rRNA and large subunit ...
Genetic and phylogenetic analysis of ten Gobiidae species in China ...
African Journals Online (AJOL)
To study the genetic and phylogenetic relationship of gobioid fishes in China, the representatives of 10 gobioid fishes from 2 subfamilies in China were examined by amplified fragment length polymorphism (AFLP) analysis. We established 220 AFLP bands for 45 individuals from the 10 species, and the percentage of ...
Characterization and phylogenetic analysis of α-gliadin gene ...
Indian Academy of Sciences (India)
Supplementary data: Characterization and phylogenetic analysis of α-gliadin gene sequences reveals significant genomic divergence in Triticeae species. Guang-Rong Li, Tao Lang, En-Nian Yang, Cheng Liu ... The MITE insertion at the 3 UTR is boxed. Figure 2. The secondary structure of MITE insertion in HM452949.
Phylogenetic analysis of the genus Hordeum using repetitive DNA sequences
DEFF Research Database (Denmark)
Svitashev, S.; Bryngelsson, T.; Vershinin, A.
1994-01-01
A set of six cloned barley (Hordeum vulgare) repetitive DNA sequences was used for the analysis of phylogenetic relationships among 31 species (46 taxa) of the genus Hordeum, using molecular hybridization techniques. In situ hybridization experiments showed dispersed organization of the sequences...
Kyle, H. Lee; Hucek, Richard R.; Groveman, Brian; Frey, Richard
1990-01-01
The archived Earth radiation budget (ERB) products produced from the Nimbus-7 ERB narrow field-of-view scanner are described. The principal products are broadband outgoing longwave radiation (4.5 to 50 microns), reflected solar radiation (0.2 to 4.8 microns), and the net radiation. Daily and monthly averages are presented on a fixed global equal area (500 sq km), grid for the period May 1979 to May 1980. Two independent algorithms are used to estimate the outgoing fluxes from the observed radiances. The algorithms are described and the results compared. The products are divided into three subsets: the Scene Radiance Tapes (SRT) contain the calibrated radiances; the Sorting into Angular Bins (SAB) tape contains the SAB produced shortwave, longwave, and net radiation products; and the Maximum Likelihood Cloud Estimation (MLCE) tapes contain the MLCE products. The tape formats are described in detail.
Staggemeier, Vanessa Graziele; Diniz-Filho, José Alexandre Felizola; Forest, Félix; Lucas, Eve
2015-04-01
Myrcia section Aulomyrcia includes ∼120 species that are endemic to the Neotropics and disjunctly distributed in the moist Amazon and Atlantic coastal forests of Brazil. This paper presents the first comprehensive phylogenetic study of this group and this phylogeny is used as a basis to evaluate recent classification systems and to test alternative hypotheses associated with the history of this clade. Fifty-three taxa were sampled out of the 120 species currently recognized, plus 40 outgroup taxa, for one nuclear marker (ribosomal internal transcribed spacer) and four plastid markers (psbA-trnH, trnL-trnF, trnQ-rpS16 and ndhF). The relationships were reconstructed based on Bayesian and maximum likelihood analyses. Additionally, a likelihood approach, 'geographic state speciation and extinction', was used to estimate region- dependent rates of speciation, extinction and dispersal, comparing historically climatic stable areas (refugia) and unstable areas. Maximum likelihood and Bayesian inferences indicate that Myrcia and Marlierea are polyphyletic, and the internal groupings recovered are characterized by combinations of morphological characters. Phylogenetic relationships support a link between Amazonian and north-eastern species and between north-eastern and south-eastern species. Lower extinction rates within glacial refugia suggest that these areas were important in maintaining diversity in the Atlantic forest biodiversity hotspot. This study provides a robust phylogenetic framework to address important ecological questions for Myrcia s.l. within an evolutionary context, and supports the need to unite taxonomically the two traditional genera Myrcia and Marlierea in an expanded Myrcia s.l. Furthermore, this study offers valuable insights into the diversification of plant species in the highly impacted Atlantic forest of South America; evidence is presented that the lowest extinction rates are found inside refugia and that range expansion from unstable areas
Phylogenetic evidence for cladogenetic polyploidization in land plants.
Zhan, Shing H; Drori, Michal; Goldberg, Emma E; Otto, Sarah P; Mayrose, Itay
2016-07-01
Polyploidization is a common and recurring phenomenon in plants and is often thought to be a mechanism of "instant speciation". Whether polyploidization is associated with the formation of new species (cladogenesis) or simply occurs over time within a lineage (anagenesis), however, has never been assessed systematically. We tested this hypothesis using phylogenetic and karyotypic information from 235 plant genera (mostly angiosperms). We first constructed a large database of combined sequence and chromosome number data sets using an automated procedure. We then applied likelihood models (ClaSSE) that estimate the degree of synchronization between polyploidization and speciation events in maximum likelihood and Bayesian frameworks. Our maximum likelihood analysis indicated that 35 genera supported a model that includes cladogenetic transitions over a model with only anagenetic transitions, whereas three genera supported a model that incorporates anagenetic transitions over one with only cladogenetic transitions. Furthermore, the Bayesian analysis supported a preponderance of cladogenetic change in four genera but did not support a preponderance of anagenetic change in any genus. Overall, these phylogenetic analyses provide the first broad confirmation that polyploidization is temporally associated with speciation events, suggesting that it is indeed a major speciation mechanism in plants, at least in some genera. © 2016 Botanical Society of America.
Bamorovat, Mehdi; Sharifi, Iraj; Mohammadi, Mohammad Ali; Eybpoosh, Sana; Nasibi, Saeid; Aflatoonian, Mohammad Reza; Khosravi, Ahmad
2018-03-01
The precise identification of the parasite species causing leishmaniasis is essential for selecting proper treatment modality. The present study aims to compare the nucleotide variations of the ITS1, 7SL RNA, and Hsp70 sequences between non-healed and healed anthroponotic cutaneous leishmaniasis (ACL) patients in major foci in Iran. A case-control study was carried out from September 2015 to October 2016 in the cities of Kerman and Bam, in the southeast of Iran. Randomly selected skin-scraping lesions of 40 patients (20 non-healed and 20 healed) were examined and the organisms were grown in a culture medium. Promastigotes were collected by centrifugation and kept for further molecular examinations. The extracted DNA was amplified and sequenced. After global sequence alignment with BioEdit software, maximum likelihood phylogenetic analysis was performed in PhyML for typing of Leishmania isolates. Nucleotide composition of each genetic region was also compared between non-healed and healed patients. Our results showed that all isolates belonged to the Leishmania tropica complex, with their genetic composition in the ITS1 region being different among non-healed and healed patients. 7SL RNA and Hsp70 regions were genetically identical between both groups. Variability in nucleotide patterns observed between both groups in the ITS1 region may serve to encourage future research on the function of these polymorphisms and may improve our understanding of the role of parasite genome properties on patients' response to Leishmania treatment. Our results also do not support future use of 7SL RNA and Hsp70 regions of the parasite for comparative genomic analyses. Copyright © 2018 Elsevier Ltd. All rights reserved.
Directory of Open Access Journals (Sweden)
Xin-Nan Jia
Full Text Available In this study, the authors first obtained the mitochondrial genome of Somanniathelphusa boyangensis. The results showed that the mitochondrial genome is 17,032bp in length, included 13 protein-coding genes, 2 rRNAs genes, 22 tRNAs genes and 1 putative control region, and it has the characteristics of the metazoan mitochondrial genome A+T bias. All tRNA genes display the typical clover-leaf secondary structure except tRNASer(AGN, which has lost the dihydroxyuridine arm. The GenBank database contains the mitochondrial genomes of representatives of approximately 22 families of Brachyura, comprising 56 species, including 4 species of freshwater crab. The authors established the phylogenetic relationships using the maximum likelihood and Bayesian inference methods. The phylogenetic relationship indicated that the molecular taxonomy of S. boyangensis is consistent with current morphological classification, and Parathelphusidae and Potamidae are derived within the freshwater clade or as part of it. In addition, the authors used the COX1 sequence of Somanniathelphusa in GenBank and the COX1 sequence of S. boyangensis to estimated the divergence time of this genus. The result displayed that the divergence time of Somanniathelphusa qiongshanensis is consistent with the separation of Hainan Island from mainland China in the Beibu Gulf, and the divergence time for Somanniathelphusa taiwanensis and Somanniathelphusa amoyensis is consistent with the separation of Taiwan Province from Mainland China at Fujian Province. These data indicate that geologic events influenced speciation of the genus Somanniathelphusa.
Phylogenetic and recombination analysis of tomato spotted wilt virus.
Directory of Open Access Journals (Sweden)
Sen Lian
Full Text Available Tomato spotted wilt virus (TSWV severely damages and reduces the yield of many economically important plants worldwide. In this study, we determined the whole-genome sequences of 10 TSWV isolates recently identified from various regions and hosts in Korea. Phylogenetic analysis of these 10 isolates as well as the three previously sequenced isolates indicated that the 13 Korean TSWV isolates could be divided into two groups reflecting either two different origins or divergences of Korean TSWV isolates. In addition, the complete nucleotide sequences for the 13 Korean TSWV isolates along with previously sequenced TSWV RNA segments from Korea and other countries were subjected to phylogenetic and recombination analysis. The phylogenetic analysis indicated that both the RNA L and RNA M segments of most Korean isolates might have originated in Western Europe and North America but that the RNA S segments for all Korean isolates might have originated in China and Japan. Recombination analysis identified a total of 12 recombination events among all isolates and segments and five recombination events among the 13 Korea isolates; among the five recombinants from Korea, three contained the whole RNA L segment, suggesting reassortment rather than recombination. Our analyses provide evidence that both recombination and reassortment have contributed to the molecular diversity of TSWV.
Phylogenetic analysis of hepatitis B virus in pakistan
International Nuclear Information System (INIS)
Baig, S.; Hasnain, N.U.
2008-01-01
To identify the distribution pattern of Hepatitis B Virus (HBV) genotype in a group of patients and to study its phylogenetic divergence. Two hundred and one HBV infected patients were genotyped for this study. All HbsAg positive individuals, either healthy carriers or suffering from conditions such as acute or chronic hepatitis, cirrhosis and hepatocellular carcinoma were included. Hepatitis B patients co-infected with other hepatic viruses were excluded. Hepatitis B virus DNA was extracted from serum, and subjected to a nested PCR, using the primers type-specific for genotype detection. Phylogenetic analysis was performed in the pre-S1 through S genes of HBV. The divergence was studied through 15 sequences of 967bp submitted to the DBJ/EMBL/GenBank databases accessible under accession number EF584640 through EF584654. Out of 201 patients tested, 156 were males and 45 were females. Genotype D was the predominant type found in 128 (64%) patients followed by A in 47 (23%) and mixed A/D in 26 (13%). Phylogenetic analysis confirmed the dominance of genotype D and subtype ayw2. There was dominance of genotype D subtype ayw2. It had a close resemblance with HBV strains that circulate in Iran, India and Japan. (author)
galaxieEST: addressing EST identity through automated phylogenetic analysis.
Nilsson, R Henrik; Rajashekar, Balaji; Larsson, Karl-Henrik; Ursing, Björn M
2004-07-05
Research involving expressed sequence tags (ESTs) is intricately coupled to the existence of large, well-annotated sequence repositories. Comparatively complete and satisfactory annotated public sequence libraries are, however, available only for a limited range of organisms, rendering the absence of sequences and gene structure information a tangible problem for those working with taxa lacking an EST or genome sequencing project. Paralogous genes belonging to the same gene family but distinguished by derived characteristics are particularly prone to misidentification and erroneous annotation; high but incomplete levels of sequence similarity are typically difficult to interpret and have formed the basis of many unsubstantiated assumptions of orthology. In these cases, a phylogenetic study of the query sequence together with the most similar sequences in the database may be of great value to the identification process. In order to facilitate this laborious procedure, a project to employ automated phylogenetic analysis in the identification of ESTs was initiated. galaxieEST is an open source Perl-CGI script package designed to complement traditional similarity-based identification of EST sequences through employment of automated phylogenetic analysis. It uses a series of BLAST runs as a sieve to retrieve nucleotide and protein sequences for inclusion in neighbour joining and parsimony analyses; the output includes the BLAST output, the results of the phylogenetic analyses, and the corresponding multiple alignments. galaxieEST is available as an on-line web service for identification of fungal ESTs and for download / local installation for use with any organism group at http://galaxie.cgb.ki.se/galaxieEST.html. By addressing sequence relatedness in addition to similarity, galaxieEST provides an integrative view on EST origin and identity, which may prove particularly useful in cases where similarity searches return one or more pertinent, but not full, matches and
[A phylogenetic analysis of plant communities of Teberda Biosphere Reserve].
Shulakov, A A; Egorov, A V; Onipchenko, V G
2016-01-01
Phylogenetic analysis of communities is based on the comparison of distances on the phylogenetic tree between species of a community under study and those distances in random samples taken out of local flora. It makes it possible to determine to what extent a community composition is formed by more closely related species (i.e., "clustered") or, on the opposite, it is more even and includes species that are less related with each other. The first case is usually interpreted as a result of strong influence caused by abiotic factors, due to which species with similar ecology, a priori more closely related, would remain: In the second case, biotic factors, such as competition, may come to the fore and lead to forming a community out of distant clades due to divergence of their ecological niches: The aim of this' study Was Ad explore the phylogenetic structure in communities of the northwestern Caucasus at two spatial scales - the scale of area from 4 to 100 m2 and the smaller scale within a community. The list of local flora of the alpine belt has been composed using the database of geobotanic descriptions carried out in Teberda Biosphere Reserve at true altitudes exceeding.1800 m. It includes 585 species of flowering plants belonging to 57 families. Basal groups of flowering plants are.not represented in the list. At the scale of communities of three classes, namely Thlaspietea rotundifolii - commumties formed on screes and pebbles, Calluno-Ulicetea - alpine meadow, and Mulgedio-Aconitetea subalpine meadows, have not demonstrated significant distinction of phylogenetic structure. At intra level, for alpine meadows the larger share of closely related species. (clustered community) is detected. Significantly clustered happen to be those communities developing on rocks (class Asplenietea trichomanis) and alpine (class Juncetea trifidi). At the same time, alpine lichen proved to have even phylogenetic structure at the small scale. Alpine (class Salicetea herbaceae) that
Onuk, A. Emre; Akcakaya, Murat; Bardhan, Jaydeep P.; Erdogmus, Deniz; Brooks, Dana H.; Makowski, Lee
2015-01-01
In this paper, we describe a model for maximum likelihood estimation (MLE) of the relative abundances of different conformations of a protein in a heterogeneous mixture from small angle X-ray scattering (SAXS) intensities. To consider cases where the solution includes intermediate or unknown conformations, we develop a subset selection method based on k-means clustering and the Cramér-Rao bound on the mixture coefficient estimation error to find a sparse basis set that represents the space spanned by the measured SAXS intensities of the known conformations of a protein. Then, using the selected basis set and the assumptions on the model for the intensity measurements, we show that the MLE model can be expressed as a constrained convex optimization problem. Employing the adenylate kinase (ADK) protein and its known conformations as an example, and using Monte Carlo simulations, we demonstrate the performance of the proposed estimation scheme. Here, although we use 45 crystallographically determined experimental structures and we could generate many more using, for instance, molecular dynamics calculations, the clustering technique indicates that the data cannot support the determination of relative abundances for more than 5 conformations. The estimation of this maximum number of conformations is intrinsic to the methodology we have used here. PMID:26924916
Dilla, Shintia Ulfa; Andriyana, Yudhie; Sudartianto
2017-03-01
Acid rain causes many bad effects in life. It is formed by two strong acids, sulfuric acid (H2SO4) and nitric acid (HNO3), where sulfuric acid is derived from SO2 and nitric acid from NOx {x=1,2}. The purpose of the research is to find out the influence of So4 and NO3 levels contained in the rain to the acidity (pH) of rainwater. The data are incomplete panel data with two-way error component model. The panel data is a collection of some of the observations that observed from time to time. It is said incomplete if each individual has a different amount of observation. The model used in this research is in the form of random effects model (REM). Minimum variance quadratic unbiased estimation (MIVQUE) is used to estimate the variance error components, while maximum likelihood estimation is used to estimate the parameters. As a result, we obtain the following model: Ŷ* = 0.41276446 - 0.00107302X1 + 0.00215470X2.
Sedayu, A.; Eurlings, M.C.M.; Gravendeel, B.; Hetterscheid, W.L.A.
2010-01-01
Sequences of three different genes in 69 taxa of Amorphophallus were combined to reconstruct the molecular phylogeny of this species-rich Aroid genus. The data set was analyzed by three different methods, Maximum Parsimony, Maximum Likelihood and Bayesian analysis, producing slightly different tree
Multigene analysis of lophophorate and chaetognath phylogenetic relationships.
Helmkampf, Martin; Bruchhaus, Iris; Hausdorf, Bernhard
2008-01-01
Maximum likelihood and Bayesian inference analyses of seven concatenated fragments of nuclear-encoded housekeeping genes indicate that Lophotrochozoa is monophyletic, i.e., the lophophorate groups Bryozoa, Brachiopoda and Phoronida are more closely related to molluscs and annelids than to Deuterostomia or Ecdysozoa. Lophophorates themselves, however, form a polyphyletic assemblage. The hypotheses that they are monophyletic and more closely allied to Deuterostomia than to Protostomia can be ruled out with both the approximately unbiased test and the expected likelihood weights test. The existence of Phoronozoa, a putative clade including Brachiopoda and Phoronida, has also been rejected. According to our analyses, phoronids instead share a more recent common ancestor with bryozoans than with brachiopods. Platyhelminthes is the sister group of Lophotrochozoa. Together these two constitute Spiralia. Although Chaetognatha appears as the sister group of Priapulida within Ecdysozoa in our analyses, alternative hypothesis concerning chaetognath relationships could not be rejected.
Directory of Open Access Journals (Sweden)
Ronald Herrera
2017-12-01
Full Text Available In a town located in a desert area of Northern Chile, gold and copper open-pit mining is carried out involving explosive processes. These processes are associated with increased dust exposure, which might affect children’s respiratory health. Therefore, we aimed to quantify the causal attributable risk of living close to the mines on asthma or allergic rhinoconjunctivitis risk burden in children. Data on the prevalence of respiratory diseases and potential confounders were available from a cross-sectional survey carried out in 2009 among 288 (response: 69 % children living in the community. The proximity of the children’s home addresses to the local gold and copper mine was calculated using geographical positioning systems. We applied targeted maximum likelihood estimation to obtain the causal attributable risk (CAR for asthma, rhinoconjunctivitis and both outcomes combined. Children living more than the first quartile away from the mines were used as the unexposed group. Based on the estimated CAR, a hypothetical intervention in which all children lived at least one quartile away from the copper mine would decrease the risk of rhinoconjunctivitis by 4.7 percentage points (CAR: − 4.7 ; 95 % confidence interval ( 95 % CI: − 8.4 ; − 0.11 ; and 4.2 percentage points (CAR: − 4.2 ; 95 % CI: − 7.9 ; − 0.05 for both outcomes combined. Overall, our results suggest that a hypothetical intervention intended to increase the distance between the place of residence of the highest exposed children would reduce the prevalence of respiratory disease in the community by around four percentage points. This approach could help local policymakers in the development of efficient public health strategies.
Herrera, Ronald; Berger, Ursula; von Ehrenstein, Ondine S; Díaz, Iván; Huber, Stella; Moraga Muñoz, Daniel; Radon, Katja
2017-12-27
In a town located in a desert area of Northern Chile, gold and copper open-pit mining is carried out involving explosive processes. These processes are associated with increased dust exposure, which might affect children's respiratory health. Therefore, we aimed to quantify the causal attributable risk of living close to the mines on asthma or allergic rhinoconjunctivitis risk burden in children. Data on the prevalence of respiratory diseases and potential confounders were available from a cross-sectional survey carried out in 2009 among 288 (response: 69 % ) children living in the community. The proximity of the children's home addresses to the local gold and copper mine was calculated using geographical positioning systems. We applied targeted maximum likelihood estimation to obtain the causal attributable risk (CAR) for asthma, rhinoconjunctivitis and both outcomes combined. Children living more than the first quartile away from the mines were used as the unexposed group. Based on the estimated CAR, a hypothetical intervention in which all children lived at least one quartile away from the copper mine would decrease the risk of rhinoconjunctivitis by 4.7 percentage points (CAR: - 4.7 ; 95 % confidence interval ( 95 % CI): - 8.4 ; - 0.11 ); and 4.2 percentage points (CAR: - 4.2 ; 95 % CI: - 7.9 ; - 0.05 ) for both outcomes combined. Overall, our results suggest that a hypothetical intervention intended to increase the distance between the place of residence of the highest exposed children would reduce the prevalence of respiratory disease in the community by around four percentage points. This approach could help local policymakers in the development of efficient public health strategies.
Moslemi, Vahid; Ashoor, Mansour
2017-10-01
One of the major problems associated with parallel hole collimators (PCs) is the trade-off between their resolution and sensitivity. To solve this problem, a novel PC - namely, extended parallel hole collimator (EPC) - was proposed, in which particular trapezoidal denticles were increased upon septa on the side of the detector. In this study, an EPC was designed and its performance was compared with that of two PCs, PC35 and PC41, with a hole size of 1.5 mm and hole lengths of 35 and 41 mm, respectively. The Monte Carlo method was used to calculate the important parameters such as resolution, sensitivity, scattering, and penetration ratio. A Jaszczak phantom was also simulated to evaluate the resolution and contrast of tomographic images, which were produced by the EPC6, PC35, and PC41 using the Monte Carlo N-particle version 5 code, and tomographic images were reconstructed by using maximum likelihood expectation maximization algorithm. Sensitivity of the EPC6 was increased by 20.3% in comparison with that of the PC41 at the identical spatial resolution and full-width at tenth of maximum here. Moreover, the penetration and scattering ratio of the EPC6 was 1.2% less than that of the PC41. The simulated phantom images show that the EPC6 increases contrast-resolution and contrast-to-noise ratio compared with those of PC41 and PC35. When compared with PC41 and PC35, EPC6 improved trade-off between resolution and sensitivity, reduced penetrating and scattering ratios, and produced images with higher quality. EPC6 can be used to increase detectability of more details in nuclear medicine images.
Penicillium simile sp. nov. revealed by morphological and phylogenetic analysis.
Davolos, Domenico; Pietrangeli, Biancamaria; Persiani, Anna Maria; Maggi, Oriana
2012-02-01
The morphology of three phenetically identical Penicillium isolates, collected from the bioaerosol in a restoration laboratory in Italy, displayed macro- and microscopic characteristics that were similar though not completely ascribable to Penicillium raistrickii. For this reason, a phylogenetic approach based on DNA sequencing analysis was performed to establish both the taxonomic status and the evolutionary relationships of these three peculiar isolates in relation to previously described species of the genus Penicillium. We used four nuclear loci (both rRNA and protein coding genes) that have previously proved useful for the molecular investigation of taxa belonging to the genus Penicillium at various evolutionary levels. The internal transcribed spacer region (ITS1-5.8S-ITS2), domains D1 and D2 of the 28S rDNA, a region of the tubulin beta chain gene (benA) and part of the calmodulin gene (cmd) were amplified by PCR and sequenced. Analysis of the rRNA genes and of the benA and cmd sequence data indicates the presence of three isogenic isolates belonging to a genetically distinct species of the genus Penicillium, here described and named Penicillium simile sp. nov. (ATCC MYA-4591(T) = CBS 129191(T)). This novel species is phylogenetically different from P. raistrickii and other related species of the genus Penicillium (e.g. Penicillium scabrosum), from which it can be distinguished on the basis of morphological trait analysis.
Detecting Network Communities: An Application to Phylogenetic Analysis
Andrade, Roberto F. S.; Rocha-Neto, Ivan C.; Santos, Leonardo B. L.; de Santana, Charles N.; Diniz, Marcelo V. C.; Lobão, Thierry Petit; Goés-Neto, Aristóteles; Pinho, Suani T. R.; El-Hani, Charbel N.
2011-01-01
This paper proposes a new method to identify communities in generally weighted complex networks and apply it to phylogenetic analysis. In this case, weights correspond to the similarity indexes among protein sequences, which can be used for network construction so that the network structure can be analyzed to recover phylogenetically useful information from its properties. The analyses discussed here are mainly based on the modular character of protein similarity networks, explored through the Newman-Girvan algorithm, with the help of the neighborhood matrix . The most relevant networks are found when the network topology changes abruptly revealing distinct modules related to the sets of organisms to which the proteins belong. Sound biological information can be retrieved by the computational routines used in the network approach, without using biological assumptions other than those incorporated by BLAST. Usually, all the main bacterial phyla and, in some cases, also some bacterial classes corresponded totally (100%) or to a great extent (>70%) to the modules. We checked for internal consistency in the obtained results, and we scored close to 84% of matches for community pertinence when comparisons between the results were performed. To illustrate how to use the network-based method, we employed data for enzymes involved in the chitin metabolic pathway that are present in more than 100 organisms from an original data set containing 1,695 organisms, downloaded from GenBank on May 19, 2007. A preliminary comparison between the outcomes of the network-based method and the results of methods based on Bayesian, distance, likelihood, and parsimony criteria suggests that the former is as reliable as these commonly used methods. We conclude that the network-based method can be used as a powerful tool for retrieving modularity information from weighted networks, which is useful for phylogenetic analysis. PMID:21573202
Phylogenetic analysis of the light-harvesting system in Chromera velia.
Pan, Hao; Slapeta, Jan; Carter, Dee; Chen, Min
2012-03-01
Chromera velia is a newly discovered photosynthetic eukaryotic alga that has functional chloroplasts closely related to the apicoplast of apicomplexan parasites. Recently, the chloroplast in C. velia was shown to be derived from the red algal lineage. Light-harvesting protein complexes (LHC), which are a group of proteins involved in photon capture and energy transfer in photosynthesis, are important for photosynthesis efficiency, photo-adaptation/accumulation and photo-protection. Although these proteins are encoded by genes located in the nucleus, LHC peptides migrate and function in the chloroplast, hence the LHC may have a different evolutionary history compared to chloroplast evolution. Here, we compare the phylogenetic relationship of the C. velia LHCs to LHCs from other photosynthetic organisms. Twenty-three LHC homologues retrieved from C. velia EST sequences were aligned according to their conserved regions. The C. velia LHCs are positioned in four separate groups on trees constructed by neighbour-joining, maximum likelihood and Bayesian methods. A major group of seventeen LHCs from C. velia formed a separate cluster that was closest to dinoflagellate LHC, and to LHC and fucoxanthin chlorophyll-binding proteins from diatoms. One C. velia LHC sequence grouped with LI1818/LI818-like proteins, which were recently identified as environmental stress-induced protein complexes. Only three LHC homologues from C. velia grouped with the LHCs from red algae.
Phylogenetic analysis of methanogens from the bovine rumen
Directory of Open Access Journals (Sweden)
Forster Robert J
2001-05-01
Full Text Available Abstract Background Interest in methanogens from ruminants has resulted from the role of methane in global warming and from the fact that cattle typically lose 6 % of ingested energy as methane. Several species of methanogens have been isolated from ruminants. However they are difficult to culture, few have been consistently found in high numbers, and it is likely that major species of rumen methanogens are yet to be identified. Results Total DNA from clarified bovine rumen fluid was amplified using primers specific for Archaeal 16S rRNA gene sequences (rDNA. Phylogenetic analysis of 41 rDNA sequences identified three clusters of methanogens. The largest cluster contained two distinct subclusters with rDNA sequences similar to Methanobrevibacter ruminantium 16S rDNA. A second cluster contained sequences related to 16S rDNA from Methanosphaera stadtmanae, an organism not previously described in the rumen. The third cluster contained rDNA sequences that may form a novel group of rumen methanogens. Conclusions The current set of 16S rRNA hybridization probes targeting methanogenic Archaea does not cover the phylogenetic diversity present in the rumen and possibly other gastro-intestinal tract environments. New probes and quantitative PCR assays are needed to determine the distribution of the newly identified methanogen clusters in rumen microbial communities.
Phylogenetic analysis and taxonomic revision of Physodactylinae (Coleoptera, Elateridae
Directory of Open Access Journals (Sweden)
Simone Policena Rosa
2014-01-01
Full Text Available A phylogeny based on male morphological characters and taxonomic revision of the Physodactylinae genera are presented. The phylogenetic analysis based on 66 male characters resulted in the polyphyly of Physodactylinae which comprises four independent lineages. Oligostethius and Idiotropia from Africa were found to be sister groups. Teslasena from Brazil was corroborated as belonging to Cardiophorinae clade. The South American genera Physodactylus and Dactylophysus were found to be sister groups and phylogenetically related to Heterocrepidius species. The Oriental Toxognathus resulted as sister group of that clade plus (Dicrepidius ramicornis (Lissomus sp, Physorhynus erythrocephalus. Taxonomic revisions include diagnoses and redescriptions of genera and distributional records and illustrations of species. Key to species of Teslasena, Toxognathus, Dactylophysus and Physodactylus are also provided. Teslasena lucasi is synonymized with T. femoralis. A new species of Dactylophysus is described, D. hirtus sp. nov., and lectotypes are designated to non-conspecific D. mendax sensu Fleutiaux and Heterocrepidius mendax Candèze. Physodactylus niger is removed from synonymy under P. oberthuri; P. carreti is synonymized with P. niger; P. obesus and P. testaceus are synonymized with P. sulcatus. Nine new species are described in Physodactylus: P. asper sp. nov., P. brunneus sp. nov., P. chassaini sp. nov., P. flavifrons sp. nov., P. girardi sp. nov., P. gounellei sp. nov., P. latithorax sp. nov., P. patens sp. nov. and P. tuberculatus sp. nov.
A phylogenetic transform enhances analysis of compositional microbiota data.
Silverman, Justin D; Washburne, Alex D; Mukherjee, Sayan; David, Lawrence A
2017-02-15
Surveys of microbial communities (microbiota), typically measured as relative abundance of species, have illustrated the importance of these communities in human health and disease. Yet, statistical artifacts commonly plague the analysis of relative abundance data. Here, we introduce the PhILR transform, which incorporates microbial evolutionary models with the isometric log-ratio transform to allow off-the-shelf statistical tools to be safely applied to microbiota surveys. We demonstrate that analyses of community-level structure can be applied to PhILR transformed data with performance on benchmarks rivaling or surpassing standard tools. Additionally, by decomposing distance in the PhILR transformed space, we identified neighboring clades that may have adapted to distinct human body sites. Decomposing variance revealed that covariation of bacterial clades within human body sites increases with phylogenetic relatedness. Together, these findings illustrate how the PhILR transform combines statistical and phylogenetic models to overcome compositional data challenges and enable evolutionary insights relevant to microbial communities.
Phylogenetic Analysis of Apple scar skin viroid Isolates in Korea
Directory of Open Access Journals (Sweden)
Kang Hee Cho
2015-12-01
Full Text Available To identify genome sequences of Apple scar skin viroid (ASSVd isolates in Korea, the field survey was performed from ‘Hongro’ apple orchards located in eight sites in South Korea (Bongwha, Cheongsong, Dangjin, Gimchoen, Muju, Mungyeong, Suwon, and Yeongwol. ASSVd was detected by RT-PCR and PCR fragments were cloned into cloning vector. Full-length viral genomes of eight ASSVd isolates were sequenced and compared with 21 isolates reported previously from Korea, India, China, Japan and Greece. Eight isolates in this study showed 92.2-99.7% nucleotide sequence identities with those reported previously. Phylogenetic analysis showed that seven isolates reported in this study belong to the same group distinct from other groups.
Phylogenetic analysis of ferlin genes reveals ancient eukaryotic origins
Directory of Open Access Journals (Sweden)
Lek Monkol
2010-07-01
Full Text Available Abstract Background The ferlin gene family possesses a rare and identifying feature consisting of multiple tandem C2 domains and a C-terminal transmembrane domain. Much currently remains unknown about the fundamental function of this gene family, however, mutations in its two most well-characterised members, dysferlin and otoferlin, have been implicated in human disease. The availability of genome sequences from a wide range of species makes it possible to explore the evolution of the ferlin family, providing contextual insight into characteristic features that define the ferlin gene family in its present form in humans. Results Ferlin genes were detected from all species of representative phyla, with two ferlin subgroups partitioned within the ferlin phylogenetic tree based on the presence or absence of a DysF domain. Invertebrates generally possessed two ferlin genes (one with DysF and one without, with six ferlin genes in most vertebrates (three DysF, three non-DysF. Expansion of the ferlin gene family is evident between the divergence of lamprey (jawless vertebrates and shark (cartilaginous fish. Common to almost all ferlins is an N-terminal C2-FerI-C2 sandwich, a FerB motif, and two C-terminal C2 domains (C2E and C2F adjacent to the transmembrane domain. Preservation of these structural elements throughout eukaryotic evolution suggests a fundamental role of these motifs for ferlin function. In contrast, DysF, C2DE, and FerA are optional, giving rise to subtle differences in domain topologies of ferlin genes. Despite conservation of multiple C2 domains in all ferlins, the C-terminal C2 domains (C2E and C2F displayed higher sequence conservation and greater conservation of putative calcium binding residues across paralogs and orthologs. Interestingly, the two most studied non-mammalian ferlins (Fer-1 and Misfire in model organisms C. elegans and D. melanogaster, present as outgroups in the phylogenetic analysis, with results suggesting
Directory of Open Access Journals (Sweden)
Edwin J. Niklitschek
2016-10-01
Full Text Available Background Mixture models (MM can be used to describe mixed stocks considering three sets of parameters: the total number of contributing sources, their chemical baseline signatures and their mixing proportions. When all nursery sources have been previously identified and sampled for juvenile fish to produce baseline nursery-signatures, mixing proportions are the only unknown set of parameters to be estimated from the mixed-stock data. Otherwise, the number of sources, as well as some/all nursery-signatures may need to be also estimated from the mixed-stock data. Our goal was to assess bias and uncertainty in these MM parameters when estimated using unconditional maximum likelihood approaches (ML-MM, under several incomplete sampling and nursery-signature separation scenarios. Methods We used a comprehensive dataset containing otolith elemental signatures of 301 juvenile Sparus aurata, sampled in three contrasting years (2008, 2010, 2011, from four distinct nursery habitats. (Mediterranean lagoons Artificial nursery-source and mixed-stock datasets were produced considering: five different sampling scenarios where 0–4 lagoons were excluded from the nursery-source dataset and six nursery-signature separation scenarios that simulated data separated 0.5, 1.5, 2.5, 3.5, 4.5 and 5.5 standard deviations among nursery-signature centroids. Bias (BI and uncertainty (SE were computed to assess reliability for each of the three sets of MM parameters. Results Both bias and uncertainty in mixing proportion estimates were low (BI ≤ 0.14, SE ≤ 0.06 when all nursery-sources were sampled but exhibited large variability among cohorts and increased with the number of non-sampled sources up to BI = 0.24 and SE = 0.11. Bias and variability in baseline signature estimates also increased with the number of non-sampled sources, but tended to be less biased, and more uncertain than mixing proportion ones, across all sampling scenarios (BI < 0.13, SE < 0
Darnaude, Audrey M.
2016-01-01
Background Mixture models (MM) can be used to describe mixed stocks considering three sets of parameters: the total number of contributing sources, their chemical baseline signatures and their mixing proportions. When all nursery sources have been previously identified and sampled for juvenile fish to produce baseline nursery-signatures, mixing proportions are the only unknown set of parameters to be estimated from the mixed-stock data. Otherwise, the number of sources, as well as some/all nursery-signatures may need to be also estimated from the mixed-stock data. Our goal was to assess bias and uncertainty in these MM parameters when estimated using unconditional maximum likelihood approaches (ML-MM), under several incomplete sampling and nursery-signature separation scenarios. Methods We used a comprehensive dataset containing otolith elemental signatures of 301 juvenile Sparus aurata, sampled in three contrasting years (2008, 2010, 2011), from four distinct nursery habitats. (Mediterranean lagoons) Artificial nursery-source and mixed-stock datasets were produced considering: five different sampling scenarios where 0–4 lagoons were excluded from the nursery-source dataset and six nursery-signature separation scenarios that simulated data separated 0.5, 1.5, 2.5, 3.5, 4.5 and 5.5 standard deviations among nursery-signature centroids. Bias (BI) and uncertainty (SE) were computed to assess reliability for each of the three sets of MM parameters. Results Both bias and uncertainty in mixing proportion estimates were low (BI ≤ 0.14, SE ≤ 0.06) when all nursery-sources were sampled but exhibited large variability among cohorts and increased with the number of non-sampled sources up to BI = 0.24 and SE = 0.11. Bias and variability in baseline signature estimates also increased with the number of non-sampled sources, but tended to be less biased, and more uncertain than mixing proportion ones, across all sampling scenarios (BI nursery signatures improved reliability
Conus pennaceus : a phylogenetic analysis of the Mozambican ...
African Journals Online (AJOL)
The genus Conus has over 500 species and is the most species-rich taxon of marine invertebrates. Based on mitochondrial DNA, this study focuses on the phylogenetics of Conus, particularly the pennaceus complex collected along the Mozambican coast. Phylogenetic trees based on both the 16S and the 12S ribosomal ...
Orthology prediction at scalable resolution by phylogenetic tree analysis
Heijden, R.T.J.M. van der; Snel, B.; Noort, V. van; Huynen, M.A.
2007-01-01
BACKGROUND: Orthology is one of the cornerstones of gene function prediction. Dividing the phylogenetic relations between genes into either orthologs or paralogs is however an oversimplification. Already in two-species gene-phylogenies, the complicated, non-transitive nature of phylogenetic
Balzer, Laura; Staples, Patrick; Onnela, Jukka-Pekka; DeGruttola, Victor
2017-04-01
Several cluster-randomized trials are underway to investigate the implementation and effectiveness of a universal test-and-treat strategy on the HIV epidemic in sub-Saharan Africa. We consider nesting studies of pre-exposure prophylaxis within these trials. Pre-exposure prophylaxis is a general strategy where high-risk HIV- persons take antiretrovirals daily to reduce their risk of infection from exposure to HIV. We address how to target pre-exposure prophylaxis to high-risk groups and how to maximize power to detect the individual and combined effects of universal test-and-treat and pre-exposure prophylaxis strategies. We simulated 1000 trials, each consisting of 32 villages with 200 individuals per village. At baseline, we randomized the universal test-and-treat strategy. Then, after 3 years of follow-up, we considered four strategies for targeting pre-exposure prophylaxis: (1) all HIV- individuals who self-identify as high risk, (2) all HIV- individuals who are identified by their HIV+ partner (serodiscordant couples), (3) highly connected HIV- individuals, and (4) the HIV- contacts of a newly diagnosed HIV+ individual (a ring-based strategy). We explored two possible trial designs, and all villages were followed for a total of 7 years. For each village in a trial, we used a stochastic block model to generate bipartite (male-female) networks and simulated an agent-based epidemic process on these networks. We estimated the individual and combined intervention effects with a novel targeted maximum likelihood estimator, which used cross-validation to data-adaptively select from a pre-specified library the candidate estimator that maximized the efficiency of the analysis. The universal test-and-treat strategy reduced the 3-year cumulative HIV incidence by 4.0% on average. The impact of each pre-exposure prophylaxis strategy on the 4-year cumulative HIV incidence varied by the coverage of the universal test-and-treat strategy with lower coverage resulting in a larger
Bayesian models for comparative analysis integrating phylogenetic uncertainty
Directory of Open Access Journals (Sweden)
Villemereuil Pierre de
2012-06-01
Full Text Available Abstract Background Uncertainty in comparative analyses can come from at least two sources: a phylogenetic uncertainty in the tree topology or branch lengths, and b uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow and inflated significance in hypothesis testing (e.g. p-values will be too small. Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. Methods We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. Results We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Conclusions Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible
Bayesian models for comparative analysis integrating phylogenetic uncertainty
2012-01-01
Background Uncertainty in comparative analyses can come from at least two sources: a) phylogenetic uncertainty in the tree topology or branch lengths, and b) uncertainty due to intraspecific variation in trait values, either due to measurement error or natural individual variation. Most phylogenetic comparative methods do not account for such uncertainties. Not accounting for these sources of uncertainty leads to false perceptions of precision (confidence intervals will be too narrow) and inflated significance in hypothesis testing (e.g. p-values will be too small). Although there is some application-specific software for fitting Bayesian models accounting for phylogenetic error, more general and flexible software is desirable. Methods We developed models to directly incorporate phylogenetic uncertainty into a range of analyses that biologists commonly perform, using a Bayesian framework and Markov Chain Monte Carlo analyses. Results We demonstrate applications in linear regression, quantification of phylogenetic signal, and measurement error models. Phylogenetic uncertainty was incorporated by applying a prior distribution for the phylogeny, where this distribution consisted of the posterior tree sets from Bayesian phylogenetic tree estimation programs. The models were analysed using simulated data sets, and applied to a real data set on plant traits, from rainforest plant species in Northern Australia. Analyses were performed using the free and open source software OpenBUGS and JAGS. Conclusions Incorporating phylogenetic uncertainty through an empirical prior distribution of trees leads to more precise estimation of regression model parameters than using a single consensus tree and enables a more realistic estimation of confidence intervals. In addition, models incorporating measurement errors and/or individual variation, in one or both variables, are easily formulated in the Bayesian framework. We show that BUGS is a useful, flexible general purpose tool for
Directory of Open Access Journals (Sweden)
Anna Claudia Baumel Mongruel
2017-08-01
Full Text Available Abstract Wild animals play an important role in carrying vectors that may potentially transmit pathogens. Several reports highlighted the participation of wild animals on the Anaplasma phagocytophilum cycle, including as hosts of the agent. The aim of this study was to report the molecular detection of an agent phylogenetically related to A. phagocytophilum isolated from a wild bird in the Midwest of the state of Paraná, Brazil. Fifteen blood samples were collected from eleven different bird species in the Guarapuava region. One sample collected from a Penelope obscura bird was positive in nested PCR targeting the 16S rRNA gene of Anaplasma spp. The phylogenetic tree based on the Maximum Likelihood analysis showed that the sequence obtained was placed in the same clade with A. phagocytophilum isolated from domestic cats in Brazil. The present study reports the first molecular detection of a phylogenetically related A. phagocytophilum bacterium in a bird from Paraná State.
Molecular Phylogenetics: Mathematical Framework and Unsolved Problems
Xia, Xuhua
Phylogenetic relationship is essential in dating evolutionary events, reconstructing ancestral genes, predicting sites that are important to natural selection, and, ultimately, understanding genomic evolution. Three categories of phylogenetic methods are currently used: the distance-based, the maximum parsimony, and the maximum likelihood method. Here, I present the mathematical framework of these methods and their rationales, provide computational details for each of them, illustrate analytically and numerically the potential biases inherent in these methods, and outline computational challenges and unresolved problems. This is followed by a brief discussion of the Bayesian approach that has been recently used in molecular phylogenetics.
PAL: an object-oriented programming library for molecular evolution and phylogenetics.
Drummond, A; Strimmer, K
2001-07-01
Phylogenetic Analysis Library (PAL) is a collection of Java classes for use in molecular evolution and phylogenetics. PAL provides a modular environment for the rapid construction of both special-purpose and general analysis programs. PAL version 1.1 consists of 145 public classes or interfaces in 13 packages, including classes for models of character evolution, maximum-likelihood estimation, and the coalescent, with a total of more than 27000 lines of code. The PAL project is set up as a collaborative project to facilitate contributions from other researchers. AVAILIABILTY: The program is free and is available at http://www.pal-project.org. It requires Java 1.1 or later. PAL is licensed under the GNU General Public License.
A molecular phylogenetic analysis of the Scarabaeinae (dung beetles).
Monaghan, Michael T; Inward, Daegan J G; Hunt, Toby; Vogler, Alfried P
2007-11-01
The dung beetles (Scarabaeinae) include ca. 5000 species and exhibit a diverse array of morphologies and behaviors. This variation presumably reflects the adaptation to a diversity of food types and the different strategies used to avoid competition for vertebrate dung, which is the primary breeding environment for most species. The current classification gives great weight to the major behavioral types, separating the ball rollers and the tunnelers, but existing phylogenetic studies have been based on limited taxonomic or biogeographic sampling and have been contradictory. Here, we present a molecular phylogenetic analysis of 214 species of Scarabaeinae, representing all 12 traditionally recognized tribes and six biogeographical regions, using partial gene sequences from one nuclear (28S) and two mitochondrial (cox1, rrnL) genes. Length variation in 28S (588-621 bp) and rrnL (514-523 bp) was subjected to a thorough evaluation of alternative alignments, gap-coding methods, and tree searches using model-based (Bayesian and likelihood), maximum parsimony, and direct optimization analyses. The small-bodied, non-dung-feeding Sarophorus+Coptorhina were basal in all reconstructions. These were closely related to rolling Odontoloma+Dicranocara, suggesting an early acquisition of rolling behavior. Smaller tribes and most genera were monophyletic, while Canthonini and Dichotomiini each consisted of multiple paraphyletic lineages at hierarchical levels equivalent to the smaller tribes. Plasticity of rolling and tunneling was evidenced by a lack of monophyly (S-H test, p > 0.05) and several reversals within clades. The majority of previously unrecognized clades were geographical, including the well-supported Neotropical Phanaeini+Eucraniini, and a large Australian clade of rollers as well as tunneling Coptodactyla and Demarziella. Only three lineages, Gymnopleurini, Copris+Microcopris and Onthophagus, were widespread and therefore appear to be dispersive at a global scale. A
Krajewski, C; Fain, M G; Buckley, L; King, D G
1999-11-01
ki ctes over whether molecular sequence data should be partitioned for phylogenetic analysis often confound two types of heterogeneity among partitions. We distinguish historical heterogeneity (i.e., different partitions have different evolutionary relationships) from dynamic heterogeneity (i.e., different partitions show different patterns of sequence evolution) and explore the impact of the latter on phylogenetic accuracy and precision with a two-gene, mitochondrial data set for cranes. The well-established phylogeny of cranes allows us to contrast tree-based estimates of relevant parameter values with estimates based on pairwise comparisons and to ascertain the effects of incorporating different amounts of process information into phylogenetic estimates. We show that codon positions in the cytochrome b and NADH dehydrogenase subunit 6 genes are dynamically heterogenous under both Poisson and invariable-sites + gamma-rates versions of the F84 model and that heterogeneity includes variation in base composition and transition bias as well as substitution rate. Estimates of transition-bias and relative-rate parameters from pairwise sequence comparisons were comparable to those obtained as tree-based maximum likelihood estimates. Neither rate-category nor mixed-model partitioning strategies resulted in a loss of phylogenetic precision relative to unpartitioned analyses. We suggest that weighted-average distances provide a computationally feasible alternative to direct maximum likelihood estimates of phylogeny for mixed-model analyses of large, dynamically heterogenous data sets. Copyright 1999 Academic Press.
Phylogenetic Analysis of Phytophthora Species Based on Mitochondrial and Nuclear DNA Sequences
Kroon, L.P.N.M.; Bakker, F.T.; Bosch, van den G.B.M.; Bonants, P.J.M.; Flier, W.G.
2004-01-01
A molecular phylogenetic analysis of the genus Phytophthora was performed, 113 isolates from 48 Phytophthora species were included in this analysis. Phylogenetic analyses were performed on regions of mitochondrial (cytochrome c oxidase subunit 1; NADH dehydrogenase subunit 1) and nuclear gene
Prevalence and phylogenetic analysis of HTLV-1 in a segregated population in Iran.
Rafatpanah, Houshang; Torkamani, Mahmood; Valizadeh, Narges; Vakili, Rosita; Meshkani, Baratali; Khademi, Hassan; Gerayli, Sina; Mozhgani, Sayed Hamid Reza; Rezaee, Seyed Abdolrahim
2016-07-01
Human T-lymphotropic virus type 1 (HTLV-1) infection is an important health issue that affects a variety of endemic areas. The Khorasan province, mainly its capital Mashhad in northeastern Iran, was reported to be as one of these endemic regions. Torbat-e Heydarieh, a large city Southwest border to Mashhad with a segregated population was investigated for the prevalence and associated risk factors of HTLV-1 infection in 400 randomly selected individuals. Blood samples were tested for the presence of HTLV-1 antibodies via the ELISA method and then were confirmed by an Immunoblot test. For the presence of HTLV-1 in lymphocytes of infected subjects, PCR was performed on LTR and TAX regions. DNA sequencing of LTR fragment was also carried out to determine the phylogenetic of HTLV-1, using the Maximum likelihood method. HTLV-1 sero-reactivity (sero-prevalence) among the study population was 2% (8/400), of which 1.25% had HTLV-1 provirus in lymphocytes (actual prevalence). HTLV-1 infection was significantly associated with the age, marital status, and history of blood transfusion (P cosmopolitan subtype A. HTLV-1 prevalence in Torbat-e Heydarieh (1.25%) is low comparing to those of both Mashhad (2-3%) and Neishabour (3.5-5%) in the province of Khorasan. Thus, traveling mobility and population mixing such as marriage, bureaucratic affairs, occupation, and economic activities could be the usual routs of HTLV-1 new wave of spreading in this segregated city. © 2015 Wiley Periodicals, Inc.
Sequence comparison and phylogenetic analysis of core gene of ...
African Journals Online (AJOL)
STORAGESEVER
2010-07-19
Jul 19, 2010 ... and antisense primers, a single band of 573 base pairs .... Amino acid sequence alignment of Cluster I and Cluster II of phylogenetic tree. First ten sequences ... sequence weighting, postion-spiecific gap penalties and weight.
SNPhylo: a pipeline to construct a phylogenetic tree from huge SNP data.
Lee, Tae-Ho; Guo, Hui; Wang, Xiyin; Kim, Changsoo; Paterson, Andrew H
2014-02-26
Phylogenetic trees are widely used for genetic and evolutionary studies in various organisms. Advanced sequencing technology has dramatically enriched data available for constructing phylogenetic trees based on single nucleotide polymorphisms (SNPs). However, massive SNP data makes it difficult to perform reliable analysis, and there has been no ready-to-use pipeline to generate phylogenetic trees from these data. We developed a new pipeline, SNPhylo, to construct phylogenetic trees based on large SNP datasets. The pipeline may enable users to construct a phylogenetic tree from three representative SNP data file formats. In addition, in order to increase reliability of a tree, the pipeline has steps such as removing low quality data and considering linkage disequilibrium. A maximum likelihood method for the inference of phylogeny is also adopted in generation of a tree in our pipeline. Using SNPhylo, users can easily produce a reliable phylogenetic tree from a large SNP data file. Thus, this pipeline can help a researcher focus more on interpretation of the results of analysis of voluminous data sets, rather than manipulations necessary to accomplish the analysis.
A phylogenetic analysis of the sugar porters in hemiascomycetous yeasts.
Palma, Margarida; Goffeau, André; Spencer-Martins, Isabel; Baret, Philippe V
2007-01-01
A total of 214 members of the sugar porter (SP) family (TC 2.A.1.1) from eight hemiascomycetous yeasts: Saccharomyces cerevisiae, Candida glabrata, Kluyveromyces lactis, Ashbya (Eremothecium) gossypii, Debaryomyces hansenii, Yarrowia lipolytica, Candida albicans and Pichia stipitis, were identified. The yeast SPs were classified in 13 different phylogenetic clusters. Specific sugar substrates could be allocated to nine phylogenetic clusters, including two novel TC clusters that are specific to fungi, i.e. the glycerol:H(+) symporter (2.A.1.1.38) and the high-affinity glucose transporter (2.A.1.1.39). Four phylogenetic clusters are identified by the preliminary fifth number Z23, Z24, Z25 and Z26 and the substrates of their members remain undetermined. The amplification of the SP clusters across the Hemiascomycetes reflects adaptation to specific carbon and energy sources available in the habitat of each yeast species. (c) 2007 S. Karger AG, Basel.
Orthology prediction at scalable resolution by phylogenetic tree analysis
Directory of Open Access Journals (Sweden)
Huynen Martijn A
2007-03-01
Full Text Available Abstract Background Orthology is one of the cornerstones of gene function prediction. Dividing the phylogenetic relations between genes into either orthologs or paralogs is however an oversimplification. Already in two-species gene-phylogenies, the complicated, non-transitive nature of phylogenetic relations results in inparalogs and outparalogs. For situations with more than two species we lack semantics to specifically describe the phylogenetic relations, let alone to exploit them. Published procedures to extract orthologous groups from phylogenetic trees do not allow identification of orthology at various levels of resolution, nor do they document the relations between the orthologous groups. Results We introduce "levels of orthology" to describe the multi-level nature of gene relations. This is implemented in a program LOFT (Levels of Orthology From Trees that assigns hierarchical orthology numbers to genes based on a phylogenetic tree. To decide upon speciation and gene duplication events in a tree LOFT can be instructed either to perform classical species-tree reconciliation or to use the species overlap between partitions in the tree. The hierarchical orthology numbers assigned by LOFT effectively summarize the phylogenetic relations between genes. The resulting high-resolution orthologous groups are depicted in colour, facilitating visual inspection of (large trees. A benchmark for orthology prediction, that takes into account the varying levels of orthology between genes, shows that the phylogeny-based high-resolution orthology assignments made by LOFT are reliable. Conclusion The "levels of orthology" concept offers high resolution, reliable orthology, while preserving the relations between orthologous groups. A Windows as well as a preliminary Java version of LOFT is available from the LOFT website http://www.cmbi.ru.nl/LOFT.
A Multi-Criterion Evolutionary Approach Applied to Phylogenetic Reconstruction
Cancino, W.; Delbem, A.C.B.
2010-01-01
In this paper, we proposed an MOEA approach, called PhyloMOEA which solves the phylogenetic inference problem using maximum parsimony and maximum likelihood criteria. The PhyloMOEA's development was motivated by several studies in the literature (Huelsenbeck, 1995; Jin & Nei, 1990; Kuhner & Felsenstein, 1994; Tateno et al., 1994), which point out that various phylogenetic inference methods lead to inconsistent solutions. Techniques using parsimony and likelihood criteria yield to different tr...
Applying phylogenetic analysis to viral livestock diseases: moving beyond molecular typing.
Olvera, Alex; Busquets, Núria; Cortey, Marti; de Deus, Nilsa; Ganges, Llilianne; Núñez, José Ignacio; Peralta, Bibiana; Toskano, Jennifer; Dolz, Roser
2010-05-01
Changes in livestock production systems in recent years have altered the presentation of many diseases resulting in the need for more sophisticated control measures. At the same time, new molecular assays have been developed to support the diagnosis of animal viral disease. Nucleotide sequences generated by these diagnostic techniques can be used in phylogenetic analysis to infer phenotypes by sequence homology and to perform molecular epidemiology studies. In this review, some key elements of phylogenetic analysis are highlighted, such as the selection of the appropriate neutral phylogenetic marker, the proper phylogenetic method and different techniques to test the reliability of the resulting tree. Examples are given of current and future applications of phylogenetic reconstructions in viral livestock diseases. Copyright 2009 Elsevier Ltd. All rights reserved.
Global phylogenetic analysis of contemporary aleutian mink disease viruses (AMDVs)
DEFF Research Database (Denmark)
Ryt-Hansen, Pia; Hagberg, E. E.; Chriél, Mariann
2017-01-01
a strain originating from Sweden. In contrast, we did not identify any potential source for the other and more widespread outbreak strain. To the authors knowledge this is the first major global phylogenetic study of contemporary AMDV partial NS1 sequences. The study proved that partial NS1 sequencing can...
Phylogenetic Analysis of the Bee Tribe Anthidiini | Combey | Journal ...
African Journals Online (AJOL)
The phylogenetic relationships among members of long tongue bee tribe Anthidiini (Megachilidae: Megachilinae) were investigated at the Department of Entomology and Wildlife, University of Cape Coast (Ghana) and the Agricultural Research Council, Pretoria (South Af-rica) from July, 2006 to May, 2007. Ten museums ...
A comparative phylogenetic analysis of full-length mariner elements
Indian Academy of Sciences (India)
Mariner like elements (MLEs) are widely distributed type II transposons with an open reading frame (ORF) for transposase. We studied comparative phylogenetic evolution and inverted terminal repeat (ITR) conservation of MLEs from Indian saturniid silkmoth, Antheraea mylitta with other full length MLEs submitted in the ...
A RAD-based phylogenetics for Orestias fishes from Lake Titicaca.
Takahashi, Tetsumi; Moreno, Edmundo
2015-12-01
The fish genus Orestias is endemic to the Andes highlands, and Lake Titicaca is the centre of the species diversity of the genus. Previous phylogenetic studies based on a single locus of mitochondrial and nuclear DNA strongly support the monophyly of a group composed of many of species endemic to the Lake Titicaca basin (the Lake Titicaca radiation), but the relationships among the species in the radiation remain unclear. Recently, restriction site-associated DNA (RAD) sequencing, which can produce a vast number of short sequences from various loci of nuclear DNA, has emerged as a useful way to resolve complex phylogenetic problems. To propose a new phylogenetic hypothesis of Orestias fishes of the Lake Titicaca radiation, we conducted a cluster analysis based on morphological similarities among fish samples and a molecular phylogenetic analysis based on RAD sequencing. From a morphological cluster analysis, we recognised four species groups in the radiation, and three of the four groups were resolved as monophyletic groups in maximum-likelihood trees based on RAD sequencing data. The other morphology-based group was not resolved as a monophyletic group in molecular phylogenies, and some members of the group were diverged from its sister group close to the root of the Lake Titicaca radiation. The evolution of these fishes is discussed from the phylogenetic relationships. Copyright © 2015 Elsevier Inc. All rights reserved.
Inferring Phylogenetic Networks Using PhyloNet.
Wen, Dingqiao; Yu, Yun; Zhu, Jiafan; Nakhleh, Luay
2018-07-01
PhyloNet was released in 2008 as a software package for representing and analyzing phylogenetic networks. At the time of its release, the main functionalities in PhyloNet consisted of measures for comparing network topologies and a single heuristic for reconciling gene trees with a species tree. Since then, PhyloNet has grown significantly. The software package now includes a wide array of methods for inferring phylogenetic networks from data sets of unlinked loci while accounting for both reticulation (e.g., hybridization) and incomplete lineage sorting. In particular, PhyloNet now allows for maximum parsimony, maximum likelihood, and Bayesian inference of phylogenetic networks from gene tree estimates. Furthermore, Bayesian inference directly from sequence data (sequence alignments or biallelic markers) is implemented. Maximum parsimony is based on an extension of the "minimizing deep coalescences" criterion to phylogenetic networks, whereas maximum likelihood and Bayesian inference are based on the multispecies network coalescent. All methods allow for multiple individuals per species. As computing the likelihood of a phylogenetic network is computationally hard, PhyloNet allows for evaluation and inference of networks using a pseudolikelihood measure. PhyloNet summarizes the results of the various analyzes and generates phylogenetic networks in the extended Newick format that is readily viewable by existing visualization software.
Directory of Open Access Journals (Sweden)
Jeffrey Rosenfeld
2016-03-01
Full Text Available Twenty one fully sequenced and well annotated insect genomes were used to construct genome content matrices for phylogenetic analysis and functional annotation of insect genomes. To examine the role of e-value cutoff in ortholog determination we used scaled e-value cutoffs and a single linkage clustering approach.. The present communication includes (1 a list of the genomes used to construct the genome content phylogenetic matrices, (2 a nexus file with the data matrices used in phylogenetic analysis, (3 a nexus file with the Newick trees generated by phylogenetic analysis, (4 an excel file listing the Core (CORE genes and Unique (UNI genes found in five insect groups, and (5 a figure showing a plot of consistency index (CI versus percent of unannotated genes that are apomorphies in the data set for gene losses and gains and bar plots of gains and losses for four consistency index (CI cutoffs.
Multilocus sequence typing and phylogenetic analysis of Propionibacterium acnes
DEFF Research Database (Denmark)
Kilian, Mogens; Scholz, Christian F. P.; Lomholt, Hans B.
2012-01-01
Propionibacterium acnes is a commensal of human skin but is also implicated in the pathogenesis of acne vulgaris, in biofilm-associated infections of medical devices and endophthalmitis, and in infections of bone and dental root canals. Recent studies associate P. acnes with prostate cancer...... schemes were compared with reference to a phylogenetic tree based on 78 P. acnes genomes and their gene contents. Further support for a basically clonal population structure of P. acnes and a scenario of the global spread of epidemic clones of P. acnes was obtained. Compared to the Belfast scheme...
Multilocus Sequence Typing (MLST) and Phylogenetic Analysis of Propionibacterium acnes
DEFF Research Database (Denmark)
Kilian, Mogens; Scholz, Christian; Lomholt, Hans B
2011-01-01
Propionibacterium acnes is a commensal of human skin but is also implicated in the pathogenesis of acne vulgaris and in biofilm-associated infections of medical devices and endophthalmitis, and in infections of bone and dental root canals. Recent studies associate P. acnes with prostate cancer...... with reference to a phylogenetic tree based on 78 P. acnes genomes and their gene contents. Further support for a basically clonal population structure of P. acnes and a scenario of global spread of epidemic clones of P. acnes was obtained. Compared with the Belfast scheme, the Aarhus MLST scheme (http...
Eder, Wolfgang; Ives Torres-Silva, Ana; Hohenegger, Johann
2017-04-01
Phylogenetic analysis and trees based on molecular data are broadly applied and used to infer genetical and biogeographic relationship in recent larger foraminifera. Molecular phylogenetic is intensively used within recent nummulitids, however for fossil representatives these trees are only of minor informational value. Hence, within paleontological studies a phylogenetic approach through morphometric analysis is of much higher value. To tackle phylogenetic relationships within the nummulitid family, a much higher number of morphological character must be measured than are commonly used in biometric studies, where mostly parameters describing embryonic size (e.g., proloculus diameter, deuteroloculus diameter) and/or the marginal spiral (e.g., spiral diagrams, spiral indices) are studied. For this purpose 11 growth-independent and/or growth-invariant characters have been used to describe the morphological variability of equatorial thin sections of seven Carribbean nummulitid taxa (Nummulites striatoreticulatus, N. macgillavry, Palaeonummulites willcoxi, P.floridensis, P. soldadensis, P.trinitatensis and P.ocalanus) and one outgroup taxon (Ranikothalia bermudezi). Using these characters, phylogenetic trees were calculated using a restricted maximum likelihood algorithm (REML), and results are cross-checked by ordination and cluster analysis. Square-change parsimony method has been run to reconstruct ancestral states, as well as to simulate the evolution of the chosen characters along the calculated phylogenetic tree and, independent - contrast analysis was used to estimate confidence intervals. Based on these simulations, phylogenetic tendencies of certain characters proposed for nummulitids (e.g., Cope's rule or nepionic acceleration) can be tested, whether these tendencies are valid for the whole family or only for certain clades. At least, within the Carribean nummulitids, phylogenetic trends along some growth-independent characters of the embryo (e.g., first
REFGEN and TREENAMER: Automated Sequence Data Handling for Phylogenetic Analysis in the Genomic Era
Leonard, Guy; Stevens, Jamie R.; Richards, Thomas A.
2009-01-01
The phylogenetic analysis of nucleotide sequences and increasingly that of amino acid sequences is used to address a number of biological questions. Access to extensive datasets, including numerous genome projects, means that standard phylogenetic analyses can include many hundreds of sequences. Unfortunately, most phylogenetic analysis programs do not tolerate the sequence naming conventions of genome databases. Managing large numbers of sequences and standardizing sequence labels for use in phylogenetic analysis programs can be a time consuming and laborious task. Here we report the availability of an online resource for the management of gene sequences recovered from public access genome databases such as GenBank. These web utilities include the facility for renaming every sequence in a FASTA alignment file, with each sequence label derived from a user-defined combination of the species name and/or database accession number. This facility enables the user to keep track of the branching order of the sequences/taxa during multiple tree calculations and re-optimisations. Post phylogenetic analysis, these webpages can then be used to rename every label in the subsequent tree files (with a user-defined combination of species name and/or database accession number). Together these programs drastically reduce the time required for managing sequence alignments and labelling phylogenetic figures. Additional features of our platform include the automatic removal of identical accession numbers (recorded in the report file) and generation of species and accession number lists for use in supplementary materials or figure legends. PMID:19812722
REFGEN and TREENAMER: Automated Sequence Data Handling for Phylogenetic Analysis in the Genomic Era
Directory of Open Access Journals (Sweden)
Guy Leonard
2009-01-01
Full Text Available The phylogenetic analysis of nucleotide sequences and increasingly that of amino acid sequences is used to address a number of biological questions. Access to extensive datasets, including numerous genome projects, means that standard phylogenetic analyses can include many hundreds of sequences. Unfortunately, most phylogenetic analysis programs do not tolerate the sequence naming conventions of genome databases. Managing large numbers of sequences and standardizing sequence labels for use in phylogenetic analysis programs can be a time consuming and laborious task. Here we report the availability of an online resource for the management of gene sequences recovered from public access genome databases such as GenBank. These web utilities include the facility for renaming every sequence in a FASTA alignment ﬁ le, with each sequence label derived from a user-defined combination of the species name and/or database accession number. This facility enables the user to keep track of the branching order of the sequences/taxa during multiple tree calculations and re-optimisations. Post phylogenetic analysis, these webpages can then be used to rename every label in the subsequent tree ﬁ les (with a user-defined combination of species name and/or database accession number. Together these programs drastically reduce the time required for managing sequence alignments and labelling phylogenetic figures. Additional features of our platform include the automatic removal of identical accession numbers (recorded in the report ﬁle and generation of species and accession number lists for use in supplementary materials or figure legends.
Jansen, Robert K; Kaittanis, Charalambos; Saski, Christopher; Lee, Seung-Bum; Tomkins, Jeffrey; Alverson, Andrew J; Daniell, Henry
2006-04-09
The Vitaceae (grape) is an economically important family of angiosperms whose phylogenetic placement is currently unresolved. Recent phylogenetic analyses based on one to several genes have suggested several alternative placements of this family, including sister to Caryophyllales, asterids, Saxifragales, Dilleniaceae or to rest of rosids, though support for these different results has been weak. There has been a recent interest in using complete chloroplast genome sequences for resolving phylogenetic relationships among angiosperms. These studies have clarified relationships among several major lineages but they have also emphasized the importance of taxon sampling and the effects of different phylogenetic methods for obtaining accurate phylogenies. We sequenced the complete chloroplast genome of Vitis vinifera and used these data to assess relationships among 27 angiosperms, including nine taxa of rosids. The Vitis vinifera chloroplast genome is 160,928 bp in length, including a pair of inverted repeats of 26,358 bp that are separated by small and large single copy regions of 19,065 bp and 89,147 bp, respectively. The gene content and order of Vitis is identical to many other unrearranged angiosperm chloroplast genomes, including tobacco. Phylogenetic analyses using maximum parsimony and maximum likelihood were performed on DNA sequences of 61 protein-coding genes for two datasets with 28 or 29 taxa, including eight or nine taxa from four of the seven currently recognized major clades of rosids. Parsimony and likelihood phylogenies of both data sets provide strong support for the placement of Vitaceae as sister to the remaining rosids. However, the position of the Myrtales and support for the monophyly of the eurosid I clade differs between the two data sets and the two methods of analysis. In parsimony analyses, the inclusion of Gossypium is necessary to obtain trees that support the monophyly of the eurosid I clade. However, maximum likelihood analyses place
Directory of Open Access Journals (Sweden)
Alverson Andrew J
2006-04-01
Full Text Available Abstract Background The Vitaceae (grape is an economically important family of angiosperms whose phylogenetic placement is currently unresolved. Recent phylogenetic analyses based on one to several genes have suggested several alternative placements of this family, including sister to Caryophyllales, asterids, Saxifragales, Dilleniaceae or to rest of rosids, though support for these different results has been weak. There has been a recent interest in using complete chloroplast genome sequences for resolving phylogenetic relationships among angiosperms. These studies have clarified relationships among several major lineages but they have also emphasized the importance of taxon sampling and the effects of different phylogenetic methods for obtaining accurate phylogenies. We sequenced the complete chloroplast genome of Vitis vinifera and used these data to assess relationships among 27 angiosperms, including nine taxa of rosids. Results The Vitis vinifera chloroplast genome is 160,928 bp in length, including a pair of inverted repeats of 26,358 bp that are separated by small and large single copy regions of 19,065 bp and 89,147 bp, respectively. The gene content and order of Vitis is identical to many other unrearranged angiosperm chloroplast genomes, including tobacco. Phylogenetic analyses using maximum parsimony and maximum likelihood were performed on DNA sequences of 61 protein-coding genes for two datasets with 28 or 29 taxa, including eight or nine taxa from four of the seven currently recognized major clades of rosids. Parsimony and likelihood phylogenies of both data sets provide strong support for the placement of Vitaceae as sister to the remaining rosids. However, the position of the Myrtales and support for the monophyly of the eurosid I clade differs between the two data sets and the two methods of analysis. In parsimony analyses, the inclusion of Gossypium is necessary to obtain trees that support the monophyly of the eurosid I clade
Klegarth, A R; Sanders, S A; Gloss, A D; Lane-deGraaf, K E; Jones-Engel, L; Fuentes, A; Hollocher, H
2017-08-01
Cyclical submergence and re-emergence of the Sunda Shelf throughout the Pleistocene served as a dynamic biogeographic landscape, across which long-tailed macaques (Macaca fascicularis) have migrated and evolved. Here, we tested the integrity of the previously reported continental-insular haplotype divide reported among Y and mitochondrial DNA lineages across multiple studies. The continental-insular haplotype divide was tested by heavily sampling wild macaques from two important biogeographic regions within Sundaland: (1) Singapore, the southernmost tip of continental Asia and (2) Bali, Indonesia, the southeastern edge of the Indonesian archipelago, immediately west of Wallace's line. Y DNA was haplotyped for samples from Bali, deep within the Indonesian archipelago. Mitochondrial D-loop from both islands was analyzed against existing data using Maximum Likelihood and Bayesian approaches. We uncovered both "continental" and "insular" Y DNA haplotypes in Bali. Between Singapore and Bali we found 52 unique mitochondrial haplotypes, none of which had been previously described. Phylogenetic analyses confirmed a major haplogroup division within Singapore and identified five new Singapore subclades and two primary subclades in Bali. While we confirmed the continental-insular divide among mtDNA haplotypes, maintenance of both Y DNA haplotypes on Bali, deep within the Indonesian archipelago calls into question the mechanism by which Y DNA diversity has been maintained. It also suggests the continental-insular designation is less appropriate for Y DNA, leading us to propose geographically neutral Y haplotype designations. © 2017 Wiley Periodicals, Inc.
PHYTOCHEMICAL AND PHYLOGENETIC ANALYSIS OF Spondias(Anacardiaceae
Directory of Open Access Journals (Sweden)
Cristiane Pereira
2015-07-01
Full Text Available This paper describes the correlation between the phenolic composition and the molecular phylogenetic reconstruction of five Spondias species (Anacardiaceae. Two of these species (S. venulosa and Spondias sp. occur in rainforest areas and the other three are widely distributed in Brazil (S. dulcis, S.mombin, and S. purpurea. The flavonoid enriched fraction of the S. venulosa leaf extract also underwent a chemical study. The results indicate that the presence of flavonol 3-O-glycosides are a synapomorphic character of the studied American Spondias and the production of rhamnetin 3-O-rutinoside is a synapomorphy of the Atlantic forest species. This is the first report of flavonoids in S. venulosa, an endemic species from the Brazilian Atlantic rainforest.
Evolution of climatic niche specialization: a phylogenetic analysis in amphibians.
Bonetti, Maria Fernanda; Wiens, John J
2014-11-22
The evolution of climatic niche specialization has important implications for many topics in ecology, evolution and conservation. The climatic niche reflects the set of temperature and precipitation conditions where a species can occur. Thus, specialization to a limited set of climatic conditions can be important for understanding patterns of biogeography, species richness, community structure, allopatric speciation, spread of invasive species and responses to climate change. Nevertheless, the factors that determine climatic niche width (level of specialization) remain poorly explored. Here, we test whether species that occur in more extreme climates are more highly specialized for those conditions, and whether there are trade-offs between niche widths on different climatic niche axes (e.g. do species that tolerate a broad range of temperatures tolerate only a limited range of precipitation regimes?). We test these hypotheses in amphibians, using phylogenetic comparative methods and global-scale datasets, including 2712 species with both climatic and phylogenetic data. Our results do not support either hypothesis. Rather than finding narrower niches in more extreme environments, niches tend to be narrower on one end of a climatic gradient but wider on the other. We also find that temperature and precipitation niche breadths are positively related, rather than showing trade-offs. Finally, our results suggest that most amphibian species occur in relatively warm and dry environments and have relatively narrow climatic niche widths on both of these axes. Thus, they may be especially imperilled by anthropogenic climate change. © 2014 The Author(s) Published by the Royal Society. All rights reserved.
Graf, Daniel L; Jones, Hugh; Geneva, Anthony J; Pfeiffer, John M; Klunzinger, Michael W
2015-04-01
The freshwater mussel family Hyriidae (Mollusca: Bivalvia: Unionida) has a disjunct trans-Pacific distribution in Australasia and South America. Previous phylogenetic analyses have estimated the evolutionary relationships of the family and the major infra-familial taxa (Velesunioninae and Hyriinae: Hyridellini in Australia; Hyriinae: Hyriini, Castaliini, and Rhipidodontini in South America), but taxon and character sampling have been too incomplete to support a predictive classification or allow testing of biogeographical hypotheses. We sampled 30 freshwater mussel individuals representing the aforementioned hyriid taxa, as well as outgroup species representing the five other freshwater mussel families and their marine sister group (order Trigoniida). Our ingroup included representatives of all Australian genera. Phylogenetic relationships were estimated from three gene fragments (nuclear 28S, COI and 16S mtDNA) using maximum parsimony, maximum likelihood, and Bayesian inference, and we applied a Bayesian relaxed clock model calibrated with fossil dates to estimate node ages. Our analyses found good support for monophyly of the Hyriidae and the subfamilies and tribes, as well as the paraphyly of the Australasian taxa (Velesunioninae, (Hyridellini, (Rhipidodontini, (Castaliini, Hyriini)))). The Hyriidae was recovered as sister to a clade comprised of all other Recent freshwater mussel families. Our molecular date estimation supported Cretaceous origins of the major hyriid clades, pre-dating the Tertiary isolation of South America from Antarctica/Australia. We hypothesize that early diversification of the Hyriidae was driven by terrestrial barriers on Gondwana rather than marine barriers following disintegration of the super-continent. Copyright © 2015 Elsevier Inc. All rights reserved.
Humphreys-Pereira, Danny A; Elling, Axel A
2014-01-01
Root-knot nematodes (Meloidogyne spp.) are among the most important plant pathogens. In this study, the mitochondrial (mt) genomes of the root-knot nematodes, M. chitwoodi and M. incognita were sequenced. PCR analyses suggest that both mt genomes are circular, with an estimated size of 19.7 and 18.6-19.1kb, respectively. The mt genomes each contain a large non-coding region with tandem repeats and the control region. The mt gene arrangement of M. chitwoodi and M. incognita is unlike that of other nematodes. Sequence alignments of the two Meloidogyne mt genomes showed three translocations; two in transfer RNAs and one in cox2. Compared with other nematode mt genomes, the gene arrangement of M. chitwoodi and M. incognita was most similar to Pratylenchus vulnus. Phylogenetic analyses (Maximum Likelihood and Bayesian inference) were conducted using 78 complete mt genomes of diverse nematode species. Analyses based on nucleotides and amino acids of the 12 protein-coding mt genes showed strong support for the monophyly of class Chromadorea, but only amino acid-based analyses supported the monophyly of class Enoplea. The suborder Spirurina was not monophyletic in any of the phylogenetic analyses, contradicting the Clade III model, which groups Ascaridomorpha, Spiruromorpha and Oxyuridomorpha based on the small subunit ribosomal RNA gene. Importantly, comparisons of mt gene arrangement and tree-based methods placed Meloidogyne as sister taxa of Pratylenchus, a migratory plant endoparasitic nematode, and not with the sedentary endoparasitic Heterodera. Thus, comparative analyses of mt genomes suggest that sedentary endoparasitism in Meloidogyne and Heterodera is based on convergent evolution. Copyright © 2014 Elsevier B.V. All rights reserved.
DendroBLAST: approximate phylogenetic trees in the absence of multiple sequence alignments.
Kelly, Steven; Maini, Philip K
2013-01-01
The rapidly growing availability of genome information has created considerable demand for both fast and accurate phylogenetic inference algorithms. We present a novel method called DendroBLAST for reconstructing phylogenetic dendrograms/trees from protein sequences using BLAST. This method differs from other methods by incorporating a simple model of sequence evolution to test the effect of introducing sequence changes on the reliability of the bipartitions in the inferred tree. Using realistic simulated sequence data we demonstrate that this method produces phylogenetic trees that are more accurate than other commonly-used distance based methods though not as accurate as maximum likelihood methods from good quality multiple sequence alignments. In addition to tests on simulated data, we use DendroBLAST to generate input trees for a supertree reconstruction of the phylogeny of the Archaea. This independent analysis produces an approximate phylogeny of the Archaea that has both high precision and recall when compared to previously published analysis of the same dataset using conventional methods. Taken together these results demonstrate that approximate phylogenetic trees can be produced in the absence of multiple sequence alignments, and we propose that these trees will provide a platform for improving and informing downstream bioinformatic analysis. A web implementation of the DendroBLAST method is freely available for use at http://www.dendroblast.com/.
Phylogenetic Information Content of Copepoda Ribosomal DNA Repeat Units: ITS1 and ITS2 Impact
Zagoskin, Maxim V.; Lazareva, Valentina I.; Grishanin, Andrey K.; Mukha, Dmitry V.
2014-01-01
The utility of various regions of the ribosomal repeat unit for phylogenetic analysis was examined in 16 species representing four families, nine genera, and two orders of the subclass Copepoda (Crustacea). Fragments approximately 2000 bp in length containing the ribosomal DNA (rDNA) 18S and 28S gene fragments, the 5.8S gene, and the internal transcribed spacer regions I and II (ITS1 and ITS2) were amplified and analyzed. The DAMBE (Data Analysis in Molecular Biology and Evolution) software was used to analyze the saturation of nucleotide substitutions; this test revealed the suitability of both the 28S gene fragment and the ITS1/ITS2 rDNA regions for the reconstruction of phylogenetic trees. Distance (minimum evolution) and probabilistic (maximum likelihood, Bayesian) analyses of the data revealed that the 28S rDNA and the ITS1 and ITS2 regions are informative markers for inferring phylogenetic relationships among families of copepods and within the Cyclopidae family and associated genera. Split-graph analysis of concatenated ITS1/ITS2 rDNA regions of cyclopoid copepods suggested that the Mesocyclops, Thermocyclops, and Macrocyclops genera share complex evolutionary relationships. This study revealed that the ITS1 and ITS2 regions potentially represent different phylogenetic signals. PMID:25215300
DendroBLAST: approximate phylogenetic trees in the absence of multiple sequence alignments.
Directory of Open Access Journals (Sweden)
Steven Kelly
Full Text Available The rapidly growing availability of genome information has created considerable demand for both fast and accurate phylogenetic inference algorithms. We present a novel method called DendroBLAST for reconstructing phylogenetic dendrograms/trees from protein sequences using BLAST. This method differs from other methods by incorporating a simple model of sequence evolution to test the effect of introducing sequence changes on the reliability of the bipartitions in the inferred tree. Using realistic simulated sequence data we demonstrate that this method produces phylogenetic trees that are more accurate than other commonly-used distance based methods though not as accurate as maximum likelihood methods from good quality multiple sequence alignments. In addition to tests on simulated data, we use DendroBLAST to generate input trees for a supertree reconstruction of the phylogeny of the Archaea. This independent analysis produces an approximate phylogeny of the Archaea that has both high precision and recall when compared to previously published analysis of the same dataset using conventional methods. Taken together these results demonstrate that approximate phylogenetic trees can be produced in the absence of multiple sequence alignments, and we propose that these trees will provide a platform for improving and informing downstream bioinformatic analysis. A web implementation of the DendroBLAST method is freely available for use at http://www.dendroblast.com/.
Peyre, Hugo; Leplège, Alain; Coste, Joël
2011-03-01
Missing items are common in quality of life (QoL) questionnaires and present a challenge for research in this field. It remains unclear which of the various methods proposed to deal with missing data performs best in this context. We compared personal mean score, full information maximum likelihood, multiple imputation, and hot deck techniques using various realistic simulation scenarios of item missingness in QoL questionnaires constructed within the framework of classical test theory. Samples of 300 and 1,000 subjects were randomly drawn from the 2003 INSEE Decennial Health Survey (of 23,018 subjects representative of the French population and having completed the SF-36) and various patterns of missing data were generated according to three different item non-response rates (3, 6, and 9%) and three types of missing data (Little and Rubin's "missing completely at random," "missing at random," and "missing not at random"). The missing data methods were evaluated in terms of accuracy and precision for the analysis of one descriptive and one association parameter for three different scales of the SF-36. For all item non-response rates and types of missing data, multiple imputation and full information maximum likelihood appeared superior to the personal mean score and especially to hot deck in terms of accuracy and precision; however, the use of personal mean score was associated with insignificant bias (relative bias personal mean score appears nonetheless appropriate for dealing with items missing from completed SF-36 questionnaires in most situations of routine use. These results can reasonably be extended to other questionnaires constructed according to classical test theory.
Maximum Likelihood Learning of Conditional MTE Distributions
DEFF Research Database (Denmark)
Langseth, Helge; Nielsen, Thomas Dyhre; Rumí, Rafael
2009-01-01
We describe a procedure for inducing conditional densities within the mixtures of truncated exponentials (MTE) framework. We analyse possible conditional MTE speciﬁcations and propose a model selection scheme, based on the BIC score, for partitioning the domain of the conditioning variables....... Finally, experimental results demonstrate the applicability of the learning procedure as well as the expressive power of the conditional MTE distribution....
Genome-wide comparative analysis of phylogenetic trees: the prokaryotic forest of life.
Puigbò, Pere; Wolf, Yuri I; Koonin, Eugene V
2012-01-01
Genome-wide comparison of phylogenetic trees is becoming an increasingly common approach in evolutionary genomics, and a variety of approaches for such comparison have been developed. In this article, we present several methods for comparative analysis of large numbers of phylogenetic trees. To compare phylogenetic trees taking into account the bootstrap support for each internal branch, the Boot-Split Distance (BSD) method is introduced as an extension of the previously developed Split Distance method for tree comparison. The BSD method implements the straightforward idea that comparison of phylogenetic trees can be made more robust by treating tree splits differentially depending on the bootstrap support. Approaches are also introduced for detecting tree-like and net-like evolutionary trends in the phylogenetic Forest of Life (FOL), i.e., the entirety of the phylogenetic trees for conserved genes of prokaryotes. The principal method employed for this purpose includes mapping quartets of species onto trees to calculate the support of each quartet topology and so to quantify the tree and net contributions to the distances between species. We describe the application of these methods to analyze the FOL and the results obtained with these methods. These results support the concept of the Tree of Life (TOL) as a central evolutionary trend in the FOL as opposed to the traditional view of the TOL as a "species tree."
Multiple sequence alignment accuracy and phylogenetic inference.
Ogden, T Heath; Rosenberg, Michael S
2006-04-01
Phylogenies are often thought to be more dependent upon the specifics of the sequence alignment rather than on the method of reconstruction. Simulation of sequences containing insertion and deletion events was performed in order to determine the role that alignment accuracy plays during phylogenetic inference. Data sets were simulated for pectinate, balanced, and random tree shapes under different conditions (ultrametric equal branch length, ultrametric random branch length, nonultrametric random branch length). Comparisons between hypothesized alignments and true alignments enabled determination of two measures of alignment accuracy, that of the total data set and that of individual branches. In general, our results indicate that as alignment error increases, topological accuracy decreases. This trend was much more pronounced for data sets derived from more pectinate topologies. In contrast, for balanced, ultrametric, equal branch length tree shapes, alignment inaccuracy had little average effect on tree reconstruction. These conclusions are based on average trends of many analyses under different conditions, and any one specific analysis, independent of the alignment accuracy, may recover very accurate or inaccurate topologies. Maximum likelihood and Bayesian, in general, outperformed neighbor joining and maximum parsimony in terms of tree reconstruction accuracy. Results also indicated that as the length of the branch and of the neighboring branches increase, alignment accuracy decreases, and the length of the neighboring branches is the major factor in topological accuracy. Thus, multiple-sequence alignment can be an important factor in downstream effects on topological reconstruction.
Directory of Open Access Journals (Sweden)
Daniel R Brooks
2007-12-01
Full Text Available We review Hennigian phylogenetics and compare it with Maximum parsimony, Maximum likelihood, and Bayesian likelihood approaches. All methods use the principle of parsimony in some form. Hennigian-based approaches are justified ontologically by the Darwinian concepts of phylogenetic conservatism and cohesion of homologies, embodied in Hennig's Auxiliary Principle, and applied by outgroup comparisons. Parsimony is used as an epistemological tool, applied a posteriori to choose the most robust hypothesis when there are conflicting data. Quantitative methods use parsimony as an ontological criterion: Maximum parsimony analysis uses unweighted parsimony, Maximum likelihood weight all characters equally that explain the data, and Bayesian likelihood relying on weighting each character partition that explains the data. Different results most often stem from insufficient data, in which case each quantitative method treats ambiguities differently. All quantitative methods produce networks. The networks can be converted into trees by rooting them. If the rooting is done in accordance with Hennig's Auxiliary Principle, using outgroup comparisons, the resulting tree can then be interpreted as a phylogenetic hypothesis. As the size of the data set increases, likelihood methods select models that allow an increasingly greater number of a priori possibilities, converging on the Hennigian perspective that nothing is prohibited a priori. Thus, all methods produce similar results, regardless of data type, especially when their networks are rooted using outgroups. Appeals to Popperian philosophy cannot justify any kind of phylogenetic analysis, because they argue from effect to cause rather than from cause to effect. Nor can particular methods be justified on the basis of statistical consistency, because all may be consistent or inconsistent depending on the data. If analyses using different types of data and/or different methods of phylogeny reconstruction do not
The Nothoaspis amazoniensis Complete Mitogenome: A Comparative and Phylogenetic Analysis
Directory of Open Access Journals (Sweden)
Paulo H. C. Lima
2018-03-01
Full Text Available The molecular biology era, together with morphology, molecular phylogenetics, bioinformatics, and high-throughput sequencing technologies, improved the taxonomic identification of Argasidae family members, especially when considering specimens at different development stages, which remains a great difficulty for acarologists. These tools could provide important data and insights on the history and evolutionary relationships of argasids. To better understand these relationships, we sequenced and assembled the first complete mitochondrial genome of Nothoaspis amazoniensis. We used phylogenomics to identify the evolutionary history of this species of tick, comparing the data obtained with 26 complete mitochondrial sequences available in biological databases. The results demonstrated the absence of genetic rearrangements, high similarity and identity, and a close organizational link between the mitogenomes of N. amazoniensis and other argasids analyzed. In addition, the mitogenome had a monophyletic cladistic taxonomic arrangement, encompassed by representatives of the Afrotropical and Neotropical regions, with specific parasitism in bats, which may be indicative of an evolutionary process of cospeciation between vectors and the host.
Phylogenetic Analysis of Canine Parvovirus VP2 Gene in China.
Yi, L; Tong, M; Cheng, Y; Song, W; Cheng, S
2016-04-01
In this study, a total of 37 samples (58.0%) were found through PCR assay to be positive for canine parvovirus (CPV) of 66 suspected faecal samples of dogs collected from various cities throughout China. Eight CPV isolates could be obtained in the CRFK cell line. The sequencing of the VP2 gene of CPV identified the predominant CPV strain as CPV-2a (Ser297Ala), with two CPV-2b (Ser297Ala). Sequence comparison revealed homologies of 99.3-99.9%, 99.9% and 99.3-99.7% within the CPV 2a isolates, within the CPV 2b isolates and between the CPV 2a and 2b isolates, respectively. In addition, several non-synonymous and synonymous mutations were also recorded. The phylogenetic tree revealed that most of the CPV strains from different areas in China were located in the formation of a large branch, which were grouped together along with the KU143-09 strain from Thailand and followed the same evolution. In this study, we provide an updated molecular characterization of CPV 2 circulation in China. © 2014 Blackwell Verlag GmbH.
Molecular characterization and phylogenetic analysis of Fasciola hepatica from Peru.
Ichikawa-Seki, Madoka; Ortiz, Pedro; Cabrera, Maria; Hobán, Cristian; Itagaki, Tadashi
2016-06-01
The causative agent of fasciolosis in South America is thought to be Fasciola hepatica. In this study, Fasciola flukes from Peru were analyzed to investigate their genetic structure and phylogenetic relationships with those from other countries. Fasciola flukes were collected from the three definitive host species: cattle, sheep, and pigs. They were identified as F. hepatica because mature sperms were observed in their seminal vesicles, and also they displayed Fh type, which has an identical fragment pattern to F. hepatica in the nuclear internal transcribed spacer 1. Eight haplotypes were obtained from the mitochondrial NADH dehydrogenase subunit 1 (nad1) sequences of Peruvian F. hepatica; however, no special difference in genetic structure was observed between the three host species. Its extremely low genetic diversity suggests that the Peruvian population was introduced from other regions. Nad1 haplotypes identical to those of Peruvian F. hepatica were detected in China, Uruguay, Italy, Iran, and Australia. Our results indicate that F. hepatica rapidly expanded its range due to human migration. Future studies are required to elucidate dispersal route of F. hepatica from Europe, its probable origin, to other areas, including Peru. Copyright © 2015. Published by Elsevier Ireland Ltd.
Phylogenetic analysis reveals multiple introductions of Cynodon species in Australia.
Jewell, M; Frère, C H; Harris-Shultz, K; Anderson, W F; Godwin, I D; Lambrides, C J
2012-11-01
The distinction between native and introduced flora within isolated land masses presents unique challenges. The geological and colonisation history of Australia, the world's largest island, makes it a valuable system for studying species endemism, introduction, and phylogeny. Using this strategy we investigated Australian cosmopolitan grasses belonging to the genus Cynodon. While it is believed that seven species of Cynodon are present in Australia, no genetic analyses have investigated the origin, diversity and phylogenetic history of Cynodon within Australia. To address this gap, 147 samples (92 from across Australia and 55 representing global distribution) were sequenced for a total of 3336bp of chloroplast DNA spanning six genes. Data showed the presence of at least six putatively introduced Cynodon species (C. transvaalensis, C. incompletus, C. hirsutus, C. radiatus, C. plectostachyus and C. dactylon) in Australia and suggested multiple recent introductions. C. plectostachyus, a species often confused with C. nlemfuensis, was not previously considered to be present in Australia. Most significantly, we identified two common haplotypes that formed a monophyletic clade diverging from previously identified Cynodon species. We hypothesise that these two haplotypes may represent a previously undescribed species of Cynodon. We provide further evidence that two Australian native species, Brachyachne tenella and B. convergens belong in the genus Cynodon and, therefore, argue for the taxonomic revision of the genus Cynodon. Copyright © 2012 Elsevier Inc. All rights reserved.
Molecular phylogenetic analysis of Fasciola flukes from eastern India.
Hayashi, Kei; Ichikawa-Seki, Madoka; Mohanta, Uday Kumar; Singh, T Shantikumar; Shoriki, Takuya; Sugiyama, Hiromu; Itagaki, Tadashi
2015-10-01
Fasciola flukes from eastern India were characterized on the basis of spermatogenesis status and nuclear ITS1. Both Fasciola gigantica and aspermic Fasciola flukes were detected in Imphal, Kohima, and Gantoku districts. The sequences of mitochondrial nad1 were analyzed to infer their phylogenetical relationship with neighboring countries. The haplotypes of aspermic Fasciola flukes were identical or showed a single nucleotide substitution compared to those from populations in the neighboring countries, corroborating the previous reports that categorized them in the same lineage. However, the prevalence of aspermic Fasciola flukes in eastern India was lower than those in the neighboring countries, suggesting that they have not dispersed throughout eastern India. In contrast, F. gigantica was predominant and well diversified, and the species was thought to be distributed in the area for a longer time than the aspermic Fasciola flukes. Fasciola gigantica populations from eastern India were categorized into two distinct haplogroups A and B. The level of their genetic diversity suggests that populations belonging to haplogroup A have dispersed from the west side of the Indian subcontinent to eastern India with the artificial movement of domestic cattle, Bos indicus, whereas populations belonging to haplogroup B might have spread from Myanmar to eastern India with domestic buffaloes, Bubalus bubalis. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Sun, Ping; Clamp, John C; Xu, Dapeng
2010-07-01
Despite extensive previous morphological work, little agreement has been reached about phylogenetic relationships among peritrich ciliates, making it difficult to study the evolution of the group in a phylogenetic framework. In this study, the nucleotide characteristics and secondary structures of internal transcribed spacers 1 and 2 (ITS1 and ITS2) of 26 peritrich ciliates in 12 genera were analyzed. Information from secondary structures of ITS1 and ITS2 then was used to perform the first systematic study of ITS regions in peritrich ciliates, including one species of Rhabdostyla for which no sequence has been reported previously. Lengths of ITS1 and ITS2 sequences varied relatively little among taxa studied, but their G+C content was highly variable. General secondary structure models of ITS1 and ITS2 were proposed for peritrich ciliates and their reliability was assessed by compensatory base changes. The secondary structure of ITS1 contains three major helices in peritrich ciliates and deviations from this basic structure were found in all taxa examined. The core structure of peritrich ITS2 includes four helices, with helix III as the longest and containing a motif 5'-MAC versus GUK-3' at its apex as well as a YU-UY mismatch in helix II. In addition, the structural motifs of both ITS secondary structures were identified. Phylogenetic analyses using ITS data were performed by means of Bayesian inference, maximum likelihood and neighbor joining methods. Trees had a consistent branching pattern that included the following features: (1) Rhabdostyla always clustered with members of the family Vorticellidae, instead of members of the family Epistylididae, in which it is now classified on the basis of morphology. (2) The systematically questionable genus Ophrydium closely associated with Carchesium, forming a clearly defined, monophyletic group within the Vorticellidae. This supported the hypothesis derived from previous study based on small subunit rRNA gene sequences
Genotyping and phylogenetic analysis of Pneumocystis jirovecii isolates from India.
Gupta, Rashmi; Mirdha, Bijay Ranjan; Guleria, Randeep; Agarwal, Sanjay Kumar; Samantaray, Jyotish Chandra; Kumar, Lalit; Kabra, Sushil Kumar; Luthra, Kalpana; Sreenivas, Vishnubhatla
2010-08-01
Pneumocystis jirovecii is the cause of Pneumocystis pneumonia (PCP) in immuno-compromised individuals. The aim of this study was to describe the genotypes/haplotypes of P. jirovecii in immuno-compromised individuals with positive polymerase chain reaction (PCR) result for PCP. The typing was based on sequence polymorphism at internal transcribed spacer (ITS) regions of rRNA operon. Phylogenetic relationship between Indian and global haplotypes was also studied. Between January 2005 to October 2008, 43 patients were found to be positive for Pneumocystis using PCR targeting mitochondrial large subunit rRNA (mt LSU rRNA) and ITS region. Genotyping of all the positive samples was performed at the ITS locus by direct sequencing. Nine ITS1 alleles (all previously known) and 11 ITS2 alleles (nine previously defined and two new) were observed. A total of 19 ITS haplotypes, including five novel haplotypes (DEL1r, Edel2, Hr, Adel3 and SYD1a), were observed. The most prevalent type was SYD1g (16.3%), followed by types Ea (11.6%), Ec (9.3%), Eg (6.9%), DEL1r (6.9%), Ne (6.9%) and Ai (6.9%). To detect mixed infection, 30% of the positive isolates were cloned and 4-5 clones were sequenced from each specimen. Cloning and sequencing identified two more haplotypes in addition to the 19 types. Mixed infection was identified in 3 of the 13 cloned samples (23.1%). Upon construction of a haplotype network of 21 haplotypes, type Eg was identified as the most probable ancestral type. The present study is the first study that describes the haplotypes of P. jirovecii based on the ITS gene from India. The study suggests a high diversity of P. jirovecii haplotypes in the population. Copyright 2010 Elsevier B.V. All rights reserved.
Directory of Open Access Journals (Sweden)
Mariën J
2008-03-01
Full Text Available Abstract Background In recent years, several new hypotheses on phylogenetic relations among arthropods have been proposed on the basis of DNA sequences. One of the challenged hypotheses is the monophyly of hexapods. This discussion originated from analyses based on mitochondrial DNA datasets that, due to an unusual positioning of Collembola, suggested that the hexapod body plan evolved at least twice. Here, we re-evaluate the position of Collembola using ribosomal protein gene sequences. Results In total 48 ribosomal proteins were obtained for the collembolan Folsomia candida. These 48 sequences were aligned with sequence data on 35 other ecdysozoans. Each ribosomal protein gene was available for 25% to 86% of the taxa. However, the total sequence information was unequally distributed over the taxa and ranged between 4% and 100%. A concatenated dataset was constructed (5034 inferred amino acids in length, of which ~66% of the positions were filled. Phylogenetic tree reconstructions, using Maximum Likelihood, Maximum Parsimony, and Bayesian methods, resulted in a topology that supports monophyly of Hexapoda. Conclusion Although ribosomal proteins in general may not evolve independently, they once more appear highly valuable for phylogenetic reconstruction. Our analyses clearly suggest that Hexapoda is monophyletic. This underpins the inconsistency between nuclear and mitochondrial datasets when analyzing pancrustacean relationships. Caution is needed when applying mitochondrial markers in deep phylogeny.
Timmermans, M J T N; Roelofs, D; Mariën, J; van Straalen, N M
2008-03-12
In recent years, several new hypotheses on phylogenetic relations among arthropods have been proposed on the basis of DNA sequences. One of the challenged hypotheses is the monophyly of hexapods. This discussion originated from analyses based on mitochondrial DNA datasets that, due to an unusual positioning of Collembola, suggested that the hexapod body plan evolved at least twice. Here, we re-evaluate the position of Collembola using ribosomal protein gene sequences. In total 48 ribosomal proteins were obtained for the collembolan Folsomia candida. These 48 sequences were aligned with sequence data on 35 other ecdysozoans. Each ribosomal protein gene was available for 25% to 86% of the taxa. However, the total sequence information was unequally distributed over the taxa and ranged between 4% and 100%. A concatenated dataset was constructed (5034 inferred amino acids in length), of which ~66% of the positions were filled. Phylogenetic tree reconstructions, using Maximum Likelihood, Maximum Parsimony, and Bayesian methods, resulted in a topology that supports monophyly of Hexapoda. Although ribosomal proteins in general may not evolve independently, they once more appear highly valuable for phylogenetic reconstruction. Our analyses clearly suggest that Hexapoda is monophyletic. This underpins the inconsistency between nuclear and mitochondrial datasets when analyzing pancrustacean relationships. Caution is needed when applying mitochondrial markers in deep phylogeny.
Prediction and phylogenetic analysis of mammalian short interspersed elements (SINEs).
Rogozin, I B; Mayorov, V I; Lavrentieva, M V; Milanesi, L; Adkison, L R
2000-09-01
The presence of repetitive elements can create serious problems for sequence analysis, especially in the case of homology searches in nucleotide sequence databases. Repetitive elements should be treated carefully by using special programs and databases. In this paper, various aspects of SINE (short interspersed repetitive element) identification, analysis and evolution are discussed.
Caruso, Claudio; Dondo, Alessandro; Cerutti, Francesco; Masoero, Loretta; Rosamilia, Alfonso; Zoppi, Simona; D'Errico, Valeria; Grattarola, Carla; Acutis, Pier Luigi; Peletto, Simone
2014-07-01
We describe Aujeszky's disease in a female of red fox (Vulpes vulpes). Although wild boar (Sus scrofa) would be the expected source of infection, phylogenetic analysis suggested a domestic rather than a wild source of virus, underscoring the importance of biosecurity measures in pig farms to prevent contact with wild animals.
Sarcocystis nesbitti was first described by Mandour in 1969 from rhesus monkey muscle. Its definitive host remains unknown. 18SrRNA gene of Sarcocystis nesbitti was amplified, sequenced, and subjected to phylogenetic analysis. Among those congeners available for comparison, it shares closest affinit...
Soft-tissue anatomy of the extant hominoids: a review and phylogenetic analysis
Gibbs, S; Collard, M; Wood, B
2002-01-01
This paper reports the results of a literature search for information about the soft-tissue anatomy of the extant non-human hominoid genera, Pan, Gorilla, Pongo and Hylobates, together with the results of a phylogenetic analysis of these data plus comparable data for Homo. Information on the four extant non-human hominoid genera was located for 240 out of the 1783 soft-tissue structures listed in the Nomina Anatomica. Numerically these data are biased so that information about some systems (e.g. muscles) and some regions (e.g. the forelimb) are over-represented, whereas other systems and regions (e.g. the veins and the lymphatics of the vascular system, the head region) are either under-represented or not represented at all. Screening to ensure that the data were suitable for use in a phylogenetic analysis reduced the number of eligible soft-tissue structures to 171. These data, together with comparable data for modern humans, were converted into discontinuous character states suitable for phylogenetic analysis and then used to construct a taxon-by-character matrix. This matrix was used in two tests of the hypothesis that soft-tissue characters can be relied upon to reconstruct hominoid phylogenetic relationships. In the first, parsimony analysis was used to identify cladograms requiring the smallest number of character state changes. In the second, the phylogenetic bootstrap was used to determine the confidence intervals of the most parsimonious clades. The parsimony analysis yielded a single most parsimonious cladogram that matched the molecular cladogram. Similarly the bootstrap analysis yielded clades that were compatible with the molecular cladogram; a (Homo, Pan) clade was supported by 95% of the replicates, and a (Gorilla, Pan, Homo) clade by 96%. These are the first hominoid morphological data to provide statistically significant support for the clades favoured by the molecular evidence. PMID:11833653
Phylogenetic analysis of Tibetan mastiffs based on mitochondrial ...
Indian Academy of Sciences (India)
ZHANJUN REN
Tibetan mastiffs have close relationship with humans in guarding ... analysis of origin of dog by sequencing mtDNA has been reported. ... However, study on evolutionary relation- ..... tral test and mismatch distribution were explained by rapid.
Assessment of phylogenetic sensitivity for reconstructing HIV-1 epidemiological relationships.
Beloukas, Apostolos; Magiorkinis, Emmanouil; Magiorkinis, Gkikas; Zavitsanou, Asimina; Karamitros, Timokratis; Hatzakis, Angelos; Paraskevis, Dimitrios
2012-06-01
Phylogenetic analysis has been extensively used as a tool for the reconstruction of epidemiological relations for research or for forensic purposes. It was our objective to assess the sensitivity of different phylogenetic methods and various phylogenetic programs to reconstruct epidemiological links among HIV-1 infected patients that is the probability to reveal a true transmission relationship. Multiple datasets (90) were prepared consisting of HIV-1 sequences in protease (PR) and partial reverse transcriptase (RT) sampled from patients with documented epidemiological relationship (target population), and from unrelated individuals (control population) belonging to the same HIV-1 subtype as the target population. Each dataset varied regarding the number, the geographic origin and the transmission risk groups of the sequences among the control population. Phylogenetic trees were inferred by neighbor-joining (NJ), maximum likelihood heuristics (hML) and Bayesian methods. All clusters of sequences belonging to the target population were correctly reconstructed by NJ and Bayesian methods receiving high bootstrap and posterior probability (PP) support, respectively. On the other hand, TreePuzzle failed to reconstruct or provide significant support for several clusters; high puzzling step support was associated with the inclusion of control sequences from the same geographic area as the target population. In contrary, all clusters were correctly reconstructed by hML as implemented in PhyML 3.0 receiving high bootstrap support. We report that under the conditions of our study, hML using PhyML, NJ and Bayesian methods were the most sensitive for the reconstruction of epidemiological links mostly from sexually infected individuals. Copyright © 2012 Elsevier B.V. All rights reserved.
Bias in phylogenetic reconstruction of vertebrate rhodopsin sequences.
Chang, B S; Campbell, D L
2000-08-01
Two spurious nodes were found in phylogenetic analyses of vertebrate rhodopsin sequences in comparison with well-established vertebrate relationships. These spurious reconstructions were well supported in bootstrap analyses and occurred independently of the method of phylogenetic analysis used (parsimony, distance, or likelihood). Use of this data set of vertebrate rhodopsin sequences allowed us to exploit established vertebrate relationships, as well as the considerable amount known about the molecular evolution of this gene, in order to identify important factors contributing to the spurious reconstructions. Simulation studies using parametric bootstrapping indicate that it is unlikely that the spurious nodes in the parsimony analyses are due to long branches or other topological effects. Rather, they appear to be due to base compositional bias at third positions, codon bias, and convergent evolution at nucleotide positions encoding the hydrophobic residues isoleucine, leucine, and valine. LogDet distance methods, as well as maximum-likelihood methods which allow for nonstationary changes in base composition, reduce but do not entirely eliminate support for the spurious resolutions. Inclusion of five additional rhodopsin sequences in the phylogenetic analyses largely corrected one of the spurious reconstructions while leaving the other unaffected. The additional sequences not only were more proximal to the corrected node, but were also found to have intermediate levels of base composition and codon bias as compared with neighboring sequences on the tree. This study shows that the spurious reconstructions can be corrected either by excluding third positions, as well as those encoding the amino acids Ile, Val, and Leu (which may not be ideal, as these sites can contain useful phylogenetic signal for other parts of the tree), or by the addition of sequences that reduce problems associated with convergent evolution.
YOUSOFI, SUGOL; POUYANI, ESKANDAR RASTEGAR; HOJATI, VIDA
2015-01-01
We examined intraspecific relationships of the subspecies Pristurus rupestris iranicus from the northern Persian Gulf area (Hormozgan, Bushehr, and Sistan and Baluchestan provinces). Phylogenetic relationships among these samples were estimated based on the mitochondrial cytochrome b gene. We used three methods of phylogenetic tree reconstruction (maximum likelihood, maximum parsimony, and Bayesian inference). The sampled populations were divided into 5 clades but exhibit little genetic diver...
Phylogenetic analysis of the genus Anabaena based on PCR ...
African Journals Online (AJOL)
STORAGESEVER
2009-12-15
Dec 15, 2009 ... LTRR2. 5'- CTA TCA GGG ATT GAA AG-3'. Rasmussen and. Svenning, 1998 on a 1% agarose gel containing 0.5 X TBE (Tris Borate-EDTA) and. 0.5 µg/ml ethidium bromide. Data analysis. Fingerprints generated from different Anabaena species were compared and all bands were scored. The presence or ...
Construction of phylogenetic trees by kernel-based comparative analysis of metabolic networks.
Oh, S June; Joung, Je-Gun; Chang, Jeong-Ho; Zhang, Byoung-Tak
2006-06-06
To infer the tree of life requires knowledge of the common characteristics of each species descended from a common ancestor as the measuring criteria and a method to calculate the distance between the resulting values of each measure. Conventional phylogenetic analysis based on genomic sequences provides information about the genetic relationships between different organisms. In contrast, comparative analysis of metabolic pathways in different organisms can yield insights into their functional relationships under different physiological conditions. However, evaluating the similarities or differences between metabolic networks is a computationally challenging problem, and systematic methods of doing this are desirable. Here we introduce a graph-kernel method for computing the similarity between metabolic networks in polynomial time, and use it to profile metabolic pathways and to construct phylogenetic trees. To compare the structures of metabolic networks in organisms, we adopted the exponential graph kernel, which is a kernel-based approach with a labeled graph that includes a label matrix and an adjacency matrix. To construct the phylogenetic trees, we used an unweighted pair-group method with arithmetic mean, i.e., a hierarchical clustering algorithm. We applied the kernel-based network profiling method in a comparative analysis of nine carbohydrate metabolic networks from 81 biological species encompassing Archaea, Eukaryota, and Eubacteria. The resulting phylogenetic hierarchies generally support the tripartite scheme of three domains rather than the two domains of prokaryotes and eukaryotes. By combining the kernel machines with metabolic information, the method infers the context of biosphere development that covers physiological events required for adaptation by genetic reconstruction. The results show that one may obtain a global view of the tree of life by comparing the metabolic pathway structures using meta-level information rather than sequence
Construction of phylogenetic trees by kernel-based comparative analysis of metabolic networks
Directory of Open Access Journals (Sweden)
Chang Jeong-Ho
2006-06-01
Full Text Available Abstract Background To infer the tree of life requires knowledge of the common characteristics of each species descended from a common ancestor as the measuring criteria and a method to calculate the distance between the resulting values of each measure. Conventional phylogenetic analysis based on genomic sequences provides information about the genetic relationships between different organisms. In contrast, comparative analysis of metabolic pathways in different organisms can yield insights into their functional relationships under different physiological conditions. However, evaluating the similarities or differences between metabolic networks is a computationally challenging problem, and systematic methods of doing this are desirable. Here we introduce a graph-kernel method for computing the similarity between metabolic networks in polynomial time, and use it to profile metabolic pathways and to construct phylogenetic trees. Results To compare the structures of metabolic networks in organisms, we adopted the exponential graph kernel, which is a kernel-based approach with a labeled graph that includes a label matrix and an adjacency matrix. To construct the phylogenetic trees, we used an unweighted pair-group method with arithmetic mean, i.e., a hierarchical clustering algorithm. We applied the kernel-based network profiling method in a comparative analysis of nine carbohydrate metabolic networks from 81 biological species encompassing Archaea, Eukaryota, and Eubacteria. The resulting phylogenetic hierarchies generally support the tripartite scheme of three domains rather than the two domains of prokaryotes and eukaryotes. Conclusion By combining the kernel machines with metabolic information, the method infers the context of biosphere development that covers physiological events required for adaptation by genetic reconstruction. The results show that one may obtain a global view of the tree of life by comparing the metabolic pathway
Phylogenetic analysis of Hungarian goose parvovirus isolates and vaccine strains.
Tatár-Kis, Tímea; Mató, Tamás; Markos, Béla; Palya, Vilmos
2004-08-01
Polymerase chain reaction and sequencing were used to analyse goose parvovirus field isolates and vaccine strains. Two fragments of the genome were amplified. Fragment "A" represents a region of VP3 gene, while fragment "B" represents a region upstream of the VP3 gene, encompassing part of the VP1 gene. In the region of fragment "A" the deduced amino acid sequence of the strains was identical, therefore differentiation among strains could be done only at the nucleotide level, which resulted in the formation of three groups: Hungarian, West-European and Asian strains. In the region of fragment "B", separation of groups could be done by both nucleotide and deduced amino acid sequence level. The nucleotide sequences resulted in the same groups as for fragment "A" but with a different clustering pattern among the Hungarian strains. Within the "Hungarian" group most of the recent field isolates fell into one cluster, very closely related or identical to each other, indicating a very slow evolutionary change. The attenuated strains and field isolates from 1979/80 formed a separate cluster. When vaccine strains and field isolates were compared, two specific amino acid differences were found that can be considered as possible markers for vaccinal strains. Sequence analysis of fragment "B" seems to be a suitable method for differentiation of attenuated vaccine strains from virulent strains. Copyright 2004 Houghton Trust Ltd
Reeb, Valérie; Lutzoni, François; Roux, Claude
2004-09-01
Despite the recent progress in molecular phylogenetics, many of the deepest relationships among the main lineages of the largest fungal phylum, Ascomycota, remain unresolved. To increase both resolution and support on a large-scale phylogeny of lichenized and non-lichenized ascomycetes, we combined the protein coding-gene RPB2 with the traditionally used nuclear ribosomal genes SSU and LSU. Our analyses resulted in the naming of the new subclasses Acarosporomycetidae and Ostropomycetidae, and the new class Lichinomycetes, as well as the establishment of the phylogenetic placement and novel circumscription of the lichen-forming fungi family Acarosporaceae. The delimitation of this family has been problematic over the past century, because its main diagnostic feature, true polyspory (numerous spores issued from multiple post-meiosis mitoses) with over 100 spores per ascus, is probably not restricted to the Acarosporaceae. This observation was confirmed by our reconstruction of the origin and evolution of this form of true polyspory using maximum likelihood as the optimality criterion. The various phylogenetic analyses carried out on our data sets allowed us to conclude that: (1) the inclusion of phylogenetic signal from ambiguously aligned regions into the maximum parsimony analyses proved advantageous in reconstructing phylogeny; however, when more data become available, Bayesian analysis using different models of evolution is likely to be more efficient; (2) neighbor-joining bootstrap proportions seem to be more appropriate in detecting topological conflict between data partitions of large-scale phylogenies than posterior probabilities; and (3) Bayesian bootstrap proportion provides a compromise between posterior probability outcomes (i.e., higher accuracy, but with a higher number of significantly supported wrong internodes) vs. maximum likelihood bootstrap proportion outcomes (i.e., lower accuracy, with a lower number of significantly supported wrong internodes).
Liu, Jingjing; Wu, Weixiang; Chen, Chongjun; Sun, Faqian; Chen, Yingxu
2011-09-01
In order to obtain insight into the prokaryotic diversity and community in leachate sediment, a culture-independent DNA-based molecular phylogenetic approach was performed with archaeal and bacterial 16S rRNA gene clone libraries derived from leachate sediment of an aged landfill. A total of 59 archaeal and 283 bacterial rDNA phylotypes were identified in 425 archaeal and 375 bacterial analyzed clones. All archaeal clones distributed within two archaeal phyla of the Euryarchaeota and Crenarchaeota, and well-defined methanogen lineages, especially Methanosaeta spp., are the most numerically dominant species of the archaeal community. Phylogenetic analysis of the bacterial library revealed a variety of pollutant-degrading and biotransforming microorganisms, including 18 distinct phyla. A substantial fraction of bacterial clones showed low levels of similarity with any previously documented sequences and thus might be taxonomically new. Chemical characteristics and phylogenetic inferences indicated that (1) ammonium-utilizing bacteria might form consortia to alleviate or avoid the negative influence of high ammonium concentration on other microorganisms, and (2) members of the Crenarchaeota found in the sediment might be involved in ammonium oxidation. This study is the first to report the composition of the microbial assemblages and phylogenetic characteristics of prokaryotic populations extant in leachate sediment. Additional work on microbial activity and contaminant biodegradation remains to be explored.
Atibalentja, N; Noel, G R; Domier, L L
2000-03-01
A 1341 bp sequence of the 16S rDNA of an undescribed species of Pasteuria that parasitizes the soybean cyst nematode, Heterodera glycines, was determined and then compared with a homologous sequence of Pasteuria ramosa, a parasite of cladoceran water fleas of the family Daphnidae. The two Pasteuria sequences, which diverged from each other by a dissimilarity index of 7%, also were compared with the 16S rDNA sequences of 30 other bacterial species to determine the phylogenetic position of the genus Pasteuria among the Gram-positive eubacteria. Phylogenetic analyses using maximum-likelihood, maximum-parsimony and neighbour-joining methods showed that the Heterodera glycines-infecting Pasteuria and its sister species, P. ramosa, form a distinct line of descent within the Alicyclobacillus group of the Bacillaceae. These results are consistent with the view that the genus Pasteuria is a deeply rooted member of the Clostridium-Bacillus-Streptococcus branch of the Gram-positive eubacteria, neither related to the actinomycetes nor closely related to true endospore-forming bacteria.
Speiser, Daniel I; Pankey, M Sabrina; Zaharoff, Alexander K; Battelle, Barbara A; Bracken-Grissom, Heather D; Breinholt, Jesse W; Bybee, Seth M; Cronin, Thomas W; Garm, Anders; Lindgren, Annie R; Patel, Nipam H; Porter, Megan L; Protas, Meredith E; Rivera, Ajna S; Serb, Jeanne M; Zigler, Kirk S; Crandall, Keith A; Oakley, Todd H
2014-11-19
Tools for high throughput sequencing and de novo assembly make the analysis of transcriptomes (i.e. the suite of genes expressed in a tissue) feasible for almost any organism. Yet a challenge for biologists is that it can be difficult to assign identities to gene sequences, especially from non-model organisms. Phylogenetic analyses are one useful method for assigning identities to these sequences, but such methods tend to be time-consuming because of the need to re-calculate trees for every gene of interest and each time a new data set is analyzed. In response, we employed existing tools for phylogenetic analysis to produce a computationally efficient, tree-based approach for annotating transcriptomes or new genomes that we term Phylogenetically-Informed Annotation (PIA), which places uncharacterized genes into pre-calculated phylogenies of gene families. We generated maximum likelihood trees for 109 genes from a Light Interaction Toolkit (LIT), a collection of genes that underlie the function or development of light-interacting structures in metazoans. To do so, we searched protein sequences predicted from 29 fully-sequenced genomes and built trees using tools for phylogenetic analysis in the Osiris package of Galaxy (an open-source workflow management system). Next, to rapidly annotate transcriptomes from organisms that lack sequenced genomes, we repurposed a maximum likelihood-based Evolutionary Placement Algorithm (implemented in RAxML) to place sequences of potential LIT genes on to our pre-calculated gene trees. Finally, we implemented PIA in Galaxy and used it to search for LIT genes in 28 newly-sequenced transcriptomes from the light-interacting tissues of a range of cephalopod mollusks, arthropods, and cubozoan cnidarians. Our new trees for LIT genes are available on the Bitbucket public repository ( http://bitbucket.org/osiris_phylogenetics/pia/ ) and we demonstrate PIA on a publicly-accessible web server ( http://galaxy-dev.cnsi.ucsb.edu/pia/ ). Our new
Directory of Open Access Journals (Sweden)
Dwueng-Chwuan Jhwueng
Full Text Available Phylogenetic comparative methods (PCMs have been applied widely in analyzing data from related species but their fit to data is rarely assessed.Can one determine whether any particular comparative method is typically more appropriate than others by examining comparative data sets?I conducted a meta-analysis of 122 phylogenetic data sets found by searching all papers in JEB, Blackwell Synergy and JSTOR published in 2002-2005 for the purpose of assessing the fit of PCMs. The number of species in these data sets ranged from 9 to 117.I used the Akaike information criterion to compare PCMs, and then fit PCMs to bivariate data sets through REML analysis. Correlation estimates between two traits and bootstrapped confidence intervals of correlations from each model were also compared.For phylogenies of less than one hundred taxa, the Independent Contrast method and the independent, non-phylogenetic models provide the best fit.For bivariate analysis, correlations from different PCMs are qualitatively similar so that actual correlations from real data seem to be robust to the PCM chosen for the analysis. Therefore, researchers might apply the PCM they believe best describes the evolutionary mechanisms underlying their data.
Directory of Open Access Journals (Sweden)
Panca J. Santoso
2013-07-01
Full Text Available Twenty seven species of Durio have been identified in Sabah and Sarawak, Malaysia, but their relationships have not been studied. This study was conducted to analyse phylogenetic relationships amongst 10 Durio species in Malaysia using PCR-RFLP on two chloroplast DNA genes, i.e. ndhC-trnV and rbcL. DNAs were extracted from young leaves of 11 accessions from 10 Durio species collected from the Tenom Agriculture Research Station, Sabah, and University Agriculture Park, Universiti Putra Malaysia. Two pairs of oligonucleotide primers, N1-N2 and rbcL1-rbcL2, were used to flank the target regions ndhC-trnV and rbcL. Eight restriction enzymes, HindIII, BsuRI, PstI, TaqI, MspI, SmaI, BshNI, and EcoR130I, were used to digest the amplicons. Based on the results of PCR-RFLP on ndhC-trnV gene, the 10 Durio species were grouped into five distinct clusters, and the accessions generally showed high variations. However, based on the results of PCR-RFLP on the rbcL gene, the species were grouped into three distinct clusters, and generally showed low variations. This means that ndhC-trnV gene is more reliable for phylogenetic analysis in lower taxonomic level of Durio species or for diversity analysis, while rbcL gene is reliable marker for phylogenetic analysis at higher taxonomic level. PCR-RFLP on the ndhC-trnV and rbcL genes could therefore be considered as useful markers to phylogenetic analysis amongst Durio species. These finding might be used for further molecular marker assisted in Durio breeding program.
Izedin Goga; Kristaq Berxholi; Beqe Hulaj; Driton Sylejmani; Boris Yakobson; Yehuda Stram
2014-01-01
Three serum samples positive in Antigen ELISA BVDV have been tested to characterise genetic diversity of bovine viral diarrhea virus (BVDV) in Kosovo. Samples were obtained in 2011 from heifers and were amplified by reverse transcription-polymerase chain reaction, sequenced and analysed by computer-assisted phylogenetic analysis. Amplified products and nucleotide sequence showed that all 3 isolates belonged to BVDV 1 genotype and 1b sub genotype. These results enrich the extant knowledge of B...
Phylogenetic Analysis of Dengue Virus in Bangkalan, Madura Island, East Java Province, Indonesia.
Sucipto, Teguh Hari; Kotaki, Tomohiro; Mulyatno, Kris Cahyo; Churrotin, Siti; Labiqah, Amaliah; Soegijanto, Soegeng; Kameoka, Masanori
2018-01-01
Dengue virus (DENV) infection is a major health issue in tropical and subtropical areas. Indonesia is one of the biggest dengue endemic countries in the world. In the present study, the phylogenetic analysis of DENV in Bangkalan, Madura Island, Indonesia, was performed in order to obtain a clearer understanding of its dynamics in this country. A total of 359 blood samples from dengue-suspected patients were collected between 2012 and 2014. Serotyping was conducted using a multiplex Reverse Transcriptase-Polymerase Chain Reaction and a phylogenetic analysis of E gene sequences was performed using the Bayesian Markov chain Monte Carlo (MCMC) method. 17 out of 359 blood samples (4.7%) were positive for the isolation of DENV. Serotyping and the phylogenetic analysis revealed the predominance of DENV-1 genotype I (9/17, 52.9%), followed by DENV-2 Cosmopolitan type (7/17, 41.2%) and DENV-3 genotype I (1/17, 5.9%) . DENV-4 was not isolated. The Madura Island isolates showed high nucleotide similarity to other Indonesian isolates, indicating frequent virus circulation in Indonesia. The results of the present study highlight the importance of continuous viral surveillance in dengue endemic areas in order to obtain a clearer understanding of the dynamics of DENV in Indonesia.
Whole genome sequence phylogenetic analysis of four Mexican rabies viruses isolated from cattle.
Bárcenas-Reyes, I; Loza-Rubio, E; Cantó-Alarcón, G J; Luna-Cozar, J; Enríquez-Vázquez, A; Barrón-Rodríguez, R J; Milián-Suazo, F
2017-08-01
Phylogenetic analysis of the rabies virus in molecular epidemiology has been traditionally performed on partial sequences of the genome, such as the N, G, and P genes; however, that approach raises concerns about the discriminatory power compared to whole genome sequencing. In this study we characterized four strains of the rabies virus isolated from cattle in Querétaro, Mexico by comparing the whole genome sequence to that of strains from the American, European and Asian continents. Four cattle brain samples positive to rabies and characterized as AgV11, genotype 1, were used in the study. A cDNA sequence was generated by reverse transcription PCR (RT-PCR) using oligo dT. cDNA samples were sequenced in an Illumina NextSeq 500 platform. The phylogenetic analysis was performed with MEGA 6.0. Minimum evolution phylogenetic trees were constructed with the Neighbor-Joining method and bootstrapped with 1000 replicates. Three large and seven small clusters were formed with the 26 sequences used. The largest cluster grouped strains from different species in South America: Brazil, and the French Guyana. The second cluster grouped five strains from Mexico. A Mexican strain reported in a different study was highly related to our four strains, suggesting common source of infection. The phylogenetic analysis shows that the type of host is different for the different regions in the American Continent; rabies is more related to bats. It was concluded that the rabies virus in central Mexico is genetically stable and that it is transmitted by the vampire bat Desmodus rotundus. Copyright © 2017 Elsevier Ltd. All rights reserved.
Malyarchuk, Boris; Derenko, Miroslava; Mikhailova, Ekaterina; Denisova, Galina
2014-02-01
Phylogenetic and statistical analyses of DNA sequences of two genes, cytochrome oxidase subunit 1 (cox 1) of the mitochondrial DNA and 18S subunit of the nuclear ribosomal RNA (18S rRNA), was used to characterize Neoechinorhynchus species from fishes collected in different localities of North-East Asia. It has been found that four species can be clearly recognized using molecular markers-Neoechinorhynchus tumidus, Neoechinorhynchus beringianus, Neoechinorhynchus simansularis and Neoechinorhynchus salmonis. 18S sequences ascribed to Neoechinorhynchus crassus specimens from North-East Asia were identical to those of N. tumidus, but differed substantially from North American N. crassus. We renamed North-East Asian N. crassus specimens to N. sp., although the possibility that they represent a subspecies of N. tumidus cannot be excluded, taking into account a relatively small distance between cox 1 sequences of North-East Asian specimens of N. crassus and N. tumidus. Maximum likelihood, maximum parsimony and Bayesian inference analyses were performed for phylogeny reconstruction. All the phylogenetic trees showed that North-East Asian species of Neoechinorhynchus analyzed in this study represent independent clades, with the only exception of N. tumidus and N. sp. for 18S data. Phylogenetic analysis has shown that the majority of species sampled (N. tumidus+N. sp., N. simansularis and N. beringianus) are probably very closely related, while N. salmonis occupies separate position in the trees, possibly indicating a North American origin of this species. © 2013.
Phylogenetic analysis of several Thermus strains from Rehai of Tengchong, Yunnan, China.
Lin, Lianbing; Zhang, Jie; Wei, Yunlin; Chen, Chaoyin; Peng, Qian
2005-10-01
Several Thermus strains were isolated from 10 hot springs of the Rehai geothermal area in Tengchong, Yunnan province. The diversity of Thermus strains was examined by sequencing the 16S rRNA genes and comparing their sequences. Phylogenetic analysis showed that the 16S rDNA sequences from the Rehai geothermal isolates form four branches in the phylogenetic tree and had greater than 95.9% similarity in the phylogroup. Secondary structure comparison also indicated that the 16S rRNA from the Rehai geothermal isolates have unique secondary structure characteristics in helix 6, helix 9, and helix 10 (reference to Escherichia coli). This research is the first attempt to reveal the diversity of Thermus strains that are distributed in the Rehai geothermal area.
Yamasaki, Masahiro; Tsuboi, Yoshihiro; Taniyama, Yusuke; Uchida, Naohiro; Sato, Reeko; Nakamura, Kensuke; Ohta, Hiroshi; Takiguchi, Mitsuyoshi
2016-09-01
The Babesia gibsoni heat shock protein 90 (BgHSP90) gene was cloned and sequenced. The length of the gene was 2,610 bp with two introns. This gene was amplified from cDNA corresponding to full length coding sequence (CDS) with an open reading frame of 2,148 bp. A phylogenetic analysis of the CDS of HSP90 gene showed that B. gibsoni was most closely related to B. bovis and Babesia sp. BQ1/Lintan and lies within a phylogenetic cluster of protozoa. Moreover, mRNA transcription profile for BgHSP90 exposed to high temperature were examined by quantitative real-time reverse transcription-polymerase chain reaction. BgHSP90 levels were elevated when the parasites were incubated at 43°C for 1 hr.
DNA Translator and Aligner: HyperCard utilities to aid phylogenetic analysis of molecules.
Eernisse, D J
1992-04-01
DNA Translator and Aligner are molecular phylogenetics HyperCard stacks for Macintosh computers. They manipulate sequence data to provide graphical gene mapping, conversions, translations and manual multiple-sequence alignment editing. DNA Translator is able to convert documented GenBank or EMBL documented sequences into linearized, rescalable gene maps whose gene sequences are extractable by clicking on the corresponding map button or by selection from a scrolling list. Provided gene maps, complete with extractable sequences, consist of nine metazoan, one yeast, and one ciliate mitochondrial DNAs and three green plant chloroplast DNAs. Single or multiple sequences can be manipulated to aid in phylogenetic analysis. Sequences can be translated between nucleic acids and proteins in either direction with flexible support of alternate genetic codes and ambiguous nucleotide symbols. Multiple aligned sequence output from diverse sources can be converted to Nexus, Hennig86 or PHYLIP format for subsequent phylogenetic analysis. Input or output alignments can be examined with Aligner, a convenient accessory stack included in the DNA Translator package. Aligner is an editor for the manual alignment of up to 100 sequences that toggles between display of matched characters and normal unmatched sequences. DNA Translator also generates graphic displays of amino acid coding and codon usage frequency relative to all other, or only synonymous, codons for approximately 70 select organism-organelle combinations. Codon usage data is compatible with spreadsheet or UWGCG formats for incorporation of additional molecules of interest. The complete package is available via anonymous ftp and is free for non-commercial uses.
Pan, Ting Shuang; Nie, Pin
2013-07-01
Acanthocephalans are a small group of obligate endoparasites. They and rotifers are recently placed in a group called Syndermata. However, phylogenetic relationships within classes of acanthocephalans, and between them and rotifers, have not been well resolved, possibly due to the lack of molecular data suitable for such analysis. In this study, the mitochondrial (mt) genome was sequenced from Pallisentis celatus (Van Cleave, 1928), an acanthocephalan in the class Eoacanthocephala, an intestinal parasite of rice-field eel, Monopterus albus (Zuiew, 1793), in China. The complete mt genome sequence of P. celatus is 13 855 bp long, containing 36 genes including 12 protein-coding genes, 22 transfer RNAs (tRNAs) and 2 ribosomal RNAs (rRNAs) as reported for other acanthocephalan species. All genes are encoded on the same strand and in the same direction. Phylogenetic analysis indicated that acanthocephalans are closely related with a clade containing bdelloids, which then correlates with the clade containing monogononts. The class Eoacanthocephala, containing P. celatus and Paratenuisentis ambiguus (Van Cleave, 1921) was closely related to the Palaeacanthocephala. It is thus indicated that acanthocephalans may be just clustered among groups of rotifers. However, the resolving of phylogenetic relationship among all classes of acanthocephalans and between them and rotifers may require further sampling and more molecular data.
Kuleshov, K V; Markelov, M L; Dedkov, V G; Vodop'ianov, A S; Kermanov, A V; Pisanov, R V; Kruglikov, V D; Mazrukho, A B; Maleev, V V; Shipulin, G A
2013-01-01
Determination of origin of 2 Vibrio cholerae strains isolated on the territory of Rostov region by using full genome sequencing data. Toxigenic strain 2011 EL- 301 V. cholerae 01 El Tor Inaba No. 301 (ctxAB+, tcpA+) and nontoxigenic strain V. cholerae O1 Ogawa P- 18785 (ctxAB-, tcpA+) were studied. Sequencing was carried out on the MiSeq platform. Phylogenetic analysis of the genomes obtained was carried out based on comparison of conservative part of the studied and 54 previously sequenced genomes. 2011EL-301 strain genome was presented by 164 contigs with an average coverage of 100, N50 parameter was 132 kb, for strain P- 18785 - 159 contigs with a coverage of69, N50 - 83 kb. The contigs obtained for strain 2011 EL-301 were deposited in DDBJ/EMBL/GenBank databases with access code AJFN02000000, for strain P-18785 - ANHS00000000. 716 protein-coding orthologous genes were detected. Based on phylogenetic analysis strain P- 18785 belongs to PG-1 subgroup (a group of predecessor strains of the 7th pandemic). Strain 2011EL-301 belongs to groups of strains of the 7th pandemic and is included into the cluster with later isolates that are associated with cases of cholera in South Africa and cases of import of cholera to the USA from Pakistan. The data obtained allows to establish phylogenetic connections with V cholerae strains isolated earlier.
galaxie--CGI scripts for sequence identification through automated phylogenetic analysis.
Nilsson, R Henrik; Larsson, Karl-Henrik; Ursing, Björn M
2004-06-12
The prevalent use of similarity searches like BLAST to identify sequences and species implicitly assumes the reference database to be of extensive sequence sampling. This is often not the case, restraining the correctness of the outcome as a basis for sequence identification. Phylogenetic inference outperforms similarity searches in retrieving correct phylogenies and consequently sequence identities, and a project was initiated to design a freely available script package for sequence identification through automated Web-based phylogenetic analysis. Three CGI scripts were designed to facilitate qualified sequence identification from a Web interface. Query sequences are aligned to pre-made alignments or to alignments made by ClustalW with entries retrieved from a BLAST search. The subsequent phylogenetic analysis is based on the PHYLIP package for inferring neighbor-joining and parsimony trees. The scripts are highly configurable. A service installation and a version for local use are found at http://andromeda.botany.gu.se/galaxiewelcome.html and http://galaxie.cgb.ki.se
Phylogenetic and Diversity Analysis of Dactylis glomerata Subspecies Using SSR and IT-ISJ Markers.
Yan, Defei; Zhao, Xinxin; Cheng, Yajuan; Ma, Xiao; Huang, Linkai; Zhang, Xinquan
2016-10-31
The genus Dactylis , an important forage crop, has a wide geographical distribution in temperate regions. While this genus is thought to include a single species, Dactylis glomerata , this species encompasses many subspecies whose relationships have not been fully characterized. In this study, the genetic diversity and phylogenetic relationships of nine representative Dactylis subspecies were examined using SSR and IT-ISJ markers. In total, 21 pairs of SSR primers and 15 pairs of IT-ISJ primers were used to amplify 295 polymorphic bands with polymorphic rates of 100%. The average polymorphic information contents (PICs) of SSR and IT-ISJ markers were 0.909 and 0.780, respectively. The combined data of the two markers indicated a high level of genetic diversity among the nine D. glomerata subspecies, with a Nei's gene diversity index value of 0.283 and Shannon's diversity of 0.448. Preliminarily phylogenetic analysis results revealed that the 20 accessions could be divided into three groups (A, B, C). Furthermore, they could be divided into five clusters, which is similar to the structure analysis with K = 5. Phylogenetic placement in these three groups may be related to the distribution ranges and the climate types of the subspecies in each group. Group A contained eight accessions of four subspecies, originating from the west Mediterranean, while Group B contained seven accessions of three subspecies, originating from the east Mediterranean.
Expressed sequence tags as a tool for phylogenetic analysis of placental mammal evolution.
Directory of Open Access Journals (Sweden)
Morgan Kullberg
Full Text Available BACKGROUND: We investigate the usefulness of expressed sequence tags, ESTs, for establishing divergences within the tree of placental mammals. This is done on the example of the established relationships among primates (human, lagomorphs (rabbit, rodents (rat and mouse, artiodactyls (cow, carnivorans (dog and proboscideans (elephant. METHODOLOGY/PRINCIPAL FINDINGS: We have produced 2000 ESTs (1.2 mega bases from a marsupial mouse and characterized the data for their use in phylogenetic analysis. The sequences were used to identify putative orthologous sequences from whole genome projects. Although most ESTs stem from single sequence reads, the frequency of potential sequencing errors was found to be lower than allelic variation. Most of the sequences represented slowly evolving housekeeping-type genes, with an average amino acid distance of 6.6% between human and mouse. Positive Darwinian selection was identified at only a few single sites. Phylogenetic analyses of the EST data yielded trees that were consistent with those established from whole genome projects. CONCLUSIONS: The general quality of EST sequences and the general absence of positive selection in these sequences make ESTs an attractive tool for phylogenetic analysis. The EST approach allows, at reasonable costs, a fast extension of data sampling from species outside the genome projects.
Phylogenetic inference with weighted codon evolutionary distances.
Criscuolo, Alexis; Michel, Christian J
2009-04-01
We develop a new approach to estimate a matrix of pairwise evolutionary distances from a codon-based alignment based on a codon evolutionary model. The method first computes a standard distance matrix for each of the three codon positions. Then these three distance matrices are weighted according to an estimate of the global evolutionary rate of each codon position and averaged into a unique distance matrix. Using a large set of both real and simulated codon-based alignments of nucleotide sequences, we show that this approach leads to distance matrices that have a significantly better treelikeness compared to those obtained by standard nucleotide evolutionary distances. We also propose an alternative weighting to eliminate the part of the noise often associated with some codon positions, particularly the third position, which is known to induce a fast evolutionary rate. Simulation results show that fast distance-based tree reconstruction algorithms on distance matrices based on this codon position weighting can lead to phylogenetic trees that are at least as accurate as, if not better, than those inferred by maximum likelihood. Finally, a well-known multigene dataset composed of eight yeast species and 106 codon-based alignments is reanalyzed and shows that our codon evolutionary distances allow building a phylogenetic tree which is similar to those obtained by non-distance-based methods (e.g., maximum parsimony and maximum likelihood) and also significantly improved compared to standard nucleotide evolutionary distance estimates.
She, C-W; Jiang, X-H; Ou, L-J; Liu, J; Long, K-L; Zhang, L-H; Duan, W-T; Zhao, W; Hu, J-C
2015-01-01
The genomic organisation of the seven cultivated Vigna species, V. unguiculata, V. subterranea, V. angularis, V. umbellata, V. radiata, V. mungo and V. aconitifolia, was determined using sequential combined PI and DAPI (CPD) staining and dual-colour fluorescence in situ hybridisation (FISH) with 5S and 45S rDNA probes. For phylogenetic analyses, comparative genomic in situ hybridisation (cGISH) onto somatic chromosomes and sequence analysis of the internal transcribed spacer (ITS) of 45S rDNA were used. Quantitative karyotypes were established using chromosome measurements, fluorochrome bands and rDNA FISH signals. All species had symmetrical karyotypes composed of only metacentric or metacentric and submetacentric chromosomes. Distinct heterochromatin differentiation was revealed by CPD staining and DAPI counterstaining after FISH. The rDNA sites among all species differed in their number, location and size. cGISH of V. umbellata genomic DNA to the chromosomes of all species produced strong signals in all centromeric regions of V. umbellata and V. angularis, weak signals in all pericentromeric regions of V. aconitifolia, and CPD-banded proximal regions of V. mungo var. mungo. Molecular phylogenetic trees showed that V. angularis and V. umbellata were the closest relatives, and V. mungo and V. aconitifolia were relatively closely related; these species formed a group that was separated from another group comprising V. radiata, V. unguiculata ssp. sesquipedalis and V. subterranea. This result was consistent with the phylogenetic relationships inferred from the heterochromatin and cGISH patterns; thus, fluorochrome banding and cGISH are efficient tools for the phylogenetic analysis of Vigna species. © 2014 German Botanical Society and The Royal Botanical Society of the Netherlands.
Abdel-Shafi, Iman R; Shoieb, Eman Y; Attia, Samar S; Rubio, José M; Ta-Tang, Thuy-Huong; El-Badry, Ayman A
2017-03-01
Lymphatic filariasis (LF) is a serious vector-borne health problem, and Wuchereria bancrofti (W.b) is the major cause of LF worldwide and is focally endemic in Egypt. Identification of filarial infection using traditional morphologic and immunological criteria can be difficult and lead to misdiagnosis. The aim of the present study was molecular detection of W.b in residents in endemic areas in Egypt, sequence variance analysis, and phylogenetic analysis of W.b DNA. Collected blood samples from residents in filariasis endemic areas in five governorates were subjected to semi-nested PCR targeting repeated DNA sequence, for detection of W.b DNA. PCR products were sequenced; subsequently, a phylogenetic analysis of the obtained sequences was performed. Out of 300 blood samples, W.b DNA was identified in 48 (16%). Sequencing analysis confirmed PCR results identifying only W.b species. Sequence alignment and phylogenetic analysis indicated genetically distinct clusters of W.b among the study population. Study results demonstrated that the semi-nested PCR proved to be an effective diagnostic tool for accurate and rapid detection of W.b infections in nano-epidemics and is applicable for samples collected in the daytime as well as the night time. PCR products sequencing and phylogenitic analysis revealed three different nucleotide sequences variants. Further genetic studies of W.b in Egypt and other endemic areas are needed to distinguish related strains and the various ecological as well as drug effects exerted on them to support W.b elimination.
Directory of Open Access Journals (Sweden)
Jianguo Zhou
2018-02-01
Full Text Available Papaver rhoeas L. and P. orientale L., which belong to the family Papaveraceae, are used as ornamental and medicinal plants. The chloroplast genome has been used for molecular markers, evolutionary biology, and barcoding identification. In this study, the complete chloroplast genome sequences of P. rhoeas and P. orientale are reported. Results show that the complete chloroplast genomes of P. rhoeas and P. orientale have typical quadripartite structures, which are comprised of circular 152,905 and 152,799-bp-long molecules, respectively. A total of 130 genes were identified in each genome, including 85 protein-coding genes, 37 tRNA genes, and 8 rRNA genes. Sequence divergence analysis of four species from Papaveraceae indicated that the most divergent regions are found in the non-coding spacers with minimal differences among three Papaver species. These differences include the ycf1 gene and intergenic regions, such as rpoB-trnC, trnD-trnT, petA-psbJ, psbE-petL, and ccsA-ndhD. These regions are hypervariable regions, which can be used as specific DNA barcodes. This finding suggested that the chloroplast genome could be used as a powerful tool to resolve the phylogenetic positions and relationships of Papaveraceae. These results offer valuable information for future research in the identification of Papaver species and will benefit further investigations of these species.
An autochthonous case of hepatitis C virus genotype 5a in Brazil: phylogenetic analysis
DEFF Research Database (Denmark)
Ribeiro, L.C.; Souto, F.J.D.; do Espirito-Santo, M.P.
2009-01-01
Genotype 5 of hepatitis C virus (HCV) has been rarely identified in South America. A female of African descent who never left Brazil was found to be infected by this genotype in Mato Grosso state, Central Brazil. The patient denied drug injections and revealed that she had received blood...... transfusions several years before. One of her blood donors was identified and tested negative for anti-HCV and HCV RNA, as were her husband and offspring. Phylogenetic analysis of the E1 and NS5B regions confirmed that this HCV strain belonged to genotype 5a. However, the E1 region analysis indicates that our...
Ragan, Mark A; Murphy, Colleen A; Rand, Thomas G
2003-12-01
Ichthyosporea is a recently recognized group of morphologically simple eukaryotes, many of which cause disease in aquatic organisms. Ribosomal RNA sequence analyses place Ichthyosporea near the divergence of the animal and fungal lineages, but do not allow resolution of its exact phylogenetic position. Some of the best evidence for a specific grouping of animals and fungi (Opisthokonta) has come from elongation factor 1alpha, not only phylogenetic analysis of sequences but also the presence or absence of short insertions and deletions. We sequenced the EF-1alpha gene from the ichthyosporean parasite Ichthyophonus irregularis and determined its phylogenetic position using neighbor-joining, parsimony and Bayesian methods. We also sequenced EF-1alpha genes from four chytrids to provide broader representation within fungi. Sequence analyses and the presence of a characteristic 12 amino acid insertion strongly indicate that I. irregularis is a member of Opisthokonta, but do not resolve whether I. irregularis is a specific relative of animals or of fungi. However, the EF-1alpha of I. irregularis exhibits a two amino acid deletion heretofore reported only among fungi.
Directory of Open Access Journals (Sweden)
Bazzicalupo Marco
2008-12-01
Full Text Available Abstract Background Phylogenetic methods are well-established bioinformatic tools for sequence analysis, allowing to describe the non-independencies of sequences because of their common ancestor. However, the evolutionary profiles of bacterial genes are often complicated by hidden paralogy and extensive and/or (multiple horizontal gene transfer (HGT events which make bifurcating trees often inappropriate. In this context, plasmid sequences are paradigms of network-like relationships characterizing the evolution of prokaryotes. Actually, they can be transferred among different organisms allowing the dissemination of novel functions, thus playing a pivotal role in prokaryotic evolution. However, the study of their evolutionary dynamics is complicated by the absence of universally shared genes, a prerequisite for phylogenetic analyses. Results To overcome such limitations we developed a bioinformatic package, named Blast2Network (B2N, allowing the automatic phylogenetic profiling and the visualization of homology relationships in a large number of plasmid sequences. The software was applied to the study of 47 completely sequenced plasmids coming from Escherichia, Salmonella and Shigella spps. Conclusion The tools implemented by B2N allow to describe and visualize in a new way some of the evolutionary features of plasmid molecules of Enterobacteriaceae; in particular it helped to shed some light on the complex history of Escherichia, Salmonella and Shigella plasmids and to focus on possible roles of unannotated proteins. The proposed methodology is general enough to be used for comparative genomic analyses of bacteria.
Directory of Open Access Journals (Sweden)
Mingsheng Yang
2015-03-01
Full Text Available Satyrinae is one of twelve subfamilies of the butterfly family Nymphalidae, which currently includes nine tribes. However, phylogenetic relationships among them remain largely unresolved, though different researches have been conducted based on both morphological and molecular data. However, ribosomal genes have never been used in tribe level phylogenetic analyses of Satyrinae. In this study we investigate for the first time the phylogenetic relationships among the tribes Elymniini, Amathusiini, Zetherini and Melanitini which are indicated to be a monophyletic group, and the Satyrini, using two ribosomal genes (28s rDNA and 16s rDNA and four protein-coding genes (EF-1α, COI, COII and Cytb. We mainly aim to assess the phylogenetic informativeness of the ribosomal genes as well as clarify the relationships among different tribes. Our results show the two ribosomal genes generally have the same high phylogenetic informativeness compared with EF-1α; and we infer the 28s rDNA would show better informativeness if the 28s rDNA sequence data for each sampling taxon are obtained in this study. The placement of the monotypic genus Callarge Leech in Zetherini is confirmed for the first time based on molecular evidence. In addition, our maximum likelihood (ML and Bayesian inference (BI trees consistently show that the involved Satyrinae including the Amathusiini is monophyletic with high support values. Although the relationships among the five tribes are identical among ML and BI analyses and are mostly strongly-supported in BI analysis, those in ML analysis are lowly- or moderately- supported. Therefore, the relationships among the related five tribes recovered herein need further verification based on more sampling taxa.
Wei, Kai-Fa; Chen, Juan; Chen, Yan-Feng; Wu, Ling-Juan; Xie, Dao-Xin
2012-04-01
The WRKY transcription factors function in plant growth and development, and response to the biotic and abiotic stresses. Although many studies have focused on the functional identification of the WRKY transcription factors, much less is known about molecular phylogenetic and global expression analysis of the complete WRKY family in maize. In this study, we identified 136 WRKY proteins coded by 119 genes in the B73 inbred line from the complete genome and named them in an orderly manner. Then, a comprehensive phylogenetic analysis of five species was performed to explore the origin and evolutionary patterns of these WRKY genes, and the result showed that gene duplication is the major driving force for the origin of new groups and subgroups and functional divergence during evolution. Chromosomal location analysis of maize WRKY genes indicated that 20 gene clusters are distributed unevenly in the genome. Microarray-based expression analysis has revealed that 131 WRKY transcripts encoded by 116 genes may participate in the regulation of maize growth and development. Among them, 102 transcripts are stably expressed with a coefficient of variation (CV) value of WRKY genes with the CV value of >15% are further analysed to discover new organ- or tissue-specific genes. In addition, microarray analyses of transcriptional responses to drought stress and fungal infection showed that maize WRKY proteins are involved in stress responses. All these results contribute to a deep probing into the roles of WRKY transcription factors in maize growth and development and stress tolerance.
Phylogenetic study of Class Armophorea (Alveolata, Ciliophora based on 18S-rDNA data
Directory of Open Access Journals (Sweden)
Thiago da Silva Paiva
2013-01-01
Full Text Available The 18S rDNA phylogeny of Class Armophorea, a group of anaerobic ciliates, is proposed based on an analysis of 44 sequences (out of 195 retrieved from the NCBI/GenBank database. Emphasis was placed on the use of two nucleotide alignment criteria that involved variation in the gap-opening and gap-extension parameters and the use of rRNA secondary structure to orientate multiple-alignment. A sensitivity analysis of 76 data sets was run to assess the effect of variations in indel parameters on tree topologies. Bayesian inference, maximum likelihood and maximum parsimony phylogenetic analyses were used to explore how different analytic frameworks influenced the resulting hypotheses. A sensitivity analysis revealed that the relationships among higher taxa of the Intramacronucleata were dependent upon how indels were determined during multiple-alignment of nucleotides. The phylogenetic analyses rejected the monophyly of the Armophorea most of the time and consistently indicated that the Metopidae and Nyctotheridae were related to the Litostomatea. There was no consensus on the placement of the Caenomorphidae, which could be a sister group of the Metopidae + Nyctorheridae, or could have diverged at the base of the Spirotrichea branch or the Intramacronucleata tree.
Molecular and phylogenetic evidence of chikungunya virus circulating in Assam, India
Directory of Open Access Journals (Sweden)
Prafulla Dutta
2017-01-01
Full Text Available Purpose: Northeast Region of India possesses an abundant number of Aedes mosquitoes, the common vector for Dengue and Chikungunya (CHIK. Dengue is reported every year from Assam, but active surveillance for CHIK virus (CHIKV infection is lacking in this part of India. Therefore, this present study has been undertaken to detect any CHIKV infection during a dengue outbreak in Assam. Materials and Methods: A total of 42 dengue negative samples collected from Guwahati were screened for the presence of CHIK IgM antibodies. Further, all the samples were processed for CHIKV RNA detection by reverse transcriptase-polymerase chain reaction (RT-PCR. Phylogenetic analysis was done by Maximum Likelihood method using Kimura-2 parameter model. Results: No IgM positivity was found in the processed samples; however, 7 samples were positive for CHIKV by RT-PCR. Phylogenetic analysis revealed that the circulating CHIKV belonged to Eastern, Central and Southern African genotype. Sequence analysis showed two uniform nucleotide substitutions and very less amino acid substitution. Conclusion: Silent existence of CHIKV beside dengue is reported from this study. Therefore, CHIKV diagnosis should be included as a regular practice for active surveillance of the disease and its accomplishment before commencing an outbreak.
International Nuclear Information System (INIS)
Mapaco, L.P.; Monjane, I.V.A.; Nhamusso, A.E.; Viljoen, G.J; Dundon, W.G.; Achá, S.J.
2016-01-01
Full text: The complete sequence of the fusion (F) protein gene from eleven Newcastle disease viruses (NDV) isolated from commercial poultry in Mozambique between 2011 and 2016 has been generated. The F gene cleavage site motif for all eleven isolates was 112RRRKRF117 indicating that the viruses are virulent. A phylogenetic analysis using the full F gene sequence revealed that the viruses clustered within genotype VIIh and showed a higher similarity to NDVs from South Africa, China and Southeast Asia than to viruses previously described in Mozambique in 1994 to 1995 and 2005. The characterization of these new NDVs has important implications for Newcastle disease management and control in Mozambique. (author)
Phylogenetic analysis of Monascus and new species from honey, pollen and nests of stingless bees
DEFF Research Database (Denmark)
Barbosa, R. N.; Leong, Su-lin L.; Vinnere-Pettersson, O.
2017-01-01
on this polyphasic approach, the genus Monascus is resolved in nine species, including three new species associated with stingless bees (M. flavipigmentosus sp. nov., M. mellicola sp. nov., M. recifensis sp. nov., M. argentinensis, M. floridanus, M. lunisporas, M. pallens, M. purpureus, M. ruber), and split in two...... new sections (section Floridani sect. nov., section Rubri sect. nov.). Phylogenetic analysis showed that the xerophile Monascus eremophilus does not belong in Monascus and monophyly in Monascus is restored with the transfer of M. eremophilus to Penicillium (P. eremophilum comb. nov.). A list...
Directory of Open Access Journals (Sweden)
Izedin Goga
2014-03-01
Full Text Available Three serum samples positive in Antigen ELISA BVDV have been tested to characterise genetic diversity of bovine viral diarrhea virus (BVDV in Kosovo. Samples were obtained in 2011 from heifers and were amplified by reverse transcription-polymerase chain reaction, sequenced and analysed by computer-assisted phylogenetic analysis. Amplified products and nucleotide sequence showed that all 3 isolates belonged to BVDV 1 genotype and 1b sub genotype. These results enrich the extant knowledge of BVDV and represent the first documented data about Kosovo BVDV isolates.
Directory of Open Access Journals (Sweden)
Gu Xun
2007-03-01
Full Text Available Abstract Background Phylogenetically related miRNAs (miRNA families convey important information of the function and evolution of miRNAs. Due to the special sequence features of miRNAs, pair-wise sequence identity between miRNA precursors alone is often inadequate for unequivocally judging the phylogenetic relationships between miRNAs. Most of the current methods for miRNA classification rely heavily on manual inspection and lack measurements of the reliability of the results. Results In this study, we designed an analysis pipeline (the Phylogeny-Bootstrap-Cluster (PBC pipeline to identify miRNA families based on branch stability in the bootstrap trees derived from overlapping genome-wide miRNA sequence sets. We tested the PBC analysis pipeline with the miRNAs from six animal species, H. sapiens, M. musculus, G. gallus, D. rerio, D. melanogaster, and C. elegans. The resulting classification was compared with the miRNA families defined in miRBase. The two classifications were largely consistent. Conclusion The PBC analysis pipeline is an efficient method for classifying large numbers of heterogeneous miRNA sequences. It requires minimum human involvement and provides measurements of the reliability of the classification results.
GapCoder automates the use of indel characters in phylogenetic analysis.
Young, Nelson D; Healy, John
2003-02-19
Several ways of incorporating indels into phylogenetic analysis have been suggested. Simple indel coding has two strengths: (1) biological realism and (2) efficiency of analysis. In the method, each indel with different start and/or end positions is considered to be a separate character. The presence/absence of these indel characters is then added to the data set. We have written a program, GapCoder to automate this procedure. The program can input PIR format aligned datasets, find the indels and add the indel-based characters. The output is a NEXUS format file, which includes a table showing what region each indel characters is based on. If regions are excluded from analysis, this table makes it easy to identify the corresponding indel characters for exclusion. Manual implementation of the simple indel coding method can be very time-consuming, especially in data sets where indels are numerous and/or overlapping. GapCoder automates this method and is therefore particularly useful during procedures where phylogenetic analyses need to be repeated many times, such as when different alignments are being explored or when various taxon or character sets are being explored. GapCoder is currently available for Windows from http://www.home.duq.edu/~youngnd/GapCoder.
Genetic characterization and phylogenetic analysis of porcine circovirus type 2 (PCV2) in Serbia.
Savic, Bozidar; Milicevic, Vesna; Jakic-Dimic, Dobrila; Bojkovski, Jovan; Prodanovic, Radisa; Kureljusic, Branislav; Potkonjak, Aleksandar; Savic, Borivoje
2012-01-01
Porcine circovirus type 2 (PCV2) is the main causative agent of postweaning multisystemic wasting syndrome (PMWS). To characterize and determine the genetic diversity of PCV2 in the porcine population of Serbia, nucleotide and deduced amino acid sequences of the open reading frame 2 (ORF2) of PCV2 collected from the tissues of pigs that either had died as a result of PMWS or did not exhibit disease symptoms were analyzed. Sequencing and phylogenetic analysis showed considerable diversity among PCV2 ORF2 sequences and the existence of two main PCV2 genotypes, PCV2b and PCV2a, with at least three clusters, 1A/B, 1C and 2D. In order to provide further proof that the 1C strain is circulating in the porcine population, the whole viral genome of one PCV2 isolate was sequenced. Genotyping and phylogenetic analysis using the entire viral genome sequences confirmed that there was a PMWS-associated 1C strain emerging in Serbia. Our analysis also showed that PCV2b is dominant in the porcine population, and that it is exclusively associated with PMWS occurrences in the country. These data constitute a useful basis for further epidemiological studies regarding the heterogeneity of PCV2 strains on the European continent.
Zhang, Meiping; Rong, Ying; Lee, Mi-Kyung; Zhang, Yang; Stelly, David M; Zhang, Hong-Bin
2015-10-01
Cotton is the world's leading textile fiber crop and is also grown as a bioenergy and food crop. Knowledge of the phylogeny of closely related species and the genome origin and evolution of polyploid species is significant for advanced genomics research and breeding. We have reconstructed the phylogeny of the cotton genus, Gossypium L., and deciphered the genome origin and evolution of its five polyploid species by restriction fragment analysis of repeated sequences. Nuclear DNA of 84 accessions representing 35 species and all eight genomes of the genus were analyzed. The phylogenetic tree of the genus was reconstructed using the parsimony method on 1033 polymorphic repeated sequence restriction fragments. The genome origin of its polyploids was determined by calculating the diploid-polyploid restriction fragment correspondence (RFC). The tree is consistent with the morphological classification, genome designation and geographic distribution of the species at subgenus, section and subsection levels. Gossypium lobatum (D7) was unambiguously shown to have the highest RFC with the D-subgenomes of all five polyploids of the genus, while the common ancestor of Gossypium herbaceum (A1) and Gossypium arboreum (A2) likely contributed to the A-subgenomes of the polyploids. These results provide a comprehensive phylogenetic tree of the cotton genus and new insights into the genome origin and evolution of its polyploid species. The results also further demonstrate a simple, rapid and inexpensive method suitable for phylogenetic analysis of closely related species, especially congeneric species, and the inference of genome origin of polyploids that constitute over 70 % of flowering plants.
Guo, Zhong-Long; Wang, Juan; Shen, Yu-Ying
2015-01-01
Insect mitochondrial genome (mitogenome) are the most extensively used genetic information for molecular evolution, phylogenetics and population genetics. Pentatomomorpha (>14,000 species) is the second largest infraorder of Heteroptera and of great economic importance. To better understand the diversity and phylogeny within Pentatomomorpha, we sequenced and annotated the complete mitogenome of Corizus tetraspilus (Hemiptera: Rhopalidae), an important pest of alfalfa in China. We analyzed the main features of the C. tetraspilus mitogenome, and provided a comparative analysis with four other Coreoidea species. Our results reveal that gene content, gene arrangement, nucleotide composition, codon usage, rRNA structures and sequences of mitochondrial transcription termination factor are conserved in Coreoidea. Comparative analysis shows that different protein-coding genes have been subject to different evolutionary rates correlated with the G+C content. All the transfer RNA genes found in Coreoidea have the typical clover leaf secondary structure, except for trnS1 (AGN) which lacks the dihydrouridine (DHU) arm and possesses a unusual anticodon stem (9 bp vs. the normal 5 bp). The control regions (CRs) among Coreoidea are highly variable in size, of which the CR of C. tetraspilus is the smallest (440 bp), making the C. tetraspilus mitogenome the smallest (14,989 bp) within all completely sequenced Coreoidea mitogenomes. No conserved motifs are found in the CRs of Coreoidea. In addition, the A+T content (60.68%) of the CR of C. tetraspilus is much lower than that of the entire mitogenome (74.88%), and is lowest among Coreoidea. Phylogenetic analyses based on mitogenomic data support the monophyly of each superfamily within Pentatomomorpha, and recognize a phylogenetic relationship of (Aradoidea + (Pentatomoidea + (Lygaeoidea + (Pyrrhocoroidea + Coreoidea)))). PMID:26042898
Directory of Open Access Journals (Sweden)
William G. Parker
2016-01-01
Full Text Available Aetosauria is an early-diverging clade of pseudosuchians (crocodile-line archosaurs that had a global distribution and high species diversity as a key component of various Late Triassic terrestrial faunas. It is one of only two Late Triassic clades of large herbivorous archosaurs, and thus served a critical ecological role. Nonetheless, aetosaur phylogenetic relationships are still poorly understood, owing to an overreliance on osteoderm characters, which are often poorly constructed and suspected to be highly homoplastic. A new phylogenetic analysis of the Aetosauria, comprising 27 taxa and 83 characters, includes more than 40 new characters that focus on better sampling the cranial and endoskeletal regions, and represents the most comprenhensive phylogeny of the clade to date. Parsimony analysis recovered three most parsimonious trees; the strict consensus of these trees finds an Aetosauria that is divided into two main clades: Desmatosuchia, which includes the Desmatosuchinae and the Stagonolepidinae, and Aetosaurinae, which includes the Typothoracinae. As defined Desmatosuchinae now contains Neoaetosauroides engaeus and several taxa that were previously referred to the genus Stagonolepis, and a new clade, Desmatosuchini, is erected for taxa more closely related to Desmatosuchus. Overall support for some clades is still weak, and Partitioned Bremer Support (PBS is applied for the first time to a strictly morphological dataset demonstrating that this weak support is in part because of conflict in the phylogenetic signals of cranial versus postcranial characters. PBS helps identify homoplasy among characters from various body regions, presumably the result of convergent evolution within discrete anatomical modules. It is likely that at least some of this character conflict results from different body regions evolving at different rates, which may have been under different selective pressures.
van der Kuyl, Antoinette C.; Ph Ballasina, Donato L.; Dekker, John T.; Maas, Jolanda; Willemsen, Ronald E.; Goudsmit, Jaap
2002-01-01
To test phylogenetic relationships within the genus Testudo (Testudines: Testudinidae), we have sequenced a fragment of the mitochondrial (mt) 12S rRNA gene of 98 tortoise specimens belonging to the genera Testudo, Indotestudo, and Geochelone. Maximum likelihood and neighbor-joining methods identify
EM for phylogenetic topology reconstruction on nonhomogeneous data.
Ibáñez-Marcelo, Esther; Casanellas, Marta
2014-06-17
The reconstruction of the phylogenetic tree topology of four taxa is, still nowadays, one of the main challenges in phylogenetics. Its difficulties lie in considering not too restrictive evolutionary models, and correctly dealing with the long-branch attraction problem. The correct reconstruction of 4-taxon trees is crucial for making quartet-based methods work and being able to recover large phylogenies. We adapt the well known expectation-maximization algorithm to evolutionary Markov models on phylogenetic 4-taxon trees. We then use this algorithm to estimate the substitution parameters, compute the corresponding likelihood, and to infer the most likely quartet. In this paper we consider an expectation-maximization method for maximizing the likelihood of (time nonhomogeneous) evolutionary Markov models on trees. We study its success on reconstructing 4-taxon topologies and its performance as input method in quartet-based phylogenetic reconstruction methods such as QFIT and QuartetSuite. Our results show that the method proposed here outperforms neighbor-joining and the usual (time-homogeneous continuous-time) maximum likelihood methods on 4-leaved trees with among-lineage instantaneous rate heterogeneity, and perform similarly to usual continuous-time maximum-likelihood when data satisfies the assumptions of both methods. The method presented in this paper is well suited for reconstructing the topology of any number of taxa via quartet-based methods and is highly accurate, specially regarding largely divergent trees and time nonhomogeneous data.
Directory of Open Access Journals (Sweden)
McClure Myra
2011-07-01
Full Text Available Abstract Background To combat the pandemic of human immunodeficiency virus 1 (HIV-1, a successful vaccine will need to cope with the variability of transmissible viruses. Human hosts infected with HIV-1 potentially harbour many viral variants but very little is known about viruses that are likely to be transmitted, or even if there are viral characteristics that predict enhanced transmission in vivo. We show for the first time that genetic divergence consistent with a single transmission event in vivo can represent several years of pre-transmission evolution. Results We describe a highly unusual case consistent with a single donor transmitting highly related but distinct HIV-1 variants to two individuals on the same evening. We confirm that the clustering of viral genetic sequences, present within each recipient, is consistent with the history of a single donor across the viral env, gag and pol genes by maximum likelihood and Bayesian Markov Chain Monte Carlo based phylogenetic analyses. Based on an uncorrelated, lognormal relaxed clock of env gene evolution calibrated with other datasets, the time since the most recent common ancestor is estimated as 2.86 years prior to transmission (95% confidence interval 1.28 to 4.54 years. Conclusion Our results show that an effective design for a preventative vaccine will need to anticipate extensive HIV-1 diversity within an individual donor as well as diversity at the population level.
Boykin, L M; Shatters, R G; Hall, D G; Burns, R E; Franqui, R A
2006-10-01
Anastrepha suspensa (Loew) is an economically important pest, restricted to the Greater Antilles and southern Florida. It infests a wide variety of hosts and is of quarantine importance in citrus, a multi-million dollar industry in Florida. The observed recent increase in citrus infested with A. suspensa in Florida has raised questions regarding host-specificity of certain populations and genetic diversity of the pest throughout its geographical distribution. Cytochrome oxidase I (COI) DNA sequence data was used to characterize the genetic diversity of A. suspensa from Florida and Caribbean populations reared from different host plants. Maximum likelihood and Bayesian phylogenetic methods were used to analyse COI data. Sequence variation among mitochondrial COI genes from 107 A. suspensa samples collected throughout Florida and the Caribbean ranged between 0 and 10% and placed all A. suspensa as a monophyletic group that united all A. suspensa in a clade sister to a Central American group of the A. fraterculus paraphyletic species complex. The most likely tree of the COI locus indicated that COI sequence variation was too low to provide resolution at the subspecies level, therefore monophyletic groups based on host-plant use, geography (Florida, Jamaica, Cayman Islands, Puerto Rico or Dominican Republic) or population sampled are not supported. This result indicates that either no population segregation has occurred based on these biological or geographical distinctions and that this is a generalist, polyphagous invasive genotype. Alternatively, if populations are distinct, the segregation event was more recent than can be distinguished based on COI sequence variation.
Phylogenetic analysis of rabbit haemorrhagic disease virus (RHDV) strains isolated in Poland.
Fitzner, Andrzej; Niedbalski, Wieslaw
2017-10-01
The aim of this study was to characterise the nucleotide and amino acid sequence of complete genomes (7.5 kb) from RHDV strains isolated in Poland and estimate the genetic variability in different elements of the viral RNA. In addition, the sequence of Polish RHDV isolates isolated from 1988-2015 was compared with the sequences of other European RHDV, including the RHDVa and RHDV2/RHDVb subtypes. The complete sequence was developed by the compilation of partial nucleotide sequences. This sequence consisted of approximately 7428 nucleotides. For comparison of nucleotide sequences and the development of phylogenetic trees of Polish RHDV isolates and reference RHDV strains representing the main phylogenetic groups of classical RHDV, RHDVa and RHDV2 as well as the non-pathogenic rabbit lagovirus RCV, the BLAST software with blastn and MEGA6 with neighbour-joining method was applied. The complete nucleotide sequence of Polish isolates of RHDV has also been entered into GenBank. For comparative analysis, nineteen complete sequences representing the main RHDV genetic types available in GenBank were used. The results of phylogenetic analysis of Polish RHDV strains reveals the presence of three classical RHDV genogroups (G2, G4 and G5) and an RHDVa variant (G6). The oldest RHDV isolates (KGM 1988, PD 1989 and MAL 1994) belong to genogroup G2. It can be assumed that the elimination of these strains from the environment probably occurred at the turn of 1994 and 1995. Genogroup G2 was replaced by the phylogenetically younger BLA 1994 and OPO 2004 strains from genogroup G4, which probably originated from the G3 lineage, represented by the Italian strains BS89. The last representatives of classical RHDV in Poland are isolates GSK 1988 and ZD0 2000 from genogroup G5. A single clade contains the Polish RHDV strains from 2004-2015 (GRZ 2004, KRY 2004, L145 2004, W147 2005, SKO 2013, GLE 2013, RED1 2013, STR 2012, STR2 2013, STR 2014, BIE 2015) identified as RHDVa, which clustered
PyElph - a software tool for gel images analysis and phylogenetics
Directory of Open Access Journals (Sweden)
Pavel Ana Brânduşa
2012-01-01
Full Text Available Abstract Background This paper presents PyElph, a software tool which automatically extracts data from gel images, computes the molecular weights of the analyzed molecules or fragments, compares DNA patterns which result from experiments with molecular genetic markers and, also, generates phylogenetic trees computed by five clustering methods, using the information extracted from the analyzed gel image. The software can be successfully used for population genetics, phylogenetics, taxonomic studies and other applications which require gel image analysis. Researchers and students working in molecular biology and genetics would benefit greatly from the proposed software because it is free, open source, easy to use, has a friendly Graphical User Interface and does not depend on specific image acquisition devices like other commercial programs with similar functionalities do. Results PyElph software tool is entirely implemented in Python which is a very popular programming language among the bioinformatics community. It provides a very friendly Graphical User Interface which was designed in six steps that gradually lead to the results. The user is guided through the following steps: image loading and preparation, lane detection, band detection, molecular weights computation based on a molecular weight marker, band matching and finally, the computation and visualization of phylogenetic trees. A strong point of the software is the visualization component for the processed data. The Graphical User Interface provides operations for image manipulation and highlights lanes, bands and band matching in the analyzed gel image. All the data and images generated in each step can be saved. The software has been tested on several DNA patterns obtained from experiments with different genetic markers. Examples of genetic markers which can be analyzed using PyElph are RFLP (Restriction Fragment Length Polymorphism, AFLP (Amplified Fragment Length Polymorphism, RAPD
Directory of Open Access Journals (Sweden)
Junha Shin
Full Text Available Phylogenetic profiling, a network inference method based on gene inheritance profiles, has been widely used to construct functional gene networks in microbes. However, its utility for network inference in higher eukaryotes has been limited. An improved algorithm with an in-depth understanding of pathway evolution may overcome this limitation. In this study, we investigated the effects of taxonomic structures on co-inheritance analysis using 2,144 reference species in four query species: Escherichia coli, Saccharomyces cerevisiae, Arabidopsis thaliana, and Homo sapiens. We observed three clusters of reference species based on a principal component analysis of the phylogenetic profiles, which correspond to the three domains of life-Archaea, Bacteria, and Eukaryota-suggesting that pathways inherit primarily within specific domains or lower-ranked taxonomic groups during speciation. Hence, the co-inheritance pattern within a taxonomic group may be eroded by confounding inheritance patterns from irrelevant taxonomic groups. We demonstrated that co-inheritance analysis within domains substantially improved network inference not only in microbe species but also in the higher eukaryotes, including humans. Although we observed two sub-domain clusters of reference species within Eukaryota, co-inheritance analysis within these sub-domain taxonomic groups only marginally improved network inference. Therefore, we conclude that co-inheritance analysis within domains is the optimal approach to network inference with the given reference species. The construction of a series of human gene networks with increasing sample sizes of the reference species for each domain revealed that the size of the high-accuracy networks increased as additional reference species genomes were included, suggesting that within-domain co-inheritance analysis will continue to expand human gene networks as genomes of additional species are sequenced. Taken together, we propose that co
Analysis of Domain Architecture and Phylogenetics of Family 2 Glycoside Hydrolases (GH2.
Directory of Open Access Journals (Sweden)
David Talens-Perales
Full Text Available In this work we report a detailed analysis of the topology and phylogenetics of family 2 glycoside hydrolases (GH2. We distinguish five topologies or domain architectures based on the presence and distribution of protein domains defined in Pfam and Interpro databases. All of them share a central TIM barrel (catalytic module with two β-sandwich domains (non-catalytic at the N-terminal end, but differ in the occurrence and nature of additional non-catalytic modules at the C-terminal region. Phylogenetic analysis was based on the sequence of the Pfam Glyco_hydro_2_C catalytic module present in most GH2 proteins. Our results led us to propose a model in which evolutionary diversity of GH2 enzymes is driven by the addition of different non-catalytic domains at the C-terminal region. This model accounts for the divergence of β-galactosidases from β-glucuronidases, the diversification of β-galactosidases with different transglycosylation specificities, and the emergence of bicistronic β-galactosidases. This study also allows the identification of groups of functionally uncharacterized protein sequences with potential biotechnological interest.
Directory of Open Access Journals (Sweden)
M.O.Baba Sheikh
2017-09-01
Full Text Available Canine parvovirus (CPV remains the most significant viral cause of haemorrhagic enteritis and bloody diarrhoea in puppies over the age of 12 weeks. The objective of the present study was to detect and genotype CPV-2 by polymerase chain reaction (PCR and to perform phylogenetic analysis using partial VP2 gene sequences. We analysed eight faecal samples of unvaccinated dogs with signs of vomiting and bloody diarrhoea during the period from December 2013 to May 2014 in different locations in Sulaimani, Kurdistan, Iraq. After PCR detection, we found that all viral sequences in our study were CPV-2b variants, which differed genetically by 0.8% to 3.6% from five commercially available vaccines. Alignment between eight nucleotides of field virus sequences showed 95% to 99.5% similarity. The phylogenetic analysis for the 8 field sequences formed two distinct clusters with two sequences belonging to strains from China and Thailand and the other six – with a strain from Egypt. Molecular characterisation and CPV typing are crucial in epidemiological studies for future prevention and control of the disease.
Identification and phylogenetic analysis of novel cytochrome P450 1A genes from ungulate species.
Darwish, Wageh Sobhy; Kawai, Yusuke; Ikenaka, Yoshinori; Yamamoto, Hideaki; Muroya, Tarou; Ishizuka, Mayumi
2010-09-01
As part of an ongoing effort to understand the biological response of wild and domestic ungulates to different environmental pollutants such as dioxin-like compounds, cDNAs encoding for CYP1A1 and CYP1A2 were cloned and characterized. Four novel CYP1A cDNA fragments from the livers of four wild ungulates (elephant, hippopotamus, tapir and deer) were identified. Three fragments from hippopotamus, tapir and deer were classified as CYP1A2, and the other fragment from elephant was designated as CYP1A1/2. The deduced amino acid sequences of these fragment CYP1As showed identities ranging from 76 to 97% with other animal CYP1As. The phylogenetic analysis of these fragments showed that both elephant and hippopotamus CYP1As made separate branches, while tapir and deer CYP1As were located beside that of horse and cattle respectively in the phylogenetic tree. Analysis of dN/dS ratio among the identified CYP1As indicated that odd toed ungulate CYP1A2s were exposed to different selection pressure.
Kovács, Endre R; Benko, Mária
2009-03-01
Partial genome characterisation of a novel adenovirus, found recently in organ samples of multiple species of dead birds of prey, was carried out by sequence analysis of PCR-amplified DNA fragments. The virus, named as raptor adenovirus 1 (RAdV-1), has originally been detected by a nested PCR method with consensus primers targeting the adenoviral DNA polymerase gene. Phylogenetic analysis with the deduced amino acid sequence of the small PCR product has implied a new siadenovirus type present in the samples. Since virus isolation attempts remained unsuccessful, further characterisation of this putative novel siadenovirus was carried out with the use of PCR on the infected organ samples. The DNA sequence of the central genome part of RAdV-1, encompassing nine full (pTP, 52K, pIIIa, III, pVII, pX, pVI, hexon, protease) and two partial (DNA polymerase and DBP) genes and exceeding 12 kb pairs in size, was determined. Phylogenetic tree reconstructions, based on several genes, unambiguously confirmed the preliminary classification of RAdV-1 as a new species within the genus Siadenovirus. Further study of RAdV-1 is of interest since it represents a rare adenovirus genus of yet undetermined host origin.
Directory of Open Access Journals (Sweden)
Yuan Wang
Full Text Available Insect mitochondrial genomes (mitogenomes are of great interest in exploring molecular evolution, phylogenetics and population genetics. Only two mitogenomes have been previously released in the insect group Aphididae, which consists of about 5,000 known species including some agricultural, forestry and horticultural pests. Here we report the complete 16,317 bp mitogenome of Cavariella salicicola and two nearly complete mitogenomes of Aphis glycines and Pterocomma pilosum. We also present a first comparative analysis of mitochondrial genomes of aphids. Results showed that aphid mitogenomes share conserved genomic organization, nucleotide and amino acid composition, and codon usage features. All 37 genes usually present in animal mitogenomes were sequenced and annotated. The analysis of gene evolutionary rate revealed the lowest and highest rates for COI and ATP8, respectively. A unique repeat region exclusively in aphid mitogenomes, which included variable numbers of tandem repeats in a lineage-specific manner, was highlighted for the first time. This region may have a function as another origin of replication. Phylogenetic reconstructions based on protein-coding genes and the stem-loop structures of control regions confirmed a sister relationship between Cavariella and pterocommatines. Current evidence suggest that pterocommatines could be formally transferred into Macrosiphini. Our paper also offers methodological instructions for obtaining other Aphididae mitochondrial genomes.
Wang, Chuan; Zhang, Chaowu; Pei, Xiaofang; Liu, Hengchuan
2007-11-01
For being further applied and studied, one strain of Lactobacillus delbrueckii subsp. bulgaricus (wch9901) separated from yoghourt which had been identified by phenotype characteristic analysis was identified by 16S rDNA and phylogenetic analyzed. The 16S rDNA of wch9901 was amplified with the genomic DNA of wch9901 as template, and the conservative sequences of the 16S rDNA as primers. Inserted 16S rDNA amplified into clonal vector pGEM-T under the function of T4 DNA ligase to construct recombined plasmid pGEM-wch9901 16S rDNA. The recombined plasmid was identified by restriction enzyme digestion, and the eligible plasmid was presented to sequencing company for DNA sequencing. Nucleic acid sequence was blast in GenBank and phylogenetic tree was constructed using neighbor-joining method of distance methods by Mega3.1 soft. Results of blastn showed that the homology of 16S rDNA of wch9901 with the 16S rDNA of Lactobacillus delbrueckii subsp. bulgaricus strains was higher than 96%. On the phylogenetic tree, wch9901 formed a separate branch and located between Lactobacillus delbrueckii subsp. bulgaricus LGM2 evolution branch and another evolution branch which was composed of Lactobacillus delbrueckii subsp. bulgaricus DL2 evolution cluster and Lactobacillus delbrueckii subsp. bulgaricus JSQ evolution cluster. The distance between wch9901 evolution branch and Lactobacillus delbrueckii subsp. bulgaricus LGM2 evolution branch was the closest. wch9901 belonged to Lactobacillus delbrueckii subsp. bulgaricus. wch9901 showed the closest evolution relationship to Lactobacillus delbrueckii subsp. bulgaricus LGM2.
Phylogenetic analysis of Demodex caprae based on mitochondrial 16S rDNA sequence.
Zhao, Ya-E; Hu, Li; Ma, Jun-Xian
2013-11-01
Demodex caprae infests the hair follicles and sebaceous glands of goats worldwide, which not only seriously impairs goat farming, but also causes a big economic loss. However, there are few reports on the DNA level of D. caprae. To reveal the taxonomic position of D. caprae within the genus Demodex, the present study conducted phylogenetic analysis of D. caprae based on mt16S rDNA sequence data. D. caprae adults and eggs were obtained from a skin nodule of the goat suffering demodicidosis. The mt16S rDNA sequences of individual mite were amplified using specific primers, and then cloned, sequenced, and aligned. The sequence divergence, genetic distance, and transition/transversion rate were computed, and the phylogenetic trees in Demodex were reconstructed. Results revealed the 339-bp partial sequences of six D. caprae isolates were obtained, and the sequence identity was 100% among isolates. The pairwise divergences between D. caprae and Demodex canis or Demodex folliculorum or Demodex brevis were 22.2-24.0%, 24.0-24.9%, and 22.9-23.2%, respectively. The corresponding average genetic distances were 2.840, 2.926, and 2.665, and the average transition/transversion rates were 0.70, 0.55, and 0.54, respectively. The divergences, genetic distances, and transition/transversion rates of D. caprae versus the other three species all reached interspecies level. The five phylogenetic trees all presented that D. caprae clustered with D. brevis first, and then with D. canis, D. folliculorum, and Demodex injai in sequence. In conclusion, D. caprae is an independent species, and it is closer to D. brevis than to D. canis, D. folliculorum, or D. injai.
Using genes as characters and a parsimony analysis to explore the phylogenetic position of turtles.
Directory of Open Access Journals (Sweden)
Bin Lu
Full Text Available The phylogenetic position of turtles within the vertebrate tree of life remains controversial. Conflicting conclusions from different studies are likely a consequence of systematic error in the tree construction process, rather than random error from small amounts of data. Using genomic data, we evaluate the phylogenetic position of turtles with both conventional concatenated data analysis and a "genes as characters" approach. Two datasets were constructed, one with seven species (human, opossum, zebra finch, chicken, green anole, Chinese pond turtle, and western clawed frog and 4584 orthologous genes, and the second with four additional species (soft-shelled turtle, Nile crocodile, royal python, and tuatara but only 1638 genes. Our concatenated data analysis strongly supported turtle as the sister-group to archosaurs (the archosaur hypothesis, similar to several recent genomic data based studies using similar methods. When using genes as characters and gene trees as character-state trees with equal weighting for each gene, however, our parsimony analysis suggested that turtles are possibly sister-group to diapsids, archosaurs, or lepidosaurs. None of these resolutions were strongly supported by bootstraps. Furthermore, our incongruence analysis clearly demonstrated that there is a large amount of inconsistency among genes and most of the conflict relates to the placement of turtles. We conclude that the uncertain placement of turtles is a reflection of the true state of nature. Concatenated data analysis of large and heterogeneous datasets likely suffers from systematic error and over-estimates of confidence as a consequence of a large number of characters. Using genes as characters offers an alternative for phylogenomic analysis. It has potential to reduce systematic error, such as data heterogeneity and long-branch attraction, and it can also avoid problems associated with computation time and model selection. Finally, treating genes as
Using Genes as Characters and a Parsimony Analysis to Explore the Phylogenetic Position of Turtles
Lu, Bin; Yang, Weizhao; Dai, Qiang; Fu, Jinzhong
2013-01-01
The phylogenetic position of turtles within the vertebrate tree of life remains controversial. Conflicting conclusions from different studies are likely a consequence of systematic error in the tree construction process, rather than random error from small amounts of data. Using genomic data, we evaluate the phylogenetic position of turtles with both conventional concatenated data analysis and a “genes as characters” approach. Two datasets were constructed, one with seven species (human, opossum, zebra finch, chicken, green anole, Chinese pond turtle, and western clawed frog) and 4584 orthologous genes, and the second with four additional species (soft-shelled turtle, Nile crocodile, royal python, and tuatara) but only 1638 genes. Our concatenated data analysis strongly supported turtle as the sister-group to archosaurs (the archosaur hypothesis), similar to several recent genomic data based studies using similar methods. When using genes as characters and gene trees as character-state trees with equal weighting for each gene, however, our parsimony analysis suggested that turtles are possibly sister-group to diapsids, archosaurs, or lepidosaurs. None of these resolutions were strongly supported by bootstraps. Furthermore, our incongruence analysis clearly demonstrated that there is a large amount of inconsistency among genes and most of the conflict relates to the placement of turtles. We conclude that the uncertain placement of turtles is a reflection of the true state of nature. Concatenated data analysis of large and heterogeneous datasets likely suffers from systematic error and over-estimates of confidence as a consequence of a large number of characters. Using genes as characters offers an alternative for phylogenomic analysis. It has potential to reduce systematic error, such as data heterogeneity and long-branch attraction, and it can also avoid problems associated with computation time and model selection. Finally, treating genes as characters
Using ESTs for phylogenomics: Can one accurately infer a phylogenetic tree from a gappy alignment?
Directory of Open Access Journals (Sweden)
Hartmann Stefanie
2008-03-01
Full Text Available Abstract Background While full genome sequences are still only available for a handful of taxa, large collections of partial gene sequences are available for many more. The alignment of partial gene sequences results in a multiple sequence alignment containing large gaps that are arranged in a staggered pattern. The consequences of this pattern of missing data on the accuracy of phylogenetic analysis are not well understood. We conducted a simulation study to determine the accuracy of phylogenetic trees obtained from gappy alignments using three commonly used phylogenetic reconstruction methods (Neighbor Joining, Maximum Parsimony, and Maximum Likelihood and studied ways to improve the accuracy of trees obtained from such datasets. Results We found that the pattern of gappiness in multiple sequence alignments derived from partial gene sequences substantially compromised phylogenetic accuracy even in the absence of alignment error. The decline in accuracy was beyond what would be expected based on the amount of missing data. The decline was particularly dramatic for Neighbor Joining and Maximum Parsimony, where the majority of gappy alignments contained 25% to 40% incorrect quartets. To improve the accuracy of the trees obtained from a gappy multiple sequence alignment, we examined two approaches. In the first approach, alignment masking, potentially problematic columns and input sequences are excluded from from the dataset. Even in the absence of alignment error, masking improved phylogenetic accuracy up to 100-fold. However, masking retained, on average, only 83% of the input sequences. In the second approach, alignment subdivision, the missing data is statistically modelled in order to retain as many sequences as possible in the phylogenetic analysis. Subdivision resulted in more modest improvements to alignment accuracy, but succeeded in including almost all of the input sequences. Conclusion These results demonstrate that partial gene
Using ESTs for phylogenomics: can one accurately infer a phylogenetic tree from a gappy alignment?
Hartmann, Stefanie; Vision, Todd J
2008-03-26
While full genome sequences are still only available for a handful of taxa, large collections of partial gene sequences are available for many more. The alignment of partial gene sequences results in a multiple sequence alignment containing large gaps that are arranged in a staggered pattern. The consequences of this pattern of missing data on the accuracy of phylogenetic analysis are not well understood. We conducted a simulation study to determine the accuracy of phylogenetic trees obtained from gappy alignments using three commonly used phylogenetic reconstruction methods (Neighbor Joining, Maximum Parsimony, and Maximum Likelihood) and studied ways to improve the accuracy of trees obtained from such datasets. We found that the pattern of gappiness in multiple sequence alignments derived from partial gene sequences substantially compromised phylogenetic accuracy even in the absence of alignment error. The decline in accuracy was beyond what would be expected based on the amount of missing data. The decline was particularly dramatic for Neighbor Joining and Maximum Parsimony, where the majority of gappy alignments contained 25% to 40% incorrect quartets. To improve the accuracy of the trees obtained from a gappy multiple sequence alignment, we examined two approaches. In the first approach, alignment masking, potentially problematic columns and input sequences are excluded from from the dataset. Even in the absence of alignment error, masking improved phylogenetic accuracy up to 100-fold. However, masking retained, on average, only 83% of the input sequences. In the second approach, alignment subdivision, the missing data is statistically modelled in order to retain as many sequences as possible in the phylogenetic analysis. Subdivision resulted in more modest improvements to alignment accuracy, but succeeded in including almost all of the input sequences. These results demonstrate that partial gene sequences and gappy multiple sequence alignments can pose a
Romanovskaia, V A; Rokitko, P V
2011-01-01
To determine a possibility of application of phylogenetic criteria for estimating the taxa rank, the intra- and interspecies, as well as intergeneric relatedness of methanotrophs on the basis of 16S rRNA gene sequences was estimated. We used sequences of 16S rRNA genes of the studied isolates of obligate methanotrophs which have been deposited in UCM (Ukrainian Collection of Microorganisms), and of type strains of other obligate methanotrophs species (from GenBank database). It is shown, that the levels of interspecies and intergeneric relatedness in different families of methanotrophs are not identical, and therefore they can be used for differentiation of taxa only within one family. The carried out analysis has shown, that it is necessary to reconsider taxonomic position: (1) of two phenotypically similar species of Methylomonas (M. aurantiaca and M. fodinarum), similarity of 16S rRNA genes which is 99.4%, similarity of their total DNA--up to 80% that rather testifies to strain differences, than to species differences; (2) of species Methylomicrobium agile and M album which are phylogenetically more related to genus Methylobacter (97% of affinity), than Methylomicrobium (94% of affinity); (3) of genera of the family Beijerinckiaceae (Methylocella and Methylocapsa), and also genera of the family Methylocystaceae (Methylosinus and Methylocystis), whereas high level of relatedness (97% and more) of these bacteria with other methanotrophic genera (within one family) practically corresponds to a range of relatedness of species (within some genera) in the family Methylococcaceae. When determining phylogenetic criteria which can characterize the ranks of taxa, it was revealed, that the levels of interspecies relatedness of methanotrophic genera of the families Methylocystaceae and Beijerinckiaceae (97.8-99.1% and 97.8%, accordingly) considerably exceed the level of genera formation in the family Methylococcaceae (94.0-98.2%) and, moreover, approach the value of
Hasanpour, Mojtaba; Najafi, Akram
2017-06-01
Uropathogenic Escherichia coli (UPEC) is among major pathogens causing 80-90% of all episodes of urinary tract infections (UTIs). Recently, E. coli strains are divided into eight main phylogenetic groups including A, B1, B2, C, D, E, F, and clade I. This study was aimed to develop a rapid, sensitive, and specific multiplex real time PCR method capable of detecting phylogenetic groups of E. coli strains. This study was carried out on E. coli strains (isolated from the patient with UTI) in which the presence of all seven target genes had been confirmed in our previous phylogenetic study. An EvaGreen-based singleplex and multiplex real-time PCR with melting curve analysis was designed for simultaneous detection and differentiation of these genes. The primers were selected mainly based on the production of amplicons with melting temperatures (T m ) ranging from 82°C to 93°C and temperature difference of more than 1.5°C between each peak.The multiplex real-time PCR assays that have been developed in the present study were successful in detecting the eight main phylogenetic groups. Seven distinct melting peaks were discriminated, with Tm value of 93±0.8 for arpA, 89.2±0.1for chuA, 86.5±0.1 for yjaA, 82.3±0.2 for TspE4C2, 87.8±0.1for trpAgpC, 85.4±0.6 for arpAgpE genes, and 91±0.5 for the internal control. To our knowledge, this study is the first melting curve-based real-time PCR assay developed for simultaneous and discrete detection of these seven target genes. Our findings showed that this assay has the potential to be a rapid, reliable and cost-effective alternative for routine phylotyping of E. coli strains. Copyright © 2017 Elsevier B.V. All rights reserved.
Social Mating System and Sex-Biased Dispersal in Mammals and Birds: A Phylogenetic Analysis
Mabry, Karen E.; Shelley, Erin L.; Davis, Katie E.; Blumstein, Daniel T.; Van Vuren, Dirk H.
2013-01-01
The hypothesis that patterns of sex-biased dispersal are related to social mating system in mammals and birds has gained widespread acceptance over the past 30 years. However, two major complications have obscured the relationship between these two behaviors: 1) dispersal frequency and dispersal distance, which measure different aspects of the dispersal process, have often been confounded, and 2) the relationship between mating system and sex-biased dispersal in these vertebrate groups has not been examined using modern phylogenetic comparative methods. Here, we present a phylogenetic analysis of the relationship between mating system and sex-biased dispersal in mammals and birds. Results indicate that the evolution of female-biased dispersal in mammals may be more likely on monogamous branches of the phylogeny, and that females may disperse farther than males in socially monogamous mammalian species. However, we found no support for a relationship between social mating system and sex-biased dispersal in birds when the effects of phylogeny are taken into consideration. We caution that although there are larger-scale behavioral differences in mating system and sex-biased dispersal between mammals and birds, mating system and sex-biased dispersal are far from perfectly associated within these taxa. PMID:23483957
The performance of the Congruence Among Distance Matrices (CADM test in phylogenetic analysis
Directory of Open Access Journals (Sweden)
Lapointe François-Joseph
2011-03-01
Full Text Available Abstract Background CADM is a statistical test used to estimate the level of Congruence Among Distance Matrices. It has been shown in previous studies to have a correct rate of type I error and good power when applied to dissimilarity matrices and to ultrametric distance matrices. Contrary to most other tests of incongruence used in phylogenetic analysis, the null hypothesis of the CADM test assumes complete incongruence of the phylogenetic trees instead of congruence. In this study, we performed computer simulations to assess the type I error rate and power of the test. It was applied to additive distance matrices representing phylogenies and to genetic distance matrices obtained from nucleotide sequences of different lengths that were simulated on randomly generated trees of varying sizes, and under different evolutionary conditions. Results Our results showed that the test has an accurate type I error rate and good power. As expected, power increased with the number of objects (i.e., taxa, the number of partially or completely congruent matrices and the level of congruence among distance matrices. Conclusions Based on our results, we suggest that CADM is an excellent candidate to test for congruence and, when present, to estimate its level in phylogenomic studies where numerous genes are analysed simultaneously.
MulRF: a software package for phylogenetic analysis using multi-copy gene trees.
Chaudhary, Ruchi; Fernández-Baca, David; Burleigh, John Gordon
2015-02-01
MulRF is a platform-independent software package for phylogenetic analysis using multi-copy gene trees. It seeks the species tree that minimizes the Robinson-Foulds (RF) distance to the input trees using a generalization of the RF distance to multi-labeled trees. The underlying generic tree distance measure and fast running time make MulRF useful for inferring phylogenies from large collections of gene trees, in which multiple evolutionary processes as well as phylogenetic error may contribute to gene tree discord. MulRF implements several features for customizing the species tree search and assessing the results, and it provides a user-friendly graphical user interface (GUI) with tree visualization. The species tree search is implemented in C++ and the GUI in Java Swing. MulRF's executable as well as sample datasets and manual are available at http://genome.cs.iastate.edu/CBL/MulRF/, and the source code is available at https://github.com/ruchiherself/MulRFRepo. ruchic@ufl.edu Supplementary data are available at Bioinformatics online. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Directory of Open Access Journals (Sweden)
Bertoni Giuseppe
2011-06-01
Full Text Available Abstract Background Small Ruminant Lentiviruses (SRLV are widespread in Canadian sheep and goats and represent an important health issue in these animals. There is however no data about the genetic diversity of Caprine Arthritis Encephalitis Virus (CAEV or Maedi Visna Virus (MVV in this country. Findings We performed a molecular and phylogenetic analysis of sheep and goat lentiviruses from a small geographic area in Canada using long sequences from the gag region of 30 infected sheep and 36 infected goats originating from 14 different flocks. Pairwise DNA distance and phylogenetic analyses revealed that all SRLV sequences obtained from sheep clustered tightly with prototypical Maedi visna sequences from America. Similarly, all SRLV strains obtained from goats clustered tightly with prototypical US CAEV-Cork strain. Conclusions The data reported in this study suggests that Canadian and US SRLV strains share common origins. In addition, the molecular data failed to bring to light any evidence of past cross species transmission between sheep and goats, which is consistent with the type of farming practiced in this part of the country where single species flocks predominate and where opportunities of cross species transmissions are proportionately low.
Molecular and phylogenetic analysis of HIV-1 variants circulating in Italy
Directory of Open Access Journals (Sweden)
Sbreglia Costanza
2008-10-01
Full Text Available Abstract Objective The continuous identification of HIV-1 non-B subtypes and recombinant forms in Italy indicates the need of constant molecular epidemiology survey of genetic forms circulating and transmitted in the resident population. Methods The distribution of HIV-1 subtypes has been evaluated in 25 seropositive individuals residing in Italy, most of whom were infected through a sexual route during the 1995–2005 period. Each sample has been characterized by detailed molecular and phylogenetic analyses. Results 18 of the 25 samples were positive at HIV-1 PCR amplification. Three samples showed a nucleotide divergence compatible with a non-B subtype classification. The phylogenetic analysis, performed on both HIV-1 env and gag regions, confirms the molecular sub-typing prediction, given that 1 sample falls into the C subtype and 2 into the G subtype. The B subtype isolates show high levels of intra-subtype nucleotide divergence, compatible with a long-lasting epidemic and a progressive HIV-1 molecular diversification. Conclusion The Italian HIV-1 epidemic is still mostly attributable to the B subtype, regardless the transmission route, which shows an increasing nucleotide heterogeneity. Heterosexual transmission and the interracial blending, however, are slowly introducing novel HIV-1 subtypes. Therefore, a molecular monitoring is needed to follow the constant evolution of the HIV-1 epidemic.
Phylogenetic analysis of Tibetan mastiffs based on mitochondrial hypervariable region I.
Ren, Zhanjun; Chen, Huiling; Yang, Xuejiao; Zhang, Chengdong
2017-03-01
Recently, the number of Tibetan mastiffs, which is a precious germplasm resource and cultural heritage, is decreasing sharply. Therefore, the genetic diversity of Tibetan mastiffs needs to be studied to clarify its phylogenetics relationships and lay the foundation for resource protection, rational development and utilization of Tibetan mastiffs. We sequenced hypervariable region I of mitochondrial DNA (mtDNA) of 110 individuals from Tibet region and Gansu province. A total of 12 polymorphic sites were identified which defined eight haplotypes of which H4 and H8 were unique to Tibetan population with H8 being identified first. The haplotype diversity (Hd: 0.808), nucleotide diversity (Pi: 0.603%), the average number of nucleotide difference (K: 3.917) of Tibetan mastiffs from Gansu were higher than those from Tibet region (Hd: 0.794; Pi: 0.589%; K: 3.831), which revealed higher genetic diversity in Gansu. In terms of total population, the genetic variation was low. The median-joining network and phylogenetic tree based on the mtDNA hypervariable region I showed that Tibetan mastiffs originated from grey wolves, as the other domestic dogs and had different history of maternal origin. The mismatch distribution analysis and neutrality tests indicated that Tibetan mastiffs were in genetic equilibrium or in a population decline.
The performance of the Congruence Among Distance Matrices (CADM) test in phylogenetic analysis
2011-01-01
Background CADM is a statistical test used to estimate the level of Congruence Among Distance Matrices. It has been shown in previous studies to have a correct rate of type I error and good power when applied to dissimilarity matrices and to ultrametric distance matrices. Contrary to most other tests of incongruence used in phylogenetic analysis, the null hypothesis of the CADM test assumes complete incongruence of the phylogenetic trees instead of congruence. In this study, we performed computer simulations to assess the type I error rate and power of the test. It was applied to additive distance matrices representing phylogenies and to genetic distance matrices obtained from nucleotide sequences of different lengths that were simulated on randomly generated trees of varying sizes, and under different evolutionary conditions. Results Our results showed that the test has an accurate type I error rate and good power. As expected, power increased with the number of objects (i.e., taxa), the number of partially or completely congruent matrices and the level of congruence among distance matrices. Conclusions Based on our results, we suggest that CADM is an excellent candidate to test for congruence and, when present, to estimate its level in phylogenomic studies where numerous genes are analysed simultaneously. PMID:21388552
Directory of Open Access Journals (Sweden)
Paola Pedraza-Penalosa
2015-04-01
Full Text Available The blueberry tribe Vaccinieae (Ericaceae is particularly diverse in South America and underwent extensive radiation in Colombia where many endemics occur. Recent fieldwork in Colombia has resulted in valuable additions to the phylogeny and as well in the discovery of morphologically noteworthy new species that need to be phylogenetically placed before being named. This is particularly important, as the monophyly of many of the studied genera have not been confirmed. In order to advance our understanding of the relationships within neotropical Vaccinieae and advice the taxonomy of the new blueberry relatives, here we present the most comprehensive phylogenetic analysis for the Andean clade. Anthopterus, Demosthenesia, and Pellegrinia are among the putative Andean genera recovered as monophyletic, while other eight Andean genera were not. The analyses also showed that genera that have been traditionally widely defined are non-monophyletic and could be further split into more discrete groups. Four newly discovered Colombian Vaccinieae are placed in the monophyletic Satyria s.s. and the Psammisia I clade. Although these new species are endemic to the Colombian Western Cordillera and Chocó biogeographic region and three are not known outside of Las Orquídeas National Park, they do not form sister pairs.
Roukaerts, Inge D M; Theuns, Sebastiaan; Taffin, Elien R L; Daminet, Sylvie; Nauwynck, Hans J
2015-01-22
Feline immunodeficiency virus (FIV) is a major pathogen in feline populations worldwide, with seroprevalences up to 26%. Virus strains circulating in domestic cats are subdivided into different phylogenetic clades (A-E), based on the genetic diversity of the V3-V4 region of the env gene. In this report, a phylogenetic analysis of the V3-V4 env region, and a variable region in the gag gene was made for 36 FIV strains isolated in Belgium and The Netherlands. All newly generated gag sequences clustered together with previously known clade A FIV viruses, confirming the dominance of clade A viruses in Northern Europe. The same was true for the obtained env sequences, with only one sample of an unknown env subtype. Overall, the genetic diversity of FIV strains sequenced in this report was low. This indicates a relatively recent introduction of FIV in Belgium and The Netherlands. However, the sample with an unknown env subtype indicates that new introductions of FIV from unknown origin do occur and this will likely increase genetic variability in time. Copyright © 2014 Elsevier B.V. All rights reserved.
Directory of Open Access Journals (Sweden)
Gabriella Linc
Full Text Available This paper reports detailed FISH-based karyotypes for three diploid wheatgrass species Agropyron cristatum (L. Beauv., Thinopyrum bessarabicum (Savul.&Rayss A. Löve, Pseudoroegneria spicata (Pursh A. Löve, the supposed ancestors of hexaploid Thinopyrum intermedium (Host Barkworth & D.R.Dewey, compiled using DNA repeats and comparative genome analysis based on COS markers. Fluorescence in situ hybridization (FISH with repetitive DNA probes proved suitable for the identification of individual chromosomes in the diploid JJ, StSt and PP genomes. Of the seven microsatellite markers tested only the (GAAn trinucleotide sequence was appropriate for use as a single chromosome marker for the P. spicata AS chromosome. Based on COS marker analysis, the phylogenetic relationship between diploid wheatgrasses and the hexaploid bread wheat genomes was established. These findings confirmed that the J and E genomes are in neighbouring clusters.
Genome-wide analysis of SINA family in plants and their phylogenetic relationships.
Wang, Meng; Jin, Ying; Fu, Junjie; Zhu, Yun; Zheng, Jun; Hu, Jian; Wang, Guoying
2008-06-01
SINA genes in plants are part of a multigene family with 5 members in Arabidopsis thaliana, 10 members in Populus trichocarpa, 6 members in Oryza sativa, at least 6 members in Zea mays and at least 1 member in Physcomitrella patens. Six members in maize were confirmed by RT-PCR. All SINAs have one RING domain and one SINA domain. These two domains are highly conserved in plants. According to the motif organization and phylogenetic tree, SINA family members were divided into 2 groups. In addition, through semi-quantitative RT-PCR analysis of maize members and Digital Northern analysis of Arabidopsis and rice members, we found that the tissue expression patterns are more diverse in monocot than in Arabidopsis.
Phylogenetic analysis of West Nile virus isolated in Italy in 2008.
Savini, G; Monaco, F; Calistri, P; Lelli, R
2008-11-27
In Italy the first occurrence of West Nile virus (WNV) infection was reported in Tuscany region during the late summer of 1998. In August 2008, the WNV infection re-emerged in Italy, in areas surrounding the Po river delta, and involving three regions Lombardy, Emilia Romagna and Veneto. WNV was isolated from blood and organs samples of one horse, one donkey, one pigeon (Columba livia) and three magpies (Pica pica). The phylogenetic analysis of the isolates, conducted on 255 bp in the region coding for the E protein, indicates that these isolates belong to the lineage I among the European strains. According to the analysis, both the 1998 and 2008 Italian strains as well as isolates from Romania, Russia, Senegal and Kenya fell in the same sub-cluster.
da Silva, Roberta dos Santos; Mejdalani, Gabriel; Cavichioli, Rodney R.
2015-01-01
Abstract The South American sharpshooter genus Subrasaca comprises 14 species. Some species of this genus are quite common in the Brazilian Atlantic Rainforest. In this paper, a phylogenetic analysis of Subrasaca, based on a matrix of 20 terminal taxa and 72 morphological characters of the head, thorax, and male and female genitalia, is presented. The analysis yielded six equally most parsimonious trees (197 steps, CI = 0.6091, RI = 0.5722, and RC = 0.3486). The results suggest that Subrasaca is a monophyletic taxon, although the genus branch is not robust. The clade showing the highest bootstrap and Bremer scores is formed by species with longitudinal dark brown to black stripes on the forewings (Subrasaca bimaculata, Subrasaca constricta, Subrasaca curvovittata, and Subrasaca flavolineata), followed by Subrasaca atronasa + Subrasaca austera. PMID:25829841
Kubo, Yuji; Rooney, Alejandro P; Tsukakoshi, Yoshiki; Nakagawa, Rikio; Hasegawa, Hiromasa; Kimura, Keitarou
2011-09-01
Spore-forming Bacillus strains that produce extracellular poly-γ-glutamic acid were screened for their application to natto (fermented soybean food) fermentation. Among the 424 strains, including Bacillus subtilis and B. amyloliquefaciens, which we isolated from rice straw, 59 were capable of fermenting natto. Biotin auxotrophism was tightly linked to natto fermentation. A multilocus nucleotide sequence of six genes (rpoB, purH, gyrA, groEL, polC, and 16S rRNA) was used for phylogenetic analysis, and amplified fragment length polymorphism (AFLP) analysis was also conducted on the natto-fermenting strains. The ability to ferment natto was inferred from the two principal components of the AFLP banding pattern, and natto-fermenting strains formed a tight cluster within the B. subtilis subsp. subtilis group.
Kubo, Yuji; Rooney, Alejandro P.; Tsukakoshi, Yoshiki; Nakagawa, Rikio; Hasegawa, Hiromasa; Kimura, Keitarou
2011-01-01
Spore-forming Bacillus strains that produce extracellular poly-γ-glutamic acid were screened for their application to natto (fermented soybean food) fermentation. Among the 424 strains, including Bacillus subtilis and B. amyloliquefaciens, which we isolated from rice straw, 59 were capable of fermenting natto. Biotin auxotrophism was tightly linked to natto fermentation. A multilocus nucleotide sequence of six genes (rpoB, purH, gyrA, groEL, polC, and 16S rRNA) was used for phylogenetic analysis, and amplified fragment length polymorphism (AFLP) analysis was also conducted on the natto-fermenting strains. The ability to ferment natto was inferred from the two principal components of the AFLP banding pattern, and natto-fermenting strains formed a tight cluster within the B. subtilis subsp. subtilis group. PMID:21764950
Frank, Daniel N
2008-10-07
Advances in automated DNA sequencing technology have accelerated the generation of metagenomic DNA sequences, especially environmental ribosomal RNA gene (rDNA) sequences. As the scale of rDNA-based studies of microbial ecology has expanded, need has arisen for software that is capable of managing, annotating, and analyzing the plethora of diverse data accumulated in these projects. XplorSeq is a software package that facilitates the compilation, management and phylogenetic analysis of DNA sequences. XplorSeq was developed for, but is not limited to, high-throughput analysis of environmental rRNA gene sequences. XplorSeq integrates and extends several commonly used UNIX-based analysis tools by use of a Macintosh OS-X-based graphical user interface (GUI). Through this GUI, users may perform basic sequence import and assembly steps (base-calling, vector/primer trimming, contig assembly), perform BLAST (Basic Local Alignment and Search Tool; 123) searches of NCBI and local databases, create multiple sequence alignments, build phylogenetic trees, assemble Operational Taxonomic Units, estimate biodiversity indices, and summarize data in a variety of formats. Furthermore, sequences may be annotated with user-specified meta-data, which then can be used to sort data and organize analyses and reports. A document-based architecture permits parallel analysis of sequence data from multiple clones or amplicons, with sequences and other data stored in a single file. XplorSeq should benefit researchers who are engaged in analyses of environmental sequence data, especially those with little experience using bioinformatics software. Although XplorSeq was developed for management of rDNA sequence data, it can be applied to most any sequencing project. The application is available free of charge for non-commercial use at http://vent.colorado.edu/phyloware.
Long-branch attraction bias and inconsistency in Bayesian phylogenetics.
Kolaczkowski, Bryan; Thornton, Joseph W
2009-12-09
Bayesian inference (BI) of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML), so BI has generally been assumed to share ML's desirable statistical properties, such as largely unbiased inference of topology given an accurate model and increasingly reliable inferences as the amount of data increases. Here we show that BI, unlike ML, is biased in favor of topologies that group long branches together, even when the true model and prior distributions of evolutionary parameters over a group of phylogenies are known. Using experimental simulation studies and numerical and mathematical analyses, we show that this bias becomes more severe as more data are analyzed, causing BI to infer an incorrect tree as the maximum a posteriori phylogeny with asymptotically high support as sequence length approaches infinity. BI's long branch attraction bias is relatively weak when the true model is simple but becomes pronounced when sequence sites evolve heterogeneously, even when this complexity is incorporated in the model. This bias--which is apparent under both controlled simulation conditions and in analyses of empirical sequence data--also makes BI less efficient and less robust to the use of an incorrect evolutionary model than ML. Surprisingly, BI's bias is caused by one of the method's stated advantages--that it incorporates uncertainty about branch lengths by integrating over a distribution of possible values instead of estimating them from the data, as ML does. Our findings suggest that trees inferred using BI should be interpreted with caution and that ML may be a more reliable framework for modern phylogenetic analysis.
Long-branch attraction bias and inconsistency in Bayesian phylogenetics.
Directory of Open Access Journals (Sweden)
Bryan Kolaczkowski
Full Text Available Bayesian inference (BI of phylogenetic relationships uses the same probabilistic models of evolution as its precursor maximum likelihood (ML, so BI has generally been assumed to share ML's desirable statistical properties, such as largely unbiased inference of topology given an accurate model and increasingly reliable inferences as the amount of data increases. Here we show that BI, unlike ML, is biased in favor of topologies that group long branches together, even when the true model and prior distributions of evolutionary parameters over a group of phylogenies are known. Using experimental simulation studies and numerical and mathematical analyses, we show that this bias becomes more severe as more data are analyzed, causing BI to infer an incorrect tree as the maximum a posteriori phylogeny with asymptotically high support as sequence length approaches infinity. BI's long branch attraction bias is relatively weak when the true model is simple but becomes pronounced when sequence sites evolve heterogeneously, even when this complexity is incorporated in the model. This bias--which is apparent under both controlled simulation conditions and in analyses of empirical sequence data--also makes BI less efficient and less robust to the use of an incorrect evolutionary model than ML. Surprisingly, BI's bias is caused by one of the method's stated advantages--that it incorporates uncertainty about branch lengths by integrating over a distribution of possible values instead of estimating them from the data, as ML does. Our findings suggest that trees inferred using BI should be interpreted with caution and that ML may be a more reliable framework for modern phylogenetic analysis.
Directory of Open Access Journals (Sweden)
Jérôme Beaufays
Full Text Available BACKGROUND: During their blood meal, ticks secrete a wide variety of proteins that interfere with their host's defense mechanisms. Among these proteins, lipocalins play a major role in the modulation of the inflammatory response. METHODOLOGY/PRINCIPAL FINDINGS: Screening a cDNA library in association with RT-PCR and RACE methodologies allowed us to identify 14 new lipocalin genes in the salivary glands of the Ixodes ricinus hard tick. A computational in-depth structural analysis confirmed that LIRs belong to the lipocalin family. These proteins were called LIR for "Lipocalin from I. ricinus" and numbered from 1 to 14 (LIR1 to LIR14. According to their percentage identity/similarity, LIR proteins may be assigned to 6 distinct phylogenetic groups. The mature proteins have calculated pM and pI varying from 21.8 kDa to 37.2 kDa and from 4.45 to 9.57 respectively. In a western blot analysis, all recombinant LIRs appeared as a series of thin bands at 50-70 kDa, suggesting extensive glycosylation, which was experimentally confirmed by treatment with N-glycosidase F. In addition, the in vivo expression analysis of LIRs in I. ricinus, examined by RT-PCR, showed homogeneous expression profiles for certain phylogenetic groups and relatively heterogeneous profiles for other groups. Finally, we demonstrated that LIR6 codes for a protein that specifically binds leukotriene B4. CONCLUSIONS/SIGNIFICANCE: This work confirms that, regarding their biochemical properties, expression profile, and sequence signature, lipocalins in Ixodes hard tick genus, and more specifically in the Ixodes ricinus species, are segregated into distinct phylogenetic groups suggesting potential distinct function. This was particularly demonstrated by the ability of LIR6 to scavenge leukotriene B4. The other LIRs did not bind any of the ligands tested, such as 5-hydroxytryptamine, ADP, norepinephrine, platelet activating factor, prostaglandins D2 and E2, and finally leukotrienes B4 and C
Baños, Hector; Bushek, Nathaniel; Davidson, Ruth; Gross, Elizabeth; Harris, Pamela E.; Krone, Robert; Long, Colby; Stewart, Allen; Walker, Robert
2016-01-01
We introduce the package PhylogeneticTrees for Macaulay2 which allows users to compute phylogenetic invariants for group-based tree models. We provide some background information on phylogenetic algebraic geometry and show how the package PhylogeneticTrees can be used to calculate a generating set for a phylogenetic ideal as well as a lower bound for its dimension. Finally, we show how methods within the package can be used to compute a generating set for the join of any two ideals.
Zhang, Dong; Yan, Liping; Zhang, Ming; Chu, Hongjun; Cao, Jie; Li, Kai; Hu, Defu; Pape, Thomas
2016-01-01
The complete mitogenome of the horse stomach bot fly Gasterophilus pecorum (Fabricius) and a near-complete mitogenome of Wohlfahrt's wound myiasis fly Wohlfahrtia magnifica (Schiner) were sequenced. The mitogenomes contain the typical 37 mitogenes found in metazoans, organized in the same order and orientation as in other cyclorrhaphan Diptera. Phylogenetic analyses of mitogenomes from 38 calyptrate taxa with and without two non-calyptrate outgroups were performed using Bayesian Inference and Maximum Likelihood. Three sub-analyses were performed on the concatenated data: (1) not partitioned; (2) partitioned by gene; (3) 3rd codon positions of protein-coding genes omitted. We estimated the contribution of each of the mitochondrial genes for phylogenetic analysis, as well as the effect of some popular methodologies on calyptrate phylogeny reconstruction. In the favoured trees, the Oestroidea are nested within the muscoid grade. Relationships at the family level within Oestroidea are (remaining Calliphoridae (Sarcophagidae (Oestridae, Pollenia + Tachinidae))). Our mito-phylogenetic reconstruction of the Calyptratae presents the most extensive taxon coverage so far, and the risk of long-branch attraction is reduced by an appropriate selection of outgroups. We find that in the Calyptratae the ND2, ND5, ND1, COIII, and COI genes are more phylogenetically informative compared with other mitochondrial protein-coding genes. Our study provides evidence that data partitioning and the inclusion of conserved tRNA genes have little influence on calyptrate phylogeny reconstruction, and that the 3rd codon positions of protein-coding genes are not saturated and therefore should be included. PMID:27019632
Short segment search method for phylogenetic analysis using nested sliding windows
Iskandar, A. A.; Bustamam, A.; Trimarsanto, H.
2017-10-01
To analyze phylogenetics in Bioinformatics, coding DNA sequences (CDS) segment is needed for maximal accuracy. However, analysis by CDS cost a lot of time and money, so a short representative segment by CDS, which is envelope protein segment or non-structural 3 (NS3) segment is necessary. After sliding window is implemented, a better short segment than envelope protein segment and NS3 is found. This paper will discuss a mathematical method to analyze sequences using nested sliding window to find a short segment which is representative for the whole genome. The result shows that our method can find a short segment which more representative about 6.57% in topological view to CDS segment than an Envelope segment or NS3 segment.
Isolation and phylogenetic analysis of canine distemper virus among domestic dogs in Vietnam.
Nguyen, Dung Van; Suzuki, Junko; Minami, Shohei; Yonemitsu, Kenzo; Nagata, Nao; Kuwata, Ryusei; Shimoda, Hiroshi; Vu, Chien Kim; Truong, Thuy Quoc; Maeda, Ken
2017-01-20
Canine distemper virus (CDV) is one of the most serious pathogens found in many species of carnivores, including domestic dogs. In this study, hemagglutinin (H) genes were detected in five domestic Vietnamese dogs with diarrhea, and two CDVs were successfully isolated from dogs positive for H genes. The complete genome of one isolate, CDV/dog/HCM/33/140816, was determined. Phylogenetic analysis showed that all Vietnamese CDVs belonged to the Asia-1 genotype. In addition, the H proteins of Vietnamese CDV strains were the most homologous to those of Chinese CDVs (98.4% to 99.3% identity). These results indicated that the Asia-1 genotype of CDV was the predominant genotype circulating among the domestic dog population in Vietnam and that transboundary transmission of CDV has occurred between Vietnam and China.
Phylogenetic analysis of canine distemper virus in domestic dogs in Nanjing, China.
Bi, Zhenwei; Wang, Yongshan; Wang, Xiaoli; Xia, Xingxia
2015-02-01
Canine distemper virus (CDV) infects a broad range of carnivores, including wild and domestic Canidae. The hemagglutinin gene, which encodes the attachment protein that determines viral tropism, has been widely used to determine the relationship between CDV strains of different lineages circulating worldwide. We determined the full-length H gene sequences of seven CDV field strains detected in domestic dogs in Nanjing, China. A phylogenetic analysis of the H gene sequences of CDV strains from different geographic regions and vaccine strains was performed. Four of the seven CDV strains were grouped in the same cluster of the Asia-1 lineage to which the vast majority of Chinese CDV strains belong, whereas the other three were clustered within the Asia-4 lineage, which has never been detected in China. This represents the first record of detection of strains of the Asia-4 lineage in China since this lineage was reported in Thailand in 2013.
Directory of Open Access Journals (Sweden)
R. Kann
2006-06-01
Full Text Available Feline immunodeficiency virus (FIV, a lentivirus, is an important pathogen of domestic cats around the world and has many similarities to human immunodeficiency virus (HIV. A characteristic of these lentiviruses is their extensive genetic diversity, which has been an obstacle in the development of successful vaccines. Of the FIV genes, the envelope gene is the most variable and sequence differences in a portion of this gene have been used to define 5 FIV subtypes (A, B, C, D and E. In this study, the proviral DNA sequence of the V3-V5 region of the envelope gene was determined in blood samples from 31 FIV positive cats from 4 different regions of South Africa. Phylogenetic analysis demonstrated the presence of both subtypes A and C, with subtype A predominating. These findings contribute to the understanding of the genetic diversity of FIV.
Li, Chen; He, Liping; Chen, Chong; Cai, Lingchao; Chen, Pingping; Yang, Shoubao
2016-01-01
Sarcocheilichthys sinensis sinensis (Bleeker, 1871), is a small benthopelagic freshwater species with high nutritional and ornamental value. In this study, the complete mitochondrial genome of S. sinensis sinensis was determined; the phylogenetic analysis with another individual and closely related species of Sarcocheilichthys fishes was carried out. The complete mitogenome of S. sinensis sinensis was 16683 bp in length, consist of 13 protein-coding genes, 2 rRNA genes, 22 tRNA genes and 2 non-coding regions: (D-loop and OL). It indicated that D-loop, ND2, and CytB may be appropriate molecular markers for studying population genetics and conservation biology of Sarcocheilichthys fishes.
Directory of Open Access Journals (Sweden)
Heni Yohandini
2015-07-01
Full Text Available A community of thermophiles within Tanjung Sakti Hot Spring (South Sumatera have been cultivated and identified based on 16S ribosomal RNA gene sequence. The hot spring has temperature 80 °C–91 °C and pH 7–8. We used a simple method for culturing the microbes, by enriching the spring water with nutrient broth media. Phylogenetic analysis showed that the method could recover microbes, which clustered within four distinct taxonomic groups: Anoxybacillus, Geobacillus, Brevibacillus, and Bacillus. These microbes closely related to Anoxybacillus rupiensis, Anoxybacillus flavithermus, Geobacillus pallidus, Brevibacillus thermoruber, Bacillus licheniformis, and Bacillus thermoamylovorans. The 16S ribosomal RNA gene sequence of one isolate only had 96% similarity with Brevibacillus sequence in GenBank.
Directory of Open Access Journals (Sweden)
Weijuan Zhu and Xiaofeng Ren*
2012-05-01
Full Text Available Porcine circovirus type 2 (PCV2 is the major causative agent of post-weaning multisystemic wasting syndrome (PMWS. Infection by PCV2 may cause heavy losses in pig industry. In this study, we report the isolation of a newly emerging PCV2 from northeastern China. The complete genome of the PCV2 isolate named PCV2-LJR contains 1766 nucleotides and was compared with reference sequences published in GenBank followed by topology analysis of the resulting phylogenetic tree. The data indicated that the prevalent PCV2 isolates in the northeastern China had close relationship, although various genotypes of PCV2 existed. In addition, by gene recombination and transfection techniques, the PCV2 infectious clone was achieved and was able to rescue virus in vitro determined by indirect immunofluorescence assay and PCR. The obtained biological materials may be used for biological characterization of PCV2.
Amiroch, S.; Pradana, M. S.; Irawan, M. I.; Mukhlash, I.
2017-09-01
Multiple Alignment (MA) is a particularly important tool for studying the viral genome and determine the evolutionary process of the specific virus. Application of MA in the case of the spread of the Severe acute respiratory syndrome (SARS) epidemic is an interesting thing because this virus epidemic a few years ago spread so quickly that medical attention in many countries. Although there has been a lot of software to process multiple sequences, but the use of pairwise alignment to process MA is very important to consider. In previous research, the alignment between the sequences to process MA algorithm, Super Pairwise Alignment, but in this study used a dynamic programming algorithm Needleman wunchs simulated in Matlab. From the analysis of MA obtained and stable region and unstable which indicates the position where the mutation occurs, the system network topology that produced the phylogenetic tree of the SARS epidemic distance method, and system area networks mutation.
Borden, W Calvin
1999-02-01
Striated muscles of 15 species of unicornfishes (Naso, Acanthuridae) are described in detail. Of 93 muscles dissected, only five demonstrate intrageneric variation, providing only ten characters suitable for phylogenetic analysis. Thus, myology appears to be highly conservative at the species level and has been so for approximately 50-55 million years in this particular group of fishes. Furthermore, myology is static relative to osteology in Naso and at any taxonomic rank within fishes, implying osteology provides a larger but not necessarily more valuable data source for systematic studies. Although important for their epistomological value, these descriptions provide a basis for further studies ranging from functional, comparative, and systematic analyses, ultimately with the potential to address questions of historical ecology (i.e., speciation, adaptation, coevolution) within Naso. J. Morphol. 239:191-224, 1999. © 1999 Wiley-Liss, Inc. Copyright © 1999 Wiley-Liss, Inc.
Sequencing and phylogenetic analysis of tobacco virus 2, a polerovirus from Nicotiana tabacum.
Zhou, Benguo; Wang, Fang; Zhang, Xuesong; Zhang, Lina; Lin, Huafeng
2017-07-01
The complete genome sequence of a new virus, provisionally named tobacco virus 2 (TV2), was determined and identified from leaves of tobacco (Nicotiana tabacum) exhibiting leaf mosaic, yellowing, and deformity, in Anhui Province, China. The genome sequence of TV2 comprises 5,979 nucleotides, with 87% nucleotide sequence identity to potato leafroll virus (PLRV). Its genome organization is similar to that of PLRV, containing six open reading frames (ORFs) that potentially encode proteins with putative functions in cell-to-cell movement and suppression of RNA silencing. Phylogenetic analysis of the nucleotide sequence placed TV2 alongside members of the genus Polerovirus in the family Luteoviridae. To the best our knowledge, this study is the first report of a complete genome sequence of a new polerovirus identified in tobacco.
Phylogenetic Analysis of Petunia sensu Jussieu (Solanaceae) using Chloroplast DNA RFLP
ANDO, TOSHIO; KOKUBUN, HISASHI; WATANABE, HITOSHI; TANAKA, NORIO; YUKAWA, TOMOHISA; HASHIMOTO, GORO; MARCHESI, EDUARDO; SUÁREZ, ENRIQUE; BASUALDO, ISABEL L.
2005-01-01
• Background and Aims The phylogenetic relationships of Petunia sensu Jussieu (Petunia sensu Wijsman plus Calibrachoa) are unclear. This study aimed to resolve this uncertainty using molecular evidence.
Biswal, Devendra Kumar; Debnath, Manish; Kumar, Shakti; Tandon, Pramod
2012-01-01
The Nymphaeales (waterlilly and relatives) lineage has diverged as the second branch of basal angiosperms and comprises of two families: Cabombaceae and Nymphaceae. The classification of Nymphaeales and phylogeny within the flowering plants are quite intriguing as several systems (Thorne system, Dahlgren system, Cronquist system, Takhtajan system and APG III system (Angiosperm Phylogeny Group III system) have attempted to redefine the Nymphaeales taxonomy. There have been also fossil records consisting especially of seeds, pollen, stems, leaves and flowers as early as the lower Cretaceous. Here we present an in silico study of the order Nymphaeales taking maturaseK (matK) and internal transcribed spacer (ITS2) as biomarkers for phylogeny reconstruction (using character-based methods and Bayesian approach) and identification of motifs for DNA barcoding. The Maximum Likelihood (ML) and Bayesian approach yielded congruent fully resolved and well-supported trees using a concatenated (ITS2+ matK) supermatrix aligned dataset. The taxon sampling corroborates the monophyly of Cabombaceae. Nuphar emerges as a monophyletic clade in the family Nymphaeaceae while there are slight discrepancies in the monophyletic nature of the genera Nymphaea owing to Victoria-Euryale and Ondinea grouping in the same node of Nymphaeaceae. ITS2 secondary structures alignment corroborate the primary sequence analysis. Hydatellaceae emerged as a sister clade to Nymphaeaceae and had a basal lineage amongst the water lilly clades. Species from Cycas and Ginkgo were taken as outgroups and were rooted in the overall tree topology from various methods. MatK genes are fast evolving highly variant regions of plant chloroplast DNA that can serve as potential biomarkers for DNA barcoding and also in generating primers for angiosperms with identification of unique motif regions. We have reported unique genus specific motif regions in the Order Nymphaeles from matK dataset which can be further validated for
Monogenean anchor morphometry: systematic value, phylogenetic signal, and evolution
Soo, Oi Yoon Michelle; Tan, Wooi Boon; Lim, Lee Hong Susan
2016-01-01
Background. Anchors are one of the important attachment appendages for monogenean parasites. Common descent and evolutionary processes have left their mark on anchor morphometry, in the form of patterns of shape and size variation useful for systematic and evolutionary studies. When combined with morphological and molecular data, analysis of anchor morphometry can potentially answer a wide range of biological questions. Materials and Methods. We used data from anchor morphometry, body size and morphology of 13 Ligophorus (Monogenea: Ancyrocephalidae) species infecting two marine mugilid (Teleostei: Mugilidae) fish hosts: Moolgarda buchanani (Bleeker) and Liza subviridis (Valenciennes) from Malaysia. Anchor shape and size data (n = 530) were generated using methods of geometric morphometrics. We used 28S rRNA, 18S rRNA, and ITS1 sequence data to infer a maximum likelihood phylogeny. We discriminated species using principal component and cluster analysis of shape data. Adams’s Kmult was used to detect phylogenetic signal in anchor shape. Phylogeny-correlated size and shape changes were investigated using continuous character mapping and directional statistics, respectively. We assessed morphological constraints in anchor morphometry using phylogenetic regression of anchor shape against body size and anchor size. Anchor morphological integration was studied using partial least squares method. The association between copulatory organ morphology and anchor shape and size in phylomorphospace was used to test the Rohde-Hobbs hypothesis. We created monogeneaGM, a new R package that integrates analyses of monogenean anchor geometric morphometric data with morphological and phylogenetic data. Results. We discriminated 12 of the 13 Ligophorus species using anchor shape data. Significant phylogenetic signal was detected in anchor shape. Thus, we discovered new morphological characters based on anchor shaft shape, the length between the inner root point and the outer root
Goodwin, S B; Dunkle, L D; Zismann, V L
2001-07-01
ABSTRACT Most of the 3,000 named species in the genus Cercospora have no known sexual stage, although a Mycosphaerella teleomorph has been identified for a few. Mycosphaerella is an extremely large and important genus of plant pathogens, with more than 1,800 named species and at least 43 associated anamorph genera. The goal of this research was to perform a large-scale phylogenetic analysis to test hypotheses about the past evolutionary history of Cercospora and Mycosphaerella. Based on the phylogenetic analysis of internal transcribed spacer (ITS) sequence data (ITS1, 5.8S rRNA gene, ITS2), the genus Mycosphaerella is monophyletic. In contrast, many anamorph genera within Mycosphaerella were polyphyletic and were not useful for grouping species. One exception was Cercospora, which formed a highly supported monophyletic group. Most Cercospora species from cereal crops formed a subgroup within the main Cercospora cluster. Only species within the Cercospora cluster produced the toxin cercosporin, suggesting that the ability to produce this compound had a single evolutionary origin. Intraspecific variation for 25 taxa in the Mycosphaerella clade averaged 1.7 nucleotides (nts) in the ITS region. Thus, isolates with ITS sequences that differ by two or more nucleotides may be distinct species. ITS sequences of groups I and II of the gray leaf spot pathogen Cercospora zeae-maydis differed by 7 nts and clearly represent different species. There were 6.5 nt differences on average between the ITS sequences of the sorghum pathogen Cercospora sorghi and the maize pathogen Cercospora sorghi var. maydis, indicating that the latter is a separate species and not simply a variety of Cercospora sorghi. The large monophyletic Mycosphaerella cluster contained a number of anamorph genera with no known teleomorph associations. Therefore, the number of anamorph genera related to Mycosphaerella may be much larger than suspected previously.
Identification, expression and phylogenetic analysis of EgG1Y162 from Echinococcus granulosus
Zhang, Fengbo; Ma, Xiumin; Zhu, Yuejie; Wang, Hongying; Liu, Xianfei; Zhu, Min; Ma, Haimei; Wen, Hao; Fan, Haining; Ding, Jianbing
2014-01-01
Objective: This study was to clone, identify and analyze the characteristics of egG1Y162 gene from Echinococcus granulosus. Methods: Genomic DNA and total RNAs were extracted from four different developmental stages of protoscolex, germinal layer, adult and egg of Echinococcus granulosus, respectively. Fluorescent quantitative PCR was used for analyzing the expression of egG1Y162 gene. Prokaryotic expression plasmid of pET41a-EgG1Y162 was constructed to express recombinant His-EgG1Y162 antigen. Western blot analysis was performed to detect antigenicity of EgG1Y162 antigen. Gene sequence, amino acid alignment and phylogenetic tree of EgG1Y162 were analyzed by BLAST, online Spidey and MEGA4 software, respectively. Results: EgG1Y162 gene was expressed in four developmental stages of Echinococcus granulosus. And, egG1Y162 gene expression was the highest in the adult stage, with the relative value of 19.526, significantly higher than other three stages. Additionally, Western blot analysis revealed that EgG1Y162 recombinant protein had good reaction with serum samples from Echinococcus granulosus infected human and dog. Moreover, EgG1Y162 antigen was phylogenetically closest to EmY162 antigen, with the similarity over 90%. Conclusion: Our study identified EgG1Y162 antigen in Echinococcus granulosus for the first time. EgG1Y162 antigen had a high similarity with EmY162 antigen, with the genetic differences mainly existing in the intron region. And, EgG1Y162 recombinant protein showed good antigenicity. PMID:25337206
Detection and phylogenetic analysis of infectious pancreatic necrosis virus in Chile.
Tapia, D; Eissler, Y; Torres, P; Jorquera, E; Espinoza, J C; Kuznar, J
2015-10-27
Infectious pancreatic necrosis virus (IPNV) is the etiological agent of a highly contagious disease that is endemic to salmon farming in Chile and causes great economic losses to the industry. Here we compared different diagnostic methods to detect IPNV in field samples, including 3 real-time reverse transcription PCR (qRT-PCR) assays, cell culture isolation, and indirect fluorescent antibody test (IFAT). Additionally, we performed a phylogenetic analysis to investigate the genogroups prevailing in Chile, as well as their geographic distribution and virulence. The 3 qRT-PCR assays used primers that targeted regions of the VP2 and VP1 genes of the virus and were tested in 46 samples, presenting a fair agreement within their results. All samples were positive for at least 2 of the qRT-PCR assays, 29 were positive for cell culture, and 23 for IFAT, showing less sensitivity for these latter 2 methods. For the phylogenetic analysis, portions of 1180 and 523 bp of the VP2 region of segment A were amplified by RT-PCR, sequenced and compared with sequences from reference strains and from isolates reported by previous studies carried out in Chile. Most of the sequenced isolates belonged to genogroup 5 (European origin), and 5 were classified within genogroup 1 (American origin). Chilean isolates formed clusters within each of the genogroups found, evidencing a clear differentiation from the reference strains. To our knowledge, this is the most extensive study completed for IPNV in Chile, covering isolates from sea- and freshwater salmon farms and showing a high prevalence of this virus in the country.
Identification, expression and phylogenetic analysis of EgG1Y162 from Echinococcus granulosus.
Zhang, Fengbo; Ma, Xiumin; Zhu, Yuejie; Wang, Hongying; Liu, Xianfei; Zhu, Min; Ma, Haimei; Wen, Hao; Fan, Haining; Ding, Jianbing
2014-01-01
This study was to clone, identify and analyze the characteristics of egG1Y162 gene from Echinococcus granulosus. Genomic DNA and total RNAs were extracted from four different developmental stages of protoscolex, germinal layer, adult and egg of Echinococcus granulosus, respectively. Fluorescent quantitative PCR was used for analyzing the expression of egG1Y162 gene. Prokaryotic expression plasmid of pET41a-EgG1Y162 was constructed to express recombinant His-EgG1Y162 antigen. Western blot analysis was performed to detect antigenicity of EgG1Y162 antigen. Gene sequence, amino acid alignment and phylogenetic tree of EgG1Y162 were analyzed by BLAST, online Spidey and MEGA4 software, respectively. EgG1Y162 gene was expressed in four developmental stages of Echinococcus granulosus. And, egG1Y162 gene expression was the highest in the adult stage, with the relative value of 19.526, significantly higher than other three stages. Additionally, Western blot analysis revealed that EgG1Y162 recombinant protein had good reaction with serum samples from Echinococcus granulosus infected human and dog. Moreover, EgG1Y162 antigen was phylogenetically closest to EmY162 antigen, with the similarity over 90%. Our study identified EgG1Y162 antigen in Echinococcus granulosus for the first time. EgG1Y162 antigen had a high similarity with EmY162 antigen, with the genetic differences mainly existing in the intron region. And, EgG1Y162 recombinant protein showed good antigenicity.
Cruz, Alejandra; Marín, Patricia; González-Jaén, M Teresa; Aguilar, Kristel Grace I; Cumagun, Christian Joseph R
2013-09-01
Fusarium fujikuroi Nirenberg is a maize and rice pathogen causing important agricultural losses and produces fumonisins - mycotoxins which pose health risk to humans and farm animals. However, little information is available about the phylogenetics of this species and its ability to produce fumonisins in rice. We studied 32 strains isolated from rice in the Philippines and performed a phylogenetic analysis using the partial sequence of Elongation Factor 1 alpha (EF-1α) including isolates belonging to closely related species. Fumonisin B1 (FB1 ) production was analyzed in 7-day-old cultures grown in fumonisin-inducing medium by an enzyme-linked immunosorbent assay-based method and by real-time reverse transcriptase-polymerase chain reaction using primers for FUM1 gene, a key gene in fumonisin biosynthesis. Nucleotide diversities per site (π) were 0.00024 ± 0.00022 (standard deviation) for the 32 F. fujikuroi strains from the Philippines and 0.00189 ± 0.00143 for all 34 F. fujikuroi strains, respectively. F. fujikuroi isolates grouped into one cluster separated from the rest of isolates belonging to the closely related F. proliferatum and showed very low variability, irrespective of their geographic origin. The cluster containing strains of F. proliferatum showed higher intraspecific variability than F. fujikuroi. Thirteen of the 32 strains analyzed were FB1 producers (40.62%), with production ranging from 0.386 to 223.83 ppm. All isolates analyzed showed FUM1 gene expression above 1 and higher than the CT value of the non-template control sample. Both seedling stunting and elongation were induced by the isolates in comparison with the control. F. fujikuroi are distinct from F. proliferatum isolates based on phytogenetic analysis and are potential fumonisin producers because all are positive for FUM1 gene expression. No relationship between fumonisin production and pathogenicity could be observed. © 2013 Society of Chemical Industry.
The origin and evolution of Basigin(BSG) gene: A comparative genomic and phylogenetic analysis.
Zhu, Xinyan; Wang, Shenglan; Shao, Mingjie; Yan, Jie; Liu, Fei
2017-07-01
Basigin (BSG), also known as extracellular matrix metalloproteinase inducer (EMMPRIN) or cluster of differentiation 147 (CD147), plays various fundamental roles in the intercellular recognition involved in immunologic phenomena, differentiation, and development. In this study, we aimed to compare the similarities and differences of BSG among organisms and explore possible evolutionary relationships based on the comparison result. We used the extensive BLAST tool to search the metazoan genomes, N-glycosylation sites, the transmembrane region and other functional sites. We then identified BSG homologs from genomic sequences and analyzed their phylogenetic relationships. We identified that BSG genes exist not only in the vertebrate metazoans but also in the invertebrate metazoans such as Amphioxus B. floridae, D. melanogaster, A. mellifera, S. japonicum, C. gigas, and T. patagoniensis. After sequence analysis, we confirmed that only vertebrate metazoans and Cephalochordate (amphioxus B. floridae) have the classic structure (a signal peptide, two Ig-like domains (IgC2 and IgI), a transmembrane region, and an intracellular domain). The invertebrate metazoans (excluding amphioxus B. floridae) lack the N-terminal signal peptides and IgC2 domain. We then generated a phylogenetic tree, genome organization comparison, and chromosomal disposition analysis based on the biological information obtained from the NCBI and Ensembl databases. Finally, we established the possible evolutionary scenario of the BSG gene, which showed the restricted exon rearrangement that has occurred during evolution, forming the present-day BSG gene. Copyright © 2017 Elsevier Ltd. All rights reserved.
Weisrock, David W; Macey, J Robert; Matsui, Masafumi; Mulcahy, Daniel G; Papenfuss, Theodore J
2013-01-01
The salamander family Hynobiidae contains over 50 species and has been the subject of a number of molecular phylogenetic investigations aimed at reconstructing branches across the entire family. In general, studies using the greatest amount of sequence data have used reduced taxon sampling, while the study with the greatest taxon sampling has used a limited sequence data set. Here, we provide insights into the phylogenetic history of the Hynobiidae using both dense taxon sampling and a large mitochondrial DNA sequence data set. We report exclusive new mitochondrial DNA data of 2566 aligned bases (with 151 excluded sites, of included sites 1157 are variable with 957 parsimony informative). This is sampled from two genic regions encoding a 12S-16S region (the 3' end of 12S rRNA, tRNA(VAI), and the 5' end of 16S rRNA), and a ND2-COI region (ND2, tRNA(Trp), tRNA(Ala), tRNA(Asn), the origin for light strand replication--O(L), tRNA(Cys), tRNAT(Tyr), and the 5' end of COI). Analyses using parsimony, Bayesian, and maximum likelihood optimality criteria produce similar phylogenetic trees, with discordant branches generally receiving low levels of branch support. Monophyly of the Hynobiidae is strongly supported across all analyses, as is the sister relationship and deep divergence between the genus Onychodactylus with all remaining hynobiids. Within this latter grouping our phylogenetic results identify six clades that are relatively divergent from one another, but for which there is minimal support for their phylogenetic placement. This includes the genus Batrachuperus, the genus Hynobius, the genus Pachyhynobius, the genus Salamandrella, a clade containing the genera Ranodon and Paradactylodon, and a clade containing the genera Liua and Pseudohynobius. This latter clade receives low bootstrap support in the parsimony analysis, but is consistent across all three analytical methods. Our results also clarify a number of well-supported relationships within the larger
Hu, Chao; Tian, Huaizhen; Li, Hongqing; Hu, Aiqun; Xing, Fuwu; Bhattacharjee, Avishek; Hsu, Tianchuan; Kumar, Pankaj; Chung, Shihwen
2016-01-01
A molecular phylogeny of Asiatic species of Goodyera (Orchidaceae, Cranichideae, Goodyerinae) based on the nuclear ribosomal internal transcribed spacer (ITS) region and two chloroplast loci (matK and trnL-F) was presented. Thirty-five species represented by 132 samples of Goodyera were analyzed, along with other 27 genera/48 species, using Pterostylis longifolia and Chloraea gaudichaudii as outgroups. Bayesian inference, maximum parsimony and maximum likelihood methods were used to reveal th...
Phylogenetic analysis of local-scale tree soil associations in a lowland moist tropical forest.
Directory of Open Access Journals (Sweden)
Laura A Schreeg
Full Text Available BACKGROUND: Local plant-soil associations are commonly studied at the species-level, while associations at the level of nodes within a phylogeny have been less well explored. Understanding associations within a phylogenetic context, however, can improve our ability to make predictions across systems and can advance our understanding of the role of evolutionary history in structuring communities. METHODOLOGY/PRINCIPAL FINDINGS: Here we quantified evolutionary signal in plant-soil associations using a DNA sequence-based community phylogeny and several soil variables (e.g., extractable phosphorus, aluminum and manganese, pH, and slope as a proxy for soil water. We used published plant distributional data from the 50-ha plot on Barro Colorado Island (BCI, Republic of Panamá. Our results suggest some groups of closely related species do share similar soil associations. Most notably, the node shared by Myrtaceae and Vochysiaceae was associated with high levels of aluminum, a potentially toxic element. The node shared by Apocynaceae was associated with high extractable phosphorus, a nutrient that could be limiting on a taxon specific level. The node shared by the large group of Laurales and Magnoliales was associated with both low extractable phosphorus and with steeper slope. Despite significant node-specific associations, this study detected little to no phylogeny-wide signal. We consider the majority of the 'traits' (i.e., soil variables evaluated to fall within the category of ecological traits. We suggest that, given this category of traits, phylogeny-wide signal might not be expected while node-specific signals can still indicate phylogenetic structure with respect to the variable of interest. CONCLUSIONS: Within the BCI forest dynamics plot, distributions of some plant taxa are associated with local-scale differences in soil variables when evaluated at individual nodes within the phylogenetic tree, but they are not detectable by phylogeny