WorldWideScience

Sample records for speaker-independent phoneme alignment

  1. Speaker-specific variability of phoneme durations

    CSIR Research Space (South Africa)

    Van Heerden, CJ

    2007-11-01

    Full Text Available The durations of phonemes varies for different speakers. To this end, the correlations between phonemes across different speakers are studied and a novel approach to predict unknown phoneme durations from the values of known phoneme durations for a...

  2. Phoneme Error Pattern by Heritage Speakers of Spanish on an English Word Recognition Test.

    Science.gov (United States)

    Shi, Lu-Feng

    2017-04-01

    Heritage speakers acquire their native language from home use in their early childhood. As the native language is typically a minority language in the society, these individuals receive their formal education in the majority language and eventually develop greater competency with the majority than their native language. To date, there have not been specific research attempts to understand word recognition by heritage speakers. It is not clear if and to what degree we may infer from evidence based on bilingual listeners in general. This preliminary study investigated how heritage speakers of Spanish perform on an English word recognition test and analyzed their phoneme errors. A prospective, cross-sectional, observational design was employed. Twelve normal-hearing adult Spanish heritage speakers (four men, eight women, 20-38 yr old) participated in the study. Their language background was obtained through the Language Experience and Proficiency Questionnaire. Nine English monolingual listeners (three men, six women, 20-41 yr old) were also included for comparison purposes. Listeners were presented with 200 Northwestern University Auditory Test No. 6 words in quiet. They repeated each word orally and in writing. Their responses were scored by word, word-initial consonant, vowel, and word-final consonant. Performance was compared between groups with Student's t test or analysis of variance. Group-specific error patterns were primarily descriptive, but intergroup comparisons were made using 95% or 99% confidence intervals for proportional data. The two groups of listeners yielded comparable scores when their responses were examined by word, vowel, and final consonant. However, heritage speakers of Spanish misidentified significantly more word-initial consonants and had significantly more difficulty with initial /p, b, h/ than their monolingual peers. The two groups yielded similar patterns for vowel and word-final consonants, but heritage speakers made significantly

  3. Audiovisual perceptual learning with multiple speakers.

    Science.gov (United States)

    Mitchel, Aaron D; Gerfen, Chip; Weiss, Daniel J

    2016-05-01

    One challenge for speech perception is between-speaker variability in the acoustic parameters of speech. For example, the same phoneme (e.g. the vowel in "cat") may have substantially different acoustic properties when produced by two different speakers and yet the listener must be able to interpret these disparate stimuli as equivalent. Perceptual tuning, the use of contextual information to adjust phonemic representations, may be one mechanism that helps listeners overcome obstacles they face due to this variability during speech perception. Here we test whether visual contextual cues to speaker identity may facilitate the formation and maintenance of distributional representations for individual speakers, allowing listeners to adjust phoneme boundaries in a speaker-specific manner. We familiarized participants to an audiovisual continuum between /aba/ and /ada/. During familiarization, the "b-face" mouthed /aba/ when an ambiguous token was played, while the "D-face" mouthed /ada/. At test, the same ambiguous token was more likely to be identified as /aba/ when paired with a stilled image of the "b-face" than with an image of the "D-face." This was not the case in the control condition when the two faces were paired equally with the ambiguous token. Together, these results suggest that listeners may form speaker-specific phonemic representations using facial identity cues.

  4. Relationships between Categorical Perception of Phonemes, Phoneme Awareness, and Visual Attention Span in Developmental Dyslexia.

    Directory of Open Access Journals (Sweden)

    Rachel Zoubrinetzky

    Full Text Available We tested the hypothesis that the categorical perception deficit of speech sounds in developmental dyslexia is related to phoneme awareness skills, whereas a visual attention (VA span deficit constitutes an independent deficit. Phoneme awareness tasks, VA span tasks and categorical perception tasks of phoneme identification and discrimination using a d/t voicing continuum were administered to 63 dyslexic children and 63 control children matched on chronological age. Results showed significant differences in categorical perception between the dyslexic and control children. Significant correlations were found between categorical perception skills, phoneme awareness and reading. Although VA span correlated with reading, no significant correlations were found between either categorical perception or phoneme awareness and VA span. Mediation analyses performed on the whole dyslexic sample suggested that the effect of categorical perception on reading might be mediated by phoneme awareness. This relationship was independent of the participants' VA span abilities. Two groups of dyslexic children with a single phoneme awareness or a single VA span deficit were then identified. The phonologically impaired group showed lower categorical perception skills than the control group but categorical perception was similar in the VA span impaired dyslexic and control children. The overall findings suggest that the link between categorical perception, phoneme awareness and reading is independent from VA span skills. These findings provide new insights on the heterogeneity of developmental dyslexia. They suggest that phonological processes and VA span independently affect reading acquisition.

  5. Relationships between Categorical Perception of Phonemes, Phoneme Awareness, and Visual Attention Span in Developmental Dyslexia.

    Science.gov (United States)

    Zoubrinetzky, Rachel; Collet, Gregory; Serniclaes, Willy; Nguyen-Morel, Marie-Ange; Valdois, Sylviane

    2016-01-01

    We tested the hypothesis that the categorical perception deficit of speech sounds in developmental dyslexia is related to phoneme awareness skills, whereas a visual attention (VA) span deficit constitutes an independent deficit. Phoneme awareness tasks, VA span tasks and categorical perception tasks of phoneme identification and discrimination using a d/t voicing continuum were administered to 63 dyslexic children and 63 control children matched on chronological age. Results showed significant differences in categorical perception between the dyslexic and control children. Significant correlations were found between categorical perception skills, phoneme awareness and reading. Although VA span correlated with reading, no significant correlations were found between either categorical perception or phoneme awareness and VA span. Mediation analyses performed on the whole dyslexic sample suggested that the effect of categorical perception on reading might be mediated by phoneme awareness. This relationship was independent of the participants' VA span abilities. Two groups of dyslexic children with a single phoneme awareness or a single VA span deficit were then identified. The phonologically impaired group showed lower categorical perception skills than the control group but categorical perception was similar in the VA span impaired dyslexic and control children. The overall findings suggest that the link between categorical perception, phoneme awareness and reading is independent from VA span skills. These findings provide new insights on the heterogeneity of developmental dyslexia. They suggest that phonological processes and VA span independently affect reading acquisition.

  6. Relationships between Categorical Perception of Phonemes, Phoneme Awareness, and Visual Attention Span in Developmental Dyslexia

    Science.gov (United States)

    Zoubrinetzky, Rachel; Collet, Gregory; Serniclaes, Willy; Nguyen-Morel, Marie-Ange; Valdois, Sylviane

    2016-01-01

    We tested the hypothesis that the categorical perception deficit of speech sounds in developmental dyslexia is related to phoneme awareness skills, whereas a visual attention (VA) span deficit constitutes an independent deficit. Phoneme awareness tasks, VA span tasks and categorical perception tasks of phoneme identification and discrimination using a d/t voicing continuum were administered to 63 dyslexic children and 63 control children matched on chronological age. Results showed significant differences in categorical perception between the dyslexic and control children. Significant correlations were found between categorical perception skills, phoneme awareness and reading. Although VA span correlated with reading, no significant correlations were found between either categorical perception or phoneme awareness and VA span. Mediation analyses performed on the whole dyslexic sample suggested that the effect of categorical perception on reading might be mediated by phoneme awareness. This relationship was independent of the participants’ VA span abilities. Two groups of dyslexic children with a single phoneme awareness or a single VA span deficit were then identified. The phonologically impaired group showed lower categorical perception skills than the control group but categorical perception was similar in the VA span impaired dyslexic and control children. The overall findings suggest that the link between categorical perception, phoneme awareness and reading is independent from VA span skills. These findings provide new insights on the heterogeneity of developmental dyslexia. They suggest that phonological processes and VA span independently affect reading acquisition. PMID:26950210

  7. Data-Model Relationship in Text-Independent Speaker Recognition

    Directory of Open Access Journals (Sweden)

    Stapert Robert

    2005-01-01

    Full Text Available Text-independent speaker recognition systems such as those based on Gaussian mixture models (GMMs do not include time sequence information (TSI within the model itself. The level of importance of TSI in speaker recognition is an interesting question and one addressed in this paper. Recent works has shown that the utilisation of higher-level information such as idiolect, pronunciation, and prosodics can be useful in reducing speaker recognition error rates. In accordance with these developments, the aim of this paper is to show that as more data becomes available, the basic GMM can be enhanced by utilising TSI, even in a text-independent mode. This paper presents experimental work incorporating TSI into the conventional GMM. The resulting system, known as the segmental mixture model (SMM, embeds dynamic time warping (DTW into a GMM framework. Results are presented on the 2000-speaker SpeechDat Welsh database which show improved speaker recognition performance with the SMM.

  8. Recognition of speaker-dependent continuous speech with KEAL

    Science.gov (United States)

    Mercier, G.; Bigorgne, D.; Miclet, L.; Le Guennec, L.; Querre, M.

    1989-04-01

    A description of the speaker-dependent continuous speech recognition system KEAL is given. An unknown utterance, is recognized by means of the followng procedures: acoustic analysis, phonetic segmentation and identification, word and sentence analysis. The combination of feature-based, speaker-independent coarse phonetic segmentation with speaker-dependent statistical classification techniques is one of the main design features of the acoustic-phonetic decoder. The lexical access component is essentially based on a statistical dynamic programming technique which aims at matching a phonemic lexical entry containing various phonological forms, against a phonetic lattice. Sentence recognition is achieved by use of a context-free grammar and a parsing algorithm derived from Earley's parser. A speaker adaptation module allows some of the system parameters to be adjusted by matching known utterances with their acoustical representation. The task to be performed, described by its vocabulary and its grammar, is given as a parameter of the system. Continuously spoken sentences extracted from a 'pseudo-Logo' language are analyzed and results are presented.

  9. Evaluation of soft segment modeling on a context independent phoneme classification system

    International Nuclear Information System (INIS)

    Razzazi, F.; Sayadiyan, A.

    2007-01-01

    The geometric distribution of states duration is one of the main performance limiting assumptions of hidden Markov modeling of speech signals. Stochastic segment models, generally, and segmental HMM, specifically overcome this deficiency partly at the cost of more complexity in both training and recognition phases. In addition to this assumption, the gradual temporal changes of speech statistics has not been modeled in HMM. In this paper, a new duration modeling approach is presented. The main idea of the model is to consider the effect of adjacent segments on the probability density function estimation and evaluation of each acoustic segment. This idea not only makes the model robust against segmentation errors, but also it models gradual change from one segment to the next one with a minimum set of parameters. The proposed idea is analytically formulated and tested on a TIMIT based context independent phenomena classification system. During the test procedure, the phoneme classification of different phoneme classes was performed by applying various proposed recognition algorithms. The system was optimized and the results have been compared with a continuous density hidden Markov model (CDHMM) with similar computational complexity. The results show 8-10% improvement in phoneme recognition rate in comparison with standard continuous density hidden Markov model. This indicates improved compatibility of the proposed model with the speech nature. (author)

  10. An orthographic effect in phoneme processing, and its limitations

    Directory of Open Access Journals (Sweden)

    Anne eCutler

    2012-02-01

    Full Text Available To examine whether lexically stored knowledge about spelling influences phoneme evaluation, we conducted three experiments with a low-level phonetic judgement task: phoneme goodness rating. In each experiment, listeners heard phonetic tokens varying along a continuum centred on /s/, occurring finally in isolated word or nonword tokens. An effect of spelling appeared in Experiment 1: Native English speakers’ goodness ratings for the best /s/ tokens were significantly higher in words spelled with S (e.g., bless than in words spelled with C (e.g., voice. No such difference appeared when nonnative speakers rated the same materials in Experiment 2, indicating that the difference could not be due to acoustic characteristics of the S- versus C-words. In Experiment 3, nonwords with lexical neighbours consistently spelled with S (e.g., pless versus with C (e.g., floice failed to elicit orthographic neighbourhood effects; no significant difference appeared in native English speakers’ ratings for the S-consistent versus the C-consistent sets. Obligatory influence of lexical knowledge on phonemic processing would have predicted such neighbourhood effects; the findings are thus better accommodated by models in which phonemic decisions draw strategically upon lexical information.

  11. Speech rate normalization used to improve speaker verification

    CSIR Research Space (South Africa)

    Van Heerden, CJ

    2006-11-01

    Full Text Available the normalized durations is then compared with the EER using unnormalized durations, and also with the EER when duration information is not employed. 2. Proposed phoneme duration modeling 2.1. Choosing parametric models Since the duration of a phoneme... the known transcription and the speaker-specific acoustic model described above. Only one pronunciation per word was allowed, thus resulting in 49 triphones. To decide which parametric model to use for the duration density func- tions of the triphones...

  12. Effects of emotion on different phoneme classes

    Science.gov (United States)

    Lee, Chul Min; Yildirim, Serdar; Bulut, Murtaza; Busso, Carlos; Kazemzadeh, Abe; Lee, Sungbok; Narayanan, Shrikanth

    2004-10-01

    This study investigates the effects of emotion on different phoneme classes using short-term spectral features. In the research on emotion in speech, most studies have focused on prosodic features of speech. In this study, based on the hypothesis that different emotions have varying effects on the properties of the different speech sounds, we investigate the usefulness of phoneme-class level acoustic modeling for automatic emotion classification. Hidden Markov models (HMM) based on short-term spectral features for five broad phonetic classes are used for this purpose using data obtained from recordings of two actresses. Each speaker produces 211 sentences with four different emotions (neutral, sad, angry, happy). Using the speech material we trained and compared the performances of two sets of HMM classifiers: a generic set of ``emotional speech'' HMMs (one for each emotion) and a set of broad phonetic-class based HMMs (vowel, glide, nasal, stop, fricative) for each emotion type considered. Comparison of classification results indicates that different phoneme classes were affected differently by emotional change and that the vowel sounds are the most important indicator of emotions in speech. Detailed results and their implications on the underlying speech articulation will be discussed.

  13. Speaker Input Variability Does Not Explain Why Larger Populations Have Simpler Languages.

    Science.gov (United States)

    Atkinson, Mark; Kirby, Simon; Smith, Kenny

    2015-01-01

    A learner's linguistic input is more variable if it comes from a greater number of speakers. Higher speaker input variability has been shown to facilitate the acquisition of phonemic boundaries, since data drawn from multiple speakers provides more information about the distribution of phonemes in a speech community. It has also been proposed that speaker input variability may have a systematic influence on individual-level learning of morphology, which can in turn influence the group-level characteristics of a language. Languages spoken by larger groups of people have less complex morphology than those spoken in smaller communities. While a mechanism by which the number of speakers could have such an effect is yet to be convincingly identified, differences in speaker input variability, which is thought to be larger in larger groups, may provide an explanation. By hindering the acquisition, and hence faithful cross-generational transfer, of complex morphology, higher speaker input variability may result in structural simplification. We assess this claim in two experiments which investigate the effect of such variability on language learning, considering its influence on a learner's ability to segment a continuous speech stream and acquire a morphologically complex miniature language. We ultimately find no evidence to support the proposal that speaker input variability influences language learning and so cannot support the hypothesis that it explains how population size determines the structural properties of language.

  14. When speaker identity is unavoidable: Neural processing of speaker identity cues in natural speech.

    Science.gov (United States)

    Tuninetti, Alba; Chládková, Kateřina; Peter, Varghese; Schiller, Niels O; Escudero, Paola

    2017-11-01

    Speech sound acoustic properties vary largely across speakers and accents. When perceiving speech, adult listeners normally disregard non-linguistic variation caused by speaker or accent differences, in order to comprehend the linguistic message, e.g. to correctly identify a speech sound or a word. Here we tested whether the process of normalizing speaker and accent differences, facilitating the recognition of linguistic information, is found at the level of neural processing, and whether it is modulated by the listeners' native language. In a multi-deviant oddball paradigm, native and nonnative speakers of Dutch were exposed to naturally-produced Dutch vowels varying in speaker, sex, accent, and phoneme identity. Unexpectedly, the analysis of mismatch negativity (MMN) amplitudes elicited by each type of change shows a large degree of early perceptual sensitivity to non-linguistic cues. This finding on perception of naturally-produced stimuli contrasts with previous studies examining the perception of synthetic stimuli wherein adult listeners automatically disregard acoustic cues to speaker identity. The present finding bears relevance to speech normalization theories, suggesting that at an unattended level of processing, listeners are indeed sensitive to changes in fundamental frequency in natural speech tokens. Copyright © 2017 Elsevier Inc. All rights reserved.

  15. Can a linguistic serial founder effect originating in Africa explain the worldwide phonemic cline?

    Science.gov (United States)

    Fort, Joaquim; Pérez-Losada, Joaquim

    2016-04-01

    It has been proposed that a serial founder effect could have caused the present observed pattern of global phonemic diversity. Here we present a model that simulates the human range expansion out of Africa and the subsequent spatial linguistic dynamics until today. It does not assume copying errors, Darwinian competition, reduced contrastive possibilities or any other specific linguistic mechanism. We show that the decrease of linguistic diversity with distance (from the presumed origin of the expansion) arises under three assumptions, previously introduced by other authors: (i) an accumulation rate for phonemes; (ii) small phonemic inventories for the languages spoken before the out-of-Africa dispersal; (iii) an increase in the phonemic accumulation rate with the number of speakers per unit area. Numerical simulations show that the predictions of the model agree with the observed decrease of linguistic diversity with increasing distance from the most likely origin of the out-of-Africa dispersal. Thus, the proposal that a serial founder effect could have caused the present observed pattern of global phonemic diversity is viable, if three strong assumptions are satisfied. © 2016 The Authors.

  16. Can a linguistic serial founder effect originating in Africa explain the worldwide phonemic cline?

    Science.gov (United States)

    2016-01-01

    It has been proposed that a serial founder effect could have caused the present observed pattern of global phonemic diversity. Here we present a model that simulates the human range expansion out of Africa and the subsequent spatial linguistic dynamics until today. It does not assume copying errors, Darwinian competition, reduced contrastive possibilities or any other specific linguistic mechanism. We show that the decrease of linguistic diversity with distance (from the presumed origin of the expansion) arises under three assumptions, previously introduced by other authors: (i) an accumulation rate for phonemes; (ii) small phonemic inventories for the languages spoken before the out-of-Africa dispersal; (iii) an increase in the phonemic accumulation rate with the number of speakers per unit area. Numerical simulations show that the predictions of the model agree with the observed decrease of linguistic diversity with increasing distance from the most likely origin of the out-of-Africa dispersal. Thus, the proposal that a serial founder effect could have caused the present observed pattern of global phonemic diversity is viable, if three strong assumptions are satisfied. PMID:27122180

  17. Moving-Talker, Speaker-Independent Feature Study, and Baseline Results Using the CUAVE Multimodal Speech Corpus

    Directory of Open Access Journals (Sweden)

    Patterson Eric K

    2002-01-01

    Full Text Available Strides in computer technology and the search for deeper, more powerful techniques in signal processing have brought multimodal research to the forefront in recent years. Audio-visual speech processing has become an important part of this research because it holds great potential for overcoming certain problems of traditional audio-only methods. Difficulties, due to background noise and multiple speakers in an application environment, are significantly reduced by the additional information provided by visual features. This paper presents information on a new audio-visual database, a feature study on moving speakers, and on baseline results for the whole speaker group. Although a few databases have been collected in this area, none has emerged as a standard for comparison. Also, efforts to date have often been limited, focusing on cropped video or stationary speakers. This paper seeks to introduce a challenging audio-visual database that is flexible and fairly comprehensive, yet easily available to researchers on one DVD. The Clemson University Audio-Visual Experiments (CUAVE database is a speaker-independent corpus of both connected and continuous digit strings totaling over 7000 utterances. It contains a wide variety of speakers and is designed to meet several goals discussed in this paper. One of these goals is to allow testing of adverse conditions such as moving talkers and speaker pairs. A feature study of connected digit strings is also discussed. It compares stationary and moving talkers in a speaker-independent grouping. An image-processing-based contour technique, an image transform method, and a deformable template scheme are used in this comparison to obtain visual features. This paper also presents methods and results in an attempt to make these techniques more robust to speaker movement. Finally, initial baseline speaker-independent results are included using all speakers, and conclusions as well as suggested areas of research are

  18. Speaker recognition through NLP and CWT modeling.

    Energy Technology Data Exchange (ETDEWEB)

    Brown-VanHoozer, A.; Kercel, S. W.; Tucker, R. W.

    1999-06-23

    The objective of this research is to develop a system capable of identifying speakers on wiretaps from a large database (>500 speakers) with a short search time duration (<30 seconds), and with better than 90% accuracy. Much previous research in speaker recognition has led to algorithms that produced encouraging preliminary results, but were overwhelmed when applied to populations of more than a dozen or so different speakers. The authors are investigating a solution to the ''huge population'' problem by seeking two completely different kinds of characterizing features. These features are extracted using the techniques of Neuro-Linguistic Programming (NLP) and the continuous wavelet transform (CWT). NLP extracts precise neurological, verbal and non-verbal information, and assimilates the information into useful patterns. These patterns are based on specific cues demonstrated by each individual, and provide ways of determining congruency between verbal and non-verbal cues. The primary NLP modalities are characterized through word spotting (or verbal predicates cues, e.g., see, sound, feel, etc.) while the secondary modalities would be characterized through the speech transcription used by the individual. This has the practical effect of reducing the size of the search space, and greatly speeding up the process of identifying an unknown speaker. The wavelet-based line of investigation concentrates on using vowel phonemes and non-verbal cues, such as tempo. The rationale for concentrating on vowels is there are a limited number of vowels phonemes, and at least one of them usually appears in even the shortest of speech segments. Using the fast, CWT algorithm, the details of both the formant frequency and the glottal excitation characteristics can be easily extracted from voice waveforms. The differences in the glottal excitation waveforms as well as the formant frequency are evident in the CWT output. More significantly, the CWT reveals significant

  19. Phonemic versus allophonic status modulates early brain responses to language sounds: an MEG/ERF study

    DEFF Research Database (Denmark)

    Nielsen, Andreas Højlund; Gebauer, Line; Mcgregor, William

    allophonic sound contrasts. So far this has only been attested between languages. In the present study we wished to investigate this effect within the same language: Does the same sound contrast that is phonemic in one environment, but allophonic in another, elicit different MMNm responses in native...... ‘that’). This allowed us to manipulate the phonemic/allophonic status of exactly the same sound contrast (/t/-/d/) by presenting it in different immediate phonetic contexts (preceding a vowel (CV) versus following a vowel (VC)), in order to investigate the auditory event-related fields of native Danish...... listeners to a sound contrast that is both phonemic and allophonic within Danish. Methods: Relevant syllables were recorded by a male native Danish speaker. The stimuli were then created by cross-splicing the sounds so that the same vowel [æ] was used for all syllables, and the same [t] and [d] were used...

  20. Do Adults with Cochlear Implants Rely on Different Acoustic Cues for Phoneme Perception than Adults with Normal Hearing?

    Science.gov (United States)

    Moberly, Aaron C.; Lowenstein, Joanna H.; Tarr, Eric; Caldwell-Tarr, Amanda; Welling, D. Bradley; Shahin, Antoine J.; Nittrouer, Susan

    2014-01-01

    Purpose: Several acoustic cues specify any single phonemic contrast. Nonetheless, adult, native speakers of a language share weighting strategies, showing preferential attention to some properties over others. Cochlear implant (CI) signal processing disrupts the salience of some cues: In general, amplitude structure remains readily available, but…

  1. Phonemes: Lexical access and beyond.

    Science.gov (United States)

    Kazanina, Nina; Bowers, Jeffrey S; Idsardi, William

    2018-04-01

    Phonemes play a central role in traditional theories as units of speech perception and access codes to lexical representations. Phonemes have two essential properties: they are 'segment-sized' (the size of a consonant or vowel) and abstract (a single phoneme may be have different acoustic realisations). Nevertheless, there is a long history of challenging the phoneme hypothesis, with some theorists arguing for differently sized phonological units (e.g. features or syllables) and others rejecting abstract codes in favour of representations that encode detailed acoustic properties of the stimulus. The phoneme hypothesis is the minority view today. We defend the phoneme hypothesis in two complementary ways. First, we show that rejection of phonemes is based on a flawed interpretation of empirical findings. For example, it is commonly argued that the failure to find acoustic invariances for phonemes rules out phonemes. However, the lack of invariance is only a problem on the assumption that speech perception is a bottom-up process. If learned sublexical codes are modified by top-down constraints (which they are), then this argument loses all force. Second, we provide strong positive evidence for phonemes on the basis of linguistic data. Almost all findings that are taken (incorrectly) as evidence against phonemes are based on psycholinguistic studies of single words. However, phonemes were first introduced in linguistics, and the best evidence for phonemes comes from linguistic analyses of complex word forms and sentences. In short, the rejection of phonemes is based on a false analysis and a too-narrow consideration of the relevant data.

  2. Integrated Phoneme Subspace Method for Speech Feature Extraction

    Directory of Open Access Journals (Sweden)

    Park Hyunsin

    2009-01-01

    Full Text Available Speech feature extraction has been a key focus in robust speech recognition research. In this work, we discuss data-driven linear feature transformations applied to feature vectors in the logarithmic mel-frequency filter bank domain. Transformations are based on principal component analysis (PCA, independent component analysis (ICA, and linear discriminant analysis (LDA. Furthermore, this paper introduces a new feature extraction technique that collects the correlation information among phoneme subspaces and reconstructs feature space for representing phonemic information efficiently. The proposed speech feature vector is generated by projecting an observed vector onto an integrated phoneme subspace (IPS based on PCA or ICA. The performance of the new feature was evaluated for isolated word speech recognition. The proposed method provided higher recognition accuracy than conventional methods in clean and reverberant environments.

  3. Phonemes as short time cognitive components

    DEFF Research Database (Denmark)

    Feng, Ling; Hansen, Lars Kai

    2006-01-01

    are the smallest contrastive unit in the sound system of a language. Generalizable components were found deriving from phonemes based on homomorphic filtering features with basic time scale (20 msec). We sparsified the features based on energy as a preprocessing means to eliminate the intrinsic noise. Independent...

  4. Discovering Phonemes of Bidayuh

    Directory of Open Access Journals (Sweden)

    Jecky Misieng

    2012-07-01

    Full Text Available There are generally three views of the notion of a phoneme. The structuralist view of the phoneme focuses on this language phenomenon as a phonetic reality. In discovering phonemes of a language, phonologists who hold this view will look for minimal contrasting pairs as a way to determine contrasting sounds of that language. They will also look for allophones or two sounds of the same phoneme which may appear in complementary distribution. This paper will discuss the possible application of the structuralist approach to analyzing the phonemes of a dialect of Bidayuh, one of the Malayo-Polynesian languages spoken in the northern region of Borneo.

  5. Assessing the Double Phonemic Representation in Bilingual Speakers of Spanish and English: An Electrophysiological Study

    Science.gov (United States)

    Garcia-Sierra, Adrian; Ramirez-Esparza, Nairan; Silva-Pereyra, Juan; Siard, Jennifer; Champlin, Craig A.

    2012-01-01

    Event Related Potentials (ERPs) were recorded from Spanish-English bilinguals (N = 10) to test pre-attentive speech discrimination in two language contexts. ERPs were recorded while participants silently read magazines in English or Spanish. Two speech contrast conditions were recorded in each language context. In the "phonemic in English"…

  6. Text-Independent Speaker Identification Using the Histogram Transform Model

    DEFF Research Database (Denmark)

    Ma, Zhanyu; Yu, Hong; Tan, Zheng-Hua

    2016-01-01

    In this paper, we propose a novel probabilistic method for the task of text-independent speaker identification (SI). In order to capture the dynamic information during SI, we design a super-MFCCs features by cascading three neighboring Mel-frequency Cepstral coefficients (MFCCs) frames together....... These super-MFCC vectors are utilized for probabilistic model training such that the speaker’s characteristics can be sufficiently captured. The probability density function (PDF) of the aforementioned super-MFCCs features is estimated by the recently proposed histogram transform (HT) method. To recedes...

  7. Evaluation of articulation of Turkish phonemes after removable partial denture application

    Directory of Open Access Journals (Sweden)

    Özbeki Murat

    2003-01-01

    Full Text Available In this study, the adaptation of patients to removable partial dentures was evaluated related to articulation of Turkish phonemes. Articulation of /t,d,n,l,r/, /g,k/, /b,p,m/ and /s,z,Õ,v,f,y,j,h,c/ phonemes were evaluated by three speech pathologists, on records taken from 15 patients before the insertion of a removable partial denture, just after insertion, and one week later. The test consisted of evaluation of phoneme articulation of independent syllables in terms of distortion, omission, substitution, mass effect, hypernasality and hyponasality. Data were evaluated with Cochrane Q, McNemar and Kruskal-Wallis tests. The results showed that for some phonemes, problems in articulation occurred after the insertion of a removable partial denture while for others a significant amelioration was observed after the insertion of a removable partial denture. In general, problems in articulation of evaluated phonemes were resolved after one week of use.

  8. A Novel Approach in Text-Independent Speaker Recognition in Noisy Environment

    Directory of Open Access Journals (Sweden)

    Nona Heydari Esfahani

    2014-10-01

    Full Text Available In this paper, robust text-independent speaker recognition is taken into consideration. The proposed method performs on manual silence-removed utterances that are segmented into smaller speech units containing few phones and at least one vowel. The segments are basic units for long-term feature extraction. Sub-band entropy is directly extracted in each segment. A robust vowel detection method is then applied on each segment to separate a high energy vowel that is used as unit for pitch frequency and formant extraction. By applying a clustering technique, extracted short-term features namely MFCC coefficients are combined with long term features. Experiments using MLP classifier show that the average speaker accuracy recognition rate is 97.33% for clean speech and 61.33% in noisy environment for -2db SNR, that shows improvement compared to other conventional methods.

  9. Human phoneme recognition depending on speech-intrinsic variability.

    Science.gov (United States)

    Meyer, Bernd T; Jürgens, Tim; Wesker, Thorsten; Brand, Thomas; Kollmeier, Birger

    2010-11-01

    The influence of different sources of speech-intrinsic variation (speaking rate, effort, style and dialect or accent) on human speech perception was investigated. In listening experiments with 16 listeners, confusions of consonant-vowel-consonant (CVC) and vowel-consonant-vowel (VCV) sounds in speech-weighted noise were analyzed. Experiments were based on the OLLO logatome speech database, which was designed for a man-machine comparison. It contains utterances spoken by 50 speakers from five dialect/accent regions and covers several intrinsic variations. By comparing results depending on intrinsic and extrinsic variations (i.e., different levels of masking noise), the degradation induced by variabilities can be expressed in terms of the SNR. The spectral level distance between the respective speech segment and the long-term spectrum of the masking noise was found to be a good predictor for recognition rates, while phoneme confusions were influenced by the distance to spectrally close phonemes. An analysis based on transmitted information of articulatory features showed that voicing and manner of articulation are comparatively robust cues in the presence of intrinsic variations, whereas the coding of place is more degraded. The database and detailed results have been made available for comparisons between human speech recognition (HSR) and automatic speech recognizers (ASR).

  10. Impact of Cyrillic on Native English Speakers' Phono-lexical Acquisition of Russian.

    Science.gov (United States)

    Showalter, Catherine E

    2018-03-01

    We investigated the influence of grapheme familiarity and native language grapheme-phoneme correspondences during second language lexical learning. Native English speakers learned Russian-like words via auditory presentations containing only familiar first language phones, pictured meanings, and exposure to either Cyrillic orthographic forms (Orthography condition) or the sequence (No Orthography condition). Orthography participants saw three types of written forms: familiar-congruent (e.g., -[kom]), familiar-incongruent (e.g., -[rɑt]), and unfamiliar (e.g., -[fil]). At test, participants determined whether pictures and words matched according to what they saw during word learning. All participants performed near ceiling in all stimulus conditions, except for Orthography participants on words containing incongruent grapheme-phoneme correspondences. These results suggest that first language grapheme-phoneme correspondences can cause interference during second language phono-lexical acquisition. In addition, these results suggest that orthographic input effects are robust enough to interfere even when the input does not contain novel phones.

  11. Stochastic Model for Phonemes Uncovers an Author-Dependency of Their Usage.

    Science.gov (United States)

    Deng, Weibing; Allahverdyan, Armen E

    2016-01-01

    We study rank-frequency relations for phonemes, the minimal units that still relate to linguistic meaning. We show that these relations can be described by the Dirichlet distribution, a direct analogue of the ideal-gas model in statistical mechanics. This description allows us to demonstrate that the rank-frequency relations for phonemes of a text do depend on its author. The author-dependency effect is not caused by the author's vocabulary (common words used in different texts), and is confirmed by several alternative means. This suggests that it can be directly related to phonemes. These features contrast to rank-frequency relations for words, which are both author and text independent and are governed by the Zipf's law.

  12. Voice-to-Phoneme Conversion Algorithms for Voice-Tag Applications in Embedded Platforms

    Directory of Open Access Journals (Sweden)

    Yan Ming Cheng

    2008-08-01

    Full Text Available We describe two voice-to-phoneme conversion algorithms for speaker-independent voice-tag creation specifically targeted at applications on embedded platforms. These algorithms (batch mode and sequential are compared in speech recognition experiments where they are first applied in a same-language context in which both acoustic model training and voice-tag creation and application are performed on the same language. Then, their performance is tested in a cross-language setting where the acoustic models are trained on a particular source language while the voice-tags are created and applied on a different target language. In the same-language environment, both algorithms either perform comparably to or significantly better than the baseline where utterances are manually transcribed by a phonetician. In the cross-language context, the voice-tag performances vary depending on the source-target language pair, with the variation reflecting predicted phonological similarity between the source and target languages. Among the most similar languages, performance nears that of the native-trained models and surpasses the native reference baseline.

  13. Phonemic Awareness and Young Children.

    Science.gov (United States)

    Wasik, Barbara A.

    2001-01-01

    Asserts that regardless of the method used to teach reading, children first need a strong basis in phonemic awareness. Describes phonemic awareness, differentiates it from phonics, and presents available research findings. Advises on the development of phonemic awareness and creation of a classroom environment supportive of its development. (SD)

  14. An introduction to application-independent evaluation of speaker recognition systems

    NARCIS (Netherlands)

    Leeuwen, D.A. van; Brümmer, N.

    2007-01-01

    In the evaluation of speaker recognition systems - an important part of speaker classification [1], the trade-off between missed speakers and false alarms has always been an important diagnostic tool. NIST has defined the task of speaker detection with the associated Detection Cost Function (DCF) to

  15. An automatic speech recognition system with speaker-independent identification support

    Science.gov (United States)

    Caranica, Alexandru; Burileanu, Corneliu

    2015-02-01

    The novelty of this work relies on the application of an open source research software toolkit (CMU Sphinx) to train, build and evaluate a speech recognition system, with speaker-independent support, for voice-controlled hardware applications. Moreover, we propose to use the trained acoustic model to successfully decode offline voice commands on embedded hardware, such as an ARMv6 low-cost SoC, Raspberry PI. This type of single-board computer, mainly used for educational and research activities, can serve as a proof-of-concept software and hardware stack for low cost voice automation systems.

  16. Stochastic Model for Phonemes Uncovers an Author-Dependency of Their Usage.

    Directory of Open Access Journals (Sweden)

    Weibing Deng

    Full Text Available We study rank-frequency relations for phonemes, the minimal units that still relate to linguistic meaning. We show that these relations can be described by the Dirichlet distribution, a direct analogue of the ideal-gas model in statistical mechanics. This description allows us to demonstrate that the rank-frequency relations for phonemes of a text do depend on its author. The author-dependency effect is not caused by the author's vocabulary (common words used in different texts, and is confirmed by several alternative means. This suggests that it can be directly related to phonemes. These features contrast to rank-frequency relations for words, which are both author and text independent and are governed by the Zipf's law.

  17. On the status of the phoneme /b/ in heritage speakers of Spanish

    Directory of Open Access Journals (Sweden)

    Rajiv Rao

    2014-12-01

    Full Text Available This study examined intervocalic productions of /b/ in heritage speakers of Spanish residing in the United States. Eleven speakers were divided into two groups based on at-home exposure to Spanish, and subsequently completed reading and picture description tasks eliciting productions of intervocalic /b/ showing variation in word position, syllable stress, and orthography. The mixed-effects results revealed that while both groups manifested three clear phonetic categories, the group with more at-home experience followed a phonological rule of spirantization to a pure approximant to a higher degree across the data. The less-target-like stop and tense approximant allophones appeared more in the reading task, in stressed syllables, and in the less experienced group. Word boundary position interacted with group and task to induce less-target-like forms as well. The findings emphasize the influence of language background, linguistic context, orthography, and cognitive demands of tasks in accounting for heritage phonetics and phonology.

  18. Wavelet Packet Entropy in Speaker-Independent Emotional State Detection from Speech Signal

    OpenAIRE

    Mina Kadkhodaei Elyaderani; Seyed Hamid Mahmoodian; Ghazaal Sheikhi

    2015-01-01

    In this paper, wavelet packet entropy is proposed for speaker-independent emotion detection from speech. After pre-processing, wavelet packet decomposition using wavelet type db3 at level 4 is calculated and Shannon entropy in its nodes is calculated to be used as feature. In addition, prosodic features such as first four formants, jitter or pitch deviation amplitude, and shimmer or energy variation amplitude besides MFCC features are applied to complete the feature vector. Then, Support Vect...

  19. Multisensory speech perception in autism spectrum disorder: From phoneme to whole-word perception.

    Science.gov (United States)

    Stevenson, Ryan A; Baum, Sarah H; Segers, Magali; Ferber, Susanne; Barense, Morgan D; Wallace, Mark T

    2017-07-01

    Speech perception in noisy environments is boosted when a listener can see the speaker's mouth and integrate the auditory and visual speech information. Autistic children have a diminished capacity to integrate sensory information across modalities, which contributes to core symptoms of autism, such as impairments in social communication. We investigated the abilities of autistic and typically-developing (TD) children to integrate auditory and visual speech stimuli in various signal-to-noise ratios (SNR). Measurements of both whole-word and phoneme recognition were recorded. At the level of whole-word recognition, autistic children exhibited reduced performance in both the auditory and audiovisual modalities. Importantly, autistic children showed reduced behavioral benefit from multisensory integration with whole-word recognition, specifically at low SNRs. At the level of phoneme recognition, autistic children exhibited reduced performance relative to their TD peers in auditory, visual, and audiovisual modalities. However, and in contrast to their performance at the level of whole-word recognition, both autistic and TD children showed benefits from multisensory integration for phoneme recognition. In accordance with the principle of inverse effectiveness, both groups exhibited greater benefit at low SNRs relative to high SNRs. Thus, while autistic children showed typical multisensory benefits during phoneme recognition, these benefits did not translate to typical multisensory benefit of whole-word recognition in noisy environments. We hypothesize that sensory impairments in autistic children raise the SNR threshold needed to extract meaningful information from a given sensory input, resulting in subsequent failure to exhibit behavioral benefits from additional sensory information at the level of whole-word recognition. Autism Res 2017. © 2017 International Society for Autism Research, Wiley Periodicals, Inc. Autism Res 2017, 10: 1280-1290. © 2017 International

  20. A Model of Classification of Phonemic and Phonetic Negative Transfer: The case of Turkish –English Interlanguage with Pedagogical Applications

    Directory of Open Access Journals (Sweden)

    Sinan Bayraktaroğlu

    2011-04-01

    Full Text Available This article introduces a model of classification of phonemic and phonetic negative- transfer based on an empirical study of Turkish-English Interlanguage. The model sets out a hierarchy of difficulties, starting from the most crucial phonemic features affecting “intelligibility”, down to other distributional, phonetic, and allophonic features which need to be acquired if a “near-native” level of phonological competence is aimed at. Unlike previous theoretical studies of predictions of classification of phonemic and phonetic L1 interference (Moulton 1962a 1962b; Wiik 1965, this model is based on an empirical study of the recorded materials of Turkish-English IL speakers transcribed allophonically using the IPA Alphabet and diacritics. For different categories of observed systematic negative- transfer and their avoidance of getting “fossilized” in the IL process, remedial exercises are recommended for the teaching and learning BBC Pronunciation. In conclusıon, few methodological phonetic techniques, approaches, and specifications are put forward for their use in designing the curriculum and syllabus content of teaching L2 pronunciation.

  1. One-against-all weighted dynamic time warping for language-independent and speaker-dependent speech recognition in adverse conditions.

    Directory of Open Access Journals (Sweden)

    Xianglilan Zhang

    Full Text Available Considering personal privacy and difficulty of obtaining training material for many seldom used English words and (often non-English names, language-independent (LI with lightweight speaker-dependent (SD automatic speech recognition (ASR is a promising option to solve the problem. The dynamic time warping (DTW algorithm is the state-of-the-art algorithm for small foot-print SD ASR applications with limited storage space and small vocabulary, such as voice dialing on mobile devices, menu-driven recognition, and voice control on vehicles and robotics. Even though we have successfully developed two fast and accurate DTW variations for clean speech data, speech recognition for adverse conditions is still a big challenge. In order to improve recognition accuracy in noisy environment and bad recording conditions such as too high or low volume, we introduce a novel one-against-all weighted DTW (OAWDTW. This method defines a one-against-all index (OAI for each time frame of training data and applies the OAIs to the core DTW process. Given two speech signals, OAWDTW tunes their final alignment score by using OAI in the DTW process. Our method achieves better accuracies than DTW and merge-weighted DTW (MWDTW, as 6.97% relative reduction of error rate (RRER compared with DTW and 15.91% RRER compared with MWDTW are observed in our extensive experiments on one representative SD dataset of four speakers' recordings. To the best of our knowledge, OAWDTW approach is the first weighted DTW specially designed for speech data in adverse conditions.

  2. Is Cognitive Activity of Speech Based On Statistical Independence?

    DEFF Research Database (Denmark)

    Feng, Ling; Hansen, Lars Kai

    2008-01-01

    This paper explores the generality of COgnitive Component Analysis (COCA), which is defined as the process of unsupervised grouping of data such that the ensuing group structure is well-aligned with that resulting from human cognitive activity. The hypothesis of {COCA} is ecological......: the essentially independent features in a context defined ensemble can be efficiently coded using a sparse independent component representation. Our devised protocol aims at comparing the performance of supervised learning (invoking cognitive activity) and unsupervised learning (statistical regularities) based...... on similar representations, and the only difference lies in the human inferred labels. Inspired by the previous research on COCA, we introduce a new pair of models, which directly employ the independent hypothesis. Statistical regularities are revealed at multiple time scales on phoneme, gender, age...

  3. Toward a model of phoneme perception.

    Science.gov (United States)

    Ardila, A

    1993-05-01

    Hemisphere asymmetry in phoneme perception was analyzed. Three basic mechanisms underlying phoneme perception are proposed. Left temporal lobe would be specialized in: (1) ultrashort auditory (echoic) memory; (2) higher resolution power for some language frequencies; and (3) recognition of rapidly changing and time-dependent auditory signals. An attempt was made to apply some neurophysiological mechanisms described for the visual system to phoneme recognition in the auditory system.

  4. Wavelet Packet Entropy in Speaker-Independent Emotional State Detection from Speech Signal

    Directory of Open Access Journals (Sweden)

    Mina Kadkhodaei Elyaderani

    2015-01-01

    Full Text Available In this paper, wavelet packet entropy is proposed for speaker-independent emotion detection from speech. After pre-processing, wavelet packet decomposition using wavelet type db3 at level 4 is calculated and Shannon entropy in its nodes is calculated to be used as feature. In addition, prosodic features such as first four formants, jitter or pitch deviation amplitude, and shimmer or energy variation amplitude besides MFCC features are applied to complete the feature vector. Then, Support Vector Machine (SVM is used to classify the vectors in multi-class (all emotions or two-class (each emotion versus normal state format. 46 different utterances of a single sentence from Berlin Emotional Speech Dataset are selected. These are uttered by 10 speakers in sadness, happiness, fear, boredom, anger, and normal emotional state. Experimental results show that proposed features can improve emotional state detection accuracy in multi-class situation. Furthermore, adding to other features wavelet entropy coefficients increase the accuracy of two-class detection for anger, fear, and happiness.

  5. Investigating lexical competition and the cost of phonemic restoration

    DEFF Research Database (Denmark)

    Balling, Laura Winther; Morris, David Jackson; Tøndering, John

    2017-01-01

    Due to phonemic restoration, listeners can reliably perceive words when a phoneme is replaced with noise. The cost associated with this process was investigated along with the effect of lexical uniqueness on phonemic restoration, using data from a lexical decision experiment where noise replaced...... phonemes that were either uniqueness points (the phoneme at which a word deviates from all nonrelated words that share the same onset) or phonemes immediately prior to these. A baseline condition was also included with no noise-interrupted stimuli. Results showed a significant cost of phonemic restoration......, with 100 ms longer word identification times and a 14% decrease in word identification accuracy for interrupted stimuli compared to the baseline. Regression analysis of response times from the interrupted conditions showed no effect of whether the interrupted phoneme was a uniqueness point, but significant...

  6. Investigating Lexical Competition and the Cost of Phonemic Restoration

    DEFF Research Database (Denmark)

    Balling, Laura Winther; Morris, David Jackson; Tøndering, John

    2017-01-01

    Due to phonemic restoration, listeners can reliably perceive words when a phoneme is replaced with noise. The cost associated with this process was investigated along with the effect of lexical uniqueness on phonemic restoration, using data from a lexical decision experiment where noise replaced...... phonemes that were either uniqueness points (the phoneme at which a word deviates from all nonrelated words that share the same onset) or phonemes immediately prior to these. A baseline condition was also included with no noise-interrupted stimuli. Results showed a significant cost of phonemic restoration......, with 100 ms longer word identification times and a 14% decrease in word identification accuracy for interrupted stimuli compared to the baseline. Regression analysis of response times from the interrupted conditions showed no effect of whether the interrupted phoneme was a uniqueness point, but significant...

  7. Data requirements for speaker independent acoustic models

    CSIR Research Space (South Africa)

    Badenhorst, JAC

    2008-11-01

    Full Text Available When developing speech recognition systems in resource-constrained environments, careful design of the training corpus can play an important role in compensating for data scarcity. One of the factors to consider relates to the speaker composition...

  8. Experiments on Automatic Recognition of Nonnative Arabic Speech

    Directory of Open Access Journals (Sweden)

    Douglas O'Shaughnessy

    2008-05-01

    Full Text Available The automatic recognition of foreign-accented Arabic speech is a challenging task since it involves a large number of nonnative accents. As well, the nonnative speech data available for training are generally insufficient. Moreover, as compared to other languages, the Arabic language has sparked a relatively small number of research efforts. In this paper, we are concerned with the problem of nonnative speech in a speaker independent, large-vocabulary speech recognition system for modern standard Arabic (MSA. We analyze some major differences at the phonetic level in order to determine which phonemes have a significant part in the recognition performance for both native and nonnative speakers. Special attention is given to specific Arabic phonemes. The performance of an HMM-based Arabic speech recognition system is analyzed with respect to speaker gender and its native origin. The WestPoint modern standard Arabic database from the language data consortium (LDC and the hidden Markov Model Toolkit (HTK are used throughout all experiments. Our study shows that the best performance in the overall phoneme recognition is obtained when nonnative speakers are involved in both training and testing phases. This is not the case when a language model and phonetic lattice networks are incorporated in the system. At the phonetic level, the results show that female nonnative speakers perform better than nonnative male speakers, and that emphatic phonemes yield a significant decrease in performance when they are uttered by both male and female nonnative speakers.

  9. Experiments on Automatic Recognition of Nonnative Arabic Speech

    Directory of Open Access Journals (Sweden)

    Selouani Sid-Ahmed

    2008-01-01

    Full Text Available The automatic recognition of foreign-accented Arabic speech is a challenging task since it involves a large number of nonnative accents. As well, the nonnative speech data available for training are generally insufficient. Moreover, as compared to other languages, the Arabic language has sparked a relatively small number of research efforts. In this paper, we are concerned with the problem of nonnative speech in a speaker independent, large-vocabulary speech recognition system for modern standard Arabic (MSA. We analyze some major differences at the phonetic level in order to determine which phonemes have a significant part in the recognition performance for both native and nonnative speakers. Special attention is given to specific Arabic phonemes. The performance of an HMM-based Arabic speech recognition system is analyzed with respect to speaker gender and its native origin. The WestPoint modern standard Arabic database from the language data consortium (LDC and the hidden Markov Model Toolkit (HTK are used throughout all experiments. Our study shows that the best performance in the overall phoneme recognition is obtained when nonnative speakers are involved in both training and testing phases. This is not the case when a language model and phonetic lattice networks are incorporated in the system. At the phonetic level, the results show that female nonnative speakers perform better than nonnative male speakers, and that emphatic phonemes yield a significant decrease in performance when they are uttered by both male and female nonnative speakers.

  10. Extracting pronunciation rules for phonemic variants

    CSIR Research Space (South Africa)

    Davel, M

    2006-04-01

    Full Text Available Various automated techniques can be used to generalise from phonemic lexicons through the extraction of grapheme-to-phoneme rule sets. These techniques are particularly useful when developing pronunciation models for previously unmodelled languages...

  11. Analysis of human scream and its impact on text-independent speaker verification.

    Science.gov (United States)

    Hansen, John H L; Nandwana, Mahesh Kumar; Shokouhi, Navid

    2017-04-01

    Scream is defined as sustained, high-energy vocalizations that lack phonological structure. Lack of phonological structure is how scream is identified from other forms of loud vocalization, such as "yell." This study investigates the acoustic aspects of screams and addresses those that are known to prevent standard speaker identification systems from recognizing the identity of screaming speakers. It is well established that speaker variability due to changes in vocal effort and Lombard effect contribute to degraded performance in automatic speech systems (i.e., speech recognition, speaker identification, diarization, etc.). However, previous research in the general area of speaker variability has concentrated on human speech production, whereas less is known about non-speech vocalizations. The UT-NonSpeech corpus is developed here to investigate speaker verification from scream samples. This study considers a detailed analysis in terms of fundamental frequency, spectral peak shift, frame energy distribution, and spectral tilt. It is shown that traditional speaker recognition based on the Gaussian mixture models-universal background model framework is unreliable when evaluated with screams.

  12. Popular Public Discourse at Speakers' Corner: Negotiating Cultural Identities in Interaction

    DEFF Research Database (Denmark)

    McIlvenny, Paul

    1996-01-01

    , religious and general topical 'soap-box' oration. However, audiences are not passive receivers of rhetorical messages. They are active negotiators of interpretations and alignments that may conflict with the speaker's and other audience members' orientations to prior talk. Speakers' Corner is a space...

  13. Investigating lexical competition and the cost of phonemic restoration.

    Science.gov (United States)

    Balling, Laura Winther; Morris, David Jackson; Tøndering, John

    2017-12-01

    Due to phonemic restoration, listeners can reliably perceive words when a phoneme is replaced with noise. The cost associated with this process was investigated along with the effect of lexical uniqueness on phonemic restoration, using data from a lexical decision experiment where noise replaced phonemes that were either uniqueness points (the phoneme at which a word deviates from all nonrelated words that share the same onset) or phonemes immediately prior to these. A baseline condition was also included with no noise-interrupted stimuli. Results showed a significant cost of phonemic restoration, with 100 ms longer word identification times and a 14% decrease in word identification accuracy for interrupted stimuli compared to the baseline. Regression analysis of response times from the interrupted conditions showed no effect of whether the interrupted phoneme was a uniqueness point, but significant effects for several temporal attributes of the stimuli, including the duration and position of the interrupted segment. These results indicate that uniqueness points are not distinct breakpoints in the cohort reduction that occurs during lexical processing, but that temporal properties of the interrupted stimuli are central to auditory word recognition. These results are interpreted in the context of models of speech perception.

  14. Signal-to-Signal Ratio Independent Speaker Identification for Co-channel Speech Signals

    DEFF Research Database (Denmark)

    Saeidi, Rahim; Mowlaee, Pejman; Kinnunen, Tomi

    2010-01-01

    In this paper, we consider speaker identification for the co-channel scenario in which speech mixture from speakers is recorded by one microphone only. The goal is to identify both of the speakers from their mixed signal. High recognition accuracies have already been reported when an accurately...

  15. Residual dipolar couplings: are multiple independent alignments always possible?

    International Nuclear Information System (INIS)

    Higman, Victoria A.; Boyd, Jonathan; Smith, Lorna J.; Redfield, Christina

    2011-01-01

    RDCs for the 14 kDa protein hen egg-white lysozyme (HEWL) have been measured in eight different alignment media. The elongated shape and strongly positively charged surface of HEWL appear to limit the protein to four main alignment orientations. Furthermore, low levels of alignment and the protein’s interaction with some alignment media increases the experimental error. Together with heterogeneity across the alignment media arising from constraints on temperature, pH and ionic strength for some alignment media, these data are suitable for structure refinement, but not the extraction of dynamic parameters. For an analysis of protein dynamics the data must be obtained with very low errors in at least three or five independent alignment media (depending on the method used) and so far, such data have only been reported for three small 6–8 kDa proteins with identical folds: ubiquitin, GB1 and GB3. Our results suggest that HEWL is likely to be representative of many other medium to large sized proteins commonly studied by solution NMR. Comparisons with over 60 high-resolution crystal structures of HEWL reveal that the highest resolution structures are not necessarily always the best models for the protein structure in solution.

  16. Phonemic awareness as a pathway to number transcoding

    Directory of Open Access Journals (Sweden)

    Júlia Beatriz Lopes-Silva

    2014-01-01

    Full Text Available Although verbal and numerical abilities have a well-established interaction, the impact of phonological processing on numeric abilities remains elusive. The aim of this study is to investigate the role of phonemic awareness in number processing and to explore its association with other functions such as working memory and magnitude processing. One hundred seventy-two children in 2nd grade to 4th grade were evaluated in terms of their intelligence, number transcoding, phonemic awareness, verbal and visuospatial working memory and number sense (nonsymbolic magnitude comparison performance. All of the children had normal intelligence. Among these measurements of magnitude processing, working memory and phonemic awareness, only the last was retained in regression and path models predicting transcoding ability. Phonemic awareness mediated the influence of verbal working memory on number transcoding. The evidence suggests that phonemic awareness significantly affects number transcoding. Such an association is robust and should be considered in cognitive models of both dyslexia and dyscalculia.

  17. Cultural and biological evolution of phonemic speech

    NARCIS (Netherlands)

    de Boer, B.; Freitas, A.A.; Capcarrere, M.S.; Bentley, Peter J.; Johnson, Colin G.; Timmis, Jon

    2005-01-01

    This paper investigates the interaction between cultural evolution and biological evolution in the emergence of phonemic coding in speech. It is observed that our nearest relatives, the primates, use holistic utterances, whereas humans use phonemic utterances. It can therefore be argued that our

  18. Joint Single-Channel Speech Separation and Speaker Identification

    DEFF Research Database (Denmark)

    Mowlaee, Pejman; Saeidi, Rahim; Tan, Zheng-Hua

    2010-01-01

    In this paper, we propose a closed loop system to improve the performance of single-channel speech separation in a speaker independent scenario. The system is composed of two interconnected blocks: a separation block and a speaker identiſcation block. The improvement is accomplished by incorporat......In this paper, we propose a closed loop system to improve the performance of single-channel speech separation in a speaker independent scenario. The system is composed of two interconnected blocks: a separation block and a speaker identiſcation block. The improvement is accomplished...... enhances the quality of the separated output signals. To assess the improvements, the results are reported in terms of PESQ for both target and masked signals....

  19. A Text-Independent Speaker Authentication System for Mobile Devices

    Directory of Open Access Journals (Sweden)

    Florentin Thullier

    2017-09-01

    Full Text Available This paper presents a text independent speaker authentication method adapted to mobile devices. Special attention was placed on delivering a fully operational application, which admits a sufficient reliability level and an efficient functioning. To this end, we have excluded the need for any network communication. Hence, we opted for the completion of both the training and the identification processes directly on the mobile device through the extraction of linear prediction cepstral coefficients and the naive Bayes algorithm as the classifier. Furthermore, the authentication decision is enhanced to overcome misidentification through access privileges that the user should attribute to each application beforehand. To evaluate the proposed authentication system, eleven participants were involved in the experiment, conducted in quiet and noisy environments. Public speech corpora were also employed to compare this implementation to existing methods. Results were efficient regarding mobile resources’ consumption. The overall classification performance obtained was accurate with a small number of samples. Then, it appeared that our authentication system might be used as a first security layer, but also as part of a multilayer authentication, or as a fall-back mechanism.

  20. Hybrid Speaker Recognition Using Universal Acoustic Model

    Science.gov (United States)

    Nishimura, Jun; Kuroda, Tadahiro

    We propose a novel speaker recognition approach using a speaker-independent universal acoustic model (UAM) for sensornet applications. In sensornet applications such as “Business Microscope”, interactions among knowledge workers in an organization can be visualized by sensing face-to-face communication using wearable sensor nodes. In conventional studies, speakers are detected by comparing energy of input speech signals among the nodes. However, there are often synchronization errors among the nodes which degrade the speaker recognition performance. By focusing on property of the speaker's acoustic channel, UAM can provide robustness against the synchronization error. The overall speaker recognition accuracy is improved by combining UAM with the energy-based approach. For 0.1s speech inputs and 4 subjects, speaker recognition accuracy of 94% is achieved at the synchronization error less than 100ms.

  1. The Linguistic Affiliation Constraint and Phoneme Recognition in Diglossic Arabic

    Science.gov (United States)

    Saiegh-Haddad, Elinor; Levin, Iris; Hende, Nareman; Ziv, Margalit

    2011-01-01

    This study tested the effect of the phoneme's linguistic affiliation (Standard Arabic versus Spoken Arabic) on phoneme recognition among five-year-old Arabic native speaking kindergarteners (N=60). Using a picture selection task of words beginning with the same phoneme, and through careful manipulation of the phonological properties of target…

  2. Phonemic restoration in developmental dyslexia

    Directory of Open Access Journals (Sweden)

    Stephanie N. Del Tufo

    2014-06-01

    Full Text Available The comprehension of fluent speech in one’s native language requires that listeners integrate the detailed acoustic-phonetic information available in the sound signal with linguistic knowledge. This interplay is especially apparent in the phoneme restoration effect, a phenomenon in which a missing phoneme is ‘restored’ via the influence of top-down information from the lexicon and through bottom-up acoustic processing. Developmental dyslexia is a disorder characterized by an inability to read at the level of one’s peers without any clear failure due to environmental influences. In the current study we utilized the phonemic restoration illusion paradigm, to examine individual differences in phonemic restoration across a range of reading ability, from very good to dyslexic readers. Results demonstrate that restoration occurs less in those who have high scores on measures of phonological processing. Based on these results, we suggest that the processing or representation of acoustic detail may not be as reliable in poor and dyslexic readers, with the result that lexical information is more likely to override acoustic properties of the stimuli. This pattern of increased restoration could result from a failure of perceptual tuning, in which unstable representations of speech sounds result in the acceptance of non-speech sounds as speech. An additional or alternative theory is that degraded or impaired phonological processing at the speech sound level may reflect architecture that is overly plastic and consequently fails to stabilize appropriately for speech sound representations. Therefore the inability to separate speech and noise may result as a deficit in separating noise from the acoustic signal.

  3. Auditory Phoneme Discrimination in Illiterates: Mismatch Negativity--A Question of Literacy?

    Science.gov (United States)

    Schaadt, Gesa; Pannekamp, Ann; van der Meer, Elke

    2013-01-01

    These days, illiteracy is still a major problem. There is empirical evidence that auditory phoneme discrimination is one of the factors contributing to written language acquisition. The current study investigated auditory phoneme discrimination in participants who did not acquire written language sufficiently. Auditory phoneme discrimination was…

  4. Physiological Indices of Bilingualism: Oral–Motor Coordination and Speech Rate in Bengali–English Speakers

    Science.gov (United States)

    Chakraborty, Rahul; Goffman, Lisa; Smith, Anne

    2009-01-01

    Purpose To examine how age of immersion and proficiency in a 2nd language influence speech movement variability and speaking rate in both a 1st language and a 2nd language. Method A group of 21 Bengali–English bilingual speakers participated. Lip and jaw movements were recorded. For all 21 speakers, lip movement variability was assessed based on productions of Bengali (L1; 1st language) and English (L2; 2nd language) sentences. For analyses related to the influence of L2 proficiency on speech production processes, participants were sorted into low- (n = 7) and high-proficiency (n = 7) groups. Lip movement variability and speech rate were evaluated for both of these groups across L1 and L2 sentences. Results Surprisingly, adult bilingual speakers produced equally consistent speech movement patterns in their production of L1 and L2. When groups were sorted according to proficiency, highly proficient speakers were marginally more variable in their L1. In addition, there were some phoneme-specific effects, most markedly that segments not shared by both languages were treated differently in production. Consistent with previous studies, movement durations were longer for less proficient speakers in both L1 and L2. Interpretation In contrast to those of child learners, the speech motor systems of adult L2 speakers show a high degree of consistency. Such lack of variability presumably contributes to protracted difficulties with acquiring nativelike pronunciation in L2. The proficiency results suggest bidirectional interactions across L1 and L2, which is consistent with hypotheses regarding interference and the sharing of phonological space. A slower speech rate in less proficient speakers implies that there are increased task demands on speech production processes. PMID:18367680

  5. The absoluteness of semantic processing: lessons from the analysis of temporal clusters in phonemic verbal fluency.

    Directory of Open Access Journals (Sweden)

    Isabelle Vonberg

    Full Text Available For word production, we may consciously pursue semantic or phonological search strategies, but it is uncertain whether we can retrieve the different aspects of lexical information independently from each other. We therefore studied the spread of semantic information into words produced under exclusively phonemic task demands.42 subjects participated in a letter verbal fluency task, demanding the production of as many s-words as possible in two minutes. Based on curve fittings for the time courses of word production, output spurts (temporal clusters considered to reflect rapid lexical retrieval based on automatic activation spread, were identified. Semantic and phonemic word relatedness within versus between these clusters was assessed by respective scores (0 meaning no relation, 4 maximum relation.Subjects produced 27.5 (±9.4 words belonging to 6.7 (±2.4 clusters. Both phonemically and semantically words were more related within clusters than between clusters (phon: 0.33±0.22 vs. 0.19±0.17, p<.01; sem: 0.65±0.29 vs. 0.37±0.29, p<.01. Whereas the extent of phonemic relatedness correlated with high task performance, the contrary was the case for the extent of semantic relatedness.The results indicate that semantic information spread occurs, even if the consciously pursued word search strategy is purely phonological. This, together with the negative correlation between semantic relatedness and verbal output suits the idea of a semantic default mode of lexical search, acting against rapid task performance in the given scenario of phonemic verbal fluency. The simultaneity of enhanced semantic and phonemic word relatedness within the same temporal cluster boundaries suggests an interaction between content and sound-related information whenever a new semantic field has been opened.

  6. Continuing Medical Education Speakers with High Evaluation Scores Use more Image-based Slides

    Directory of Open Access Journals (Sweden)

    Ferguson, Ian

    2017-01-01

    Full Text Available Although continuing medical education (CME presentations are common across health professions, it is unknown whether slide design is independently associated with audience evaluations of the speaker. Based on the conceptual framework of Mayer’s theory of multimedia learning, this study aimed to determine whether image use and text density in presentation slides are associated with overall speaker evaluations. This retrospective analysis of six sequential CME conferences (two annual emergency medicine conferences over a three-year period used a mixed linear regression model to assess whether postconference speaker evaluations were associated with image fraction (percentage of image-based slides per presentation and text density (number of words per slide. A total of 105 unique lectures were given by 49 faculty members, and 1,222 evaluations (70.1% response rate were available for analysis. On average, 47.4% (SD=25.36 of slides had at least one educationally-relevant image (image fraction. Image fraction significantly predicted overall higher evaluation scores [F(1, 100.676=6.158, p=0.015] in the mixed linear regression model. The mean (SD text density was 25.61 (8.14 words/slide but was not a significant predictor [F(1, 86.293=0.55, p=0.815]. Of note, the individual speaker [χ2 (1=2.952, p=0.003] and speaker seniority [F(3, 59.713=4.083, p=0.011] significantly predicted higher scores. This is the first published study to date assessing the linkage between slide design and CME speaker evaluations by an audience of practicing clinicians. The incorporation of images was associated with higher evaluation scores, in alignment with Mayer’s theory of multimedia learning. Contrary to this theory, however, text density showed no significant association, suggesting that these scores may be multifactorial. Professional development efforts should focus on teaching best practices in both slide design and presentation skills.

  7. Continuing Medical Education Speakers with High Evaluation Scores Use more Image-based Slides.

    Science.gov (United States)

    Ferguson, Ian; Phillips, Andrew W; Lin, Michelle

    2017-01-01

    Although continuing medical education (CME) presentations are common across health professions, it is unknown whether slide design is independently associated with audience evaluations of the speaker. Based on the conceptual framework of Mayer's theory of multimedia learning, this study aimed to determine whether image use and text density in presentation slides are associated with overall speaker evaluations. This retrospective analysis of six sequential CME conferences (two annual emergency medicine conferences over a three-year period) used a mixed linear regression model to assess whether post-conference speaker evaluations were associated with image fraction (percentage of image-based slides per presentation) and text density (number of words per slide). A total of 105 unique lectures were given by 49 faculty members, and 1,222 evaluations (70.1% response rate) were available for analysis. On average, 47.4% (SD=25.36) of slides had at least one educationally-relevant image (image fraction). Image fraction significantly predicted overall higher evaluation scores [F(1, 100.676)=6.158, p=0.015] in the mixed linear regression model. The mean (SD) text density was 25.61 (8.14) words/slide but was not a significant predictor [F(1, 86.293)=0.55, p=0.815]. Of note, the individual speaker [χ 2 (1)=2.952, p=0.003] and speaker seniority [F(3, 59.713)=4.083, p=0.011] significantly predicted higher scores. This is the first published study to date assessing the linkage between slide design and CME speaker evaluations by an audience of practicing clinicians. The incorporation of images was associated with higher evaluation scores, in alignment with Mayer's theory of multimedia learning. Contrary to this theory, however, text density showed no significant association, suggesting that these scores may be multifactorial. Professional development efforts should focus on teaching best practices in both slide design and presentation skills.

  8. LEARNING VECTOR QUANTIZATION FOR ADAPTED GAUSSIAN MIXTURE MODELS IN AUTOMATIC SPEAKER IDENTIFICATION

    Directory of Open Access Journals (Sweden)

    IMEN TRABELSI

    2017-05-01

    Full Text Available Speaker Identification (SI aims at automatically identifying an individual by extracting and processing information from his/her voice. Speaker voice is a robust a biometric modality that has a strong impact in several application areas. In this study, a new combination learning scheme has been proposed based on Gaussian mixture model-universal background model (GMM-UBM and Learning vector quantization (LVQ for automatic text-independent speaker identification. Features vectors, constituted by the Mel Frequency Cepstral Coefficients (MFCC extracted from the speech signal are used to train the New England subset of the TIMIT database. The best results obtained (90% for gender- independent speaker identification, 97 % for male speakers and 93% for female speakers for test data using 36 MFCC features.

  9. Functions of graphemic and phonemic codes in visual word-recognition.

    Science.gov (United States)

    Meyer, D E; Schvaneveldt, R W; Ruddy, M G

    1974-03-01

    Previous investigators have argued that printed words are recognized directly from visual representations and/or phonological representations obtained through phonemic recoding. The present research tested these hypotheses by manipulating graphemic and phonemic relations within various pairs of letter strings. Ss in two experiments classified the pairs as words or nonwords. Reaction times and error rates were relatively small for word pairs (e.g., BRIBE-TRIBE) that were both graphemically, and phonemically similar. Graphemic similarity alone inhibited performance on other word pairs (e.g., COUCH-TOUCH). These and other results suggest that phonological representations play a significant role in visual word recognition and that there is a dependence between successive phonemic-encoding operations. An encoding-bias model is proposed to explain the data.

  10. Nurturing Phonemic Awareness and Alphabetic Knowledge in Pre-Kindergartners.

    Science.gov (United States)

    Steinhaus, Patricia L.

    Reading research continues to identify phonemic awareness and knowledge of the alphabetic principle as key factors in the literacy acquisition process and to indicate that they greatly facilitate decoding efforts. While research indicates that phonemic awareness and alphabetic knowledge are necessary to literacy acquisition, many early childhood…

  11. Phoneme Similarity and Confusability

    Science.gov (United States)

    Bailey, T.M.; Hahn, U.

    2005-01-01

    Similarity between component speech sounds influences language processing in numerous ways. Explanation and detailed prediction of linguistic performance consequently requires an understanding of these basic similarities. The research reported in this paper contrasts two broad classes of approach to the issue of phoneme similarity-theoretically…

  12. Spectro-Temporal Analysis of Speech for Spanish Phoneme Recognition

    DEFF Research Database (Denmark)

    Sharifzadeh, Sara; Serrano, Javier; Carrabina, Jordi

    2012-01-01

    are considered. This has improved the recognition performance especially in case of noisy situation and phonemes with time domain modulations such as stops. In this method, the 2D Discrete Cosine Transform (DCT) is applied on small overlapped 2D Hamming windowed patches of spectrogram of Spanish phonemes...

  13. What Does the Right Hemisphere Know about Phoneme Categories?

    Science.gov (United States)

    Wolmetz, Michael; Poeppel, David; Rapp, Brenda

    2011-01-01

    Innate auditory sensitivities and familiarity with the sounds of language give rise to clear influences of phonemic categories on adult perception of speech. With few exceptions, current models endorse highly left-hemisphere-lateralized mechanisms responsible for the influence of phonemic category on speech perception, based primarily on results…

  14. Phonetic basis of phonemic paraphasias in aphasia: Evidence for cascading activation.

    Science.gov (United States)

    Kurowski, Kathleen; Blumstein, Sheila E

    2016-02-01

    Phonemic paraphasias are a common presenting symptom in aphasia and are thought to reflect a deficit in which selecting an incorrect phonemic segment results in the clear-cut substitution of one phonemic segment for another. The current study re-examines the basis of these paraphasias. Seven left hemisphere-damaged aphasics with a range of left hemisphere lesions and clinical diagnoses including Broca's, Conduction, and Wernicke's aphasia, were asked to produce syllable-initial voiced and voiceless fricative consonants, [z] and [s], in CV syllables followed by one of five vowels [i e a o u] in isolation and in a carrier phrase. Acoustic analyses were conducted focusing on two acoustic parameters signaling voicing in fricative consonants: duration and amplitude properties of the fricative noise. Results show that for all participants, regardless of clinical diagnosis or lesion site, phonemic paraphasias leave an acoustic trace of the original target in the error production. These findings challenge the view that phonemic paraphasias arise from a mis-selection of phonemic units followed by its correct implementation, as traditionally proposed. Rather, they appear to derive from a common mechanism with speech errors reflecting the co-activation of a target and competitor resulting in speech output that has some phonetic properties of both segments. Copyright © 2015 Elsevier Ltd. All rights reserved.

  15. Pitch Correlogram Clustering for Fast Speaker Identification

    Directory of Open Access Journals (Sweden)

    Nitin Jhanwar

    2004-12-01

    Full Text Available Gaussian mixture models (GMMs are commonly used in text-independent speaker identification systems. However, for large speaker databases, their high computational run-time limits their use in online or real-time speaker identification situations. Two-stage identification systems, in which the database is partitioned into clusters based on some proximity criteria and only a single-cluster GMM is run in every test, have been suggested in literature to speed up the identification process. However, most clustering algorithms used have shown limited success, apparently because the clustering and GMM feature spaces used are derived from similar speech characteristics. This paper presents a new clustering approach based on the concept of a pitch correlogram that captures frame-to-frame pitch variations of a speaker rather than short-time spectral characteristics like cepstral coefficient, spectral slopes, and so forth. The effectiveness of this two-stage identification process is demonstrated on the IVIE corpus of 110 speakers. The overall system achieves a run-time advantage of 500% as well as a 10% reduction of error in overall speaker identification.

  16. Speaker diarization system on the 2007 NIST rich transcription meeting recognition evaluation

    Science.gov (United States)

    Sun, Hanwu; Nwe, Tin Lay; Koh, Eugene Chin Wei; Bin, Ma; Li, Haizhou

    2007-09-01

    This paper presents a speaker diarization system developed at the Institute for Infocomm Research (I2R) for NIST Rich Transcription 2007 (RT-07) evaluation task. We describe in details our primary approaches for the speaker diarization on the Multiple Distant Microphones (MDM) conditions in conference room scenario. Our proposed system consists of six modules: 1). Least-mean squared (NLMS) adaptive filter for the speaker direction estimate via Time Difference of Arrival (TDOA), 2). An initial speaker clustering via two-stage TDOA histogram distribution quantization approach, 3). Multiple microphone speaker data alignment via GCC-PHAT Time Delay Estimate (TDE) among all the distant microphone channel signals, 4). A speaker clustering algorithm based on GMM modeling approach, 5). Non-speech removal via speech/non-speech verification mechanism and, 6). Silence removal via "Double-Layer Windowing"(DLW) method. We achieves error rate of 31.02% on the 2006 Spring (RT-06s) MDM evaluation task and a competitive overall error rate of 15.32% for the NIST Rich Transcription 2007 (RT-07) MDM evaluation task.

  17. Why not model spoken word recognition instead of phoneme monitoring?

    NARCIS (Netherlands)

    Vroomen, J.; de Gelder, B.

    2000-01-01

    Norris, McQueen & Cutler present a detailed account of the decision stage of the phoneme monitoring task. However, we question whether this contributes to our understanding of the speech recognition process itself, and we fail to see why phonotactic knowledge is playing a role in phoneme

  18. Relating Pitch Awareness to Phonemic Awareness in Children: Implications for Tone-Deafness and Dyslexia

    Directory of Open Access Journals (Sweden)

    Psyche eLoui

    2011-05-01

    Full Text Available Language and music are complex cognitive and neural functions that rely on awareness of one’s own sound productions. Information on the awareness of vocal pitch, and its relation to phonemic awareness which is crucial for learning to read, will be important for understanding the relationship between tone-deafness and developmental language disorders such as dyslexia. Here we show that phonemic awareness skills are positively correlated with pitch perception-production skills in children. Children between the ages of 7 and 9 were tested on pitch perception and production, phonemic awareness, and IQ. Results showed a significant positive correlation between pitch perception-production and phonemic awareness, suggesting that the relationship between musical and linguistic sound processing is intimately linked to awareness at the level of pitch and phonemes. Since tone-deafness is a pitch-related impairment and dyslexia is a deficit of phonemic awareness, we suggest that dyslexia and tone-deafness may have a shared and/or common neural basis.

  19. Independent alignment of RNA for dynamic studies using residual dipolar couplings

    Energy Technology Data Exchange (ETDEWEB)

    Bardaro, Michael F.; Varani, Gabriele, E-mail: varani@chem.washington.edu [University of Washington, Department of Chemistry (United States)

    2012-09-15

    Molecular motion and dynamics play an essential role in the biological function of many RNAs. An important source of information on biomolecular motion can be found in residual dipolar couplings which contain dynamics information over the entire ms-ps timescale. However, these methods are not fully applicable to RNA because nucleic acid molecules tend to align in a highly collinear manner in different alignment media. As a consequence, information on dynamics that can be obtained with this method is limited. In order to overcome this limitation, we have generated a chimeric RNA containing both the wild type TAR RNA, the target of our investigation of dynamics, as well as the binding site for U1A protein. When U1A protein was bound to the portion of the chimeric RNA containing its binding site, we obtained independent alignment of TAR by exploiting the physical chemical characteristics of this protein. This technique can allow the extraction of new information on RNA dynamics, which is particularly important for time scales not covered by relaxation methods where important RNA motions occur.

  20. Varying acoustic-phonemic ambiguity reveals that talker normalization is obligatory in speech processing.

    Science.gov (United States)

    Choi, Ja Young; Hu, Elly R; Perrachione, Tyler K

    2018-04-01

    The nondeterministic relationship between speech acoustics and abstract phonemic representations imposes a challenge for listeners to maintain perceptual constancy despite the highly variable acoustic realization of speech. Talker normalization facilitates speech processing by reducing the degrees of freedom for mapping between encountered speech and phonemic representations. While this process has been proposed to facilitate the perception of ambiguous speech sounds, it is currently unknown whether talker normalization is affected by the degree of potential ambiguity in acoustic-phonemic mapping. We explored the effects of talker normalization on speech processing in a series of speeded classification paradigms, parametrically manipulating the potential for inconsistent acoustic-phonemic relationships across talkers for both consonants and vowels. Listeners identified words with varying potential acoustic-phonemic ambiguity across talkers (e.g., beet/boat vs. boot/boat) spoken by single or mixed talkers. Auditory categorization of words was always slower when listening to mixed talkers compared to a single talker, even when there was no potential acoustic ambiguity between target sounds. Moreover, the processing cost imposed by mixed talkers was greatest when words had the most potential acoustic-phonemic overlap across talkers. Models of acoustic dissimilarity between target speech sounds did not account for the pattern of results. These results suggest (a) that talker normalization incurs the greatest processing cost when disambiguating highly confusable sounds and (b) that talker normalization appears to be an obligatory component of speech perception, taking place even when the acoustic-phonemic relationships across sounds are unambiguous.

  1. De novo determination of internuclear vector orientations from residual dipolar couplings measured in three independent alignment media

    International Nuclear Information System (INIS)

    Ruan Ke; Briggman, Kathryn B.; Tolman, Joel R.

    2008-01-01

    The straightforward interpretation of solution state residual dipolar couplings (RDCs) in terms of internuclear vector orientations generally requires prior knowledge of the alignment tensor, which in turn is normally estimated using a structural model. We have developed a protocol which allows the requirement for prior structural knowledge to be dispensed with as long as RDC measurements can be made in three independent alignment media. This approach, called Rigid Structure from Dipolar Couplings (RSDC), allows vector orientations and alignment tensors to be determined de novo from just three independent sets of RDCs. It is shown that complications arising from the existence of multiple solutions can be overcome by careful consideration of alignment tensor magnitudes in addition to the agreement between measured and calculated RDCs. Extensive simulations as well applications to the proteins ubiquitin and Staphylococcal protein GB1 demonstrate that this method can provide robust determinations of alignment tensors and amide N-H bond orientations often with better than 10 o accuracy, even in the presence of modest levels of internal dynamics

  2. Physiological responses at short distances from a parametric speaker

    Directory of Open Access Journals (Sweden)

    Lee Soomin

    2012-06-01

    Full Text Available Abstract In recent years, parametric speakers have been used in various circumstances. In our previous studies, we verified that the physiological burden of the sound of parametric speaker set at 2.6 m from the subjects was lower than that of the general speaker. However, nothing has yet been demonstrated about the effects of the sound of a parametric speaker at the shorter distance between parametric speakers the human body. Therefore, we studied this effect on physiological functions and task performance. Nine male subjects participated in this study. They completed three consecutive sessions: a 20-minute quiet period as a baseline, a 30-minute mental task period with general speakers or parametric speakers, and a 20-minute recovery period. We measured electrocardiogram (ECG photoplethysmogram (PTG, electroencephalogram (EEG, systolic and diastolic blood pressure. Four experiments, one with a speaker condition (general speaker and parametric speaker, the other with a distance condition (0.3 m and 1.0 m, were conducted respectively at the same time of day on separate days. To examine the effects of the speaker and distance, three-way repeated measures ANOVA (speaker factor x distance factor x time factor were conducted. In conclusion, we found that the physiological responses were not significantly different between the speaker condition and the distance condition. Meanwhile, it was shown that the physiological burdens increased with progress in time independently of speaker condition and distance condition. In summary, the effects of the parametric speaker at the 2.6 m distance were not obtained at the distance of 1 m or less.

  3. FPGA Implementation for GMM-Based Speaker Identification

    Directory of Open Access Journals (Sweden)

    Phaklen EhKan

    2011-01-01

    Full Text Available In today's society, highly accurate personal identification systems are required. Passwords or pin numbers can be forgotten or forged and are no longer considered to offer a high level of security. The use of biological features, biometrics, is becoming widely accepted as the next level for security systems. Biometric-based speaker identification is a method of identifying persons from their voice. Speaker-specific characteristics exist in speech signals due to different speakers having different resonances of the vocal tract. These differences can be exploited by extracting feature vectors such as Mel-Frequency Cepstral Coefficients (MFCCs from the speech signal. A well-known statistical modelling process, the Gaussian Mixture Model (GMM, then models the distribution of each speaker's MFCCs in a multidimensional acoustic space. The GMM-based speaker identification system has features that make it promising for hardware acceleration. This paper describes the hardware implementation for classification of a text-independent GMM-based speaker identification system. The aim was to produce a system that can perform simultaneous identification of large numbers of voice streams in real time. This has important potential applications in security and in automated call centre applications. A speedup factor of ninety was achieved compared to a software implementation on a standard PC.

  4. Processing of syllable stress is functionally different from phoneme processing and does not profit from literacy acquisition

    Directory of Open Access Journals (Sweden)

    Ulrike eSchild

    2014-06-01

    Full Text Available Speech is characterized by phonemes and prosody. Neurocognitive evidence supports the separate processing of each type of information. Therefore, one might suggest individual development of both pathways. In this study, we examine literacy acquisition in middle childhood. Children become aware of the phonemes in speech at that time and refine phoneme processing when they acquire an alphabetic writing system. We test whether an enhanced sensitivity to phonemes in middle childhood extends to other aspects of the speech signal, such as prosody. To investigate prosodic processing, we used stress priming. Spoken stressed and unstressed syllables (primes preceded spoken German words with stress on the first syllable (targets. We orthogonally varied stress overlap and phoneme overlap between the primes and onsets of the targets. Lexical decisions and Event-Related Potentials (ERPs for the targets were obtained for pre-reading preschoolers, reading pupils and adults. The behavioral and ERP results were largely comparable across all groups. The fastest responses were observed when the first syllable of the target word shared stress and phonemes with the preceding prime. ERP stress priming and ERP phoneme priming started 200 ms after the target word onset. Bilateral ERP stress priming was characterized by enhanced ERP amplitudes for stress overlap. Left-lateralized ERP phoneme priming replicates previously observed reduced ERP amplitudes for phoneme overlap. Groups differed in the strength of the behavioral phoneme priming and in the late ERP phoneme priming effect. The present results show that enhanced phonological processing in middle childhood is restricted to phonemes and does not extend to prosody. These results are indicative of two parallel processing systems for phonemes and prosody that might follow different developmental trajectories in middle childhood as a function of alphabetic literacy.

  5. Universal and Language-Specific Constraints on Phonemic Awareness: Evidence from Russian-Hebrew Bilingual Children

    Science.gov (United States)

    Saiegh-Haddad, Elinor; Kogan, Nadya; Walters, Joel

    2010-01-01

    The study tested phonemic awareness in the two languages of Russian (L1)-Hebrew (L2) sequential bilingual children (N = 20) using phoneme deletion tasks where the phoneme to be deleted occurred word initial, word final, as a singleton, or part of a cluster, in long and short words and stressed and unstressed syllables. The experiments were…

  6. Constitutive equations for the Doi-Edwards model without independent alignment

    DEFF Research Database (Denmark)

    Hassager, Ole; Hansen, Rasmus

    2010-01-01

    We present two representations of the Doi-Edwards model without Independent Alignment explicitly expressed in terms of the Finger strain tensor, its inverse and its invariants. The two representations provide explicit expressions for the stress prior to and after Rouse relaxation of chain stretch......, respectively. The maximum deviations from the exact representations in simple shear, biaxial extension and uniaxial extension are of order 2%. Based on these two representations, we propose a framework for Doi-Edwards models including chain stretch in the memory integral form....

  7. Using Start/End Timings of Spectral Transitions Between Phonemes in Concatenative Speech Synthesis

    OpenAIRE

    Toshio Hirai; Seiichi Tenpaku; Kiyohiro Shikano

    2002-01-01

    The definition of "phoneme boundary timing" in a speech corpus affects the quality of concatenative speech synthesis systems. For example, if the selected speech unit is not appropriately match to the speech unit of the required phoneme environment, the quality may be degraded. In this paper, a dynamic segment boundary defi- nition is proposed. In the definition, the concatenation point is chosen from the start or end timings of spectral transition depending on the phoneme environment at the ...

  8. New Grapheme Generation Rules for Two-Stage Modelbased Grapheme-to-Phoneme Conversion

    Directory of Open Access Journals (Sweden)

    Seng Kheang

    2015-01-01

    Full Text Available The precise conversion of arbitrary text into its  corresponding phoneme sequence (grapheme-to-phoneme or G2P conversion is implemented in speech synthesis and recognition, pronunciation learning software, spoken term detection and spoken document retrieval systems. Because the quality of this module plays an important role in the performance of such systems and many problems regarding G2P conversion have been reported, we propose a novel two-stage model-based approach, which is implemented using an existing weighted finite-state transducer-based G2P conversion framework, to improve the performance of the G2P conversion model. The first-stage model is built for automatic conversion of words  to phonemes, while  the second-stage  model utilizes the input graphemes and output phonemes obtained from the first stage to determine the best final output phoneme sequence. Additionally, we designed new grapheme generation rules, which enable extra detail for the vowel and consonant graphemes appearing within a word. When compared with previous approaches, the evaluation results indicate that our approach using rules focusing on the vowel graphemes slightly improved the accuracy of the out-of-vocabulary dataset and consistently increased the accuracy of the in-vocabulary dataset.

  9. A Phonemic and Acoustic Analysis of Hindko Oral Stops

    Directory of Open Access Journals (Sweden)

    Haroon Ur RASHID

    2014-12-01

    Full Text Available Hindko is an Indo-Aryan language that is mainly spoken in Khyber Pukhtoonkhaw province of Pakistan. This work aims to identify the oral stops of Hindko and determine the intrinsic acoustic cues for them. The phonemic analysis is done with the help of minimal pairs and phoneme distribution in contrastive environments which reveals that Hindko has twelve oral stops with three way series. The acoustic analysis of these segments shows that intrinsically voice onset time (VOT, closure duration and burst are reliable and distinguishing cues of stops in Hindko.

  10. Google Home: smart speaker as environmental control unit.

    Science.gov (United States)

    Noda, Kenichiro

    2017-08-23

    Environmental Control Units (ECU) are devices or a system that allows a person to control appliances in their home or work environment. Such system can be utilized by clients with physical and/or functional disability to enhance their ability to control their environment, to promote independence and improve their quality of life. Over the last several years, there have been an emergence of several inexpensive, commercially-available, voice activated smart speakers into the market such as Google Home and Amazon Echo. These smart speakers are equipped with far field microphone that supports voice recognition, and allows for complete hand-free operation for various purposes, including for playing music, for information retrieval, and most importantly, for environmental control. Clients with disability could utilize these features to turn the unit into a simple ECU that is completely voice activated and wirelessly connected to appliances. Smart speakers, with their ease of setup, low cost and versatility, may be a more affordable and accessible alternative to the traditional ECU. Implications for Rehabilitation Environmental Control Units (ECU) enable independence for physically and functionally disabled clients, and reduce burden and frequency of demands on carers. Traditional ECU can be costly and may require clients to learn specialized skills to use. Smart speakers have the potential to be used as a new-age ECU by overcoming these barriers, and can be used by a wider range of clients.

  11. Teacher candidates' mastery of phoneme-grapheme correspondence: massed versus distributed practice in teacher education.

    Science.gov (United States)

    Sayeski, Kristin L; Earle, Gentry A; Eslinger, R Paige; Whitenton, Jessy N

    2017-04-01

    Matching phonemes (speech sounds) to graphemes (letters and letter combinations) is an important aspect of decoding (translating print to speech) and encoding (translating speech to print). Yet, many teacher candidates do not receive explicit training in phoneme-grapheme correspondence. Difficulty with accurate phoneme production and/or lack of understanding of sound-symbol correspondence can make it challenging for teachers to (a) identify student errors on common assessments and (b) serve as a model for students when teaching beginning reading or providing remedial reading instruction. For students with dyslexia, lack of teacher proficiency in this area is particularly problematic. This study examined differences between two learning conditions (massed and distributed practice) on teacher candidates' development of phoneme-grapheme correspondence knowledge and skills. An experimental, pretest-posttest-delayed test design was employed with teacher candidates (n = 52) to compare a massed practice condition (one, 60-min session) to a distributed practice condition (four, 15-min sessions distributed over 4 weeks) for learning phonemes associated with letters and letter combinations. Participants in the distributed practice condition significantly outperformed participants in the massed practice condition on their ability to correctly produce phonemes associated with different letters and letter combinations. Implications for teacher preparation are discussed.

  12. Phonological abilities in literacy-impaired children: Brain potentials reveal deficient phoneme discrimination, but intact prosodic processing

    Directory of Open Access Journals (Sweden)

    Claudia Männel

    2017-02-01

    Full Text Available Intact phonological processing is crucial for successful literacy acquisition. While individuals with difficulties in reading and spelling (i.e., developmental dyslexia are known to experience deficient phoneme discrimination (i.e., segmental phonology, findings concerning their prosodic processing (i.e., suprasegmental phonology are controversial. Because there are no behavior-independent studies on the underlying neural correlates of prosodic processing in dyslexia, these controversial findings might be explained by different task demands. To provide an objective behavior-independent picture of segmental and suprasegmental phonological processing in impaired literacy acquisition, we investigated event-related brain potentials during passive listening in typically and poor-spelling German school children. For segmental phonology, we analyzed the Mismatch Negativity (MMN during vowel length discrimination, capturing automatic auditory deviancy detection in repetitive contexts. For suprasegmental phonology, we analyzed the Closure Positive Shift (CPS that automatically occurs in response to prosodic boundaries. Our results revealed spelling group differences for the MMN, but not for the CPS, indicating deficient segmental, but intact suprasegmental phonological processing in poor spellers. The present findings point towards a differential role of segmental and suprasegmental phonology in literacy disorders and call for interventions that invigorate impaired literacy by utilizing intact prosody in addition to training deficient phonemic awareness.

  13. Decoding speech perception by native and non-native speakers using single-trial electrophysiological data.

    Directory of Open Access Journals (Sweden)

    Alex Brandmeyer

    Full Text Available Brain-computer interfaces (BCIs are systems that use real-time analysis of neuroimaging data to determine the mental state of their user for purposes such as providing neurofeedback. Here, we investigate the feasibility of a BCI based on speech perception. Multivariate pattern classification methods were applied to single-trial EEG data collected during speech perception by native and non-native speakers. Two principal questions were asked: 1 Can differences in the perceived categories of pairs of phonemes be decoded at the single-trial level? 2 Can these same categorical differences be decoded across participants, within or between native-language groups? Results indicated that classification performance progressively increased with respect to the categorical status (within, boundary or across of the stimulus contrast, and was also influenced by the native language of individual participants. Classifier performance showed strong relationships with traditional event-related potential measures and behavioral responses. The results of the cross-participant analysis indicated an overall increase in average classifier performance when trained on data from all participants (native and non-native. A second cross-participant classifier trained only on data from native speakers led to an overall improvement in performance for native speakers, but a reduction in performance for non-native speakers. We also found that the native language of a given participant could be decoded on the basis of EEG data with accuracy above 80%. These results indicate that electrophysiological responses underlying speech perception can be decoded at the single-trial level, and that decoding performance systematically reflects graded changes in the responses related to the phonological status of the stimuli. This approach could be used in extensions of the BCI paradigm to support perceptual learning during second language acquisition.

  14. eMatchSite: sequence order-independent structure alignments of ligand binding pockets in protein models.

    Directory of Open Access Journals (Sweden)

    Michal Brylinski

    2014-09-01

    Full Text Available Detecting similarities between ligand binding sites in the absence of global homology between target proteins has been recognized as one of the critical components of modern drug discovery. Local binding site alignments can be constructed using sequence order-independent techniques, however, to achieve a high accuracy, many current algorithms for binding site comparison require high-quality experimental protein structures, preferably in the bound conformational state. This, in turn, complicates proteome scale applications, where only various quality structure models are available for the majority of gene products. To improve the state-of-the-art, we developed eMatchSite, a new method for constructing sequence order-independent alignments of ligand binding sites in protein models. Large-scale benchmarking calculations using adenine-binding pockets in crystal structures demonstrate that eMatchSite generates accurate alignments for almost three times more protein pairs than SOIPPA. More importantly, eMatchSite offers a high tolerance to structural distortions in ligand binding regions in protein models. For example, the percentage of correctly aligned pairs of adenine-binding sites in weakly homologous protein models is only 4-9% lower than those aligned using crystal structures. This represents a significant improvement over other algorithms, e.g. the performance of eMatchSite in recognizing similar binding sites is 6% and 13% higher than that of SiteEngine using high- and moderate-quality protein models, respectively. Constructing biologically correct alignments using predicted ligand binding sites in protein models opens up the possibility to investigate drug-protein interaction networks for complete proteomes with prospective systems-level applications in polypharmacology and rational drug repositioning. eMatchSite is freely available to the academic community as a web-server and a stand-alone software distribution at http://www.brylinski.org/ematchsite.

  15. Speaker segmentation and clustering

    OpenAIRE

    Kotti, M; Moschou, V; Kotropoulos, C

    2008-01-01

    07.08.13 KB. Ok to add the accepted version to Spiral, Elsevier says ok whlile mandate not enforced. This survey focuses on two challenging speech processing topics, namely: speaker segmentation and speaker clustering. Speaker segmentation aims at finding speaker change points in an audio stream, whereas speaker clustering aims at grouping speech segments based on speaker characteristics. Model-based, metric-based, and hybrid speaker segmentation algorithms are reviewed. Concerning speaker...

  16. Speaker Recognition

    DEFF Research Database (Denmark)

    Mølgaard, Lasse Lohilahti; Jørgensen, Kasper Winther

    2005-01-01

    Speaker recognition is basically divided into speaker identification and speaker verification. Verification is the task of automatically determining if a person really is the person he or she claims to be. This technology can be used as a biometric feature for verifying the identity of a person...

  17. Automatic phoneme category selectivity in the dorsal auditory stream.

    Science.gov (United States)

    Chevillet, Mark A; Jiang, Xiong; Rauschecker, Josef P; Riesenhuber, Maximilian

    2013-03-20

    Debates about motor theories of speech perception have recently been reignited by a burst of reports implicating premotor cortex (PMC) in speech perception. Often, however, these debates conflate perceptual and decision processes. Evidence that PMC activity correlates with task difficulty and subject performance suggests that PMC might be recruited, in certain cases, to facilitate category judgments about speech sounds (rather than speech perception, which involves decoding of sounds). However, it remains unclear whether PMC does, indeed, exhibit neural selectivity that is relevant for speech decisions. Further, it is unknown whether PMC activity in such cases reflects input via the dorsal or ventral auditory pathway, and whether PMC processing of speech is automatic or task-dependent. In a novel modified categorization paradigm, we presented human subjects with paired speech sounds from a phonetic continuum but diverted their attention from phoneme category using a challenging dichotic listening task. Using fMRI rapid adaptation to probe neural selectivity, we observed acoustic-phonetic selectivity in left anterior and left posterior auditory cortical regions. Conversely, we observed phoneme-category selectivity in left PMC that correlated with explicit phoneme-categorization performance measured after scanning, suggesting that PMC recruitment can account for performance on phoneme-categorization tasks. Structural equation modeling revealed connectivity from posterior, but not anterior, auditory cortex to PMC, suggesting a dorsal route for auditory input to PMC. Our results provide evidence for an account of speech processing in which the dorsal stream mediates automatic sensorimotor integration of speech and may be recruited to support speech decision tasks.

  18. Shared perceptual processes in phoneme and word perception: Evidence from aphasia

    Directory of Open Access Journals (Sweden)

    Heather Raye Dial

    2014-04-01

    Replicating previous studies, performance on the two word recognition tasks without closely matched distractors (WAB and PWM was at ceiling for some subjects with impairments on consonant discrimination (see Figures 1a/1b. However, as shown in Figures 1c/1d, for word processing tasks matched in phonological discriminability to the consonant discrimination task, scores on consonant discrimination and word processing were highly correlated, and no individual demonstrated substantially better performance on word than phoneme perception. One patient demonstrated worse performance on lexical decision (d’ = .21 than phoneme perception (d’ = 1.72, which can be attributed to impaired lexical or semantic processing. These data argue against the hypothesis that phoneme and word perception rely on different perceptual processes/routes for processing, and instead indicate that word perception depends on perception of sublexical units.

  19. Performance of svm, k-nn and nbc classifiers for text-independent speaker identification with and without modelling through merging models

    Directory of Open Access Journals (Sweden)

    Yussouf Nahayo

    2016-04-01

    Full Text Available This paper proposes some methods of robust text-independent speaker identification based on Gaussian Mixture Model (GMM. We implemented a combination of GMM model with a set of classifiers such as Support Vector Machine (SVM, K-Nearest Neighbour (K-NN, and Naive Bayes Classifier (NBC. In order to improve the identification rate, we developed a combination of hybrid systems by using validation technique. The experiments were performed on the dialect DR1 of the TIMIT corpus. The results have showed a better performance for the developed technique compared to the individual techniques.

  20. Using generalized maxout networks and phoneme mapping for low resource ASR- a case study on Flemish-Afrikaans

    CSIR Research Space (South Africa)

    Sahraeian, R

    2015-11-01

    Full Text Available -driven phoneme mapping we propose to use an approximation of Kullback Leibler Divergence (KLD) to generate a confusion matrix and find the best matching phonemes of the target language for each individual phoneme in the donor language. Moreover, we explore...

  1. Developing consistent pronunciation models for phonemic variants

    CSIR Research Space (South Africa)

    Davel, M

    2006-09-01

    Full Text Available Pronunciation lexicons often contain pronunciation variants. This can create two problems: It can be difficult to define these variants in an internally consistent way and it can also be difficult to extract generalised grapheme-to-phoneme rule sets...

  2. Optimization of multilayer neural network parameters for speaker recognition

    Science.gov (United States)

    Tovarek, Jaromir; Partila, Pavol; Rozhon, Jan; Voznak, Miroslav; Skapa, Jan; Uhrin, Dominik; Chmelikova, Zdenka

    2016-05-01

    This article discusses the impact of multilayer neural network parameters for speaker identification. The main task of speaker identification is to find a specific person in the known set of speakers. It means that the voice of an unknown speaker (wanted person) belongs to a group of reference speakers from the voice database. One of the requests was to develop the text-independent system, which means to classify wanted person regardless of content and language. Multilayer neural network has been used for speaker identification in this research. Artificial neural network (ANN) needs to set parameters like activation function of neurons, steepness of activation functions, learning rate, the maximum number of iterations and a number of neurons in the hidden and output layers. ANN accuracy and validation time are directly influenced by the parameter settings. Different roles require different settings. Identification accuracy and ANN validation time were evaluated with the same input data but different parameter settings. The goal was to find parameters for the neural network with the highest precision and shortest validation time. Input data of neural networks are a Mel-frequency cepstral coefficients (MFCC). These parameters describe the properties of the vocal tract. Audio samples were recorded for all speakers in a laboratory environment. Training, testing and validation data set were split into 70, 15 and 15 %. The result of the research described in this article is different parameter setting for the multilayer neural network for four speakers.

  3. On Low-level Cognitive Components of Speech

    DEFF Research Database (Denmark)

    Feng, Ling; Hansen, Lars Kai

    2005-01-01

    In this paper we analyze speech for low-level cognitive features using linear component analysis. We demonstrate generalizable component 'fingerprints' stemming from both phonemes and speaker. Phonemes are fingerprints found at the basic analysis window time scale (20 msec), while speaker...... 'voiceprints' are found at time scales around 1000 msec. The analysis is based on homomorphic filtering features and energy based sparsification....

  4. Analysis of a Phonological Variation in Oraukwu Dialect of Igbo: A ...

    African Journals Online (AJOL)

    This paper is an analysis of the use of the phoneme /Ɩ/ for /r/ in Oraukwu dialect of Igbo. It takes into cognizance the phonetic – phonological variability of ordinary speech. Oraukwu speakers virtually do not use the phoneme /r/ in their speech. Rather, the use phoneme / / where the phoneme /r/ should occur. For instance ...

  5. Phonemic Transcriptions in British and American Dictionaries

    Directory of Open Access Journals (Sweden)

    Rastislav Šuštaršič

    2005-06-01

    Full Text Available In view of recent criticisms concerning vowel symbols in some British English dictionaries (in particular by J. Windsor Lewis in JIPA (Windsor Lewis, 2003, with regard to the Oxford Dictionary of Pronunciation (Upton, 2001, this article extends the discussion on English phonemic transcriptions by including those that typically occur in standard American dictionaries, and by comparing the most common conventions of British and American dictionaries. In addition to symbols for both vowels and consonants, the paper also deals with the different representations of word accentuation and the issue of consistency regarding application of phonemic (systemic, broad, rather than phonetic (allophonic, narrow transcription. The different transcriptions are assessed from the points of view of their departures from the International Phonetic Alphabet, their overlapping with orthographic representation (spelling and their appropriateness in terms of reflecting actual pronunciation in standard British and/or American pronunciation.

  6. Auditory ERB like admissible wavelet packet features for TIMIT phoneme recognition

    Directory of Open Access Journals (Sweden)

    P.K. Sahu

    2014-09-01

    Full Text Available In recent years wavelet transform has been found to be an effective tool for time–frequency analysis. Wavelet transform has been used as feature extraction in speech recognition applications and it has proved to be an effective technique for unvoiced phoneme classification. In this paper a new filter structure using admissible wavelet packet is analyzed for English phoneme recognition. These filters have the benefit of having frequency bands spacing similar to the auditory Equivalent Rectangular Bandwidth (ERB scale. Central frequencies of ERB scale are equally distributed along the frequency response of human cochlea. A new sets of features are derived using wavelet packet transform's multi-resolution capabilities and found to be better than conventional features for unvoiced phoneme problems. Some of the noises from NOISEX-92 database has been used for preparing the artificial noisy database to test the robustness of wavelet based features.

  7. Phonological processing skills in 6 year old blind and sighted Persian speakers

    Directory of Open Access Journals (Sweden)

    Maryam Sadat Momen Vaghefi

    2013-03-01

    Full Text Available Background and Aim: Phonological processing skills include the abilities to restore, retrieve and use memorized phonological codes. The purpose of this research is to compare and evaluate phonological processing skills in 6-7 year old blind and sighted Persian speakers in Tehran, Iran.Methods: This research is an analysis-comparison study. The subjects were 24 blind and 24 sighted children. The evaluation test of reading and writing disorders in primary school students, linguistic and cognitive abilities test, and the naming subtest of the aphasia evaluation test were used as research tools.Results: Sighted children were found to perform better on phoneme recognition of nonwords and flower naming subtests; and the difference was significant (p<0.001. Blind children performed better in words and sentence memory; the difference was significant (p<0.001. There were no significant differences in other subtests.Conclusion: Blind children's better performance in memory tasks is due to the fact that they have powerful auditory memory.

  8. The Phonemic Awareness Skills of Cochlear Implant Children and Children with Normal Hearing in Primary School

    Directory of Open Access Journals (Sweden)

    Aliakbar Dashtelei

    2015-12-01

    Full Text Available Objectives: Phonemic awareness skills have a significant impact on children speech and language. The purpose of this study was investigating the phonemic awareness skills of children with cochlear implant and normal hearing peers in primary school. Methods: phonemic awareness subscales of phonological awareness test were administered to 30 children with cochlear implantation at the first to sixth grades of primary school and 30 children with normal hearing who were matched in age with cochlear implant group. All of children were between 6 to 11 years old. Children with cochlear implant had at least 1 to 2 years of implant experience and they were over 5 years when they receive implantation. Children with cochlear implant were selected from Special education centers in Tehran and children with normal hearing were recruited from primary schools in Tehran. The phonemic awareness skills were assessed in both groups. Results: The results showed that the Mean scores of phonemic awareness skills in cochlear implant children were significantly lower than children with normal hearing (P<.0001. Discussion: children with cochlear implant, despite Cochlear implantation prosthesis, had lower performance in phonemic awareness when compared with normal hearing children. Therefore, due to importance of phonemic awareness skills in learning of literacy skills, and defects of these skills in children with cochlear implant, these skills should be assessed carefully in children with cochlear implant and rehabilitative interventions should be considered.

  9. Low-Frequency Cortical Entrainment to Speech Reflects Phoneme-Level Processing.

    Science.gov (United States)

    Di Liberto, Giovanni M; O'Sullivan, James A; Lalor, Edmund C

    2015-10-05

    The human ability to understand speech is underpinned by a hierarchical auditory system whose successive stages process increasingly complex attributes of the acoustic input. It has been suggested that to produce categorical speech perception, this system must elicit consistent neural responses to speech tokens (e.g., phonemes) despite variations in their acoustics. Here, using electroencephalography (EEG), we provide evidence for this categorical phoneme-level speech processing by showing that the relationship between continuous speech and neural activity is best described when that speech is represented using both low-level spectrotemporal information and categorical labeling of phonetic features. Furthermore, the mapping between phonemes and EEG becomes more discriminative for phonetic features at longer latencies, in line with what one might expect from a hierarchical system. Importantly, these effects are not seen for time-reversed speech. These findings may form the basis for future research on natural language processing in specific cohorts of interest and for broader insights into how brains transform acoustic input into meaning. Copyright © 2015 Elsevier Ltd. All rights reserved.

  10. The Effect of Adaptive Nonlinear Frequency Compression on Phoneme Perception.

    Science.gov (United States)

    Glista, Danielle; Hawkins, Marianne; Bohnert, Andrea; Rehmann, Julia; Wolfe, Jace; Scollie, Susan

    2017-12-12

    This study implemented a fitting method, developed for use with frequency lowering hearing aids, across multiple testing sites, participants, and hearing aid conditions to evaluate speech perception with a novel type of frequency lowering. A total of 8 participants, including children and young adults, participated in real-world hearing aid trials. A blinded crossover design, including posttrial withdrawal testing, was used to assess aided phoneme perception. The hearing aid conditions included adaptive nonlinear frequency compression (NFC), static NFC, and conventional processing. Enabling either adaptive NFC or static NFC improved group-level detection and recognition results for some high-frequency phonemes, when compared with conventional processing. Mean results for the distinction component of the Phoneme Perception Test (Schmitt, Winkler, Boretzki, & Holube, 2016) were similar to those obtained with conventional processing. Findings suggest that both types of NFC tested in this study provided a similar amount of speech perception benefit, when compared with group-level performance with conventional hearing aid technology. Individual-level results are presented with discussion around patterns of results that differ from the group average.

  11. Speaker Authentication

    CERN Document Server

    Li, Qi (Peter)

    2012-01-01

    This book focuses on use of voice as a biometric measure for personal authentication. In particular, "Speaker Recognition" covers two approaches in speaker authentication: speaker verification (SV) and verbal information verification (VIV). The SV approach attempts to verify a speaker’s identity based on his/her voice characteristics while the VIV approach validates a speaker’s identity through verification of the content of his/her utterance(s). SV and VIV can be combined for new applications. This is still a new research topic with significant potential applications. The book provides with a broad overview of the recent advances in speaker authentication while giving enough attention to advanced and useful algorithms and techniques. It also provides a step by step introduction to the current state of the speaker authentication technology, from the fundamental concepts to advanced algorithms. We will also present major design methodologies and share our experience in developing real and successful speake...

  12. On Low-level Cognitive Components of Speech

    DEFF Research Database (Denmark)

    Feng, Ling; Hansen, Lars Kai

    2006-01-01

    In this paper we analyze speech for low-level cognitive features using linear component analysis. We demonstrate generalizable component ‘fingerprints’ stemming from both phonemes and speakers. Phonemes are fingerprints found at the basic analysis window time scale (20 msec), while speaker...... ‘voiceprints’ are found at time scales around 1000 msec. The analysis is based on homomorphic filtering features and energy based sparsification....

  13. Analysis of Acoustic Features in Speakers with Cognitive Disorders and Speech Impairments

    Science.gov (United States)

    Saz, Oscar; Simón, Javier; Rodríguez, W. Ricardo; Lleida, Eduardo; Vaquero, Carlos

    2009-12-01

    This work presents the results in the analysis of the acoustic features (formants and the three suprasegmental features: tone, intensity and duration) of the vowel production in a group of 14 young speakers suffering different kinds of speech impairments due to physical and cognitive disorders. A corpus with unimpaired children's speech is used to determine the reference values for these features in speakers without any kind of speech impairment within the same domain of the impaired speakers; this is 57 isolated words. The signal processing to extract the formant and pitch values is based on a Linear Prediction Coefficients (LPCs) analysis of the segments considered as vowels in a Hidden Markov Model (HMM) based Viterbi forced alignment. Intensity and duration are also based in the outcome of the automated segmentation. As main conclusion of the work, it is shown that intelligibility of the vowel production is lowered in impaired speakers even when the vowel is perceived as correct by human labelers. The decrease in intelligibility is due to a 30% of increase in confusability in the formants map, a reduction of 50% in the discriminative power in energy between stressed and unstressed vowels and to a 50% increase of the standard deviation in the length of the vowels. On the other hand, impaired speakers keep good control of tone in the production of stressed and unstressed vowels.

  14. Speaker identification for the improvement of the security communication between law enforcement units

    Science.gov (United States)

    Tovarek, Jaromir; Partila, Pavol

    2017-05-01

    This article discusses the speaker identification for the improvement of the security communication between law enforcement units. The main task of this research was to develop the text-independent speaker identification system which can be used for real-time recognition. This system is designed for identification in the open set. It means that the unknown speaker can be anyone. Communication itself is secured, but we have to check the authorization of the communication parties. We have to decide if the unknown speaker is the authorized for the given action. The calls are recorded by IP telephony server and then these recordings are evaluate using classification If the system evaluates that the speaker is not authorized, it sends a warning message to the administrator. This message can detect, for example a stolen phone or other unusual situation. The administrator then performs the appropriate actions. Our novel proposal system uses multilayer neural network for classification and it consists of three layers (input layer, hidden layer, and output layer). A number of neurons in input layer corresponds with the length of speech features. Output layer then represents classified speakers. Artificial Neural Network classifies speech signal frame by frame, but the final decision is done over the complete record. This rule substantially increases accuracy of the classification. Input data for the neural network are a thirteen Mel-frequency cepstral coefficients, which describe the behavior of the vocal tract. These parameters are the most used for speaker recognition. Parameters for training, testing and validation were extracted from recordings of authorized users. Recording conditions for training data correspond with the real traffic of the system (sampling frequency, bit rate). The main benefit of the research is the system developed for text-independent speaker identification which is applied to secure communication between law enforcement units.

  15. Parietotemporal Stimulation Affects Acquisition of Novel Grapheme-Phoneme Mappings in Adult Readers

    Directory of Open Access Journals (Sweden)

    Jessica W. Younger

    2018-03-01

    Full Text Available Neuroimaging work from developmental and reading intervention research has suggested a cause of reading failure may be lack of engagement of parietotemporal cortex during initial acquisition of grapheme-phoneme (letter-sound mappings. Parietotemporal activation increases following grapheme-phoneme learning and successful reading intervention. Further, stimulation of parietotemporal cortex improves reading skill in lower ability adults. However, it is unclear whether these improvements following stimulation are due to enhanced grapheme-phoneme mapping abilities. To test this hypothesis, we used transcranial direct current stimulation (tDCS to manipulate parietotemporal function in adult readers as they learned a novel artificial orthography with new grapheme-phoneme mappings. Participants received real or sham stimulation to the left inferior parietal lobe (L IPL for 20 min before training. They received explicit training over the course of 3 days on 10 novel words each day. Learning of the artificial orthography was assessed at a pre-training baseline session, the end of each of the three training sessions, an immediate post-training session and a delayed post-training session about 4 weeks after training. Stimulation interacted with baseline reading skill to affect learning of trained words and transfer to untrained words. Lower skill readers showed better acquisition, whereas higher skill readers showed worse acquisition, when training was paired with real stimulation, as compared to readers who received sham stimulation. However, readers of all skill levels showed better maintenance of trained material following parietotemporal stimulation, indicating a differential effect of stimulation on initial learning and consolidation. Overall, these results indicate that parietotemporal stimulation can enhance learning of new grapheme-phoneme relationships in readers with lower reading skill. Yet, while parietotemporal function is critical to new

  16. Parietotemporal Stimulation Affects Acquisition of Novel Grapheme-Phoneme Mappings in Adult Readers

    Science.gov (United States)

    Younger, Jessica W.; Booth, James R.

    2018-01-01

    Neuroimaging work from developmental and reading intervention research has suggested a cause of reading failure may be lack of engagement of parietotemporal cortex during initial acquisition of grapheme-phoneme (letter-sound) mappings. Parietotemporal activation increases following grapheme-phoneme learning and successful reading intervention. Further, stimulation of parietotemporal cortex improves reading skill in lower ability adults. However, it is unclear whether these improvements following stimulation are due to enhanced grapheme-phoneme mapping abilities. To test this hypothesis, we used transcranial direct current stimulation (tDCS) to manipulate parietotemporal function in adult readers as they learned a novel artificial orthography with new grapheme-phoneme mappings. Participants received real or sham stimulation to the left inferior parietal lobe (L IPL) for 20 min before training. They received explicit training over the course of 3 days on 10 novel words each day. Learning of the artificial orthography was assessed at a pre-training baseline session, the end of each of the three training sessions, an immediate post-training session and a delayed post-training session about 4 weeks after training. Stimulation interacted with baseline reading skill to affect learning of trained words and transfer to untrained words. Lower skill readers showed better acquisition, whereas higher skill readers showed worse acquisition, when training was paired with real stimulation, as compared to readers who received sham stimulation. However, readers of all skill levels showed better maintenance of trained material following parietotemporal stimulation, indicating a differential effect of stimulation on initial learning and consolidation. Overall, these results indicate that parietotemporal stimulation can enhance learning of new grapheme-phoneme relationships in readers with lower reading skill. Yet, while parietotemporal function is critical to new learning, its role in

  17. Oscillatory Dynamics Underlying Perceptual Narrowing of Native Phoneme Mapping from 6 to 12 Months of Age.

    Science.gov (United States)

    Ortiz-Mantilla, Silvia; Hämäläinen, Jarmo A; Realpe-Bonilla, Teresa; Benasich, April A

    2016-11-30

    During the first months of life, human infants process phonemic elements from all languages similarly. However, by 12 months of age, as language-specific phonemic maps are established, infants respond preferentially to their native language. This process, known as perceptual narrowing, supports neural representation and thus efficient processing of the distinctive phonemes within the sound environment. Although oscillatory mechanisms underlying processing of native and non-native phonemic contrasts were recently delineated in 6-month-old infants, the maturational trajectory of these mechanisms remained unclear. A group of typically developing infants born into monolingual English families, were followed from 6 to 12 months and presented with English and Spanish syllable contrasts varying in voice-onset time. Brain responses were recorded with high-density electroencephalogram, and sources of event-related potential generators identified at right and left auditory cortices at 6 and 12 months and also at frontal cortex at 6 months. Time-frequency analyses conducted at source level found variations in both θ and γ ranges across age. Compared with 6-month-olds, 12-month-olds' responses to native phonemes showed smaller and faster phase synchronization and less spectral power in the θ range, and increases in left phase synchrony as well as induced high-γ activity in both frontal and left auditory sources. These results demonstrate that infants become more automatized and efficient in processing their native language as they approach 12 months of age via the interplay between θ and γ oscillations. We suggest that, while θ oscillations support syllable processing, γ oscillations underlie phonemic perceptual narrowing, progressively favoring mapping of native over non-native language across the first year of life. During early language acquisition, typically developing infants gradually construct phonemic maps of their native language in auditory cortex. It is well

  18. Multiple functional units in the preattentive segmentation of speech in Japanese: evidence from word illusions.

    Science.gov (United States)

    Nakamura, Miyoko; Kolinsky, Régine

    2014-12-01

    We explored the functional units of speech segmentation in Japanese using dichotic presentation and a detection task requiring no intentional sublexical analysis. Indeed, illusory perception of a target word might result from preattentive migration of phonemes, morae, or syllables from one ear to the other. In Experiment I, Japanese listeners detected targets presented in hiragana and/or kanji. Phoneme migrations did occur, suggesting that orthography-independent sublexical constituents play some role in segmentation. However, syllable and especially mora migrations were more numerous. This pattern of results was not observed in French speakers (Experiment 2), suggesting that it reflects native segmentation in Japanese. To control for the intervention of kanji representations (many words are written in kanji, and one kanji often corresponds to one syllable), in Experiment 3, Japanese listeners were presented with target loanwords that can be written only in katakana. Again, phoneme migrations occurred, while the first mora and syllable led to similar rates of illusory percepts. No migration occurred for the second, "special" mora (/J/ or/N/), probably because this constitutes the latter part of a heavy syllable. Overall, these findings suggest that multiple units, such as morae, syllables, and even phonemes, function independently of orthographic knowledge in Japanese preattentive speech segmentation.

  19. On Spoken English Phoneme Evaluation Method Based on Sphinx-4 Computer System

    Directory of Open Access Journals (Sweden)

    Li Qin

    2017-12-01

    Full Text Available In oral English learning, HDPs (phonemes that are hard to be distinguished are areas where Chinese students frequently make mistakes in pronunciation. This paper studies a speech phoneme evaluation method for HDPs, hoping to improve the ability of individualized evaluation on HDPs and help provide a personalized learning platform for English learners. First of all, this paper briefly introduces relevant phonetic recognition technologies and pronunciation evaluation algorithms and also describes the phonetic retrieving, phonetic decoding and phonetic knowledge base in the Sphinx-4 computer system, which constitute the technological foundation for phoneme evaluation. Then it proposes an HDP evaluation model, which integrates the reliability of the speech processing system and the individualization of spoken English learners into the evaluation system. After collecting HDPs of spoken English learners and sorting them into different sets, it uses the evaluation system to recognize these HDP sets and at last analyzes the experimental results of HDP evaluation, which proves the effectiveness of the HDP evaluation model.

  20. Comparison of HMM and DTW methods in automatic recognition of pathological phoneme pronunciation

    OpenAIRE

    Wielgat, Robert; Zielinski, Tomasz P.; Swietojanski, Pawel; Zoladz, Piotr; Król, Daniel; Wozniak, Tomasz; Grabias, Stanislaw

    2007-01-01

    In the paper recently proposed Human Factor Cepstral Coefficients (HFCC) are used to automatic recognition of pathological phoneme pronunciation in speech of impaired children and efficiency of this approach is compared to application of the standard Mel-Frequency Cepstral Coefficients (MFCC) as a feature vector. Both dynamic time warping (DTW), working on whole words or embedded phoneme patterns, and hidden Markov models (HMM) are used as classifiers in the presented research. Obtained resul...

  1. Measuring prosodic deficits in oral discourse by speakers with fluent aphasia

    Directory of Open Access Journals (Sweden)

    Tan Lee

    2015-04-01

    Full Text Available Introduction Prosody refers to the part of phonology that includes speech rhythm, stress, and intonation (Gandour, 1998. Proper intonation is crucial for expressing one’s emotion and linguistic meaning. Most studies examining prosodic deficits have been primarily focused on Broca’s aphasia (e.g., Danly & Shapiro, 1982. The current study proposed a computer-assisted method for systematic and objective examination of intonation patterns in aphasic oral discourse. The speech materials were in Hong Kong Cantonese, which is known of being rich in tones. Since surface F0 contour is the result of complicated interplay between sentence-level intonation and syllable-level lexical tones, the challenge is on how to extract meaningful representations of intonation from acoustic signals. Methods Four individuals with fluent aphasia (two anomic and two transcortical sensory and four gender-, age-, and education-matched controls participated. Based on the Cantonese AphasiaBank protocol (Kong, Law, & Lee, 2009, narrative samples and corresponding audio recordings were collected using discourse tasks of personal monologue, picture and sequential description, and story-telling. There were eight recordings for each subject. Each oral discourse was divided into sentences by manual inspection of the orthographic transcription and the respective acoustic signal. A sentence was defined as a sequence of words that in principle covers a complete thought. However, it was common in spontaneous oral discourse, especially in aphasia, that some of the sentences did not end with a completed expression but switched to a new topic. Occasionally, an obvious interjection was observed during an attempt of restarting a statement. Phoneme-level automatic time alignment was performed on each audio recording using hidden Markov model (HMM based forced alignment technique (Lee, Kong, Chan, & Wang, 2013. F0 was estimated from the acoustic signal at intervals of 0.01 second by

  2. Using Reversed MFCC and IT-EM for Automatic Speaker Verification

    Directory of Open Access Journals (Sweden)

    Sheeraz Memon

    2012-01-01

    Full Text Available This paper proposes text independent automatic speaker verification system using IMFCC (Inverse/ Reverse Mel Frequency Coefficients and IT-EM (Information Theoretic Expectation Maximization. To perform speaker verification, feature extraction using Mel scale has been widely applied and has established better results. The IMFCC is based on inverse Mel-scale. The IMFCC effectively captures information available at the high frequency formants which is ignored by the MFCC. In this paper the fusion of MFCC and IMFCC at input level is proposed. GMMs (Gaussian Mixture Models based on EM (Expectation Maximization have been widely used for classification of text independent verification. However EM comes across the convergence issue. In this paper we use our proposed IT-EM which has faster convergence, to train speaker models. IT-EM uses information theory principles such as PDE (Parzen Density Estimation and KL (Kullback-Leibler divergence measure. IT-EM acclimatizes the weights, means and covariances, like EM. However, IT-EM process is not performed on feature vector sets but on a set of centroids obtained using IT (Information Theoretic metric. The IT-EM process at once diminishes divergence measure between PDE estimates of features distribution within a given class and the centroids distribution within the same class. The feature level fusion and IT-EM is tested for the task of speaker verification using NIST2001 and NIST2004. The experimental evaluation validates that MFCC/IMFCC has better results than the conventional delta/MFCC feature set. The MFCC/IMFCC feature vector size is also much smaller than the delta MFCC thus reducing the computational burden as well. IT-EM method also showed faster convergence, than the conventional EM method, and thus it leads to higher speaker recognition scores.

  3. Contribution to automatic speech recognition. Analysis of the direct acoustical signal. Recognition of isolated words and phoneme identification

    International Nuclear Information System (INIS)

    Dupeyrat, Benoit

    1981-01-01

    This report deals with the acoustical-phonetic step of the automatic recognition of the speech. The parameters used are the extrema of the acoustical signal (coded in amplitude and duration). This coding method, the properties of which are described, is simple and well adapted to a digital processing. The quality and the intelligibility of the coded signal after reconstruction are particularly satisfactory. An experiment for the automatic recognition of isolated words has been carried using this coding system. We have designed a filtering algorithm operating on the parameters of the coding. Thus the characteristics of the formants can be derived under certain conditions which are discussed. Using these characteristics the identification of a large part of the phonemes for a given speaker was achieved. Carrying on the studies has required the development of a particular methodology of real time processing which allowed immediate evaluation of the improvement of the programs. Such processing on temporal coding of the acoustical signal is extremely powerful and could represent, used in connection with other methods an efficient tool for the automatic processing of the speech.(author) [fr

  4. Large-corpus phoneme and word recognition and the generality of lexical context in CVC word perception.

    Science.gov (United States)

    Gelfand, Jessica T; Christie, Robert E; Gelfand, Stanley A

    2014-02-01

    Speech recognition may be analyzed in terms of recognition probabilities for perceptual wholes (e.g., words) and parts (e.g., phonemes), where j or the j-factor reveals the number of independent perceptual units required for recognition of the whole (Boothroyd, 1968b; Boothroyd & Nittrouer, 1988; Nittrouer & Boothroyd, 1990). For consonant-vowel-consonant (CVC) nonsense syllables, j ∼ 3 because all 3 phonemes are needed to identify the syllable, but j ∼ 2.5 for real-word CVCs (revealing ∼2.5 independent perceptual units) because higher level contributions such as lexical knowledge enable word recognition even if less than 3 phonemes are accurately received. These findings were almost exclusively determined with the 120-word corpus of the isophonemic word lists (Boothroyd, 1968a; Boothroyd & Nittrouer, 1988), presented one word at a time. It is therefore possible that its generality or applicability may be limited. This study thus determined j by using a much larger and less restricted corpus of real-word CVCs presented in 3-word groups as well as whether j is influenced by test size. The j-factor for real-word CVCs was derived from the recognition performance of 223 individuals with a broad range of hearing sensitivity by using the Tri-Word Test (Gelfand, 1998), which involves 50 three-word presentations and a corpus of 450 words. The influence of test size was determined from a subsample of 96 participants with separate scores for the first 10, 20, and 25 (and all 50) presentation sets of the full test. The mean value of j was 2.48 with a 95% confidence interval of 2.44-2.53, which is in good agreement with values obtained with isophonemic word lists, although its value varies among individuals. A significant correlation was found between percent-correct scores and j, but it was small and accounted for only 12.4% of the variance in j for phoneme scores ≥60%. Mean j-factors for the 10-, 20-, 25-, and 50-set test sizes were between 2.49 and 2.53 and were not

  5. Paced Reading in Semantic Dementia: Word Knowledge Contributes to Phoneme Binding in Rapid Speech Production

    Science.gov (United States)

    Jefferies, Elizabeth; Grogan, John; Mapelli, Cristina; Isella, Valeria

    2012-01-01

    Patients with semantic dementia (SD) show deficits in phoneme binding in immediate serial recall: when attempting to reproduce a sequence of words that they no longer fully understand, they show frequent migrations of phonemes between items (e.g., cap, frog recalled as "frap, cog"). This suggests that verbal short-term memory emerges directly from…

  6. Cortical oscillations related to processing congruent and incongruent grapheme-phoneme pairs.

    Science.gov (United States)

    Herdman, Anthony T; Fujioka, Takako; Chau, Wilkin; Ross, Bernhard; Pantev, Christo; Picton, Terence W

    2006-05-15

    In this study, we investigated changes in cortical oscillations following congruent and incongruent grapheme-phoneme stimuli. Hiragana graphemes and phonemes were simultaneously presented as congruent or incongruent audiovisual stimuli to native Japanese-speaking participants. The discriminative reaction time was 57 ms shorter for congruent than incongruent stimuli. Analysis of MEG responses using synthetic aperture magnetometry (SAM) revealed that congruent stimuli evoked larger 2-10 Hz activity in the left auditory cortex within the first 250 ms after stimulus onset, and smaller 2-16 Hz activity in bilateral visual cortices between 250 and 500 ms. These results indicate that congruent visual input can modify cortical activity in the left auditory cortex.

  7. Multiresolution analysis applied to text-independent phone segmentation

    International Nuclear Information System (INIS)

    Cherniz, AnalIa S; Torres, MarIa E; Rufiner, Hugo L; Esposito, Anna

    2007-01-01

    Automatic speech segmentation is of fundamental importance in different speech applications. The most common implementations are based on hidden Markov models. They use a statistical modelling of the phonetic units to align the data along a known transcription. This is an expensive and time-consuming process, because of the huge amount of data needed to train the system. Text-independent speech segmentation procedures have been developed to overcome some of these problems. These methods detect transitions in the evolution of the time-varying features that represent the speech signal. Speech representation plays a central role is the segmentation task. In this work, two new speech parameterizations based on the continuous multiresolution entropy, using Shannon entropy, and the continuous multiresolution divergence, using Kullback-Leibler distance, are proposed. These approaches have been compared with the classical Melbank parameterization. The proposed encodings increase significantly the segmentation performance. Parameterization based on the continuous multiresolution divergence shows the best results, increasing the number of correctly detected boundaries and decreasing the amount of erroneously inserted points. This suggests that the parameterization based on multiresolution information measures provide information related to acoustic features that take into account phonemic transitions

  8. Working with Speakers.

    Science.gov (United States)

    Pestel, Ann

    1989-01-01

    The author discusses working with speakers from business and industry to present career information at the secondary level. Advice for speakers is presented, as well as tips for program coordinators. (CH)

  9. On the optimization of a mixed speaker array in an enclosed space using the virtual-speaker weighting method

    Science.gov (United States)

    Peng, Bo; Zheng, Sifa; Liao, Xiangning; Lian, Xiaomin

    2018-03-01

    In order to achieve sound field reproduction in a wide frequency band, multiple-type speakers are used. The reproduction accuracy is not only affected by the signals sent to the speakers, but also depends on the position and the number of each type of speaker. The method of optimizing a mixed speaker array is investigated in this paper. A virtual-speaker weighting method is proposed to optimize both the position and the number of each type of speaker. In this method, a virtual-speaker model is proposed to quantify the increment of controllability of the speaker array when the speaker number increases. While optimizing a mixed speaker array, the gain of the virtual-speaker transfer function is used to determine the priority orders of the candidate speaker positions, which optimizes the position of each type of speaker. Then the relative gain of the virtual-speaker transfer function is used to determine whether the speakers are redundant, which optimizes the number of each type of speaker. Finally the virtual-speaker weighting method is verified by reproduction experiments of the interior sound field in a passenger car. The results validate that the optimum mixed speaker array can be obtained using the proposed method.

  10. Effects of lips and hands on auditory learning of second-language speech sounds.

    Science.gov (United States)

    Hirata, Yukari; Kelly, Spencer D

    2010-04-01

    Previous research has found that auditory training helps native English speakers to perceive phonemic vowel length contrasts in Japanese, but their performance did not reach native levels after training. Given that multimodal information, such as lip movement and hand gesture, influences many aspects of native language processing, the authors examined whether multimodal input helps to improve native English speakers' ability to perceive Japanese vowel length contrasts. Sixty native English speakers participated in 1 of 4 types of training: (a) audio-only; (b) audio-mouth; (c) audio-hands; and (d) audio-mouth-hands. Before and after training, participants were given phoneme perception tests that measured their ability to identify short and long vowels in Japanese (e.g., /kato/ vs. /kato/). Although all 4 groups improved from pre- to posttest (replicating previous research), the participants in the audio-mouth condition improved more than those in the audio-only condition, whereas the 2 conditions involving hand gestures did not. Seeing lip movements during training significantly helps learners to perceive difficult second-language phonemic contrasts, but seeing hand gestures does not. The authors discuss possible benefits and limitations of using multimodal information in second-language phoneme learning.

  11. Gesturing by Speakers with Aphasia: How Does It Compare?

    Science.gov (United States)

    Mol, Lisette; Krahmer, Emiel; van de Sandt-Koenderman, Mieke

    2013-01-01

    Purpose: To study the independence of gesture and verbal language production. The authors assessed whether gesture can be semantically compensatory in cases of verbal language impairment and whether speakers with aphasia and control participants use similar depiction techniques in gesture. Method: The informativeness of gesture was assessed in 3…

  12. Activations of human auditory cortex to phonemic and nonphonemic vowels during discrimination and memory tasks.

    Science.gov (United States)

    Harinen, Kirsi; Rinne, Teemu

    2013-08-15

    We used fMRI to investigate activations within human auditory cortex (AC) to vowels during vowel discrimination, vowel (categorical n-back) memory, and visual tasks. Based on our previous studies, we hypothesized that the vowel discrimination task would be associated with increased activations in the anterior superior temporal gyrus (STG), while the vowel memory task would enhance activations in the posterior STG and inferior parietal lobule (IPL). In particular, we tested the hypothesis that activations in the IPL during vowel memory tasks are associated with categorical processing. Namely, activations due to categorical processing should be higher during tasks performed on nonphonemic (hard to categorize) than on phonemic (easy to categorize) vowels. As expected, we found distinct activation patterns during vowel discrimination and vowel memory tasks. Further, these task-dependent activations were different during tasks performed on phonemic or nonphonemic vowels. However, activations in the IPL associated with the vowel memory task were not stronger during nonphonemic than phonemic vowel blocks. Together these results demonstrate that activations in human AC to vowels depend on both the requirements of the behavioral task and the phonemic status of the vowels. Copyright © 2013 Elsevier Inc. All rights reserved.

  13. Word and phoneme frequency of occurrence in conversational ...

    African Journals Online (AJOL)

    Frequency of occurrence of Setswana consonant phonemes in a conversational sample of 49 358 Setswana-words was deter- mined. The percentage of occurrence of each consonant in each of the three positions of words was found to be 53.9% in the initial, 41.8% in the medial and 4.1% in the final positions. In addition ...

  14. Behavioural evidence of a dissociation between voice gender categorization and phoneme categorization using auditory morphed stimuli

    Directory of Open Access Journals (Sweden)

    Cyril R Pernet

    2014-01-01

    Full Text Available Both voice gender and speech perception rely on neuronal populations located in the peri-sylvian areas. However, whilst functional imaging studies suggest a left versus right hemisphere and anterior versus posterior dissociation between voice and speech categorization, psycholinguistic studies on talker variability suggest that these two processes (voice and speech categorization share common mechanisms. In this study, we investigated the categorical perception of voice gender (male vs. female and phonemes (/pa/ vs. /ta/ using the same stimulus continua generated by morphing. This allowed the investigation of behavioural differences while controlling acoustic characteristics, since the same stimuli were used in both tasks. Despite a higher acoustic dissimilarity between items during the phoneme categorization task (a male and female voice producing the same phonemes than the gender task (the same person producing 2 phonemes, results showed that speech information is being processed much faster than voice information. In addition, f0 or timbre equalization did not affect RT, which disagrees with the classical psycholinguistic models in which voice information is stripped away or normalized to access phonetic content. Also, despite similar response (percentages and perceptual (d’ curves, a reverse correlation analysis on acoustic features revealed, as expected, that the formant frequencies of the consonant distinguished stimuli in the phoneme task, but that only the vowel formant frequencies distinguish stimuli in the gender task. The 2nd set of results thus also disagrees with models postulating that the same acoustic information is used for voice and speech. Altogether these results suggest that voice gender categorization and phoneme categorization are dissociated at an early stage on the basis of different enhanced acoustic features that are diagnostic to the task at hand.

  15. Incorporating Pass-Phrase Dependent Background Models for Text-Dependent Speaker verification

    DEFF Research Database (Denmark)

    Sarkar, Achintya Kumar; Tan, Zheng-Hua

    2018-01-01

    -dependent. We show that the proposed method significantly reduces the error rates of text-dependent speaker verification for the non-target types: target-wrong and impostor-wrong while it maintains comparable TD-SV performance when impostors speak a correct utterance with respect to the conventional system......In this paper, we propose pass-phrase dependent background models (PBMs) for text-dependent (TD) speaker verification (SV) to integrate the pass-phrase identification process into the conventional TD-SV system, where a PBM is derived from a text-independent background model through adaptation using...... the utterances of a particular pass-phrase. During training, pass-phrase specific target speaker models are derived from the particular PBM using the training data for the respective target model. While testing, the best PBM is first selected for the test utterance in the maximum likelihood (ML) sense...

  16. Comparing grapheme-based and phoneme-based speech recognition for Afrikaans

    CSIR Research Space (South Africa)

    Basson, WD

    2012-11-01

    Full Text Available This paper compares the recognition accuracy of a phoneme-based automatic speech recognition system with that of a grapheme-based system, using Afrikaans as case study. The first system is developed using a conventional pronunciation dictionary...

  17. Supervised and Unsupervised Speaker Adaptation in the NIST 2005 Speaker Recognition Evaluation

    National Research Council Canada - National Science Library

    Hansen, Eric G; Slyh, Raymond E; Anderson, Timothy R

    2006-01-01

    Starting in 2004, the annual NIST Speaker Recognition Evaluation (SRE) has added an optional unsupervised speaker adaptation track where test files are processed sequentially and one may update the target model...

  18. The voiced pronunciation of initial phonemes predicts the gender of names.

    Science.gov (United States)

    Slepian, Michael L; Galinsky, Adam D

    2016-04-01

    Although it is known that certain names gain popularity within a culture because of historical events, it is unknown how names become associated with different social categories in the first place. We propose that vocal cord vibration during the pronunciation of an initial phoneme plays a critical role in explaining which names are assigned to males versus females. This produces a voiced gendered name effect, whereby voiced phonemes (vibration of the vocal cords) are more associated with male names, and unvoiced phonemes (no vibration of the vocal cords) are more associated with female names. Eleven studies test this association between voiced names and gender (a) using 270 million names (more than 80,000 unique names) given to children over 75 years, (b) names across 2 cultures (the U.S. and India), and (c) hundreds of novel names. The voiced gendered name effect was mediated through how hard or soft names sounded, and moderated by gender stereotype endorsement. Although extensive work has demonstrated morphological and physical cues to gender (e.g., facial, bodily, vocal), this work provides a systematic account of name-based cues to gender. Overall, the current research extends work on sound symbolism to names; the way in which a name sounds can be symbolically related to stereotypes associated with its social category. (c) 2016 APA, all rights reserved).

  19. A Comparison of Coverbal Gesture Use in Oral Discourse Among Speakers With Fluent and Nonfluent Aphasia

    Science.gov (United States)

    Law, Sam-Po; Chak, Gigi Wan-Chi

    2017-01-01

    Purpose Coverbal gesture use, which is affected by the presence and degree of aphasia, can be culturally specific. The purpose of this study was to compare gesture use among Cantonese-speaking individuals: 23 neurologically healthy speakers, 23 speakers with fluent aphasia, and 21 speakers with nonfluent aphasia. Method Multimedia data of discourse samples from these speakers were extracted from the Cantonese AphasiaBank. Gestures were independently annotated on their forms and functions to determine how gesturing rate and distribution of gestures differed across speaker groups. A multiple regression was conducted to determine the most predictive variable(s) for gesture-to-word ratio. Results Although speakers with nonfluent aphasia gestured most frequently, the rate of gesture use in counterparts with fluent aphasia did not differ significantly from controls. Different patterns of gesture functions in the 3 speaker groups revealed that gesture plays a minor role in lexical retrieval whereas its role in enhancing communication dominates among the speakers with aphasia. The percentages of complete sentences and dysfluency strongly predicted the gesturing rate in aphasia. Conclusions The current results supported the sketch model of language–gesture association. The relationship between gesture production and linguistic abilities and clinical implications for gesture-based language intervention for speakers with aphasia are also discussed. PMID:28609510

  20. Training Phoneme Blending Skills in Children with Down Syndrome

    Science.gov (United States)

    Burgoyne, Kelly; Duff, Fiona; Snowling, Maggie; Buckley, Sue; Hulme, Charles

    2013-01-01

    This article reports the evaluation of a 6-week programme of teaching designed to support the development of phoneme blending skills in children with Down syndrome (DS). Teaching assistants (TAs) were trained to deliver the intervention to individual children in daily 10-15-minute sessions, within a broader context of reading and language…

  1. Morpho-phonemic analysis boosts word reading for adult struggling readers.

    Science.gov (United States)

    Gray, Susan H; Ehri, Linnea C; Locke, John L

    2018-01-01

    A randomized control trial compared the effects of two kinds of vocabulary instruction on component reading skills of adult struggling readers. Participants seeking alternative high school diplomas received 8 h of scripted tutoring to learn forty academic vocabulary words embedded within a civics curriculum. They were matched for language background and reading levels, then randomly assigned to either morpho-phonemic analysis teaching word origins, morpheme and syllable structures, or traditional whole word study teaching multiple sentence contexts, meaningful connections, and spellings. Both groups made comparable gains in learning the target words, but the morpho-phonemic group showed greater gains in reading unfamiliar words on standardized tests of word reading, including word attack and word recognition. Findings support theories of word learning and literacy that promote explicit instruction in word analysis to increase poor readers' linguistic awareness by revealing connections between morphological, phonological, and orthographic structures within words.

  2. Computer game as a tool for training the identification of phonemic length.

    Science.gov (United States)

    Pennala, Riitta; Richardson, Ulla; Ylinen, Sari; Lyytinen, Heikki; Martin, Maisa

    2014-12-01

    Computer-assisted training of Finnish phonemic length was conducted with 7-year-old Russian-speaking second-language learners of Finnish. Phonemic length plays a different role in these two languages. The training included game activities with two- and three-syllable word and pseudo-word minimal pairs with prototypical vowel durations. The lowest accuracy scores were recorded for two-syllable words. Accuracy scores were higher for the minimal pairs with larger rather than smaller differences in duration. Accuracy scores were lower for long duration than for short duration. The ability to identify quantity degree was generalized to stimuli used in the identification test in two of the children. Ideas for improving the game are introduced.

  3. Phonemic verbal fluency and severity of anxiety disorders in young children

    Directory of Open Access Journals (Sweden)

    Rudineia Toazza

    Full Text Available Abstract Introduction: Previous studies have implicated impaired verbal fluency as being associated with anxiety disorders in adolescents. Objectives: To replicate and extend previously reported evidence by investigating whether performance in phonemic verbal fluency tasks is related to severity of anxiety symptoms in young children with anxiety disorders. We also aim to investigate whether putative associations are independent from co-occurring attention deficit hyperactivity disorder (ADHD symptoms. Methods: Sixty children (6-12 years old with primary diagnoses of anxiety disorders participated in this study. Severity of symptoms was measured using clinician-based, parent-rated and self-rated validated scales. Verbal fluency was assessed using a simple task that measures the number of words evoked in 1-minute with the letter F, from which we quantified the number of isolated words, number of clusters (groups of similar words and number of switches (transitions between clusters and/or between isolated words. Results: There was a significant association between the number of clusters and anxiety scores. Further analysis revealed associations were independent from co-occurring ADHD symptoms. Conclusion: We replicate and extend previous findings showing that verbal fluency is consistently associated with severity in anxiety disorders in children. Further studies should explore the potential effect of cognitive training on symptoms of anxiety disorders.

  4. Multistage Data Selection-based Unsupervised Speaker Adaptation for Personalized Speech Emotion Recognition

    NARCIS (Netherlands)

    Kim, Jaebok; Park, Jeong-Sik

    This paper proposes an efficient speech emotion recognition (SER) approach that utilizes personal voice data accumulated on personal devices. A representative weakness of conventional SER systems is the user-dependent performance induced by the speaker independent (SI) acoustic model framework. But,

  5. Similar speaker recognition using nonlinear analysis

    International Nuclear Information System (INIS)

    Seo, J.P.; Kim, M.S.; Baek, I.C.; Kwon, Y.H.; Lee, K.S.; Chang, S.W.; Yang, S.I.

    2004-01-01

    Speech features of the conventional speaker identification system, are usually obtained by linear methods in spectral space. However, these methods have the drawback that speakers with similar voices cannot be distinguished, because the characteristics of their voices are also similar in spectral space. To overcome the difficulty in linear methods, we propose to use the correlation exponent in the nonlinear space as a new feature vector for speaker identification among persons with similar voices. We show that our proposed method surprisingly reduces the error rate of speaker identification system to speakers with similar voices

  6. Evidences of Factorial Structure and Precision of Phonemic Awareness Tasks (TCFe

    Directory of Open Access Journals (Sweden)

    Dalva Maria Alves Godoy

    2015-12-01

    Full Text Available AbstractTo assess phonological awareness - a decisive skill for learning to read and write - it is necessary to provide evidence about an instrument construct to present trustworthy parameters for both empirical research and the development of educational intervention and rehabilitation programs. In Brazil, at this moment, there are no studies regarding the internal structure for tests of phonological awareness. This article shows the factorial validity of a test of phonological awareness composed by three sub-tests: two tasks of subtraction of initial phoneme and one of phonemic segmentation. The multidimensional confirmatory factorial analysis was applied to a sample of 176 Brazilian students ( Mage= 9.3 years from the first to fifth grade of elementary school. Results indicated a well-adjusted model, with items of intermediate difficulty and high factor loadings; thus, this corroboratedthe internal structure and well-designed theoretical conception.

  7. Exploring the role of hand gestures in learning novel phoneme contrasts and vocabulary in a second language

    Science.gov (United States)

    Kelly, Spencer D.; Hirata, Yukari; Manansala, Michael; Huang, Jessica

    2014-01-01

    Co-speech hand gestures are a type of multimodal input that has received relatively little attention in the context of second language learning. The present study explored the role that observing and producing different types of gestures plays in learning novel speech sounds and word meanings in an L2. Naïve English-speakers were taught two components of Japanese—novel phonemic vowel length contrasts and vocabulary items comprised of those contrasts—in one of four different gesture conditions: Syllable Observe, Syllable Produce, Mora Observe, and Mora Produce. Half of the gestures conveyed intuitive information about syllable structure, and the other half, unintuitive information about Japanese mora structure. Within each Syllable and Mora condition, half of the participants only observed the gestures that accompanied speech during training, and the other half also produced the gestures that they observed along with the speech. The main finding was that participants across all four conditions had similar outcomes in two different types of auditory identification tasks and a vocabulary test. The results suggest that hand gestures may not be well suited for learning novel phonetic distinctions at the syllable level within a word, and thus, gesture-speech integration may break down at the lowest levels of language processing and learning. PMID:25071646

  8. A Brief Critique of Chomsky's Challenge to Classical Phonemic Phonology.

    Science.gov (United States)

    Liu, Ngar-Fun

    1994-01-01

    Phonemic phonology became important because it provided a descriptive account of dialects and languages that had never been transcribed before, and it derives its greatest strength from its practical orientation, which has proved beneficial to language teaching and learning. Noam Chomsky's criticisms of it are largely unjust because he has not…

  9. Grammatical Planning Units during Real-Time Sentence Production in Speakers with Agrammatic Aphasia and Healthy Speakers

    Science.gov (United States)

    Lee, Jiyeon; Yoshida, Masaya; Thompson, Cynthia K.

    2015-01-01

    Purpose: Grammatical encoding (GE) is impaired in agrammatic aphasia; however, the nature of such deficits remains unclear. We examined grammatical planning units during real-time sentence production in speakers with agrammatic aphasia and control speakers, testing two competing models of GE. We queried whether speakers with agrammatic aphasia…

  10. Phoneme Awareness, Visual-Verbal Paired-Associate Learning, and Rapid Automatized Naming as Predictors of Individual Differences in Reading Ability

    Science.gov (United States)

    Warmington, Meesha; Hulme, Charles

    2012-01-01

    This study examines the concurrent relationships between phoneme awareness, visual-verbal paired-associate learning, rapid automatized naming (RAN), and reading skills in 7- to 11-year-old children. Path analyses showed that visual-verbal paired-associate learning and RAN, but not phoneme awareness, were unique predictors of word recognition,…

  11. Detection of target phonemes in spontaneous and read speech.

    Science.gov (United States)

    Mehta, G; Cutler, A

    1988-01-01

    Although spontaneous speech occurs more frequently in most listeners' experience than read speech, laboratory studies of human speech recognition typically use carefully controlled materials read from a script. The phonological and prosodic characteristics of spontaneous and read speech differ considerably, however, which suggests that laboratory results may not generalise to the recognition of spontaneous speech. In the present study listeners were presented with both spontaneous and read speech materials, and their response time to detect word-initial target phonemes was measured. Responses were, overall, equally fast in each speech mode. However, analysis of effects previously reported in phoneme detection studies revealed significant differences between speech modes. In read speech but not in spontaneous speech, later targets were detected more rapidly than targets preceded by short words. In contrast, in spontaneous speech but not in read speech, targets were detected more rapidly in accented than in unaccented words and in strong than in weak syllables. An explanation for this pattern is offered in terms of characteristic prosodic differences between spontaneous and read speech. The results support claims from previous work that listeners pay great attention to prosodic information in the process of recognising speech.

  12. Encoding, rehearsal, and recall in signers and speakers: shared network but differential engagement.

    Science.gov (United States)

    Bavelier, D; Newman, A J; Mukherjee, M; Hauser, P; Kemeny, S; Braun, A; Boutla, M

    2008-10-01

    Short-term memory (STM), or the ability to hold verbal information in mind for a few seconds, is known to rely on the integrity of a frontoparietal network of areas. Here, we used functional magnetic resonance imaging to ask whether a similar network is engaged when verbal information is conveyed through a visuospatial language, American Sign Language, rather than speech. Deaf native signers and hearing native English speakers performed a verbal recall task, where they had to first encode a list of letters in memory, maintain it for a few seconds, and finally recall it in the order presented. The frontoparietal network described to mediate STM in speakers was also observed in signers, with its recruitment appearing independent of the modality of the language. This finding supports the view that signed and spoken STM rely on similar mechanisms. However, deaf signers and hearing speakers differentially engaged key structures of the frontoparietal network as the stages of STM unfold. In particular, deaf signers relied to a greater extent than hearing speakers on passive memory storage areas during encoding and maintenance, but on executive process areas during recall. This work opens new avenues for understanding similarities and differences in STM performance in signers and speakers.

  13. The 2016 NIST Speaker Recognition Evaluation

    Science.gov (United States)

    2017-08-20

    impact on system performance. Index Terms: NIST evaluation, NIST SRE, speaker detection, speaker recognition, speaker verification 1. Introduction NIST... self -reported. Second, there were two training conditions in SRE16, namely fixed and open. In the fixed training condition, par- ticipants were only

  14. Arctic Visiting Speakers Series (AVS)

    Science.gov (United States)

    Fox, S. E.; Griswold, J.

    2011-12-01

    The Arctic Visiting Speakers (AVS) Series funds researchers and other arctic experts to travel and share their knowledge in communities where they might not otherwise connect. Speakers cover a wide range of arctic research topics and can address a variety of audiences including K-12 students, graduate and undergraduate students, and the general public. Host applications are accepted on an on-going basis, depending on funding availability. Applications need to be submitted at least 1 month prior to the expected tour dates. Interested hosts can choose speakers from an online Speakers Bureau or invite a speaker of their choice. Preference is given to individuals and organizations to host speakers that reach a broad audience and the general public. AVS tours are encouraged to span several days, allowing ample time for interactions with faculty, students, local media, and community members. Applications for both domestic and international visits will be considered. Applications for international visits should involve participation of more than one host organization and must include either a US-based speaker or a US-based organization. This is a small but important program that educates the public about Arctic issues. There have been 27 tours since 2007 that have impacted communities across the globe including: Gatineau, Quebec Canada; St. Petersburg, Russia; Piscataway, New Jersey; Cordova, Alaska; Nuuk, Greenland; Elizabethtown, Pennsylvania; Oslo, Norway; Inari, Finland; Borgarnes, Iceland; San Francisco, California and Wolcott, Vermont to name a few. Tours have included lectures to K-12 schools, college and university students, tribal organizations, Boy Scout troops, science center and museum patrons, and the general public. There are approximately 300 attendees enjoying each AVS tour, roughly 4100 people have been reached since 2007. The expectations for each tour are extremely manageable. Hosts must submit a schedule of events and a tour summary to be posted online

  15. Simple and robust generation of ultrafast laser pulse trains using polarization-independent parallel-aligned thin films

    Science.gov (United States)

    Wang, Andong; Jiang, Lan; Li, Xiaowei; Wang, Zhi; Du, Kun; Lu, Yongfeng

    2018-05-01

    Ultrafast laser pulse temporal shaping has been widely applied in various important applications such as laser materials processing, coherent control of chemical reactions, and ultrafast imaging. However, temporal pulse shaping has been limited to only-in-lab technique due to the high cost, low damage threshold, and polarization dependence. Herein we propose a novel design of ultrafast laser pulse train generation device, which consists of multiple polarization-independent parallel-aligned thin films. Various pulse trains with controllable temporal profile can be generated flexibly by multi-reflections within the splitting films. Compared with other pulse train generation techniques, this method has advantages of compact structure, low cost, high damage threshold and polarization independence. These advantages endow it with high potential for broad utilization in ultrafast applications.

  16. Understanding native Russian listeners' errors on an English word recognition test: model-based analysis of phoneme confusion.

    Science.gov (United States)

    Shi, Lu-Feng; Morozova, Natalia

    2012-08-01

    Word recognition is a basic component in a comprehensive hearing evaluation, but data are lacking for listeners speaking two languages. This study obtained such data for Russian natives in the US and analysed the data using the perceptual assimilation model (PAM) and speech learning model (SLM). Listeners were randomly presented 200 NU-6 words in quiet. Listeners responded verbally and in writing. Performance was scored on words and phonemes (word-initial consonants, vowels, and word-final consonants). Seven normal-hearing, adult monolingual English natives (NM), 16 English-dominant (ED), and 15 Russian-dominant (RD) Russian natives participated. ED and RD listeners differed significantly in their language background. Consistent with the SLM, NM outperformed ED listeners and ED outperformed RD listeners, whether responses were scored on words or phonemes. NM and ED listeners shared similar phoneme error patterns, whereas RD listeners' errors had unique patterns that could be largely understood via the PAM. RD listeners had particular difficulty differentiating vowel contrasts /i-I/, /æ-ε/, and /ɑ-Λ/, word-initial consonant contrasts /p-h/ and /b-f/, and word-final contrasts /f-v/. Both first-language phonology and second-language learning history affect word and phoneme recognition. Current findings may help clinicians differentiate word recognition errors due to language background from hearing pathologies.

  17. Computer-based auditory phoneme discrimination training improves speech recognition in noise in experienced adult cochlear implant listeners.

    Science.gov (United States)

    Schumann, Annette; Serman, Maja; Gefeller, Olaf; Hoppe, Ulrich

    2015-03-01

    Specific computer-based auditory training may be a useful completion in the rehabilitation process for cochlear implant (CI) listeners to achieve sufficient speech intelligibility. This study evaluated the effectiveness of a computerized, phoneme-discrimination training programme. The study employed a pretest-post-test design; participants were randomly assigned to the training or control group. Over a period of three weeks, the training group was instructed to train in phoneme discrimination via computer, twice a week. Sentence recognition in different noise conditions (moderate to difficult) was tested pre- and post-training, and six months after the training was completed. The control group was tested and retested within one month. Twenty-seven adult CI listeners who had been using cochlear implants for more than two years participated in the programme; 15 adults in the training group, 12 adults in the control group. Besides significant improvements for the trained phoneme-identification task, a generalized training effect was noted via significantly improved sentence recognition in moderate noise. No significant changes were noted in the difficult noise conditions. Improved performance was maintained over an extended period. Phoneme-discrimination training improves experienced CI listeners' speech perception in noise. Additional research is needed to optimize auditory training for individual benefit.

  18. Eliciting the Dutch loan phoneme /g/ with the Menu Task

    NARCIS (Netherlands)

    Hamann, S.; de Jonge, A.

    2015-01-01

    This article introduces the menu task, which can be used to elicit infrequent sounds such as loan phonemes that only occur in a restricted set of words. The menu task is similar to the well-known map task and involves the interaction of two participants to create a menu on the basis of a list of

  19. Phoneme categorization and discrimination in younger and older adults: a comparative analysis of perceptual, lexical, and attentional factors.

    Science.gov (United States)

    Mattys, Sven L; Scharenborg, Odette

    2014-03-01

    This study investigates the extent to which age-related language processing difficulties are due to a decline in sensory processes or to a deterioration of cognitive factors, specifically, attentional control. Two facets of attentional control were examined: inhibition of irrelevant information and divided attention. Younger and older adults were asked to categorize the initial phoneme of spoken syllables ("Was it m or n?"), trying to ignore the lexical status of the syllables. The phonemes were manipulated to range in eight steps from m to n. Participants also did a discrimination task on syllable pairs ("Were the initial sounds the same or different?"). Categorization and discrimination were performed under either divided attention (concurrent visual-search task) or focused attention (no visual task). The results showed that even when the younger and older adults were matched on their discrimination scores: (1) the older adults had more difficulty inhibiting lexical knowledge than did younger adults, (2) divided attention weakened lexical inhibition in both younger and older adults, and (3) divided attention impaired sound discrimination more in older than younger listeners. The results confirm the independent and combined contribution of sensory decline and deficit in attentional control to language processing difficulties associated with aging. The relative weight of these variables and their mechanisms of action are discussed in the context of theories of aging and language. (c) 2014 APA, all rights reserved.

  20. Gender Identification of the Speaker Using VQ Method

    Directory of Open Access Journals (Sweden)

    Vasif V. Nabiyev

    2009-11-01

    Full Text Available Speaking is the easiest and natural form of communication between people. Intensive studies are made in order to provide this communication via computers between people. The systems using voice biometric technology are attracting attention especially in the angle of cost and usage. When compared with the other biometic systems the application is much more practical. For example by using a microphone placed in the environment voice record can be obtained even without notifying the user and the system can be applied. Moreover the remote access facility is one of the other advantages of voice biometry. In this study, it is aimed to automatically determine the gender of the speaker through the speech waves which include personal information. If the speaker gender can be determined while composing models according to the gender information, the success of voice recognition systems can be increased in an important degree. Generally all the speaker recognition systems are composed of two parts which are feature extraction and matching. Feature extraction is the procedure in which the least information presenting the speech and the speaker is determined through voice signal. There are different features used in voice applications such as LPC, MFCC and PLP. In this study as a feature vector MFCC is used. Feature mathcing is the procedure in which the features derived from unknown speakers and known speaker group are compared. According to the text used in comparison the system is devided to two parts that are text dependent and text independent. While the same text is used in text dependent systems, different texts are used in indepentent text systems. Nowadays, DTW and HMM are text dependent, VQ and GMM are text indepentent matching methods. In this study due to the high success ratio and simple application features VQ approach is used.In this study a system which determines the speaker gender automatically and text independent is proposed. The proposed

  1. Detection of target phonemes in spontaneous and read speech

    OpenAIRE

    Mehta, G.; Cutler, A.

    1988-01-01

    Although spontaneous speech occurs more frequently in most listeners’ experience than read speech, laboratory studies of human speech recognition typically use carefully controlled materials read from a script. The phonological and prosodic characteristics of spontaneous and read speech differ considerably, however, which suggests that laboratory results may not generalize to the recognition of spontaneous and read speech materials, and their response time to detect word-initial target phonem...

  2. Complimenting Functions by Native English Speakers and Iranian EFL Learners: A Divergence or Convergence

    Directory of Open Access Journals (Sweden)

    Ali Akbar Ansarin

    2016-01-01

    Full Text Available The study of compliment speech act has been under investigation on many occasions in recent years. In this study, an attempt is made to explore appraisals performed by native English speakers and Iranian EFL learners to find out how these two groups diverge or converge from each other with regard to complimenting patterns and norms. The participants of the study were 60 advanced Iranian EFL learners who were speaking Persian as their first language and 60 native English speakers. Through a written Discourse Completion Task comprised of eight different scenarios, compliments were analyzed with regard to topics (performance, personality, possession, and skill, functions (explicit, implicit, and opt-out, gender differences and the common positive adjectives used by two groups of native and nonnative participants. The findings suggested that native English speakers praised individuals more implicitly in comparison with Iranian EFL learners and native speakers provided opt-outs more frequently than Iranian EFL learners did. The analysis of data by Chi-square showed that gender and macro functions are independent of each other among Iranian EFL learners’ compliments while for native speakers, gender played a significant role in the distribution of appraisals. Iranian EFL learners’ complimenting patterns converge more towards those of native English speakers. Moreover, both groups favored explicit compliments. However, Iranian EFL learners were more inclined to provide explicit compliments. It can be concluded that there were more similarities rather than differences between Iranian EFL learners and native English speakers regarding compliment speech act. The results of this study can benefit researchers, teachers, material developers, and EFL learners.

  3. Speech Rate Normalization and Phonemic Boundary Perception in Cochlear-Implant Users

    Science.gov (United States)

    Jaekel, Brittany N.; Newman, Rochelle S.; Goupell, Matthew J.

    2017-01-01

    Purpose: Normal-hearing (NH) listeners rate normalize, temporarily remapping phonemic category boundaries to account for a talker's speech rate. It is unknown if adults who use auditory prostheses called cochlear implants (CI) can rate normalize, as CIs transmit degraded speech signals to the auditory nerve. Ineffective adjustment to rate…

  4. Audiovisual alignment of co-speech gestures to speech supports word learning in 2-year-olds.

    Science.gov (United States)

    Jesse, Alexandra; Johnson, Elizabeth K

    2016-05-01

    Analyses of caregiver-child communication suggest that an adult tends to highlight objects in a child's visual scene by moving them in a manner that is temporally aligned with the adult's speech productions. Here, we used the looking-while-listening paradigm to examine whether 25-month-olds use audiovisual temporal alignment to disambiguate and learn novel word-referent mappings in a difficult word-learning task. Videos of two equally interesting and animated novel objects were simultaneously presented to children, but the movement of only one of the objects was aligned with an accompanying object-labeling audio track. No social cues (e.g., pointing, eye gaze, touch) were available to the children because the speaker was edited out of the videos. Immediately afterward, toddlers were presented with still images of the two objects and asked to look at one or the other. Toddlers looked reliably longer to the labeled object, demonstrating their acquisition of the novel word-referent mapping. A control condition showed that children's performance was not solely due to the single unambiguous labeling that had occurred at experiment onset. We conclude that the temporal link between a speaker's utterances and the motion they imposed on the referent object helps toddlers to deduce a speaker's intended reference in a difficult word-learning scenario. In combination with our previous work, these findings suggest that intersensory redundancy is a source of information used by language users of all ages. That is, intersensory redundancy is not just a word-learning tool used by young infants. Copyright © 2015 Elsevier Inc. All rights reserved.

  5. The Influence of Spanish Vocabulary and Phonemic Awareness on Beginning English Reading Development: A Three-Year (K-2nd) Longitudinal Study

    Science.gov (United States)

    Kelley, Michael F.; Roe, Mary; Blanchard, Jay; Atwill, Kim

    2015-01-01

    This investigation examined the influence of varying levels of Spanish receptive vocabulary and phonemic awareness ability on beginning English vocabulary, phonemic awareness, word reading fluency, and reading comprehension development across kindergarten through second grade. The 80 respondents were Spanish speaking children with no English…

  6. Multimodal Speaker Diarization.

    Science.gov (United States)

    Noulas, A; Englebienne, G; Krose, B J A

    2012-01-01

    We present a novel probabilistic framework that fuses information coming from the audio and video modality to perform speaker diarization. The proposed framework is a Dynamic Bayesian Network (DBN) that is an extension of a factorial Hidden Markov Model (fHMM) and models the people appearing in an audiovisual recording as multimodal entities that generate observations in the audio stream, the video stream, and the joint audiovisual space. The framework is very robust to different contexts, makes no assumptions about the location of the recording equipment, and does not require labeled training data as it acquires the model parameters using the Expectation Maximization (EM) algorithm. We apply the proposed model to two meeting videos and a news broadcast video, all of which come from publicly available data sets. The results acquired in speaker diarization are in favor of the proposed multimodal framework, which outperforms the single modality analysis results and improves over the state-of-the-art audio-based speaker diarization.

  7. Phonemic Awareness and the Teaching of Reading. A Position Statement from the Board of Directors of the International Reading Association.

    Science.gov (United States)

    International Reading Association, Newark, DE.

    This position paper considers the complex relation between phonemic awareness and reading. The paper seeks to define phonemic awareness (although there is no single definition), stating that it is typically described as an insight about oral language and in particular about the segmentation of sounds that are used in speech communication. It also…

  8. Alveolar and Velarized Laterals in Albanian and in the Viennese Dialect.

    Science.gov (United States)

    Moosmüller, Sylvia; Schmid, Carolin; Kasess, Christian H

    2016-12-01

    A comparison of alveolar and velarized lateral realizations in two language varieties, Albanian and the Viennese dialect, has been performed. Albanian distinguishes the two laterals phonemically, whereas in the Viennese dialect, the velarized lateral was introduced by language contact with Czech immigrants. A categorical distinction between the two lateral phonemes is fully maintained in Albanian. Results are not as straightforward in the Viennese dialect. Most prominently, female speakers, if at all, realize the velarized lateral in word-final position, thus indicating the application of a phonetically motivated process. The realization of the velarized lateral by male speakers, on the other hand, indicates that the velarized lateral replaced the former alveolar lateral phoneme. Alveolar laterals are either realized in perceptually salient positions, thus governed by an input-switch rule, or in front vowel contexts, thus subject to coarticulatory influences. Our results illustrate the subtle interplay of phonology, phonetics and sociolinguistics.

  9. A Nonverbal Phoneme Deletion Task Administered in a Dynamic Assessment Format

    Science.gov (United States)

    Gillam, Sandra Laing; Fargo, Jamison; Foley, Beth; Olszewski, Abbie

    2011-01-01

    Purpose: The purpose of the project was to design a nonverbal dynamic assessment of phoneme deletion that may prove useful with individuals who demonstrate complex communication needs (CCN) and are unable to communicate using natural speech or who present with moderate-severe speech impairments. Method: A nonverbal dynamic assessment of phoneme…

  10. English Speakers Attend More Strongly than Spanish Speakers to Manner of Motion when Classifying Novel Objects and Events

    Science.gov (United States)

    Kersten, Alan W.; Meissner, Christian A.; Lechuga, Julia; Schwartz, Bennett L.; Albrechtsen, Justin S.; Iglesias, Adam

    2010-01-01

    Three experiments provide evidence that the conceptualization of moving objects and events is influenced by one's native language, consistent with linguistic relativity theory. Monolingual English speakers and bilingual Spanish/English speakers tested in an English-speaking context performed better than monolingual Spanish speakers and bilingual…

  11. MUON DETECTORS: ALIGNMENT

    CERN Multimedia

    G. Gomez

    Since December, the muon alignment community has focused on analyzing the data recorded so far in order to produce new DT and CSC Alignment Records for the second reprocessing of CRAFT data. Two independent algorithms were developed which align the DT chambers using global tracks, thus providing, for the first time, a relative alignment of the barrel with respect to the tracker. These results are an important ingredient for the second CRAFT reprocessing and allow, for example, a more detailed study of any possible mis-modelling of the magnetic field in the muon spectrometer. Both algorithms are constructed in such a way that the resulting alignment constants are not affected, to first order, by any such mis-modelling. The CSC chambers have not yet been included in this global track-based alignment due to a lack of statistics, since only a few cosmics go through the tracker and the CSCs. A strategy exists to align the CSCs using the barrel as a reference until collision tracks become available. Aligning the ...

  12. A Prerequisite to L1 Homophone Effects in L2 Spoken-Word Recognition

    Science.gov (United States)

    Nakai, Satsuki; Lindsay, Shane; Ota, Mitsuhiko

    2015-01-01

    When both members of a phonemic contrast in L2 (second language) are perceptually mapped to a single phoneme in one's L1 (first language), L2 words containing a member of that contrast can spuriously activate L2 words in spoken-word recognition. For example, upon hearing cattle, Dutch speakers of English are reported to experience activation…

  13. Utterance Verification for Text-Dependent Speaker Recognition

    DEFF Research Database (Denmark)

    Kinnunen, Tomi; Sahidullah, Md; Kukanov, Ivan

    2016-01-01

    Text-dependent automatic speaker verification naturally calls for the simultaneous verification of speaker identity and spoken content. These two tasks can be achieved with automatic speaker verification (ASV) and utterance verification (UV) technologies. While both have been addressed previously...

  14. Modulating phonemic fluency performance in healthy subjects with transcranial magnetic stimulation over the left or right lateral frontal cortex.

    Science.gov (United States)

    Smirni, Daniela; Turriziani, Patrizia; Mangano, Giuseppa Renata; Bracco, Martina; Oliveri, Massimiliano; Cipolotti, Lisa

    2017-07-28

    A growing body of evidence have suggested that non-invasive brain stimulation techniques, such as transcranial magnetic stimulation (TMS) and transcranial direct current stimulation (tDCS), can improve the performance of aphasic patients in language tasks. For example, application of inhibitory rTMS or tDCs over the right frontal lobe of dysphasic patients resulted in improved naming abilities. Several studies have also reported that in healthy controls (HC) tDCS application over the left prefrontal cortex (PFC) improve performance in naming and semantic fluency tasks. The aim of this study was to investigate in HC, for the first time, the effects of inhibitory repetitive TMS (rTMS) over left and right lateral frontal cortex (BA 47) on two phonemic fluency tasks (FAS or FPL). 44 right-handed HCs were administered rTMS or sham over the left or right lateral frontal cortex in two separate testing sessions, with a 24h interval, followed by the two phonemic fluency tasks. To account for possible practice effects, an additional 22 HCs were tested on only the phonemic fluency task across two sessions with no stimulation. We found that rTMS-inhibition over the left lateral frontal cortex significantly worsened phonemic fluency performance when compared to sham. In contrast, rTMS-inhibition over the right lateral frontal cortex significantly improved phonemic fluency performance when compared to sham. These results were not accounted for practice effects. We speculated that rTMS over the right lateral frontal cortex may induce plastic neural changes to the left lateral frontal cortex by suppressing interhemispheric inhibitory interactions. This resulted in an increased excitability (disinhibition) of the contralateral unstimulated left lateral frontal cortex, consequently enhancing phonemic fluency performance. Conversely, application of rTMS over the left lateral frontal cortex may induce a temporary, virtual lesion, with effects similar to those reported in left frontal

  15. First Language Grapheme-Phoneme Transparency Effects in Adult Second Language Learning

    Science.gov (United States)

    Ijalba, Elizabeth; Obler, Loraine K.

    2015-01-01

    The Spanish writing system has consistent grapheme-to-phoneme correspondences (GPC), rendering it more transparent than English. We compared first-language (L1) orthographic transparency on how monolingual English- and Spanish-readers learned a novel writing system with a 1:1 (LT) and a 1:2 (LO) GPC. Our dependent variables were learning time,…

  16. The Speaker Gender Gap at Critical Care Conferences.

    Science.gov (United States)

    Mehta, Sangeeta; Rose, Louise; Cook, Deborah; Herridge, Margaret; Owais, Sawayra; Metaxa, Victoria

    2018-06-01

    To review women's participation as faculty at five critical care conferences over 7 years. Retrospective analysis of five scientific programs to identify the proportion of females and each speaker's profession based on conference conveners, program documents, or internet research. Three international (European Society of Intensive Care Medicine, International Symposium on Intensive Care and Emergency Medicine, Society of Critical Care Medicine) and two national (Critical Care Canada Forum, U.K. Intensive Care Society State of the Art Meeting) annual critical care conferences held between 2010 and 2016. Female faculty speakers. None. Male speakers outnumbered female speakers at all five conferences, in all 7 years. Overall, women represented 5-31% of speakers, and female physicians represented 5-26% of speakers. Nursing and allied health professional faculty represented 0-25% of speakers; in general, more than 50% of allied health professionals were women. Over the 7 years, Society of Critical Care Medicine had the highest representation of female (27% overall) and nursing/allied health professional (16-25%) speakers; notably, male physicians substantially outnumbered female physicians in all years (62-70% vs 10-19%, respectively). Women's representation on conference program committees ranged from 0% to 40%, with Society of Critical Care Medicine having the highest representation of women (26-40%). The female proportions of speakers, physician speakers, and program committee members increased significantly over time at the Society of Critical Care Medicine and U.K. Intensive Care Society State of the Art Meeting conferences (p gap at critical care conferences, with male faculty outnumbering female faculty. This gap is more marked among physician speakers than those speakers representing nursing and allied health professionals. Several organizational strategies can address this gender gap.

  17. Neural speech recognition: continuous phoneme decoding using spatiotemporal representations of human cortical activity

    Science.gov (United States)

    Moses, David A.; Mesgarani, Nima; Leonard, Matthew K.; Chang, Edward F.

    2016-10-01

    Objective. The superior temporal gyrus (STG) and neighboring brain regions play a key role in human language processing. Previous studies have attempted to reconstruct speech information from brain activity in the STG, but few of them incorporate the probabilistic framework and engineering methodology used in modern speech recognition systems. In this work, we describe the initial efforts toward the design of a neural speech recognition (NSR) system that performs continuous phoneme recognition on English stimuli with arbitrary vocabulary sizes using the high gamma band power of local field potentials in the STG and neighboring cortical areas obtained via electrocorticography. Approach. The system implements a Viterbi decoder that incorporates phoneme likelihood estimates from a linear discriminant analysis model and transition probabilities from an n-gram phonemic language model. Grid searches were used in an attempt to determine optimal parameterizations of the feature vectors and Viterbi decoder. Main results. The performance of the system was significantly improved by using spatiotemporal representations of the neural activity (as opposed to purely spatial representations) and by including language modeling and Viterbi decoding in the NSR system. Significance. These results emphasize the importance of modeling the temporal dynamics of neural responses when analyzing their variations with respect to varying stimuli and demonstrate that speech recognition techniques can be successfully leveraged when decoding speech from neural signals. Guided by the results detailed in this work, further development of the NSR system could have applications in the fields of automatic speech recognition and neural prosthetics.

  18. Aligning the CMS Muon Chambers with the Muon Alignment System during an Extended Cosmic Ray Run

    CERN Document Server

    Chatrchyan, S; Sirunyan, A M; Adam, W; Arnold, B; Bergauer, H; Bergauer, T; Dragicevic, M; Eichberger, M; Erö, J; Friedl, M; Frühwirth, R; Ghete, V M; Hammer, J; Hänsel, S; Hoch, M; Hörmann, N; Hrubec, J; Jeitler, M; Kasieczka, G; Kastner, K; Krammer, M; Liko, D; Magrans de Abril, I; Mikulec, I; Mittermayr, F; Neuherz, B; Oberegger, M; Padrta, M; Pernicka, M; Rohringer, H; Schmid, S; Schöfbeck, R; Schreiner, T; Stark, R; Steininger, H; Strauss, J; Taurok, A; Teischinger, F; Themel, T; Uhl, D; Wagner, P; Waltenberger, W; Walzel, G; Widl, E; Wulz, C E; Chekhovsky, V; Dvornikov, O; Emeliantchik, I; Litomin, A; Makarenko, V; Marfin, I; Mossolov, V; Shumeiko, N; Solin, A; Stefanovitch, R; Suarez Gonzalez, J; Tikhonov, A; Fedorov, A; Karneyeu, A; Korzhik, M; Panov, V; Zuyeuski, R; Kuchinsky, P; Beaumont, W; Benucci, L; Cardaci, M; De Wolf, E A; Delmeire, E; Druzhkin, D; Hashemi, M; Janssen, X; Maes, T; Mucibello, L; Ochesanu, S; Rougny, R; Selvaggi, M; Van Haevermaet, H; Van Mechelen, P; Van Remortel, N; Adler, V; Beauceron, S; Blyweert, S; D'Hondt, J; De Weirdt, S; Devroede, O; Heyninck, J; Kalogeropoulos, A; Maes, J; Maes, M; Mozer, M U; Tavernier, S; Van Doninck, W; Van Mulders, P; Villella, I; Bouhali, O; Chabert, E C; Charaf, O; Clerbaux, B; De Lentdecker, G; Dero, V; Elgammal, S; Gay, A P R; Hammad, G H; Marage, P E; Rugovac, S; Vander Velde, C; Vanlaer, P; Wickens, J; Grunewald, M; Klein, B; Marinov, A; Ryckbosch, D; Thyssen, F; Tytgat, M; Vanelderen, L; Verwilligen, P; Basegmez, S; Bruno, G; Caudron, J; Delaere, C; Demin, P; Favart, D; Giammanco, A; Grégoire, G; Lemaitre, V; Militaru, O; Ovyn, S; Piotrzkowski, K; Quertenmont, L; Schul, N; Beliy, N; Daubie, E; Alves, G A; Pol, M E; Souza, M H G; Carvalho, W; De Jesus Damiao, D; De Oliveira Martins, C; Fonseca De Souza, S; Mundim, L; Oguri, V; Santoro, A; Silva Do Amaral, S M; Sznajder, A; Fernandez Perez Tomei, T R; Ferreira Dias, M A; Gregores, E M; Novaes, S F; Abadjiev, K; Anguelov, T; Damgov, J; Darmenov, N; Dimitrov, L; Genchev, V; Iaydjiev, P; Piperov, S; Stoykova, S; Sultanov, G; Trayanov, R; Vankov, I; Dimitrov, A; Dyulendarova, M; Kozhuharov, V; Litov, L; Marinova, E; Mateev, M; Pavlov, B; Petkov, P; Toteva, Z; Chen, G M; Chen, H S; Guan, W; Jiang, C H; Liang, D; Liu, B; Meng, X; Tao, J; Wang, J; Wang, Z; Xue, Z; Zhang, Z; Ban, Y; Cai, J; Ge, Y; Guo, S; Hu, Z; Mao, Y; Qian, S J; Teng, H; Zhu, B; Avila, C; Baquero Ruiz, M; Carrillo Montoya, C A; Gomez, A; Gomez Moreno, B; Ocampo Rios, A A; Osorio Oliveros, A F; Reyes Romero, D; Sanabria, J C; Godinovic, N; Lelas, K; Plestina, R; Polic, D; Puljak, I; Antunovic, Z; Dzelalija, M; Brigljevic, V; Duric, S; Kadija, K; Morovic, S; Fereos, R; Galanti, M; Mousa, J; Papadakis, A; Ptochos, F; Razis, P A; Tsiakkouri, D; Zinonos, Z; Hektor, A; Kadastik, M; Kannike, K; Müntel, M; Raidal, M; Rebane, L; Anttila, E; Czellar, S; Härkönen, J; Heikkinen, A; Karimäki, V; Kinnunen, R; Klem, J; Kortelainen, M J; Lampén, T; Lassila-Perini, K; Lehti, S; Lindén, T; Luukka, P; Mäenpää, T; Nysten, J; Tuominen, E; Tuominiemi, J; Ungaro, D; Wendland, L; Banzuzi, K; Korpela, A; Tuuva, T; Nedelec, P; Sillou, D; Besancon, M; Chipaux, R; Dejardin, M; Denegri, D; Descamps, J; Fabbro, B; Faure, J L; Ferri, F; Ganjour, S; Gentit, F X; Givernaud, A; Gras, P; Hamel de Monchenault, G; Jarry, P; Lemaire, M C; Locci, E; Malcles, J; Marionneau, M; Millischer, L; Rander, J; Rosowsky, A; Rousseau, D; Titov, M; Verrecchia, P; Baffioni, S; Bianchini, L; Bluj, M; Busson, P; Charlot, C; Dobrzynski, L; Granier de Cassagnac, R; Haguenauer, M; Miné, P; Paganini, P; Sirois, Y; Thiebaux, C; Zabi, A; Agram, J L; Besson, A; Bloch, D; Bodin, D; Brom, J M; Conte, E; Drouhin, F; Fontaine, J C; Gelé, D; Goerlach, U; Gross, L; Juillot, P; Le Bihan, A C; Patois, Y; Speck, J; Van Hove, P; Baty, C; Bedjidian, M; Blaha, J; Boudoul, G; Brun, H; Chanon, N; Chierici, R; Contardo, D; Depasse, P; Dupasquier, T; El Mamouni, H; Fassi, F; Fay, J; Gascon, S; Ille, B; Kurca, T; Le Grand, T; Lethuillier, M; Lumb, N; Mirabito, L; Perries, S; Vander Donckt, M; Verdier, P; Djaoshvili, N; Roinishvili, N; Roinishvili, V; Amaglobeli, N; Adolphi, R; Anagnostou, G; Brauer, R; Braunschweig, W; Edelhoff, M; Esser, H; Feld, L; Karpinski, W; Khomich, A; Klein, K; Mohr, N; Ostaptchouk, A; Pandoulas, D; Pierschel, G; Raupach, F; Schael, S; Schultz von Dratzig, A; Schwering, G; Sprenger, D; Thomas, M; Weber, M; Wittmer, B; Wlochal, M; Actis, O; Altenhöfer, G; Bender, W; Biallass, P; Erdmann, M; Fetchenhauer, G; Frangenheim, J; Hebbeker, T; Hilgers, G; Hinzmann, A; Hoepfner, K; Hof, C; Kirsch, M; Klimkovich, T; Kreuzer, P; Lanske, D; Merschmeyer, M; Meyer, A; Philipps, B; Pieta, H; Reithler, H; Schmitz, S A; Sonnenschein, L; Sowa, M; Steggemann, J; Szczesny, H; Teyssier, D; Zeidler, C; Bontenackels, M; Davids, M; Duda, M; Flügge, G; Geenen, H; Giffels, M; Haj Ahmad, W; Hermanns, T; Heydhausen, D; Kalinin, S; Kress, T; Linn, A; Nowack, A; Perchalla, L; Poettgens, M; Pooth, O; Sauerland, P; Stahl, A; Tornier, D; Zoeller, M H; Aldaya Martin, M; Behrens, U; Borras, K; Campbell, A; Castro, E; Dammann, D; Eckerlin, G; Flossdorf, A; Flucke, G; Geiser, A; Hatton, D; Hauk, J; Jung, H; Kasemann, M; Katkov, I; Kleinwort, C; Kluge, H; Knutsson, A; Kuznetsova, E; Lange, W; Lohmann, W; Mankel, R; Marienfeld, M; Meyer, A B; Miglioranzi, S; Mnich, J; Ohlerich, M; Olzem, J; Parenti, A; Rosemann, C; Schmidt, R; Schoerner-Sadenius, T; Volyanskyy, D; Wissing, C; Zeuner, W D; Autermann, C; Bechtel, F; Draeger, J; Eckstein, D; Gebbert, U; Kaschube, K; Kaussen, G; Klanner, R; Mura, B; Naumann-Emme, S; Nowak, F; Pein, U; Sander, C; Schleper, P; Schum, T; Stadie, H; Steinbrück, G; Thomsen, J; Wolf, R; Bauer, J; Blüm, P; Buege, V; Cakir, A; Chwalek, T; De Boer, W; Dierlamm, A; Dirkes, G; Feindt, M; Felzmann, U; Frey, M; Furgeri, A; Gruschke, J; Hackstein, C; Hartmann, F; Heier, S; Heinrich, M; Held, H; Hirschbuehl, D; Hoffmann, K H; Honc, S; Jung, C; Kuhr, T; Liamsuwan, T; Martschei, D; Mueller, S; Müller, Th; Neuland, M B; Niegel, M; Oberst, O; Oehler, A; Ott, J; Peiffer, T; Piparo, D; Quast, G; Rabbertz, K; Ratnikov, F; Ratnikova, N; Renz, M; Saout, C; Sartisohn, G; Scheurer, A; Schieferdecker, P; Schilling, F P; Schott, G; Simonis, H J; Stober, F M; Sturm, P; Troendle, D; Trunov, A; Wagner, W; Wagner-Kuhr, J; Zeise, M; Zhukov, V; Ziebarth, E B; Daskalakis, G; Geralis, T; Karafasoulis, K; Kyriakis, A; Loukas, D; Markou, A; Markou, C; Mavrommatis, C; Petrakou, E; Zachariadou, A; Gouskos, L; Katsas, P; Panagiotou, A; Evangelou, I; Kokkas, P; Manthos, N; Papadopoulos, I; Patras, V; Triantis, F A; Bencze, G; Boldizsar, L; Debreczeni, G; Hajdu, C; Hernath, S; Hidas, P; Horvath, D; Krajczar, K; Laszlo, A; Patay, G; Sikler, F; Toth, N; Vesztergombi, G; Beni, N; Christian, G; Imrek, J; Molnar, J; Novak, D; Palinkas, J; Szekely, G; Szillasi, Z; Tokesi, K; Veszpremi, V; Kapusi, A; Marian, G; Raics, P; Szabo, Z; Trocsanyi, Z L; Ujvari, B; Zilizi, G; Bansal, S; Bawa, H S; Beri, S B; Bhatnagar, V; Jindal, M; Kaur, M; Kaur, R; Kohli, J M; Mehta, M Z; Nishu, N; Saini, L K; Sharma, A; Singh, A; Singh, J B; Singh, S P; Ahuja, S; Arora, S; Bhattacharya, S; Chauhan, S; Choudhary, B C; Gupta, P; Jain, S; Jha, M; Kumar, A; Ranjan, K; Shivpuri, R K; Srivastava, A K; Choudhury, R K; Dutta, D; Kailas, S; Kataria, S K; Mohanty, A K; Pant, L M; Shukla, P; Topkar, A; Aziz, T; Guchait, M; Gurtu, A; Maity, M; Majumder, D; Majumder, G; Mazumdar, K; Nayak, A; Saha, A; Sudhakar, K; Banerjee, S; Dugad, S; Mondal, N K; Arfaei, H; Bakhshiansohi, H; Fahim, A; Jafari, A; Mohammadi Najafabadi, M; Moshaii, A; Paktinat Mehdiabadi, S; Rouhani, S; Safarzadeh, B; Zeinali, M; Felcini, M; Abbrescia, M; Barbone, L; Chiumarulo, F; Clemente, A; Colaleo, A; Creanza, D; Cuscela, G; De Filippis, N; De Palma, M; De Robertis, G; Donvito, G; Fedele, F; Fiore, L; Franco, M; Iaselli, G; Lacalamita, N; Loddo, F; Lusito, L; Maggi, G; Maggi, M; Manna, N; Marangelli, B; My, S; Natali, S; Nuzzo, S; Papagni, G; Piccolomo, S; Pierro, G A; Pinto, C; Pompili, A; Pugliese, G; Rajan, R; Ranieri, A; Romano, F; Roselli, G; Selvaggi, G; Shinde, Y; Silvestris, L; Tupputi, S; Zito, G; Abbiendi, G; Bacchi, W; Benvenuti, A C; Boldini, M; Bonacorsi, D; Braibant-Giacomelli, S; Cafaro, V D; Caiazza, S S; Capiluppi, P; Castro, A; Cavallo, F R; Codispoti, G; Cuffiani, M; D'Antone, I; Dallavalle, G M; Fabbri, F; Fanfani, A; Fasanella, D; Giacomelli, P; Giordano, V; Giunta, M; Grandi, C; Guerzoni, M; Marcellini, S; Masetti, G; Montanari, A; Navarria, F L; Odorici, F; Pellegrini, G; Perrotta, A; Rossi, A M; Rovelli, T; Siroli, G; Torromeo, G; Travaglini, R; Albergo, S; Costa, S; Potenza, R; Tricomi, A; Tuve, C; Barbagli, G; Broccolo, G; Ciulli, V; Civinini, C; D'Alessandro, R; Focardi, E; Frosali, S; Gallo, E; Genta, C; Landi, G; Lenzi, P; Meschini, M; Paoletti, S; Sguazzoni, G; Tropiano, A; Benussi, L; Bertani, M; Bianco, S; Colafranceschi, S; Colonna, D; Fabbri, F; Giardoni, M; Passamonti, L; Piccolo, D; Pierluigi, D; Ponzio, B; Russo, A; Fabbricatore, P; Musenich, R; Benaglia, A; Calloni, M; Cerati, G B; D'Angelo, P; De Guio, F; Farina, F M; Ghezzi, A; Govoni, P; Malberti, M; Malvezzi, S; Martelli, A; Menasce, D; Miccio, V; Moroni, L; Negri, P; Paganoni, M; Pedrini, D; Pullia, A; Ragazzi, S; Redaelli, N; Sala, S; Salerno, R; Tabarelli de Fatis, T; Tancini, V; Taroni, S; Buontempo, S; Cavallo, N; Cimmino, A; De Gruttola, M; Fabozzi, F; Iorio, A O M; Lista, L; Lomidze, D; Noli, P; Paolucci, P; Sciacca, C; Azzi, P; Bacchetta, N; Barcellan, L; Bellan, P; Bellato, M; Benettoni, M; Biasotto, M; Bisello, D; Borsato, E; Branca, A; Carlin, R; Castellani, L; Checchia, P; Conti, E; Dal Corso, F; De Mattia, M; Dorigo, T; Dosselli, U; Fanzago, F; Gasparini, F; Gasparini, U; Giubilato, P; Gonella, F; Gresele, A; Gulmini, M; Kaminskiy, A; Lacaprara, S; Lazzizzera, I; Margoni, M; Maron, G; Mattiazzo, S; Mazzucato, M; Meneghelli, M; Meneguzzo, A T; Michelotto, M; Montecassiano, F; Nespolo, M; Passaseo, M; Pegoraro, M; Perrozzi, L; Pozzobon, N; Ronchese, P; Simonetto, F; Toniolo, N; Torassa, E; Tosi, M; Triossi, A; Vanini, S; Ventura, S; Zotto, P; Zumerle, G; Baesso, P; Berzano, U; Bricola, S; Necchi, M M; Pagano, D; Ratti, S P; Riccardi, C; Torre, P; Vicini, A; Vitulo, P; Viviani, C; Aisa, D; Aisa, S; Babucci, E; Biasini, M; Bilei, G M; Caponeri, B; Checcucci, B; Dinu, N; Fanò, L; Farnesini, L; Lariccia, P; Lucaroni, A; Mantovani, G; Nappi, A; Piluso, A; Postolache, V; Santocchia, A; Servoli, L; Tonoiu, D; Vedaee, A; Volpe, R; Azzurri, P; Bagliesi, G; Bernardini, J; Berretta, L; Boccali, T; Bocci, A; Borrello, L; Bosi, F; Calzolari, F; Castaldi, R; Dell'Orso, R; Fiori, F; Foà, L; Gennai, S; Giassi, A; Kraan, A; Ligabue, F; Lomtadze, T; Mariani, F; Martini, L; Massa, M; Messineo, A; Moggi, A; Palla, F; Palmonari, F; Petragnani, G; Petrucciani, G; Raffaelli, F; Sarkar, S; Segneri, G; Serban, A T; Spagnolo, P; Tenchini, R; Tolaini, S; Tonelli, G; Venturi, A; Verdini, P G; Baccaro, S; Barone, L; Bartoloni, A; Cavallari, F; Dafinei, I; Del Re, D; Di Marco, E; Diemoz, M; Franci, D; Longo, E; Organtini, G; Palma, A; Pandolfi, F; Paramatti, R; Pellegrino, F; Rahatlou, S; Rovelli, C; Alampi, G; Amapane, N; Arcidiacono, R; Argiro, S; Arneodo, M; Biino, C; Borgia, M A; Botta, C; Cartiglia, N; Castello, R; Cerminara, G; Costa, M; Dattola, D; Dellacasa, G; Demaria, N; Dughera, G; Dumitrache, F; Graziano, A; Mariotti, C; Marone, M; Maselli, S; Migliore, E; Mila, G; Monaco, V; Musich, M; Nervo, M; Obertino, M M; Oggero, S; Panero, R; Pastrone, N; Pelliccioni, M; Romero, A; Ruspa, M; Sacchi, R; Solano, A; Staiano, A; Trapani, P P; Trocino, D; Vilela Pereira, A; Visca, L; Zampieri, A; Ambroglini, F; Belforte, S; Cossutti, F; Della Ricca, G; Gobbo, B; Penzo, A; Chang, S; Chung, J; Kim, D H; Kim, G N; Kong, D J; Park, H; Son, D C; Bahk, S Y; Song, S; Jung, S Y; Hong, B; Kim, H; Kim, J H; Lee, K S; Moon, D H; Park, S K; Rhee, H B; Sim, K S; Kim, J; Choi, M; Hahn, G; Park, I C; Choi, S; Choi, Y; Goh, J; Jeong, H; Kim, T J; Lee, J; Lee, S; Janulis, M; Martisiute, D; Petrov, P; Sabonis, T; Castilla Valdez, H; Sánchez Hernández, A; Carrillo Moreno, S; Morelos Pineda, A; Allfrey, P; Gray, R N C; Krofcheck, D; Bernardino Rodrigues, N; Butler, P H; Signal, T; Williams, J C; Ahmad, M; Ahmed, I; Ahmed, W; Asghar, M I; Awan, M I M; Hoorani, H R; Hussain, I; Khan, W A; Khurshid, T; Muhammad, S; Qazi, S; Shahzad, H; Cwiok, M; Dabrowski, R; Dominik, W; Doroba, K; Konecki, M; Krolikowski, J; Pozniak, K; Romaniuk, Ryszard; Zabolotny, W; Zych, P; Frueboes, T; Gokieli, R; Goscilo, L; Górski, M; Kazana, M; Nawrocki, K; Szleper, M; Wrochna, G; Zalewski, P; Almeida, N; Antunes Pedro, L; Bargassa, P; David, A; Faccioli, P; Ferreira Parracho, P G; Freitas Ferreira, M; Gallinaro, M; Guerra Jordao, M; Martins, P; Mini, G; Musella, P; Pela, J; Raposo, L; Ribeiro, P Q; Sampaio, S; Seixas, J; Silva, J; Silva, P; Soares, D; Sousa, M; Varela, J; Wöhri, H K; Altsybeev, I; Belotelov, I; Bunin, P; Ershov, Y; Filozova, I; Finger, M; Finger, M., Jr.; Golunov, A; Golutvin, I; Gorbounov, N; Kalagin, V; Kamenev, A; Karjavin, V; Konoplyanikov, V; Korenkov, V; Kozlov, G; Kurenkov, A; Lanev, A; Makankin, A; Mitsyn, V V; Moisenz, P; Nikonov, E; Oleynik, D; Palichik, V; Perelygin, V; Petrosyan, A; Semenov, R; Shmatov, S; Smirnov, V; Smolin, D; Tikhonenko, E; Vasil'ev, S; Vishnevskiy, A; Volodko, A; Zarubin, A; Zhiltsov, V; Bondar, N; Chtchipounov, L; Denisov, A; Gavrikov, Y; Gavrilov, G; Golovtsov, V; Ivanov, Y; Kim, V; Kozlov, V; Levchenko, P; Obrant, G; Orishchin, E; Petrunin, A; Shcheglov, Y; Shchetkovskiy, A; Sknar, V; Smirnov, I; Sulimov, V; Tarakanov, V; Uvarov, L; Vavilov, S; Velichko, G; Volkov, S; Vorobyev, A; Andreev, Yu; Anisimov, A; Antipov, P; Dermenev, A; Gninenko, S; Golubev, N; Kirsanov, M; Krasnikov, N; Matveev, V; Pashenkov, A; Postoev, V E; Solovey, A; Toropin, A; Troitsky, S; Baud, A; Epshteyn, V; Gavrilov, V; Ilina, N; Kaftanov, V; Kolosov, V; Kossov, M; Krokhotin, A; Kuleshov, S; Oulianov, A; Safronov, G; Semenov, S; Shreyber, I; Stolin, V; Vlasov, E; Zhokin, A; Boos, E; Dubinin, M; Dudko, L; Ershov, A; Gribushin, A; Klyukhin, V; Kodolova, O; Lokhtin, I; Petrushanko, S; Sarycheva, L; Savrin, V; Snigirev, A; Vardanyan, I; Dremin, I; Kirakosyan, M; Konovalova, N; Rusakov, S V; Vinogradov, A; Akimenko, S; Artamonov, A; Azhgirey, I; Bitioukov, S; Burtovoy, V; Grishin, V; Kachanov, V; Konstantinov, D; Krychkine, V; Levine, A; Lobov, I; Lukanin, V; Mel'nik, Y; Petrov, V; Ryutin, R; Slabospitsky, S; Sobol, A; Sytine, A; Tourtchanovitch, L; Troshin, S; Tyurin, N; Uzunian, A; Volkov, A; Adzic, P; Djordjevic, M; Jovanovic, D; Krpic, D; Maletic, D; Puzovic, J; Smiljkovic, N; Aguilar-Benitez, M; Alberdi, J; Alcaraz Maestre, J; Arce, P; Barcala, J M; Battilana, C; Burgos Lazaro, C; Caballero Bejar, J; Calvo, E; Cardenas Montes, M; Cepeda, M; Cerrada, M; Chamizo Llatas, M; Clemente, F; Colino, N; Daniel, M; De La Cruz, B; Delgado Peris, A; Diez Pardos, C; Fernandez Bedoya, C; Fernández Ramos, J P; Ferrando, A; Flix, J; Fouz, M C; Garcia-Abia, P; Garcia-Bonilla, A C; Gonzalez Lopez, O; Goy Lopez, S; Hernandez, J M; Josa, M I; Marin, J; Merino, G; Molina, J; Molinero, A; Navarrete, J J; Oller, J C; Puerta Pelayo, J; Romero, L; Santaolalla, J; Villanueva Munoz, C; Willmott, C; Yuste, C; Albajar, C; Blanco Otano, M; de Trocóniz, J F; Garcia Raboso, A; Lopez Berengueres, J O; Cuevas, J; Fernandez Menendez, J; Gonzalez Caballero, I; Lloret Iglesias, L; Naves Sordo, H; Vizan Garcia, J M; Cabrillo, I J; Calderon, A; Chuang, S H; Diaz Merino, I; Diez Gonzalez, C; Duarte Campderros, J; Fernandez, M; Gomez, G; Gonzalez Sanchez, J; Gonzalez Suarez, R; Jorda, C; Lobelle Pardo, P; Lopez Virto, A; Marco, J; Marco, R; Martinez Rivero, C; Martinez Ruiz del Arbol, P; Matorras, F; Rodrigo, T; Ruiz Jimeno, A; Scodellaro, L; Sobron Sanudo, M; Vila, I; Vilar Cortabitarte, R; Abbaneo, D; Albert, E; Alidra, M; Ashby, S; Auffray, E; Baechler, J; Baillon, P; Ball, A H; Bally, S L; Barney, D; Beaudette, F; Bellan, R; Benedetti, D; Benelli, G; Bernet, C; Bloch, P; Bolognesi, S; Bona, M; Bos, J; Bourgeois, N; Bourrel, T; Breuker, H; Bunkowski, K; Campi, D; Camporesi, T; Cano, E; Cattai, A; Chatelain, J P; Chauvey, M; Christiansen, T; Coarasa Perez, J A; Conde Garcia, A; Covarelli, R; Curé, B; De Roeck, A; Delachenal, V; Deyrail, D; Di Vincenzo, S; Dos Santos, S; Dupont, T; Edera, L M; Elliott-Peisert, A; Eppard, M; Favre, M; Frank, N; Funk, W; Gaddi, A; Gastal, M; Gateau, M; Gerwig, H; Gigi, D; Gill, K; Giordano, D; Girod, J P; Glege, F; Gomez-Reino Garrido, R; Goudard, R; Gowdy, S; Guida, R; Guiducci, L; Gutleber, J; Hansen, M; Hartl, C; Harvey, J; Hegner, B; Hoffmann, H F; Holzner, A; Honma, A; Huhtinen, M; Innocente, V; Janot, P; Le Godec, G; Lecoq, P; Leonidopoulos, C; Loos, R; Lourenço, C; Lyonnet, A; Macpherson, A; Magini, N; Maillefaud, J D; Maire, G; Mäki, T; Malgeri, L; Mannelli, M; Masetti, L; Meijers, F; Meridiani, P; Mersi, S; Meschi, E; Meynet Cordonnier, A; Moser, R; Mulders, M; Mulon, J; Noy, M; Oh, A; Olesen, G; Onnela, A; Orimoto, T; Orsini, L; Perez, E; Perinic, G; Pernot, J F; Petagna, P; Petiot, P; Petrilli, A; Pfeiffer, A; Pierini, M; Pimiä, M; Pintus, R; Pirollet, B; Postema, H; Racz, A; Ravat, S; Rew, S B; Rodrigues Antunes, J; Rolandi, G; Rovere, M; Ryjov, V; Sakulin, H; Samyn, D; Sauce, H; Schäfer, C; Schlatter, W D; Schröder, M; Schwick, C; Sciaba, A; Segoni, I; Sharma, A; Siegrist, N; Siegrist, P; Sinanis, N; Sobrier, T; Sphicas, P; Spiga, D; Spiropulu, M; Stöckli, F; Traczyk, P; Tropea, P; Troska, J; Tsirou, A; Veillet, L; Veres, G I; Voutilainen, M; Wertelaers, P; Zanetti, M; Bertl, W; Deiters, K; Erdmann, W; Gabathuler, K; Horisberger, R; Ingram, Q; Kaestli, H C; König, S; Kotlinski, D; Langenegger, U; Meier, F; Renker, D; Rohe, T; Sibille, J; Starodumov, A; Betev, B; Caminada, L; Chen, Z; Cittolin, S; Da Silva Di Calafiori, D R; Dambach, S; Dissertori, G; Dittmar, M; Eggel, C; Eugster, J; Faber, G; Freudenreich, K; Grab, C; Hervé, A; Hintz, W; Lecomte, P; Luckey, P D; Lustermann, W; Marchica, C; Milenovic, P; Moortgat, F; Nardulli, A; Nessi-Tedaldi, F; Pape, L; Pauss, F; Punz, T; Rizzi, A; Ronga, F J; Sala, L; Sanchez, A K; Sawley, M C; Sordini, V; Stieger, B; Tauscher, L; Thea, A; Theofilatos, K; Treille, D; Trüb, P; Weber, M; Wehrli, L; Weng, J; Zelepoukine, S; Amsler, C; Chiochia, V; De Visscher, S; Regenfus, C; Robmann, P; Rommerskirchen, T; Schmidt, A; Tsirigkas, D; Wilke, L; Chang, Y H; Chen, E A; Chen, W T; Go, A; Kuo, C M; Li, S W; Lin, W; Bartalini, P; Chang, P; Chao, Y; Chen, K F; Hou, W S; Hsiung, Y; Lei, Y J; Lin, S W; Lu, R S; Schümann, J; Shiu, J G; Tzeng, Y M; Ueno, K; Velikzhanin, Y; Wang, C C; Wang, M; Adiguzel, A; Ayhan, A; Azman Gokce, A; Bakirci, M N; Cerci, S; Dumanoglu, I; Eskut, E; Girgis, S; Gurpinar, E; Hos, I; Karaman, T; Kayis Topaksu, A; Kurt, P; Önengüt, G; Önengüt Gökbulut, G; Ozdemir, K; Ozturk, S; Polatöz, A; Sogut, K; Tali, B; Topakli, H; Uzun, D; Vergili, L N; Vergili, M; Akin, I V; Aliev, T; Bilmis, S; Deniz, M; Gamsizkan, H; Guler, A M; Öcalan, K; Serin, M; Sever, R; Surat, U E; Zeyrek, M; Deliomeroglu, M; Demir, D; Gülmez, E; Halu, A; Isildak, B; Kaya, M; Kaya, O; Ozkorucuklu, S; Sonmez, N; Levchuk, L; Lukyanenko, S; Soroka, D; Zub, S; Bostock, F; Brooke, J J; Cheng, T L; Cussans, D; Frazier, R; Goldstein, J; Grant, N; Hansen, M; Heath, G P; Heath, H F; Hill, C; Huckvale, B; Jackson, J; Mackay, C K; Metson, S; Newbold, D M; Nirunpong, K; Smith, V J; Velthuis, J; Walton, R; Bell, K W; Brew, C; Brown, R M; Camanzi, B; Cockerill, D J A; Coughlan, J A; Geddes, N I; Harder, K; Harper, S; Kennedy, B W; Murray, P; Shepherd-Themistocleous, C H; Tomalin, I R; Williams, J H; Womersley, W J; Worm, S D; Bainbridge, R; Ball, G; Ballin, J; Beuselinck, R; Buchmuller, O; Colling, D; Cripps, N; Davies, G; Della Negra, M; Foudas, C; Fulcher, J; Futyan, D; Hall, G; Hays, J; Iles, G; Karapostoli, G; MacEvoy, B C; Magnan, A M; Marrouche, J; Nash, J; Nikitenko, A; Papageorgiou, A; Pesaresi, M; Petridis, K; Pioppi, M; Raymond, D M; Rompotis, N; Rose, A; Ryan, M J; Seez, C; Sharp, P; Sidiropoulos, G; Stettler, M; Stoye, M; Takahashi, M; Tapper, A; Timlin, C; Tourneur, S; Vazquez Acosta, M; Virdee, T; Wakefield, S; Wardrope, D; Whyntie, T; Wingham, M; Cole, J E; Goitom, I; Hobson, P R; Khan, A; Kyberd, P; Leslie, D; Munro, C; Reid, I D; Siamitros, C; Taylor, R; Teodorescu, L; Yaselli, I; Bose, T; Carleton, M; Hazen, E; Heering, A H; Heister, A; John, J St; Lawson, P; Lazic, D; Osborne, D; Rohlf, J; Sulak, L; Wu, S; Andrea, J; Avetisyan, A; Bhattacharya, S; Chou, J P; Cutts, D; Esen, S; Kukartsev, G; Landsberg, G; Narain, M; Nguyen, D; Speer, T; Tsang, K V; Breedon, R; Calderon De La Barca Sanchez, M; Case, M; Cebra, D; Chertok, M; Conway, J; Cox, P T; Dolen, J; Erbacher, R; Friis, E; Ko, W; Kopecky, A; Lander, R; Lister, A; Liu, H; Maruyama, S; Miceli, T; Nikolic, M; Pellett, D; Robles, J; Searle, M; Smith, J; Squires, M; Stilley, J; Tripathi, M; Vasquez Sierra, R; Veelken, C; Andreev, V; Arisaka, K; Cline, D; Cousins, R; Erhan, S; Hauser, J; Ignatenko, M; Jarvis, C; Mumford, J; Plager, C; Rakness, G; Schlein, P; Tucker, J; Valuev, V; Wallny, R; Yang, X; Babb, J; Bose, M; Chandra, A; Clare, R; Ellison, J A; Gary, J W; Hanson, G; Jeng, G Y; Kao, S C; Liu, F; Liu, H; Luthra, A; Nguyen, H; Pasztor, G; Satpathy, A; Shen, B C; Stringer, R; Sturdy, J; Sytnik, V; Wilken, R; Wimpenny, S; Branson, J G; Dusinberre, E; Evans, D; Golf, F; Kelley, R; Lebourgeois, M; Letts, J; Lipeles, E; Mangano, B; Muelmenstaedt, J; Norman, M; Padhi, S; Petrucci, A; Pi, H; Pieri, M; Ranieri, R; Sani, M; Sharma, V; Simon, S; Würthwein, F; Yagil, A; Campagnari, C; D'Alfonso, M; Danielson, T; Garberson, J; Incandela, J; Justus, C; Kalavase, P; Koay, S A; Kovalskyi, D; Krutelyov, V; Lamb, J; Lowette, S; Pavlunin, V; Rebassoo, F; Ribnik, J; Richman, J; Rossin, R; Stuart, D; To, W; Vlimant, J R; Witherell, M; Apresyan, A; Bornheim, A; Bunn, J; Chiorboli, M; Gataullin, M; Kcira, D; Litvine, V; Ma, Y; Newman, H B; Rogan, C; Timciuc, V; Veverka, J; Wilkinson, R; Yang, Y; Zhang, L; Zhu, K; Zhu, R Y; Akgun, B; Carroll, R; Ferguson, T; Jang, D W; Jun, S Y; Paulini, M; Russ, J; Terentyev, N; Vogel, H; Vorobiev, I; Cumalat, J P; Dinardo, M E; Drell, B R; Ford, W T; Heyburn, B; Luiggi Lopez, E; Nauenberg, U; Stenson, K; Ulmer, K; Wagner, S R; Zang, S L; Agostino, L; Alexander, J; Blekman, F; Cassel, D; Chatterjee, A; Das, S; Gibbons, L K; Heltsley, B; Hopkins, W; Khukhunaishvili, A; Kreis, B; Kuznetsov, V; Patterson, J R; Puigh, D; Ryd, A; Shi, X; Stroiney, S; Sun, W; Teo, W D; Thom, J; Vaughan, J; Weng, Y; Wittich, P; Beetz, C P; Cirino, G; Sanzeni, C; Winn, D; Abdullin, S; Afaq, M A; Albrow, M; Ananthan, B; Apollinari, G; Atac, M; Badgett, W; Bagby, L; Bakken, J A; Baldin, B; Banerjee, S; Banicz, K; Bauerdick, L A T; Beretvas, A; Berryhill, J; Bhat, P C; Biery, K; Binkley, M; Bloch, I; Borcherding, F; Brett, A M; Burkett, K; Butler, J N; Chetluru, V; Cheung, H W K; Chlebana, F; Churin, I; Cihangir, S; Crawford, M; Dagenhart, W; Demarteau, M; Derylo, G; Dykstra, D; Eartly, D P; Elias, J E; Elvira, V D; Evans, D; Feng, L; Fischler, M; Fisk, I; Foulkes, S; Freeman, J; Gartung, P; Gottschalk, E; Grassi, T; Green, D; Guo, Y; Gutsche, O; Hahn, A; Hanlon, J; Harris, R M; Holzman, B; Howell, J; Hufnagel, D; James, E; Jensen, H; Johnson, M; Jones, C D; Joshi, U; Juska, E; Kaiser, J; Klima, B; Kossiakov, S; Kousouris, K; Kwan, S; Lei, C M; Limon, P; Lopez Perez, J A; Los, S; Lueking, L; Lukhanin, G; Lusin, S; Lykken, J; Maeshima, K; Marraffino, J M; Mason, D; McBride, P; Miao, T; Mishra, K; Moccia, S; Mommsen, R; Mrenna, S; Muhammad, A S; Newman-Holmes, C; Noeding, C; O'Dell, V; Prokofyev, O; Rivera, R; Rivetta, C H; Ronzhin, A; Rossman, P; Ryu, S; Sekhri, V; Sexton-Kennedy, E; Sfiligoi, I; Sharma, S; Shaw, T M; Shpakov, D; Skup, E; Smith, R P; Soha, A; Spalding, W J; Spiegel, L; Suzuki, I; Tan, P; Tanenbaum, W; Tkaczyk, S; Trentadue, R; Uplegger, L; Vaandering, E W; Vidal, R; Whitmore, J; Wicklund, E; Wu, W; Yarba, J; Yumiceva, F; Yun, J C; Acosta, D; Avery, P; Barashko, V; Bourilkov, D; Chen, M; Di Giovanni, G P; Dobur, D; Drozdetskiy, A; Field, R D; Fu, Y; Furic, I K; Gartner, J; Holmes, D; Kim, B; Klimenko, S; Konigsberg, J; Korytov, A; Kotov, K; Kropivnitskaya, A; Kypreos, T; Madorsky, A; Matchev, K; Mitselmakher, G; Pakhotin, Y; Piedra Gomez, J; Prescott, C; Rapsevicius, V; Remington, R; Schmitt, M; Scurlock, B; Wang, D; Yelton, J; Ceron, C; Gaultney, V; Kramer, L; Lebolo, L M; Linn, S; Markowitz, P; Martinez, G; Rodriguez, J L; Adams, T; Askew, A; Baer, H; Bertoldi, M; Chen, J; Dharmaratna, W G D; Gleyzer, S V; Haas, J; Hagopian, S; Hagopian, V; Jenkins, M; Johnson, K F; Prettner, E; Prosper, H; Sekmen, S; Baarmand, M M; Guragain, S; Hohlmann, M; Kalakhety, H; Mermerkaya, H; Ralich, R; Vodopiyanov, I; Abelev, B; Adams, M R; Anghel, I M; Apanasevich, L; Bazterra, V E; Betts, R R; Callner, J; Castro, M A; Cavanaugh, R; Dragoiu, C; Garcia-Solis, E J; Gerber, C E; Hofman, D J; Khalatian, S; Mironov, C; Shabalina, E; Smoron, A; Varelas, N; Akgun, U; Albayrak, E A; Ayan, A S; Bilki, B; Briggs, R; Cankocak, K; Chung, K; Clarida, W; Debbins, P; Duru, F; Ingram, F D; Lae, C K; McCliment, E; Merlo, J P; Mestvirishvili, A; Miller, M J; Moeller, A; Nachtman, J; Newsom, C R; Norbeck, E; Olson, J; Onel, Y; Ozok, F; Parsons, J; Schmidt, I; Sen, S; Wetzel, J; Yetkin, T; Yi, K; Barnett, B A; Blumenfeld, B; Bonato, A; Chien, C Y; Fehling, D; Giurgiu, G; Gritsan, A V; Guo, Z J; Maksimovic, P; Rappoccio, S; Swartz, M; Tran, N V; Zhang, Y; Baringer, P; Bean, A; Grachov, O; Murray, M; Radicci, V; Sanders, S; Wood, J S; Zhukova, V; Bandurin, D; Bolton, T; Kaadze, K; Liu, A; Maravin, Y; Onoprienko, D; Svintradze, I; Wan, Z; Gronberg, J; Hollar, J; Lange, D; Wright, D; Baden, D; Bard, R; Boutemeur, M; Eno, S C; Ferencek, D; Hadley, N J; Kellogg, R G; Kirn, M; Kunori, S; Rossato, K; Rumerio, P; Santanastasio, F; Skuja, A; Temple, J; Tonjes, M B; Tonwar, S C; Toole, T; Twedt, E; Alver, B; Bauer, G; Bendavid, J; Busza, W; Butz, E; Cali, I A; Chan, M; D'Enterria, D; Everaerts, P; Gomez Ceballos, G; Hahn, K A; Harris, P; Jaditz, S; Kim, Y; Klute, M; Lee, Y J; Li, W; Loizides, C; Ma, T; Miller, M; Nahn, S; Paus, C; Roland, C; Roland, G; Rudolph, M; Stephans, G; Sumorok, K; Sung, K; Vaurynovich, S; Wenger, E A; Wyslouch, B; Xie, S; Yilmaz, Y; Yoon, A S; Bailleux, D; Cooper, S I; Cushman, P; Dahmes, B; De Benedetti, A; Dolgopolov, A; Dudero, P R; Egeland, R; Franzoni, G; Haupt, J; Inyakin, A; Klapoetke, K; Kubota, Y; Mans, J; Mirman, N; Petyt, D; Rekovic, V; Rusack, R; Schroeder, M; Singovsky, A; Zhang, J; Cremaldi, L M; Godang, R; Kroeger, R; Perera, L; Rahmat, R; Sanders, D A; Sonnek, P; Summers, D; Bloom, K; Bockelman, B; Bose, S; Butt, J; Claes, D R; Dominguez, A; Eads, M; Keller, J; Kelly, T; Kravchenko, I; Lazo-Flores, J; Lundstedt, C; Malbouisson, H; Malik, S; Snow, G R; Baur, U; Iashvili, I; Kharchilava, A; Kumar, A; Smith, K; Strang, M; Alverson, G; Barberis, E; Boeriu, O; Eulisse, G; Govi, G; McCauley, T; Musienko, Y; Muzaffar, S; Osborne, I; Paul, T; Reucroft, S; Swain, J; Taylor, L; Tuura, L; Anastassov, A; Gobbi, B; Kubik, A; Ofierzynski, R A; Pozdnyakov, A; Schmitt, M; Stoynev, S; Velasco, M; Won, S; Antonelli, L; Berry, D; Hildreth, M; Jessop, C; Karmgard, D J; Kolberg, T; Lannon, K; Lynch, S; Marinelli, N; Morse, D M; Ruchti, R; Slaunwhite, J; Warchol, J; Wayne, M; Bylsma, B; Durkin, L S; Gilmore, J; Gu, J; Killewald, P; Ling, T Y; Williams, G; Adam, N; Berry, E; Elmer, P; Garmash, A; Gerbaudo, D; Halyo, V; Hunt, A; Jones, J; Laird, E; Marlow, D; Medvedeva, T; Mooney, M; Olsen, J; Piroué, P; Stickland, D; Tully, C; Werner, J S; Wildish, T; Xie, Z; Zuranski, A; Acosta, J G; Bonnett Del Alamo, M; Huang, X T; Lopez, A; Mendez, H; Oliveros, S; Ramirez Vargas, J E; Santacruz, N; Zatzerklyany, A; Alagoz, E; Antillon, E; Barnes, V E; Bolla, G; Bortoletto, D; Everett, A; Garfinkel, A F; Gecse, Z; Gutay, L; Ippolito, N; Jones, M; Koybasi, O; Laasanen, A T; Leonardo, N; Liu, C; Maroussov, V; Merkel, P; Miller, D H; Neumeister, N; Sedov, A; Shipsey, I; Yoo, H D; Zheng, Y; Jindal, P; Parashar, N; Cuplov, V; Ecklund, K M; Geurts, F J M; Liu, J H; Maronde, D; Matveev, M; Padley, B P; Redjimi, R; Roberts, J; Sabbatini, L; Tumanov, A; Betchart, B; Bodek, A; Budd, H; Chung, Y S; de Barbaro, P; Demina, R; Flacher, H; Gotra, Y; Harel, A; Korjenevski, S; Miner, D C; Orbaker, D; Petrillo, G; Vishnevskiy, D; Zielinski, M; Bhatti, A; Demortier, L; Goulianos, K; Hatakeyama, K; Lungu, G; Mesropian, C; Yan, M; Atramentov, O; Bartz, E; Gershtein, Y; Halkiadakis, E; Hits, D; Lath, A; Rose, K; Schnetzer, S; Somalwar, S; Stone, R; Thomas, S; Watts, T L; Cerizza, G; Hollingsworth, M; Spanier, S; Yang, Z C; York, A; Asaadi, J; Aurisano, A; Eusebi, R; Golyash, A; Gurrola, A; Kamon, T; Nguyen, C N; Pivarski, J; Safonov, A; Sengupta, S; Toback, D; Weinberger, M; Akchurin, N; Berntzon, L; Gumus, K; Jeong, C; Kim, H; Lee, S W; Popescu, S; Roh, Y; Sill, A; Volobouev, I; Washington, E; Wigmans, R; Yazgan, E; Engh, D; Florez, C; Johns, W; Pathak, S; Sheldon, P; Andelin, D; Arenton, M W; Balazs, M; Boutle, S; Buehler, M; Conetti, S; Cox, B; Hirosky, R; Ledovskoy, A; Neu, C; Phillips II, D; Ronquest, M; Yohay, R; Gollapinni, S; Gunthoti, K; Harr, R; Karchin, P E; Mattson, M; Sakharov, A; Anderson, M; Bachtis, M; Bellinger, J N; Carlsmith, D; Crotty, I; Dasu, S; Dutta, S; Efron, J; Feyzi, F; Flood, K; Gray, L; Grogg, K S; Grothe, M; Hall-Wilton, R; Jaworski, M; Klabbers, P; Klukas, J; Lanaro, A; Lazaridis, C; Leonard, J; Loveless, R; Magrans de Abril, M; Mohapatra, A; Ott, G; Polese, G; Reeder, D; Savin, A; Smith, W H; Sourkov, A; Swanson, J; Weinberg, M; Wenman, D; Wensveen, M; White, A

    2010-01-01

    The alignment system for the muon spectrometer of the CMS detector comprises three independent subsystems of optical and analog position sensors. It aligns muon chambers with respect to each other and to the central silicon tracker. System commissioning at full magnetic field began in 2008 during an extended cosmic ray run. The system succeeded in tracking muon detector movements of up to 18 mm and rotations of several milliradians under magnetic forces. Depending on coordinate and subsystem, the system achieved chamber alignment precisions of 140-350 microns and 30-200 microradians. Systematic errors on displacements are estimated to be 340-590 microns based on comparisons with independent photogrammetry measurements.

  19. Speakers' choice of frame in binary choice

    Directory of Open Access Journals (Sweden)

    Marc van Buiten

    2009-02-01

    Full Text Available A distinction is proposed between extit{recommending for} preferred choice options and extit{recommending against} non-preferred choice options. In binary choice, both recommendation modes are logically, though not psychologically, equivalent. We report empirical evidence showing that speakers recommending for preferred options predominantly select positive frames, which are less common when speakers recommend against non-preferred options. In addition, option attractiveness is shown to affect speakers' choice of frame, and adoption of recommendation mode. The results are interpreted in terms of three compatibility effects, (i extit{recommendation mode---valence framing compatibility}: speakers' preference for positive framing is enhanced under extit{recommending for} and diminished under extit{recommending against} instructions, (ii extit{option attractiveness---valence framing compatibility}: speakers' preference for positive framing is more pronounced for attractive than for unattractive options, and (iii extit{recommendation mode---option attractiveness compatibility}: speakers are more likely to adopt a extit{recommending for} approach for attractive than for unattractive binary choice pairs.

  20. Türkçede Ön Seste Y Initial Phonem Y In Turkish

    Directory of Open Access Journals (Sweden)

    Sertan ALİBEKİROĞLU

    2013-03-01

    Full Text Available The Turkish language, which has an ancient history, is spread to a wide geography in the old continents. In classification of Turkish, Turcologists use different measures because this spread and natural qualifier of languages which is being alive are the main reasons that Turkish to branch into many dialects and accents. Studies ofTurcologists who use dialect for the classification of Turkish show thatthere are equivalences between West and East branches of theFirst/Proto Turkish on “l-ş”, “r-z” in internal and final phoneme; andon y-s/ş in initial phonemes. Between the Ancient and ModernTurkish, there emerged-especially base on time and place-manyphoneme evaluations and formed phoneme equivalences betweendialects. These equivalences allow us to understand current situationof the language and have information for the future while they shedlight evaluation of Turkish in its long historical journey. This studyaims to show the changes of the initial phoneme y which is used inclassification based on dialects. These changes will be shown byfirstly, determining if during the First Turkish, there were theconsonant primarily “y-” in initial phonemes of words, anddetermining the position of “y-” before the written era of Turkish; andsecondly, presenting the positions of initial phoneme y- during thewritten era of Turkish. Following these motivations, the study onlyconsiders the initial phoneme “y-” and does not include the changes ininternal and final phonemes. Kökleri çok eski tarihlere uzanan Türkçe, eski kıtalar üzerinde çok geniş coğrafyalara yayılmıştır. Bu coğrafyalara yayılım ve dilin doğasında bulunan canlı bir varlık olma vasfının Türkçenin çok çeşitli lehçe ve şivelere ayrılarak dallanmasının ana sebeplerini oluşturması, Türkologların, Türkçeyi tasnif ederken değişik ölçüler kullanmalarına neden olmuştur. Coğrafya, boy adları ve lehçe özelliklerine göre tasnif edilmeye

  1. 'Compromise position' image alignment to accommodate independent motion of multiple clinical target volumes during radiotherapy: A high risk prostate cancer example.

    Science.gov (United States)

    Rosewall, Tara; Yan, Jing; Alasti, Hamideh; Cerase, Carla; Bayley, Andrew

    2017-04-01

    Inclusion of multiple independently moving clinical target volumes (CTVs) in the irradiated volume causes an image guidance conundrum. The purpose of this research was to use high risk prostate cancer as a clinical example to evaluate a 'compromise' image alignment strategy. The daily pre-treatment orthogonal EPI for 14 consecutive patients were included in this analysis. Image matching was performed by aligning to the prostate only, the bony pelvis only and using the 'compromise' strategy. Residual CTV surrogate displacements were quantified for each of the alignment strategies. Analysis of the 388 daily fractions indicated surrogate displacements were well-correlated in all directions (r 2  = 0.95 (LR), 0.67 (AP) and 0.59 (SI). Differences between the surrogates displacements (95% range) were -0.4 to 1.8 mm (LR), -1.2 to 5.2 mm (SI) and -1.2 to 5.2 mm (AP). The distribution of the residual displacements was significantly smaller using the 'compromise' strategy, compared to the other strategies (p 0.005). The 'compromise' strategy ensured the CTV was encompassed by the PTV in all fractions, compared to 47 PTV violations when aligned to prostate only. This study demonstrated the feasibility of a compromise position image guidance strategy to accommodate simultaneous displacements of two independently moving CTVs. Application of this strategy was facilitated by correlation between the CTV displacements and resulted in no geometric excursions of the CTVs beyond standard sized PTVs. This simple image guidance strategy may also be applicable to other disease sites that concurrently irradiate multiple CTVs, such as head and neck, lung and cervix cancer. © 2016 The Royal Australian and New Zealand College of Radiologists.

  2. Neural Correlates in the Processing of Phoneme-Level Complexity in Vowel Production

    Science.gov (United States)

    Park, Haeil; Iverson, Gregory K.; Park, Hae-Jeong

    2011-01-01

    We investigated how articulatory complexity at the phoneme level is manifested neurobiologically in an overt production task. fMRI images were acquired from young Korean-speaking adults as they pronounced bisyllabic pseudowords in which we manipulated phonological complexity defined in terms of vowel duration and instability (viz., COMPLEX:…

  3. Morpho-phonemic analysis boosts word reading for adult struggling readers

    OpenAIRE

    Gray, Susan H.; Ehri, Linnea C.; Locke, John L.

    2017-01-01

    A randomized control trial compared the effects of two kinds of vocabulary instruction on component reading skills of adult struggling readers. Participants seeking alternative high school diplomas received 8 h of scripted tutoring to learn forty academic vocabulary words embedded within a civics curriculum. They were matched for language background and reading levels, then randomly assigned to either morpho-phonemic analysis teaching word origins, morpheme and syllable structures, or traditi...

  4. Polish Phoneme Statistics Obtained On Large Set Of Written Texts

    Directory of Open Access Journals (Sweden)

    Bartosz Ziółko

    2009-01-01

    Full Text Available The phonetical statistics were collected from several Polish corpora. The paper is a summaryof the data which are phoneme n-grams and some phenomena in the statistics. Triphonestatistics apply context-dependent speech units which have an important role in speech recognitionsystems and were never calculated for a large set of Polish written texts. The standardphonetic alphabet for Polish, SAMPA, and methods of providing phonetic transcriptions are described.

  5. Speaker-dependent Dictionary-based Speech Enhancement for Text-Dependent Speaker Verification

    DEFF Research Database (Denmark)

    Thomsen, Nicolai Bæk; Thomsen, Dennis Alexander Lehmann; Tan, Zheng-Hua

    2016-01-01

    not perform well in this setting. In this work we compare the performance of different noise reduction methods under different noise conditions in terms of speaker verification when the text is known and the system is trained on clean data (mis-matched conditions). We furthermore propose a new approach based......The problem of text-dependent speaker verification under noisy conditions is becoming ever more relevant, due to increased usage for authentication in real-world applications. Classical methods for noise reduction such as spectral subtraction and Wiener filtering introduce distortion and do...... on dictionary-based noise reduction and compare it to the baseline methods....

  6. Apology Strategy in English By Native Speaker

    Directory of Open Access Journals (Sweden)

    Mezia Kemala Sari

    2016-05-01

    Full Text Available This research discussed apology strategies in English by native speaker. This descriptive study was presented within the framework of Pragmatics based on the forms of strategies due to the coding manual as found in CCSARP (Cross-Cultural Speech Acts Realization Project.The goals of this study were to describe the apology strategies in English by native speaker and identify the influencing factors of it. Data were collected through the use of the questionnaire in the form of Discourse Completion Test, which was distributed to 30 native speakers. Data were classified based on the degree of familiarity and the social distance between speaker and hearer and then the data of native will be separated and classified by the type of strategies in coding manual. The results of this study are the pattern of apology strategies of native speaker brief with the pattern that potentially occurs IFID plus Offer of repair plus Taking on responsibility. While Alerters, Explanation and Downgrading appear with less number of percentage. Then, the factors that influence the apology utterance by native speakers are the social situation, the degree of familiarity and degree of the offence which more complicated the mistake tend to produce the most complex utterances by the speaker.

  7. Phoneme Awareness, Vocabulary and Word Decoding in Monolingual and Bilingual Dutch Children

    Science.gov (United States)

    Janssen, Marije; Bosman, Anna M. T.; Leseman, Paul P. M.

    2013-01-01

    The aim of this study was to investigate whether bilingually raised children in the Netherlands, who receive literacy instruction in their second language only, show an advantage on Dutch phoneme-awareness tasks compared with monolingual Dutch-speaking children. Language performance of a group of 47 immigrant first-grade children with various…

  8. Modeling phoneme perception. II: A model of stop consonant discrimination.

    Science.gov (United States)

    van Hessen, A J; Schouten, M E

    1992-10-01

    Combining elements from two existing theories of speech sound discrimination, dual process theory (DPT) and trace context theory (TCT), a new theory, called phoneme perception theory, is proposed, consisting of a long-term phoneme memory, a context-coding memory, and a trace memory, each with its own time constants. This theory is tested by means of stop-consonant discrimination data in which interstimulus interval (ISI; values of 100, 300, and 2000 ms) is an important variable. It is shown that discrimination in which labeling plays an important part (2IFC and AX between category) benefits from increased ISI, whereas discrimination in which only sensory traces are compared (AX within category), decreases with increasing ISI. The theory is also tested on speech discrimination data from the literature in which ISI is a variable [Pisoni, J. Acoust. Soc. Am. 36, 277-282 (1964); Cowan and Morse, J. Acoust. Soc. Am. 79, 500-507 (1986)]. It is concluded that the number of parameters in trace context theory is not sufficient to account for most speech-sound discrimination data and that a few additional assumptions are needed, such as a form of sublabeling, in which subjects encode the quality of a stimulus as a member of a category, and which requires processing time.

  9. The selective role of premotor cortex in speech perception: a contribution to phoneme judgements but not speech comprehension.

    Science.gov (United States)

    Krieger-Redwood, Katya; Gaskell, M Gareth; Lindsay, Shane; Jefferies, Elizabeth

    2013-12-01

    Several accounts of speech perception propose that the areas involved in producing language are also involved in perceiving it. In line with this view, neuroimaging studies show activation of premotor cortex (PMC) during phoneme judgment tasks; however, there is debate about whether speech perception necessarily involves motor processes, across all task contexts, or whether the contribution of PMC is restricted to tasks requiring explicit phoneme awareness. Some aspects of speech processing, such as mapping sounds onto meaning, may proceed without the involvement of motor speech areas if PMC specifically contributes to the manipulation and categorical perception of phonemes. We applied TMS to three sites-PMC, posterior superior temporal gyrus, and occipital pole-and for the first time within the TMS literature, directly contrasted two speech perception tasks that required explicit phoneme decisions and mapping of speech sounds onto semantic categories, respectively. TMS to PMC disrupted explicit phonological judgments but not access to meaning for the same speech stimuli. TMS to two further sites confirmed that this pattern was site specific and did not reflect a generic difference in the susceptibility of our experimental tasks to TMS: stimulation of pSTG, a site involved in auditory processing, disrupted performance in both language tasks, whereas stimulation of occipital pole had no effect on performance in either task. These findings demonstrate that, although PMC is important for explicit phonological judgments, crucially, PMC is not necessary for mapping speech onto meanings.

  10. Onomatopoeias: a new perspective around space, image schemas and phoneme clusters.

    Science.gov (United States)

    Catricalà, Maria; Guidi, Annarita

    2015-09-01

    Onomatopoeias (grammatical) descriptive and explicative model of onomatopoeia, because the rules that constrain processes of selection and construction remain idiosyncratic and variable (Dogana in Le parole dell'incanto. FrancoAngeli, Milano, 2002; Catricalà 2011). This article proposes a classification model based on spatial cognition criteria. The hypothesis (Catricalà 2011) is that onomatopoeias are related to image schemas (Johnson in The body in the mind. University Press, Chicago, 1987), i.e. to the visual mapping of a movement. We also refer to force dynamic (Talmy in Language typology and lexical description, pp 36-149, 1985; Jackendoff in Semantic structures. MIT Press, Cambridge, 1990) as a basic model of conceptual maps (Langacker in Grammar and conceptualization. Mouton de Gruyter, Berlin, 1999). Categories are related to the presence of specific phonemes and phoneme clusters, while visual patterns correspond to different image schemas. The association between specific categories of pseudo-onomatopoeias and specific spatial/movement patterns is also the object of an experiment focused on onomatopoeia interpretation. Most part of data confirms a correlation between image schemas as CONTAINER/CONTAINMENT (crunch, plop) or SOURCE-PATH-GOAL (tattarrattat 'shots') and an occlusive consonant, while liquid and trill consonants correlate with PATH (vroom).

  11. Who spoke when? Audio-based speaker location estimation for diarization

    NARCIS (Netherlands)

    Dadvar, M.

    2011-01-01

    Speaker diarization is the process which detects active speakers and groups those speech signals which has been uttered by the same speaker. Generally we can find two main applications for speaker diarization. Automatic Speech Recognition systems make use of the speaker homogeneous clusters to adapt

  12. How Important Is Teaching Phonemic Awareness to Children Learning to Read in Spanish?

    Science.gov (United States)

    Goldenberg, Claude; Tolar, Tammy D.; Reese, Leslie; Francis, David J.; Bazán, Antonio Ray; Mejía-Arauz, Rebeca

    2014-01-01

    This comparative study examines relationships between phonemic awareness and Spanish reading skill acquisition among three groups of Spanish-speaking first and second graders: children in Mexico receiving reading instruction in Spanish and children in the United States receiving reading instruction in either Spanish or English. Children were…

  13. Machine Learning for Text-Independent Speaker Verification : How to Teach a Machine to RecognizeHuman Voices

    OpenAIRE

    Imoscopi, Stefano

    2016-01-01

    The aim of speaker recognition and veri cation is to identify people's identity from the characteristics of their voices (voice biometrics). Traditionally this technology has been employed mostly for security or authentication purposes, identi cation of employees/customers and criminal investigations. During the last decade the increasing popularity of hands-free and voice-controlled systems and the massive growth of media content generated on the internet has increased the need for technique...

  14. The CMS Silicon Tracker Alignment

    CERN Document Server

    Castello, R

    2008-01-01

    The alignment of the Strip and Pixel Tracker of the Compact Muon Solenoid experiment, with its large number of independent silicon sensors and its excellent spatial resolution, is a complex and challenging task. Besides high precision mounting, survey measurements and the Laser Alignment System, track-based alignment is needed to reach the envisaged precision.\\\\ Three different algorithms for track-based alignment were successfully tested on a sample of cosmic-ray data collected at the Tracker Integration Facility, where 15\\% of the Tracker was tested. These results, together with those coming from the CMS global run, will provide the basis for the full-scale alignment of the Tracker, which will be carried out with the first \\emph{p-p} collisions.

  15. Unsupervised Speaker Change Detection for Broadcast News Segmentation

    DEFF Research Database (Denmark)

    Jørgensen, Kasper Winther; Mølgaard, Lasse Lohilahti; Hansen, Lars Kai

    2006-01-01

    This paper presents a speaker change detection system for news broadcast segmentation based on a vector quantization (VQ) approach. The system does not make any assumption about the number of speakers or speaker identity. The system uses mel frequency cepstral coefficients and change detection...

  16. A New Database for Speaker Recognition

    DEFF Research Database (Denmark)

    Feng, Ling; Hansen, Lars Kai

    2005-01-01

    In this paper we discuss properties of speech databases used for speaker recognition research and evaluation, and we characterize some popular standard databases. The paper presents a new database called ELSDSR dedicated to speaker recognition applications. The main characteristics of this database...

  17. Speaker Segmentation and Clustering Using Gender Information

    Science.gov (United States)

    2006-02-01

    used in the first stages of segmentation forder information in the clustering of the opposite-gender speaker diarization of news broadcasts. files, the...AFRL-HE-WP-TP-2006-0026 AIR FORCE RESEARCH LABORATORY Speaker Segmentation and Clustering Using Gender Information Brian M. Ore General Dynamics...COVERED (From - To) February 2006 ProceedinLgs 4. TITLE AND SUBTITLE 5a. CONTRACT NUMBER Speaker Segmentation and Clustering Using Gender Information 5b

  18. (En)countering native-speakerism global perspectives

    CERN Document Server

    Holliday, Adrian; Swan, Anne

    2015-01-01

    The book addresses the issue of native-speakerism, an ideology based on the assumption that 'native speakers' of English have a special claim to the language itself, through critical qualitative studies of the lived experiences of practising teachers and students in a range of scenarios.

  19. Robust speaker recognition in noisy environments

    CERN Document Server

    Rao, K Sreenivasa

    2014-01-01

    This book discusses speaker recognition methods to deal with realistic variable noisy environments. The text covers authentication systems for; robust noisy background environments, functions in real time and incorporated in mobile devices. The book focuses on different approaches to enhance the accuracy of speaker recognition in presence of varying background environments. The authors examine: (a) Feature compensation using multiple background models, (b) Feature mapping using data-driven stochastic models, (c) Design of super vector- based GMM-SVM framework for robust speaker recognition, (d) Total variability modeling (i-vectors) in a discriminative framework and (e) Boosting method to fuse evidences from multiple SVM models.

  20. A system of automatic speaker recognition on a minicomputer

    International Nuclear Information System (INIS)

    El Chafei, Cherif

    1978-01-01

    This study describes a system of automatic speaker recognition using the pitch of the voice. The pre-treatment consists in the extraction of the speakers' discriminating characteristics taken from the pitch. The programme of recognition gives, firstly, a preselection and then calculates the distance between the speaker's characteristics to be recognized and those of the speakers already recorded. An experience of recognition has been realized. It has been undertaken with 15 speakers and included 566 tests spread over an intermittent period of four months. The discriminating characteristics used offer several interesting qualities. The algorithms concerning the measure of the characteristics on one hand, the speakers' classification on the other hand, are simple. The results obtained in real time with a minicomputer are satisfactory. Furthermore they probably could be improved if we considered other speaker's discriminating characteristics but this was unfortunately not in our possibilities. (author) [fr

  1. Methods in ALFA Alignment

    CERN Document Server

    Melendez, Jordan

    2014-01-01

    This note presents two model-independent methods for use in the alignment of the ALFA forward detectors. Using a Monte Carlo simulated LHC run at \\beta = 90m and \\sqrt{s} = 7 TeV, the Kinematic Peak alignment method is utilized to reconstruct the Mandelstam momentum transfer variable t for single-diractive protons. The Hot Spot method uses fluctuations in the hitmap density to pinpoint particular regions in the detector that could signal a misalignment. Another method uses an error function fit to find the detector edge. With this information, the vertical alignment can be determined.

  2. Speaker diarization system using HXLPS and deep neural network

    Directory of Open Access Journals (Sweden)

    V. Subba Ramaiah

    2018-03-01

    Full Text Available In general, speaker diarization is defined as the process of segmenting the input speech signal and grouped the homogenous regions with regard to the speaker identity. The main idea behind this system is that it is able to discriminate the speaker signal by assigning the label of the each speaker signal. Due to rapid growth of broadcasting and meeting, the speaker diarization is burdensome to enhance the readability of the speech transcription. In order to solve this issue, Holoentropy with the eXtended Linear Prediction using autocorrelation Snapshot (HXLPS and deep neural network (DNN is proposed for the speaker diarization system. The HXLPS extraction method is newly developed by incorporating the Holoentropy with the XLPS. Once we attain the features, the speech and non-speech signals are detected by the Voice Activity Detection (VAD method. Then, i-vector representation of every segmented signal is obtained using Universal Background Model (UBM model. Consequently, DNN is utilized to assign the label for the speaker signal which is then clustered according to the speaker label. The performance is analysed using the evaluation metrics, such as tracking distance, false alarm rate and diarization error rate. The outcome of the proposed method ensures the better diarization performance by achieving the lower DER of 1.36% based on lambda value and DER of 2.23% depends on the frame length. Keywords: Speaker diarization, HXLPS feature extraction, Voice activity detection, Deep neural network, Speaker clustering, Diarization Error Rate (DER

  3. A nanoradio utilizing the mechanical resonance of a vertically aligned nanopillar array

    KAUST Repository

    Lee, Chang Hwa; Lee, Seok Woo; Lee, Seung S.

    2014-01-01

    A nanoradio based on the mechanical resonance of a nanomaterial has promising applications in terms of size reduction of an antenna and integrity of all components of a radio except a speaker. In this letter, a nanopillar array radio utilizing the mechanical resonance of a vertically aligned nanopillar array is realized by a reliable top-down method. By exploiting the field emission phenomenon, it was found that the nanopillar array functions as a radio with a demodulator without any electrical circuitry. The array of vertically aligned nanopillars increases the demodulated current and signal to noise ratio, and this fabrication method makes manipulation and positioning of nanostructures possible intrinsically for industrial applications. © 2014 The Royal Society of Chemistry.

  4. Insight into the Attitudes of Speakers of Urban Meccan Hijazi Arabic towards their Dialect

    Directory of Open Access Journals (Sweden)

    Sameeha D. Alahmadi

    2016-04-01

    Full Text Available The current study mainly aims to examine the attitudes of speakers of Urban Meccan Hijazi Arabic (UMHA towards their dialect, which is spoken in Mecca, Saudi Arabia. It also investigates whether the participants’ age, sex and educational level have any impact on their perception of their dialect. To this end, I designed a 5-point-Likert-scale questionnaire, requiring participants to rate their attitudes towards their dialect. I asked 80 participants, whose first language is UMHA, to fill out the questionnaire. On the basis of the three independent variables, namely, age, sex and educational level, the participants were divided into three groups: old and young speakers, male and female speakers and educated and uneducated speakers. The results reveal that in general, all the groups (young and old, male and female, and educated and uneducated participants have a sense of responsibility towards their dialect, making their attitudes towards their dialect positive. However, differences exist between the three groups. For instance, old speakers tend to express their pride of their dialect more than young speakers. The same pattern is observed in male and female groups. The results show that females may feel embarrassed to provide answers that may imply that they are not proud of their own dialect, since the majority of women in the Arab world, in general, are under more pressure to conform to the overt norms of the society than males. Therefore, I argue that most Arab women may not have the same freedom to express their opinions and feelings about various issues. Based on the results, the study concludes with some recommendations for further research.  Keywords: sociolinguistics, language attitudes, dialectology, social variables, Urban Meccan Hijazi Arabic

  5. Assessing the efficacy of hearing-aid amplification using a phoneme test

    DEFF Research Database (Denmark)

    Scheidiger, Christoph; Allen, Jont B; Dau, Torsten

    2017-01-01

    Consonant-vowel (CV) perception experiments provide valuable insights into how humans process speech. Here, two CV identification experiments were conducted in a group of hearing-impaired (HI) listeners, using 14 consonants followed by the vowel /ɑ/. The CVs were presented in quiet and with added......, in combination with a well-controlled phoneme speech test, may be used to assess the impact of hearing-aid signal processing on speech intelligibility....

  6. Dutch-Cantonese Bilinguals Show Segmental Processing during Sinitic Language Production

    Directory of Open Access Journals (Sweden)

    Kalinka Timmer

    2017-07-01

    Full Text Available This study addressed the debate on the primacy of syllable vs. segment (i.e., phoneme as a functional unit of phonological encoding in syllabic languages by investigating both behavioral and neural responses of Dutch-Cantonese (DC bilinguals in a color-object picture naming task. Specifically, we investigated whether DC bilinguals exhibit the phonemic processing strategy, evident in monolingual Dutch speakers, during planning of their Cantonese speech production. Participants named the color of colored line-drawings in Cantonese faster when color and object matched in the first segment than when they were mismatched (e.g., 藍駱駝, /laam4/ /lok3to4/, “blue camel;” 紅饑駝, /hung4/ /lok3to4/, “red camel”. This is in contrast to previous studies in Sinitic languages that did not reveal such phoneme-only facilitation. Phonemic overlap also modulated the event-related potentials (ERPs in the 125–175, 200–300, and 300–400 ms time windows, suggesting earlier ERP modulations than in previous studies with monolingual Sinitic speakers or unbalanced Sinitic-Germanic bilinguals. Conjointly, our results suggest that, while the syllable may be considered the primary unit of phonological encoding in Sinitic languages, the phoneme can serve as the primary unit of phonological encoding, both behaviorally and neurally, for DC bilinguals. The presence/absence of a segment onset effect in Sinitic languages may be related to the proficiency in the Germanic language of bilinguals.

  7. Learning speaker-specific characteristics with a deep neural architecture.

    Science.gov (United States)

    Chen, Ke; Salman, Ahmad

    2011-11-01

    Speech signals convey various yet mixed information ranging from linguistic to speaker-specific information. However, most of acoustic representations characterize all different kinds of information as whole, which could hinder either a speech or a speaker recognition (SR) system from producing a better performance. In this paper, we propose a novel deep neural architecture (DNA) especially for learning speaker-specific characteristics from mel-frequency cepstral coefficients, an acoustic representation commonly used in both speech recognition and SR, which results in a speaker-specific overcomplete representation. In order to learn intrinsic speaker-specific characteristics, we come up with an objective function consisting of contrastive losses in terms of speaker similarity/dissimilarity and data reconstruction losses used as regularization to normalize the interference of non-speaker-related information. Moreover, we employ a hybrid learning strategy for learning parameters of the deep neural networks: i.e., local yet greedy layerwise unsupervised pretraining for initialization and global supervised learning for the ultimate discriminative goal. With four Linguistic Data Consortium (LDC) benchmarks and two non-English corpora, we demonstrate that our overcomplete representation is robust in characterizing various speakers, no matter whether their utterances have been used in training our DNA, and highly insensitive to text and languages spoken. Extensive comparative studies suggest that our approach yields favorite results in speaker verification and segmentation. Finally, we discuss several issues concerning our proposed approach.

  8. Robustness-related issues in speaker recognition

    CERN Document Server

    Zheng, Thomas Fang

    2017-01-01

    This book presents an overview of speaker recognition technologies with an emphasis on dealing with robustness issues. Firstly, the book gives an overview of speaker recognition, such as the basic system framework, categories under different criteria, performance evaluation and its development history. Secondly, with regard to robustness issues, the book presents three categories, including environment-related issues, speaker-related issues and application-oriented issues. For each category, the book describes the current hot topics, existing technologies, and potential research focuses in the future. The book is a useful reference book and self-learning guide for early researchers working in the field of robust speech recognition.

  9. Reference-Frame-Independent and Measurement-Device-Independent Quantum Key Distribution Using One Single Source

    Science.gov (United States)

    Li, Qian; Zhu, Changhua; Ma, Shuquan; Wei, Kejin; Pei, Changxing

    2018-04-01

    Measurement-device-independent quantum key distribution (MDI-QKD) is immune to all detector side-channel attacks. However, practical implementations of MDI-QKD, which require two-photon interferences from separated independent single-photon sources and a nontrivial reference alignment procedure, are still challenging with current technologies. Here, we propose a scheme that significantly reduces the experimental complexity of two-photon interferences and eliminates reference frame alignment by the combination of plug-and-play and reference frame independent MDI-QKD. Simulation results show that the secure communication distance can be up to 219 km in the finite-data case and the scheme has good potential for practical MDI-QKD systems.

  10. Audio-Visual Speaker Diarization Based on Spatiotemporal Bayesian Fusion.

    Science.gov (United States)

    Gebru, Israel D; Ba, Sileye; Li, Xiaofei; Horaud, Radu

    2018-05-01

    Speaker diarization consists of assigning speech signals to people engaged in a dialogue. An audio-visual spatiotemporal diarization model is proposed. The model is well suited for challenging scenarios that consist of several participants engaged in multi-party interaction while they move around and turn their heads towards the other participants rather than facing the cameras and the microphones. Multiple-person visual tracking is combined with multiple speech-source localization in order to tackle the speech-to-person association problem. The latter is solved within a novel audio-visual fusion method on the following grounds: binaural spectral features are first extracted from a microphone pair, then a supervised audio-visual alignment technique maps these features onto an image, and finally a semi-supervised clustering method assigns binaural spectral features to visible persons. The main advantage of this method over previous work is that it processes in a principled way speech signals uttered simultaneously by multiple persons. The diarization itself is cast into a latent-variable temporal graphical model that infers speaker identities and speech turns, based on the output of an audio-visual association process, executed at each time slice, and on the dynamics of the diarization variable itself. The proposed formulation yields an efficient exact inference procedure. A novel dataset, that contains audio-visual training data as well as a number of scenarios involving several participants engaged in formal and informal dialogue, is introduced. The proposed method is thoroughly tested and benchmarked with respect to several state-of-the art diarization algorithms.

  11. Real Time Recognition Of Speakers From Internet Audio Stream

    Directory of Open Access Journals (Sweden)

    Weychan Radoslaw

    2015-09-01

    Full Text Available In this paper we present an automatic speaker recognition technique with the use of the Internet radio lossy (encoded speech signal streams. We show an influence of the audio encoder (e.g., bitrate on the speaker model quality. The model of each speaker was calculated with the use of the Gaussian mixture model (GMM approach. Both the speaker recognition and the further analysis were realized with the use of short utterances to facilitate real time processing. The neighborhoods of the speaker models were analyzed with the use of the ISOMAP algorithm. The experiments were based on four 1-hour public debates with 7–8 speakers (including the moderator, acquired from the Polish radio Internet services. The presented software was developed with the MATLAB environment.

  12. 'Compromise position' image alignment to accommodate independent motion of multiple clinical target volumes during radiotherapy: A high risk prostate cancer example

    International Nuclear Information System (INIS)

    Rosewall, Tara; Alasti, Hamideh; Bayley, Andrew; Yan, Jing

    2017-01-01

    Inclusion of multiple independently moving clinical target volumes (CTVs) in the irradiated volume causes an image guidance conundrum. The purpose of this research was to use high risk prostate cancer as a clinical example to evaluate a 'compromise' image alignment strategy. The daily pre-treatment orthogonal EPI for 14 consecutive patients were included in this analysis. Image matching was performed by aligning to the prostate only, the bony pelvis only and using the 'compromise' strategy. Residual CTV surrogate displacements were quantified for each of the alignment strategies. Analysis of the 388 daily fractions indicated surrogate displacements were well-correlated in all directions (r 2 = 0.95 (LR), 0.67 (AP) and 0.59 (SI). Differences between the surrogates displacements (95% range) were −0.4 to 1.8 mm (LR), −1.2 to 5.2 mm (SI) and −1.2 to 5.2 mm (AP). The distribution of the residual displacements was significantly smaller using the 'compromise' strategy, compared to the other strategies (p 0.005). The 'compromise' strategy ensured the CTV was encompassed by the PTV in all fractions, compared to 47 PTV violations when aligned to prostate only. This study demonstrated the feasibility of a compromise position image guidance strategy to accommodate simultaneous displacements of two independently moving CTVs. Application of this strategy was facilitated by correlation between the CTV displacements and resulted in no geometric excursions of the CTVs beyond standard sized PTVs. This simple image guidance strategy may also be applicable to other disease sites that concurrently irradiate multiple CTVs, such as head and neck, lung and cervix cancer.

  13. Accent Attribution in Speakers with Foreign Accent Syndrome

    Science.gov (United States)

    Verhoeven, Jo; De Pauw, Guy; Pettinato, Michele; Hirson, Allen; Van Borsel, John; Marien, Peter

    2013-01-01

    Purpose: The main aim of this experiment was to investigate the perception of Foreign Accent Syndrome in comparison to speakers with an authentic foreign accent. Method: Three groups of listeners attributed accents to conversational speech samples of 5 FAS speakers which were embedded amongst those of 5 speakers with a real foreign accent and 5…

  14. Comparison of Diarization Tools for Building Speaker Database

    Directory of Open Access Journals (Sweden)

    Eva Kiktova

    2015-01-01

    Full Text Available This paper compares open source diarization toolkits (LIUM, DiarTK, ALIZE-Lia_Ral, which were designed for extraction of speaker identity from audio records without any prior information about the analysed data. The comparative study of used diarization tools was performed for three different types of analysed data (broadcast news - BN and TV shows. Corresponding values of achieved DER measure are presented here. The automatic speaker diarization system developed by LIUM was able to identified speech segments belonging to speakers at very good level. Its segmentation outputs can be used to build a speaker database.

  15. Beyond the language given: the neural correlates of inferring speaker meaning.

    Science.gov (United States)

    Bašnáková, Jana; Weber, Kirsten; Petersson, Karl Magnus; van Berkum, Jos; Hagoort, Peter

    2014-10-01

    Even though language allows us to say exactly what we mean, we often use language to say things indirectly, in a way that depends on the specific communicative context. For example, we can use an apparently straightforward sentence like "It is hard to give a good presentation" to convey deeper meanings, like "Your talk was a mess!" One of the big puzzles in language science is how listeners work out what speakers really mean, which is a skill absolutely central to communication. However, most neuroimaging studies of language comprehension have focused on the arguably much simpler, context-independent process of understanding direct utterances. To examine the neural systems involved in getting at contextually constrained indirect meaning, we used functional magnetic resonance imaging as people listened to indirect replies in spoken dialog. Relative to direct control utterances, indirect replies engaged dorsomedial prefrontal cortex, right temporo-parietal junction and insula, as well as bilateral inferior frontal gyrus and right medial temporal gyrus. This suggests that listeners take the speaker's perspective on both cognitive (theory of mind) and affective (empathy-like) levels. In line with classic pragmatic theories, our results also indicate that currently popular "simulationist" accounts of language comprehension fail to explain how listeners understand the speaker's intended message. © The Author 2013. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.

  16. The relationship between maternal education and the neural substrates of phoneme perception in children: Interactions between socioeconomic status and proficiency level.

    Science.gov (United States)

    Conant, Lisa L; Liebenthal, Einat; Desai, Anjali; Binder, Jeffrey R

    2017-08-01

    Relationships between maternal education (ME) and both behavioral performances and brain activation during the discrimination of phonemic and nonphonemic sounds were examined using fMRI in children with different levels of phoneme categorization proficiency (CP). Significant relationships were found between ME and intellectual functioning and vocabulary, with a trend for phonological awareness. A significant interaction between CP and ME was seen for nonverbal reasoning abilities. In addition, fMRI analyses revealed a significant interaction between CP and ME for phonemic discrimination in left prefrontal cortex. Thus, ME was associated with differential patterns of both neuropsychological performance and brain activation contingent on the level of CP. These results highlight the importance of examining SES effects at different proficiency levels. The pattern of results may suggest the presence of neurobiological differences in the children with low CP that affect the nature of relationships with ME. Copyright © 2017 Elsevier Inc. All rights reserved.

  17. Computer-assisted training of phoneme-grapheme correspondence for children who are deaf and hard of hearing: effects on phonological processing skills.

    Science.gov (United States)

    Nakeva von Mentzer, Cecilia; Lyxell, Björn; Sahlén, Birgitta; Wass, Malin; Lindgren, Magnus; Ors, Marianne; Kallioinen, Petter; Uhlén, Inger

    2013-12-01

    Examine deaf and hard of hearing (DHH) children's phonological processing skills in relation to a reference group of children with normal hearing (NH) at two baselines pre intervention. Study the effects of computer-assisted phoneme-grapheme correspondence training in the children. Specifically analyze possible effects on DHH children's phonological processing skills. The study included 48 children who participated in a computer-assisted intervention study, which focuses on phoneme-grapheme correspondence. Children were 5, 6, and 7 years of age. There were 32 DHH children using cochlear implants (CI) or hearing aids (HA), or both in combination, and 16 children with NH. The study had a quasi-experimental design with three test occasions separated in time by four weeks; baseline 1 and 2 pre intervention, and 3 post intervention. Children performed tasks measuring lexical access, phonological processing, and letter knowledge. All children were asked to practice ten minutes per day at home supported by their parents. NH children outperformed DHH children on the majority of tasks. All children improved their accuracy in phoneme-grapheme correspondence and output phonology as a function of the computer-assisted intervention. For the whole group of children, and specifically for children with CI, a lower initial phonological composite score was associated with a larger phonological change between baseline 2 and post intervention. Finally, 18 DHH children, whereof 11 children with CI, showed specific intervention effects on their phonological processing skills, and strong effect sizes for their improved accuracy of phoneme-grapheme correspondence. For some DHH children phonological processing skills are boosted relatively more by phoneme-grapheme correspondence training. This reflects the reciprocal relationship between phonological change and exposure to and manipulations of letters. Copyright © 2013 Elsevier Ireland Ltd. All rights reserved.

  18. Rapid Naming and Phonemic Awareness in Children with or without Reading Disabilities and/or ADHD

    NARCIS (Netherlands)

    de Groot, Barry J.A.; van den Bos, Kees P.; van der Meulen, Bieuwe F.; Minnaert, Alexander E.M.G.

    2017-01-01

    Employing a large sample of children from Dutch regular elementary schools, this study assessed the contributing and discriminating values of reading disability (RD) and attention-deficit/hyperactivity disorder (ADHD) to two types of phonological processing skills, phonemic awareness (PA) and rapid

  19. Role of Speaker Cues in Attention Inference

    Directory of Open Access Journals (Sweden)

    Jin Joo Lee

    2017-10-01

    Full Text Available Current state-of-the-art approaches to emotion recognition primarily focus on modeling the nonverbal expressions of the sole individual without reference to contextual elements such as the co-presence of the partner. In this paper, we demonstrate that the accurate inference of listeners’ social-emotional state of attention depends on accounting for the nonverbal behaviors of their storytelling partner, namely their speaker cues. To gain a deeper understanding of the role of speaker cues in attention inference, we conduct investigations into real-world interactions of children (5–6 years old storytelling with their peers. Through in-depth analysis of human–human interaction data, we first identify nonverbal speaker cues (i.e., backchannel-inviting cues and listener responses (i.e., backchannel feedback. We then demonstrate how speaker cues can modify the interpretation of attention-related backchannels as well as serve as a means to regulate the responsiveness of listeners. We discuss the design implications of our findings toward our primary goal of developing attention recognition models for storytelling robots, and we argue that social robots can proactively use speaker cues to form more accurate inferences about the attentive state of their human partners.

  20. Error Variability and the Differentiation between Apraxia of Speech and Aphasia with Phonemic Paraphasia

    Science.gov (United States)

    Haley, Katarina L.; Jacks, Adam; Cunningham, Kevin T.

    2013-01-01

    Purpose: This study was conducted to evaluate the clinical utility of error variability for differentiating between apraxia of speech (AOS) and aphasia with phonemic paraphasia. Method: Participants were 32 individuals with aphasia after left cerebral injury. Diagnostic groups were formed on the basis of operationalized measures of recognized…

  1. Speaker and Observer Perceptions of Physical Tension during Stuttering.

    Science.gov (United States)

    Tichenor, Seth; Leslie, Paula; Shaiman, Susan; Yaruss, J Scott

    2017-01-01

    Speech-language pathologists routinely assess physical tension during evaluation of those who stutter. If speakers experience tension that is not visible to clinicians, then judgments of severity may be inaccurate. This study addressed this potential discrepancy by comparing judgments of tension by people who stutter and expert clinicians to determine if clinicians could accurately identify the speakers' experience of physical tension. Ten adults who stutter were audio-video recorded in two speaking samples. Two board-certified specialists in fluency evaluated the samples using the Stuttering Severity Instrument-4 and a checklist adapted for this study. Speakers rated their tension using the same forms, and then discussed their experiences in a qualitative interview so that themes related to physical tension could be identified. The degree of tension reported by speakers was higher than that observed by specialists. Tension in parts of the body that were less visible to the observer (chest, abdomen, throat) was reported more by speakers than by specialists. The thematic analysis revealed that speakers' experience of tension changes over time and that these changes may be related to speakers' acceptance of stuttering. The lack of agreement between speaker and specialist perceptions of tension suggests that using self-reports is a necessary component for supporting the accurate diagnosis of tension in stuttering. © 2018 S. Karger AG, Basel.

  2. Forensic speaker recognition

    NARCIS (Netherlands)

    Meuwly, Didier

    2013-01-01

    The aim of forensic speaker recognition is to establish links between individuals and criminal activities, through audio speech recordings. This field is multidisciplinary, combining predominantly phonetics, linguistics, speech signal processing, and forensic statistics. On these bases, expert-based

  3. Inferring speaker attributes in adductor spasmodic dysphonia: ratings from unfamiliar listeners.

    Science.gov (United States)

    Isetti, Derek; Xuereb, Linnea; Eadie, Tanya L

    2014-05-01

    To determine whether unfamiliar listeners' perceptions of speakers with adductor spasmodic dysphonia (ADSD) differ from control speakers on the parameters of relative age, confidence, tearfulness, and vocal effort and are related to speaker-rated vocal effort or voice-specific quality of life. Twenty speakers with ADSD (including 6 speakers with ADSD plus tremor) and 20 age- and sex-matched controls provided speech recordings, completed a voice-specific quality-of-life instrument (Voice Handicap Index; Jacobson et al., 1997), and rated their own vocal effort. Twenty listeners evaluated speech samples for relative age, confidence, tearfulness, and vocal effort using rating scales. Listeners judged speakers with ADSD as sounding significantly older, less confident, more tearful, and more effortful than control speakers (p < .01). Increased vocal effort was strongly associated with decreased speaker confidence (rs = .88-.89) and sounding more tearful (rs = .83-.85). Self-rated speaker effort was moderately related (rs = .45-.52) to listener impressions. Listeners' perceptions of confidence and tearfulness were also moderately associated with higher Voice Handicap Index scores (rs = .65-.70). Unfamiliar listeners judge speakers with ADSD more negatively than control speakers, with judgments extending beyond typical clinical measures. The results have implications for counseling and understanding the psychosocial effects of ADSD.

  4. Speaker-dependent Multipitch Tracking Using Deep Neural Networks

    Science.gov (United States)

    2015-01-01

    sentences spoken by each of 34 speakers (18 male, 16 female). Two male and two female speakers (No. 1, 2, 18, 20, same as [30]), denoted as MA1, MA2 ...Engineering Technical Report #12, 2015 Speaker Pairs MA1- MA2 MA1-FE1 MA1-FE2 MA2 -FE1 MA2 -FE2 FE1-FE2 E T ot al 0 10 20 30 40 50 60 70 80 Jin and Wang Hu and...Pitch 1 Estimated Pitch 2 (d) Figure 6: Multipitch tracking results on a test mixture (pbbv6n and priv3n) for the MA1- MA2 speaker pair. (a) Groundtruth

  5. Automatic Speaker Recognition for Mobile Forensic Applications

    Directory of Open Access Journals (Sweden)

    Mohammed Algabri

    2017-01-01

    Full Text Available Presently, lawyers, law enforcement agencies, and judges in courts use speech and other biometric features to recognize suspects. In general, speaker recognition is used for discriminating people based on their voices. The process of determining, if a suspected speaker is the source of trace, is called forensic speaker recognition. In such applications, the voice samples are most probably noisy, the recording sessions might mismatch each other, the sessions might not contain sufficient recording for recognition purposes, and the suspect voices are recorded through mobile channel. The identification of a person through his voice within a forensic quality context is challenging. In this paper, we propose a method for forensic speaker recognition for the Arabic language; the King Saud University Arabic Speech Database is used for obtaining experimental results. The advantage of this database is that each speaker’s voice is recorded in both clean and noisy environments, through a microphone and a mobile channel. This diversity facilitates its usage in forensic experimentations. Mel-Frequency Cepstral Coefficients are used for feature extraction and the Gaussian mixture model-universal background model is used for speaker modeling. Our approach has shown low equal error rates (EER, within noisy environments and with very short test samples.

  6. Request Strategies in Everyday Interactions of Persian and English Speakers

    Directory of Open Access Journals (Sweden)

    Shiler Yazdanfar

    2016-12-01

    Full Text Available Cross-cultural studies of speech acts in different linguistic contexts might have interesting implications for language researchers and practitioners. Drawing on the Speech Act Theory, the present study aimed at conducting a comparative study of request speech act in Persian and English. Specifically, the study endeavored to explore the request strategies used in daily interactions of Persian and English speakers based on directness level and supportive moves. To this end, English and Persian TV series were observed and requestive utterances were transcribed. The utterances were then categorized based on Blum-Kulka and Olshtain’s Cross-Cultural Study of Speech Act Realization Pattern (CCSARP for directness level and internal and external mitigation devises. According to the results, although speakers of both languages opted for the direct level as their most frequently used strategy in their daily interactions, the English speakers used more conventionally indirect strategies than the Persian speakers did, and the Persian speakers used more non-conventionally indirect strategies than the English speakers did. Furthermore, the analyzed data revealed the fact that American English speakers use more mitigation devices in their daily interactions with friends and family members than Persian speakers.

  7. Development and Evaluation of a Speech Recognition Test for Persian Speaking Adults

    Directory of Open Access Journals (Sweden)

    Mohammad Mosleh

    2001-05-01

    Full Text Available Method and Materials: This research is carried out for development and evaluation of 25 phonemically balanced word lists for Persian speaking adults in two separate stages: development and evaluation. In the first stage, in order to balance the lists phonemically, frequency -of- occurrences of each 29phonems (6 vowels and 23 Consonants of the Persian language in adults speech are determined. This section showed some significant differences between some phonemes' frequencies. Then, all Persian monosyllabic words extracted from the Mo ‘in Persian dictionary. The semantically difficult words were refused and the appropriate words choosed according to judgment of 5 adult native speakers of Persian with high school diploma. 12 openset 25 word lists are prepared. The lists were recorded on magnetic tapes in an audio studio by a professional speaker of IRIB. "nIn the second stage, in order to evaluate the test's validity and reliability, 60 normal hearing adults (30 male, 30 female, were randomly selected and evaluated as test and retest. Findings: 1- Normal hearing adults obtained 92-1 0O scores for each list at their MCL through test-retest. 2- No significant difference was observed a/ in test-retest scores in each list (‘P>O.05 b/ between the lists at test or retest scores (P>0.05, c/between sex (P>0.05. Conclusion: This research is reliable and valid, the lists are phonemically balanced and equal in difficulty and valuable for evaluation of Persian speaking adults speech recognition.

  8. Poor Phonemic Discrimination Does Not Underlie Poor Verbal Short-Term Memory in Down Syndrome

    Science.gov (United States)

    Purser, Harry R. M.; Jarrold, Christopher

    2013-01-01

    Individuals with Down syndrome tend to have a marked impairment of verbal short-term memory. The chief aim of this study was to investigate whether phonemic discrimination contributes to this deficit. The secondary aim was to investigate whether phonological representations are degraded in verbal short-term memory in people with Down syndrome…

  9. Learning of a Formation Principle for the Secondary Phonemic Function of a Syllabic Orthography

    Science.gov (United States)

    Fletcher-Flinn, Claire M.; Thompson, G. Brian; Yamada, Megumi; Meissel, Kane

    2014-01-01

    It has been observed in Japanese children learning to read that there is an early and rapid shift from exclusive reading of hiragana as syllabograms to the dual-use convention in which some hiragana also represent phonemic elements. Such rapid initial learning appears contrary to the standard theories of reading acquisition that require…

  10. Pronounceability: a measure of language samples based on children's mastery of the phonemes employed in them.

    Science.gov (United States)

    Whissell, Cynthia

    2003-06-01

    56 samples (n > half a million phonemes) of names (e.g., men's, women's jets'), song lyrics (e.g., Paul Simon's, rap, Beatles'), poems (frequently anthologized English poems), and children's materials (books directed at children ages 3-10 years) were used to study a proposed new measure of English language samples--Pronounceability-based on children's mastery of some phonemes in advance of others. This measure was provisionally equated with greater "youthfulness" and "playfulness" in language samples and with less "maturity." Findings include the facts that women's names were less pronounceable than men's and that poetry was less pronounceable than song lyrics or children's materials. In a supplementary study, 13 university student volunteers' assessments of the youth of randomly constructed names was linearly related to how pronounceable each name was (eta = .8), providing construct validity for the interpretation of Pronounceability as a measure of Youthfulness.

  11. FENICIA: a generic plasma simulation code using a flux-independent field-aligned coordinate approach

    International Nuclear Information System (INIS)

    Hariri, Farah

    2013-01-01

    The primary thrust of this work is the development and implementation of a new approach to the problem of field-aligned coordinates in magnetized plasma turbulence simulations called the FCI approach (Flux-Coordinate Independent). The method exploits the elongated nature of micro-instability driven turbulence which typically has perpendicular scales on the order of a few ion gyro-radii, and parallel scales on the order of the machine size. Mathematically speaking, it relies on local transformations that align a suitable coordinate to the magnetic field to allow efficient computation of the parallel derivative. However, it does not rely on flux coordinates, which permits discretizing any given field on a regular grid in the natural coordinates such as (x, y, z) in the cylindrical limit. The new method has a number of advantages over methods constructed starting from flux coordinates, allowing for more flexible coding in a variety of situations including X-point configurations. In light of these findings, a plasma simulation code FENICIA has been developed based on the FCI approach with the ability to tackle a wide class of physical models. The code has been verified on several 3D test models. The accuracy of the approach is tested in particular with respect to the question of spurious radial transport. Tests on 3D models of the drift wave propagation and of the Ion Temperature Gradient (ITG) instability in cylindrical geometry in the linear regime demonstrate again the high quality of the numerical method. Finally, the FCI approach is shown to be able to deal with an X-point configuration such as one with a magnetic island with good convergence and conservation properties. (author) [fr

  12. Speaker Reliability Guides Children's Inductive Inferences about Novel Properties

    Science.gov (United States)

    Kim, Sunae; Kalish, Charles W.; Harris, Paul L.

    2012-01-01

    Prior work shows that children can make inductive inferences about objects based on their labels rather than their appearance (Gelman, 2003). A separate line of research shows that children's trust in a speaker's label is selective. Children accept labels from a reliable speaker over an unreliable speaker (e.g., Koenig & Harris, 2005). In the…

  13. Guest Speakers in School-Based Sexuality Education

    Science.gov (United States)

    McRee, Annie-Laurie; Madsen, Nikki; Eisenberg, Marla E.

    2014-01-01

    This study, using data from a statewide survey (n = 332), examined teachers' practices regarding the inclusion of guest speakers to cover sexuality content. More than half of teachers (58%) included guest speakers. In multivariate analyses, teachers who taught high school, had professional preparation in health education, or who received…

  14. From Hearing Sounds to Recognizing Phonemes: Primary Auditory Cortex is A Truly Perceptual Language Area

    Directory of Open Access Journals (Sweden)

    Byron Bernal

    2016-11-01

    Full Text Available The aim of this article is to present a systematic review about the anatomy, function, connectivity, and functional activation of the primary auditory cortex (PAC (Brodmann areas 41/42 when involved in language paradigms. PAC activates with a plethora of diverse basic stimuli including but not limited to tones, chords, natural sounds, consonants, and speech. Nonetheless, the PAC shows specific sensitivity to speech. Damage in the PAC is associated with so-called “pure word-deafness” (“auditory verbal agnosia”. BA41, and to a lesser extent BA42, are involved in early stages of phonological processing (phoneme recognition. Phonological processing may take place in either the right or left side, but customarily the left exerts an inhibitory tone over the right, gaining dominance in function. BA41/42 are primary auditory cortices harboring complex phoneme perception functions with asymmetrical expression, making it possible to include them as core language processing areas (Wernicke’s area.

  15. The Predictive Power of Phonemic Awareness and Naming Speed for Early Dutch Word Recognition

    Science.gov (United States)

    Verhagen, Wim G. M.; Aarnoutse, Cor A. J.; van Leeuwe, Jan F. J.

    2009-01-01

    Effects of phonemic awareness and naming speed on the speed and accuracy of Dutch children's word recognition were investigated in a longitudinal study. Both the speed and accuracy of word recognition at the end of Grade 2 were predicted by naming speed from both kindergarten and Grade 1, after control for autoregressive relations, kindergarten…

  16. The relation between working memory and language comprehension in signers and speakers.

    Science.gov (United States)

    Emmorey, Karen; Giezen, Marcel R; Petrich, Jennifer A F; Spurgeon, Erin; O'Grady Farnady, Lucinda

    2017-06-01

    This study investigated the relation between linguistic and spatial working memory (WM) resources and language comprehension for signed compared to spoken language. Sign languages are both linguistic and visual-spatial, and therefore provide a unique window on modality-specific versus modality-independent contributions of WM resources to language processing. Deaf users of American Sign Language (ASL), hearing monolingual English speakers, and hearing ASL-English bilinguals completed several spatial and linguistic serial recall tasks. Additionally, their comprehension of spatial and non-spatial information in ASL and spoken English narratives was assessed. Results from the linguistic serial recall tasks revealed that the often reported advantage for speakers on linguistic short-term memory tasks does not extend to complex WM tasks with a serial recall component. For English, linguistic WM predicted retention of non-spatial information, and both linguistic and spatial WM predicted retention of spatial information. For ASL, spatial WM predicted retention of spatial (but not non-spatial) information, and linguistic WM did not predict retention of either spatial or non-spatial information. Overall, our findings argue against strong assumptions of independent domain-specific subsystems for the storage and processing of linguistic and spatial information and furthermore suggest a less important role for serial encoding in signed than spoken language comprehension. Copyright © 2017 Elsevier B.V. All rights reserved.

  17. The Communication of Public Speaking Anxiety: Perceptions of Asian and American Speakers.

    Science.gov (United States)

    Martini, Marianne; And Others

    1992-01-01

    Finds that U.S. audiences perceive Asian speakers to have more speech anxiety than U.S. speakers, even though Asian speakers do not self-report higher anxiety levels. Confirms that speech state anxiety is not communicated effectively between speakers and audiences for Asian or U.S. speakers. (SR)

  18. Pareto optimal pairwise sequence alignment.

    Science.gov (United States)

    DeRonne, Kevin W; Karypis, George

    2013-01-01

    Sequence alignment using evolutionary profiles is a commonly employed tool when investigating a protein. Many profile-profile scoring functions have been developed for use in such alignments, but there has not yet been a comprehensive study of Pareto optimal pairwise alignments for combining multiple such functions. We show that the problem of generating Pareto optimal pairwise alignments has an optimal substructure property, and develop an efficient algorithm for generating Pareto optimal frontiers of pairwise alignments. All possible sets of two, three, and four profile scoring functions are used from a pool of 11 functions and applied to 588 pairs of proteins in the ce_ref data set. The performance of the best objective combinations on ce_ref is also evaluated on an independent set of 913 protein pairs extracted from the BAliBASE RV11 data set. Our dynamic-programming-based heuristic approach produces approximated Pareto optimal frontiers of pairwise alignments that contain comparable alignments to those on the exact frontier, but on average in less than 1/58th the time in the case of four objectives. Our results show that the Pareto frontiers contain alignments whose quality is better than the alignments obtained by single objectives. However, the task of identifying a single high-quality alignment among those in the Pareto frontier remains challenging.

  19. The role of beliefs in lexical alignment: evidence from dialogs with humans and computers.

    Science.gov (United States)

    Branigan, Holly P; Pickering, Martin J; Pearson, Jamie; McLean, Janet F; Brown, Ash

    2011-10-01

    Five experiments examined the extent to which speakers' alignment (i.e., convergence) on words in dialog is mediated by beliefs about their interlocutor. To do this, we told participants that they were interacting with another person or a computer in a task in which they alternated between selecting pictures that matched their 'partner's' descriptions and naming pictures themselves (though in reality all responses were scripted). In both text- and speech-based dialog, participants tended to repeat their partner's choice of referring expression. However, they showed a stronger tendency to align with 'computer' than with 'human' partners, and with computers that were presented as less capable than with computers that were presented as more capable. The tendency to align therefore appears to be mediated by beliefs, with the relevant beliefs relating to an interlocutor's perceived communicative capacity. Copyright © 2011 Elsevier B.V. All rights reserved.

  20. Exploring Phonetic Realization in Danish by Transformation-Based Learning

    DEFF Research Database (Denmark)

    Uneson, Marcus; Schachtenhaufen, Ruben

    2011-01-01

    We align phonemic and semi-narrow phonetic transcriptions in the DanPASS corpus and extend the phonemic description with sound classes and with traditional phonetic features. From this representation, we induce rules for phonetic realization by Transformation-Based Learning (TBL). The rules thus ...

  1. Application of Native Speaker Models for Identifying Deviations in Rhetorical Moves in Non-Native Speaker Manuscripts

    Directory of Open Access Journals (Sweden)

    Assef Khalili

    2016-06-01

    Full Text Available Introduction: Explicit teaching of generic conventions of a text genre, usually extracted from native-speaker (NS manuscripts, has long been emphasized in the teaching of Academic Writing inEnglish for Specific Purposes (henceforthESP classes, both in theory and practice. While consciousness-raising about rhetorical structure can be instrumental to non-native speakers(NNS, it has to be admitted that most works done in the field of ESP have tended to focus almost exclusively on native-speaker (NS productions, giving scant attention to non-native speaker (NNS manuscripts. That is, having outlined established norms for good writing on the basis of NS productions, few have been inclined to provide a descriptive account of NNS attempts at trying to produce a research article (RA in English. That is what we have tried to do in the present research. Methods: We randomly selected 20 RAs in dentistry and used two well-established models for results and discussion sections to try to describe the move structure of these articles and show the points of divergence from the established norms. Results: The results pointed to significant divergences that could seriously compromise the quality of an RA. Conclusion: It is believed that the insights gained on the deviations in NNS manuscripts could prove very useful in designing syllabi for ESP classes.

  2. Speaker Clustering for a Mixture of Singing and Reading (Preprint)

    Science.gov (United States)

    2012-03-01

    diarization [2, 3] which answers the ques- tion of ”who spoke when?” is a combination of speaker segmentation and clustering. Although it is possible to...focuses on speaker clustering, the techniques developed here can be applied to speaker diarization . For the remainder of this paper, the term ”speech...and retrieval,” Proceedings of the IEEE, vol. 88, 2000. [2] S. Tranter and D. Reynolds, “An overview of automatic speaker diarization systems,” IEEE

  3. Human and automatic speaker recognition over telecommunication channels

    CERN Document Server

    Fernández Gallardo, Laura

    2016-01-01

    This work addresses the evaluation of the human and the automatic speaker recognition performances under different channel distortions caused by bandwidth limitation, codecs, and electro-acoustic user interfaces, among other impairments. Its main contribution is the demonstration of the benefits of communication channels of extended bandwidth, together with an insight into how speaker-specific characteristics of speech are preserved through different transmissions. It provides sufficient motivation for considering speaker recognition as a criterion for the migration from narrowband to enhanced bandwidths, such as wideband and super-wideband.

  4. Electrophysiology of subject-verb agreement mediated by speakers' gender.

    Science.gov (United States)

    Hanulíková, Adriana; Carreiras, Manuel

    2015-01-01

    An important property of speech is that it explicitly conveys features of a speaker's identity such as age or gender. This event-related potential (ERP) study examined the effects of social information provided by a speaker's gender, i.e., the conceptual representation of gender, on subject-verb agreement. Despite numerous studies on agreement, little is known about syntactic computations generated by speaker characteristics extracted from the acoustic signal. Slovak is well suited to investigate this issue because it is a morphologically rich language in which agreement involves features for number, case, and gender. Grammaticality of a sentence can be evaluated by checking a speaker's gender as conveyed by his/her voice. We examined how conceptual information about speaker gender, which is not syntactic but rather social and pragmatic in nature, is interpreted for the computation of agreement patterns. ERP responses to verbs disagreeing with the speaker's gender (e.g., a sentence including a masculine verbal inflection spoken by a female person 'the neighbors were upset because I (∗)stoleMASC plums') elicited a larger early posterior negativity compared to correct sentences. When the agreement was purely syntactic and did not depend on the speaker's gender, a disagreement between a formally marked subject and the verb inflection (e.g., the womanFEM (∗)stoleMASC plums) resulted in a larger P600 preceded by a larger anterior negativity compared to the control sentences. This result is in line with proposals according to which the recruitment of non-syntactic information such as the gender of the speaker results in N400-like effects, while formally marked syntactic features lead to structural integration as reflected in a LAN/P600 complex.

  5. Molecular-alignment dependence in the transfer excitation of H2

    International Nuclear Information System (INIS)

    Wang, Y.D.; McGuire, J.H.; Weaver, O.L.; Corchs, S.E.; Rivarola, R.D.

    1993-01-01

    Molecular-alignment effects in the transfer excitation of H 2 by high-velocity heavy ions are studied using a two-step mechanism with amplitudes evaluated from first-order perturbation theory. Two-electron transfer excitation is treated as a result of two independent collision processes (excitation and electron transfer). Cross sections for each one-electron subprocess as well as the combined two-electron process are calculated as functions of the molecular-alignment angle. Within the independent-electron approximation, the dynamic roles of electron excitation and transfer in conjunction with molecular alignment are explored. While both excitation and transfer cross sections may strongly depend on molecular alignment, it is electron transfer that is largely responsible for the molecular-alignment dependence in the transfer excitation process. Interpretation of some experimental observations based on this model will also be discussed

  6. Speaker-sensitive emotion recognition via ranking: Studies on acted and spontaneous speech☆

    Science.gov (United States)

    Cao, Houwei; Verma, Ragini; Nenkova, Ani

    2015-01-01

    We introduce a ranking approach for emotion recognition which naturally incorporates information about the general expressivity of speakers. We demonstrate that our approach leads to substantial gains in accuracy compared to conventional approaches. We train ranking SVMs for individual emotions, treating the data from each speaker as a separate query, and combine the predictions from all rankers to perform multi-class prediction. The ranking method provides two natural benefits. It captures speaker specific information even in speaker-independent training/testing conditions. It also incorporates the intuition that each utterance can express a mix of possible emotion and that considering the degree to which each emotion is expressed can be productively exploited to identify the dominant emotion. We compare the performance of the rankers and their combination to standard SVM classification approaches on two publicly available datasets of acted emotional speech, Berlin and LDC, as well as on spontaneous emotional data from the FAU Aibo dataset. On acted data, ranking approaches exhibit significantly better performance compared to SVM classification both in distinguishing a specific emotion from all others and in multi-class prediction. On the spontaneous data, which contains mostly neutral utterances with a relatively small portion of less intense emotional utterances, ranking-based classifiers again achieve much higher precision in identifying emotional utterances than conventional SVM classifiers. In addition, we discuss the complementarity of conventional SVM and ranking-based classifiers. On all three datasets we find dramatically higher accuracy for the test items on whose prediction the two methods agree compared to the accuracy of individual methods. Furthermore on the spontaneous data the ranking and standard classification are complementary and we obtain marked improvement when we combine the two classifiers by late-stage fusion.

  7. Speakers of different languages process the visual world differently.

    Science.gov (United States)

    Chabal, Sarah; Marian, Viorica

    2015-06-01

    Language and vision are highly interactive. Here we show that people activate language when they perceive the visual world, and that this language information impacts how speakers of different languages focus their attention. For example, when searching for an item (e.g., clock) in the same visual display, English and Spanish speakers look at different objects. Whereas English speakers searching for the clock also look at a cloud, Spanish speakers searching for the clock also look at a gift, because the Spanish names for gift (regalo) and clock (reloj) overlap phonologically. These different looking patterns emerge despite an absence of direct language input, showing that linguistic information is automatically activated by visual scene processing. We conclude that the varying linguistic information available to speakers of different languages affects visual perception, leading to differences in how the visual world is processed. (c) 2015 APA, all rights reserved).

  8. Multimodal Speaker Diarization

    NARCIS (Netherlands)

    Noulas, A.; Englebienne, G.; Kröse, B.J.A.

    2012-01-01

    We present a novel probabilistic framework that fuses information coming from the audio and video modality to perform speaker diarization. The proposed framework is a Dynamic Bayesian Network (DBN) that is an extension of a factorial Hidden Markov Model (fHMM) and models the people appearing in an

  9. Effects of single-channel phonemic compression schemes on the understanding of speech by hearing-impaired listeners

    NARCIS (Netherlands)

    Goedegebure, A.; Hulshof, M.; Maas, R. J.; Dreschler, W. A.; Verschuure, H.

    2001-01-01

    The effect of digital processing on speech intelligibility was studied in hearing-impaired listeners with moderate to severe high-frequency losses. The amount of smoothed phonemic compression in a high-frequency channel was varied using wide-band control. Two alternative systems were tested to

  10. Evaluation of phoneme compression schemes designed to compensate for temporal and spectral masking in background noise

    NARCIS (Netherlands)

    Goedegebure, A.; Goedegebure-Hulshof, M.; Dreschler, W. A.; Verschuure, J.

    2005-01-01

    The effect of phonemic compression has been studied on speech intelligibility in background noise in hearing-impaired listeners with moderate-to-severe high-frequency losses. One configuration, anti-upward-spread-of-masking (anti-USOM) focuses on a release from spectral masking of high-frequency

  11. A spatial mechanism for pilot laser alignment with four independently controlled degrees of freedom

    NARCIS (Netherlands)

    Kreutz, Ernst-Wolfgang; Meijer, J.; Quenzer, A.; Schuöcker, Dieter

    1987-01-01

    Alignment mechanism for optical components, such as mirrors for manipulating laser beams, frequently require four degrees of freedom: two translations and two rotations, i.e. a four axis system. When the adjustment of one axis influences the others, as often will be the case, alignment procedures

  12. Studies on inter-speaker variability in speech and its application in ...

    Indian Academy of Sciences (India)

    tic representation of vowel realizations by different speakers. ... in regional background, education level and gender of speaker. A more ...... formal maps such as bilinear transform and its generalizations for speaker normalization. Since.

  13. Content-specific coordination of listeners' to speakers' EEG during communication.

    Science.gov (United States)

    Kuhlen, Anna K; Allefeld, Carsten; Haynes, John-Dylan

    2012-01-01

    Cognitive neuroscience has recently begun to extend its focus from the isolated individual mind to two or more individuals coordinating with each other. In this study we uncover a coordination of neural activity between the ongoing electroencephalogram (EEG) of two people-a person speaking and a person listening. The EEG of one set of twelve participants ("speakers") was recorded while they were narrating short stories. The EEG of another set of twelve participants ("listeners") was recorded while watching audiovisual recordings of these stories. Specifically, listeners watched the superimposed videos of two speakers simultaneously and were instructed to attend either to one or the other speaker. This allowed us to isolate neural coordination due to processing the communicated content from the effects of sensory input. We find several neural signatures of communication: First, the EEG is more similar among listeners attending to the same speaker than among listeners attending to different speakers, indicating that listeners' EEG reflects content-specific information. Secondly, listeners' EEG activity correlates with the attended speakers' EEG, peaking at a time delay of about 12.5 s. This correlation takes place not only between homologous, but also between non-homologous brain areas in speakers and listeners. A semantic analysis of the stories suggests that listeners coordinate with speakers at the level of complex semantic representations, so-called "situation models". With this study we link a coordination of neural activity between individuals directly to verbally communicated information.

  14. Forensic Speaker Recognition Law Enforcement and Counter-Terrorism

    CERN Document Server

    Patil, Hemant

    2012-01-01

    Forensic Speaker Recognition: Law Enforcement and Counter-Terrorism is an anthology of the research findings of 35 speaker recognition experts from around the world. The volume provides a multidimensional view of the complex science involved in determining whether a suspect’s voice truly matches forensic speech samples, collected by law enforcement and counter-terrorism agencies, that are associated with the commission of a terrorist act or other crimes. While addressing such topics as the challenges of forensic case work, handling speech signal degradation, analyzing features of speaker recognition to optimize voice verification system performance, and designing voice applications that meet the practical needs of law enforcement and counter-terrorism agencies, this material all sounds a common theme: how the rigors of forensic utility are demanding new levels of excellence in all aspects of speaker recognition. The contributors are among the most eminent scientists in speech engineering and signal process...

  15. NIPS Workshop: Advances in Acoustic Models

    DEFF Research Database (Denmark)

    Feng, Ling; Hansen, Lars Kai

    We discuss the cognitive components of speech at different time scales. We investigate cognitive features of speech including phoneme, gender, height, speaker identity. Integration by feature stacking based on short time MFCCs. Our hypothesis is basically ecological: we assume that features...

  16. What's so funny about ha-ħa?

    DEFF Research Database (Denmark)

    Nielsen, Andreas Højlund; Spyrou, Loukianos; Sadakata, Makiko

    In this paper we present electrophysiological and behavioral results from a study on non-native speakers listening to, identifying and discriminating an Arabic phonemic contrast, [h] versus [ħ] (emphatic "h"). The non-native speakers were native Dutch speakers with no knowledge of Arabic. Arabic...... or decreasing glottal frication, unlike native Arabic speakers. And we further hypothesized that the native Dutch speakers would therefore show mismatch negativities (MMNs) significantly lower in amplitude to the Arabic than to a larger native Dutch contrast between [h] and [f], potentially reflecting...... a difference in categorical perception due to native language tuning. We did not find support of this hypothesis. Instead, we saw MMN responses in the native Dutch speakers that were stronger to the Arabic contrast ([h] vs. [ħ]) than to the native Dutch contrast of ([h] vs. [f])....

  17. Gricean Semantics and Vague Speaker-Meaning

    OpenAIRE

    Schiffer, Stephen

    2017-01-01

    Presentations of Gricean semantics, including Stephen Neale’s in “Silent Reference,” totally ignore vagueness, even though virtually every utterance is vague. I ask how Gricean semantics might be adjusted to accommodate vague speaker-meaning. My answer is that it can’t accommodate it: the Gricean program collapses in the face of vague speaker-meaning. The Gricean might, however, fi nd some solace in knowing that every other extant meta-semantic and semantic program is in the same boat.

  18. Effect of lisping on audience evaluation of male speakers.

    Science.gov (United States)

    Mowrer, D E; Wahl, P; Doolan, S J

    1978-05-01

    The social consequences of adult listeners' first impression of lisping were evaluated in two studies. Five adult speakers were rated by adult listeners with regard to speaking ability, intelligence, education, masculinity, and friendship. Results from both studies indicate that listeners rate adult speakers who demonstrate frontal lisping lower than nonlispers in all five categories investigated. Efforts to correct frontal lisping are justifiable on the basis of the poor impression lisping speakers make on the listener.

  19. Effect of speech-intrinsic variations on human and automatic recognition of spoken phonemes.

    Science.gov (United States)

    Meyer, Bernd T; Brand, Thomas; Kollmeier, Birger

    2011-01-01

    The aim of this study is to quantify the gap between the recognition performance of human listeners and an automatic speech recognition (ASR) system with special focus on intrinsic variations of speech, such as speaking rate and effort, altered pitch, and the presence of dialect and accent. Second, it is investigated if the most common ASR features contain all information required to recognize speech in noisy environments by using resynthesized ASR features in listening experiments. For the phoneme recognition task, the ASR system achieved the human performance level only when the signal-to-noise ratio (SNR) was increased by 15 dB, which is an estimate for the human-machine gap in terms of the SNR. The major part of this gap is attributed to the feature extraction stage, since human listeners achieve comparable recognition scores when the SNR difference between unaltered and resynthesized utterances is 10 dB. Intrinsic variabilities result in strong increases of error rates, both in human speech recognition (HSR) and ASR (with a relative increase of up to 120%). An analysis of phoneme duration and recognition rates indicates that human listeners are better able to identify temporal cues than the machine at low SNRs, which suggests incorporating information about the temporal dynamics of speech into ASR systems.

  20. Consistency between verbal and non-verbal affective cues: a clue to speaker credibility.

    Science.gov (United States)

    Gillis, Randall L; Nilsen, Elizabeth S

    2017-06-01

    Listeners are exposed to inconsistencies in communication; for example, when speakers' words (i.e. verbal) are discrepant with their demonstrated emotions (i.e. non-verbal). Such inconsistencies introduce ambiguity, which may render a speaker to be a less credible source of information. Two experiments examined whether children make credibility discriminations based on the consistency of speakers' affect cues. In Experiment 1, school-age children (7- to 8-year-olds) preferred to solicit information from consistent speakers (e.g. those who provided a negative statement with negative affect), over novel speakers, to a greater extent than they preferred to solicit information from inconsistent speakers (e.g. those who provided a negative statement with positive affect) over novel speakers. Preschoolers (4- to 5-year-olds) did not demonstrate this preference. Experiment 2 showed that school-age children's ratings of speakers were influenced by speakers' affect consistency when the attribute being judged was related to information acquisition (speakers' believability, "weird" speech), but not general characteristics (speakers' friendliness, likeability). Together, findings suggest that school-age children are sensitive to, and use, the congruency of affect cues to determine whether individuals are credible sources of information.

  1. Cost-Sensitive Learning for Emotion Robust Speaker Recognition

    Directory of Open Access Journals (Sweden)

    Dongdong Li

    2014-01-01

    Full Text Available In the field of information security, voice is one of the most important parts in biometrics. Especially, with the development of voice communication through the Internet or telephone system, huge voice data resources are accessed. In speaker recognition, voiceprint can be applied as the unique password for the user to prove his/her identity. However, speech with various emotions can cause an unacceptably high error rate and aggravate the performance of speaker recognition system. This paper deals with this problem by introducing a cost-sensitive learning technology to reweight the probability of test affective utterances in the pitch envelop level, which can enhance the robustness in emotion-dependent speaker recognition effectively. Based on that technology, a new architecture of recognition system as well as its components is proposed in this paper. The experiment conducted on the Mandarin Affective Speech Corpus shows that an improvement of 8% identification rate over the traditional speaker recognition is achieved.

  2. Cost-sensitive learning for emotion robust speaker recognition.

    Science.gov (United States)

    Li, Dongdong; Yang, Yingchun; Dai, Weihui

    2014-01-01

    In the field of information security, voice is one of the most important parts in biometrics. Especially, with the development of voice communication through the Internet or telephone system, huge voice data resources are accessed. In speaker recognition, voiceprint can be applied as the unique password for the user to prove his/her identity. However, speech with various emotions can cause an unacceptably high error rate and aggravate the performance of speaker recognition system. This paper deals with this problem by introducing a cost-sensitive learning technology to reweight the probability of test affective utterances in the pitch envelop level, which can enhance the robustness in emotion-dependent speaker recognition effectively. Based on that technology, a new architecture of recognition system as well as its components is proposed in this paper. The experiment conducted on the Mandarin Affective Speech Corpus shows that an improvement of 8% identification rate over the traditional speaker recognition is achieved.

  3. Young Children's Sensitivity to Speaker Gender When Learning from Others

    Science.gov (United States)

    Ma, Lili; Woolley, Jacqueline D.

    2013-01-01

    This research explores whether young children are sensitive to speaker gender when learning novel information from others. Four- and 6-year-olds ("N" = 144) chose between conflicting statements from a male versus a female speaker (Studies 1 and 3) or decided which speaker (male or female) they would ask (Study 2) when learning about the functions…

  4. Fluency profile: comparison between Brazilian and European Portuguese speakers.

    Science.gov (United States)

    Castro, Blenda Stephanie Alves e; Martins-Reis, Vanessa de Oliveira; Baptista, Ana Catarina; Celeste, Letícia Correa

    2014-01-01

    The purpose of the study was to compare the speech fluency of Brazilian Portuguese speakers with that of European Portuguese speakers. The study participants were 76 individuals of any ethnicity or skin color aged 18-29 years. Of the participants, 38 lived in Brazil and 38 in Portugal. Speech samples from all participants were obtained and analyzed according to the variables of typology and frequency of speech disruptions and speech rate. Descriptive and inferential statistical analyses were performed to assess the association between the fluency profile and linguistic variant variables. We found that the speech rate of European Portuguese speakers was higher than the speech rate of Brazilian Portuguese speakers in words per minute (p=0.004). The qualitative distribution of the typology of common dysfluencies (pPortuguese speakers is not available, speech therapists in Portugal can use the same speech fluency assessment as has been used in Brazil to establish a diagnosis of stuttering, especially in regard to typical and stuttering dysfluencies, with care taken when evaluating the speech rate.

  5. Direct Speaker Gaze Promotes Trust in Truth-Ambiguous Statements.

    Science.gov (United States)

    Kreysa, Helene; Kessler, Luise; Schweinberger, Stefan R

    2016-01-01

    A speaker's gaze behaviour can provide perceivers with a multitude of cues which are relevant for communication, thus constituting an important non-verbal interaction channel. The present study investigated whether direct eye gaze of a speaker affects the likelihood of listeners believing truth-ambiguous statements. Participants were presented with videos in which a speaker produced such statements with either direct or averted gaze. The statements were selected through a rating study to ensure that participants were unlikely to know a-priori whether they were true or not (e.g., "sniffer dogs cannot smell the difference between identical twins"). Participants indicated in a forced-choice task whether or not they believed each statement. We found that participants were more likely to believe statements by a speaker looking at them directly, compared to a speaker with averted gaze. Moreover, when participants disagreed with a statement, they were slower to do so when the statement was uttered with direct (compared to averted) gaze, suggesting that the process of rejecting a statement as untrue may be inhibited when that statement is accompanied by direct gaze.

  6. Phonemic verbal fluency task in adults with high-level literacy.

    Science.gov (United States)

    Opasso, Patrícia Romano; Barreto, Simone Dos Santos; Ortiz, Karin Zazo

    2016-01-01

    To establish normative parameters for the F-A-S form of the phonemic verbal fluency test, in a population of Brazilian Portuguese speaking adults with high-level literacy. The sample comprised 40 male and female volunteers aged 19 to 59 years, and at least 8 years of formal education. Volunteers were first submitted to the Mini-Mental State Examination and the Clock Drawing cognitive screening tests, then to the F-A-S Verbal Phonemic Fluency Test; in this test, examinees were given 60 seconds to generate as many words as possible beginning with each of the three test letters. The means for number of words beginning the letters F, A and S and for total number of words beginning with either letter generated per minute corresponded to 15.3, 14.4, 13.9 and 43.5, respectively. Reference values obtained from young adults with high levels of literacy submitted to the F-A-S Verbal Phonemic Fluency Test in this study were similar to those reported in the international literature. These reference values can be used for clinical assessment of language disorder and neuropsychological evaluation. Obter parâmetros de normalidade na tarefa de fluência verbal fonêmica, versão F-A-S, em uma população de alto letramento de adultos falantes do português brasileiro. A amostra foi constituída por 40 voluntários, de ambos os sexos, com idade entre 19 e 59 anos, e com mais de 8 anos de estudo. Todos os voluntários foram inicialmente submetidos ao Miniexame do Estado Mental e ao Teste do Desenho do Relógio, para fins de rastreio cognitivo, e, então, ao Teste de Fluência Verbal Fonêmica F-A-S. Neste último, os indivíduos foram orientados a produzirem o maior número de palavras que conseguissem, iniciadas com cada uma das três letras ditas pelo examinador, em um intervalo de 60 segundos cada. As médias das palavras produzidas com as letras F-A-S foram as seguintes: "F" = 15,3 palavras por minuto; "A" = 14,4 palavras por minuto; e "S" = 13,9 palavras por minuto. A m

  7. A hybrid generative-discriminative approach to speaker diarization

    NARCIS (Netherlands)

    Noulas, A.K.; van Kasteren, T.; Kröse, B.J.A.

    2008-01-01

    In this paper we present a sound probabilistic approach to speaker diarization. We use a hybrid framework where a distribution over the number of speakers at each point of a multimodal stream is estimated with a discriminative model. The output of this process is used as input in a generative model

  8. Noise Reduction with Microphone Arrays for Speaker Identification

    Energy Technology Data Exchange (ETDEWEB)

    Cohen, Z

    2011-12-22

    Reducing acoustic noise in audio recordings is an ongoing problem that plagues many applications. This noise is hard to reduce because of interfering sources and non-stationary behavior of the overall background noise. Many single channel noise reduction algorithms exist but are limited in that the more the noise is reduced; the more the signal of interest is distorted due to the fact that the signal and noise overlap in frequency. Specifically acoustic background noise causes problems in the area of speaker identification. Recording a speaker in the presence of acoustic noise ultimately limits the performance and confidence of speaker identification algorithms. In situations where it is impossible to control the environment where the speech sample is taken, noise reduction filtering algorithms need to be developed to clean the recorded speech of background noise. Because single channel noise reduction algorithms would distort the speech signal, the overall challenge of this project was to see if spatial information provided by microphone arrays could be exploited to aid in speaker identification. The goals are: (1) Test the feasibility of using microphone arrays to reduce background noise in speech recordings; (2) Characterize and compare different multichannel noise reduction algorithms; (3) Provide recommendations for using these multichannel algorithms; and (4) Ultimately answer the question - Can the use of microphone arrays aid in speaker identification?

  9. Understanding speaker attitudes from prosody by adults with Parkinson's disease.

    Science.gov (United States)

    Monetta, Laura; Cheang, Henry S; Pell, Marc D

    2008-09-01

    The ability to interpret vocal (prosodic) cues during social interactions can be disrupted by Parkinson's disease, with notable effects on how emotions are understood from speech. This study investigated whether PD patients who have emotional prosody deficits exhibit further difficulties decoding the attitude of a speaker from prosody. Vocally inflected but semantically nonsensical 'pseudo-utterances' were presented to listener groups with and without PD in two separate rating tasks. Task I required participants to rate how confident a speaker sounded from their voice and Task 2 required listeners to rate how polite the speaker sounded for a comparable set of pseudo-utterances. The results showed that PD patients were significantly less able than HC participants to use prosodic cues to differentiate intended levels of speaker confidence in speech, although the patients could accurately detect the politelimpolite attitude of the speaker from prosody in most cases. Our data suggest that many PD patients fail to use vocal cues to effectively infer a speaker's emotions as well as certain attitudes in speech such as confidence, consistent with the idea that the basal ganglia play a role in the meaningful processing of prosodic sequences in spoken language (Pell & Leonard, 2003).

  10. Benefits of phoneme discrimination training in a randomized controlled trial of 50- to 74-year-olds with mild hearing loss.

    Science.gov (United States)

    Ferguson, Melanie A; Henshaw, Helen; Clark, Daniel P A; Moore, David R

    2014-01-01

    The aims of this study were to (i) evaluate the efficacy of phoneme discrimination training for hearing and cognitive abilities of adults aged 50 to 74 years with mild sensorineural hearing loss who were not users of hearing aids, and to (ii) determine participant compliance with a self-administered, computer-delivered, home- and game-based auditory training program. This study was a randomized controlled trial with repeated measures and crossover design. Participants were trained and tested over an 8- to 12-week period. One group (Immediate Training) trained during weeks 1 and 4. A second waitlist group (Delayed Training) did no training during weeks 1 and 4, but then trained during weeks 5 and 8. On-task (phoneme discrimination) and transferable outcome measures (speech perception, cognition, self-report of hearing disability) for both groups were obtained during weeks 0, 4, and 8, and for the Delayed Training group only at week 12. Robust phoneme discrimination learning was found for both groups, with the largest improvements in threshold shown for those with the poorest initial thresholds. Between weeks 1 and 4, the Immediate Training group showed moderate, significant improvements on self-report of hearing disability, divided attention, and working memory, specifically for conditions or situations that were more complex and therefore more challenging. Training did not result in consistent improvements in speech perception in noise. There was no evidence of any test-retest effects between weeks 1 and 4 for the Delayed Training group. Retention of benefit at 4 weeks post-training was shown for phoneme discrimination, divided attention, working memory, and self-report of hearing disability. Improved divided attention and reduced self-reported hearing difficulties were highly correlated. It was observed that phoneme discrimination training benefits some but not all people with mild hearing loss. Evidence presented here, together with that of other studies that

  11. A Method to Integrate GMM, SVM and DTW for Speaker Recognition

    Directory of Open Access Journals (Sweden)

    Ing-Jr Ding

    2014-01-01

    Full Text Available This paper develops an effective and efficient scheme to integrate Gaussian mixture model (GMM, support vector machine (SVM, and dynamic time wrapping (DTW for automatic speaker recognition. GMM and SVM are two popular classifiers for speaker recognition applications. DTW is a fast and simple template matching method, and it is frequently seen in applications of speech recognition. In this work, DTW does not play a role to perform speech recognition, and it will be employed to be a verifier for verification of valid speakers. The proposed combination scheme of GMM, SVM and DTW, called SVMGMM-DTW, for speaker recognition in this study is a two-phase verification process task including GMM-SVM verification of the first phase and DTW verification of the second phase. By providing a double check to verify the identity of a speaker, it will be difficult for imposters to try to pass the security protection; therefore, the safety degree of speaker recognition systems will be largely increased. A series of experiments designed on door access control applications demonstrated that the superiority of the developed SVMGMM-DTW on speaker recognition accuracy.

  12. Dynamic-range reduction by peak clipping or compression and its effects on phoneme perception in hearing-impaired listeners

    NARCIS (Netherlands)

    Dreschler, W. A.

    1988-01-01

    In this study, differences between dynamic-range reduction by peak clipping and single-channel compression for phoneme perception through conventional hearing aids have been investigated. The results from 16 hearing-impaired listeners show that compression limiting yields significantly better

  13. Direct Speaker Gaze Promotes Trust in Truth-Ambiguous Statements.

    Directory of Open Access Journals (Sweden)

    Helene Kreysa

    Full Text Available A speaker's gaze behaviour can provide perceivers with a multitude of cues which are relevant for communication, thus constituting an important non-verbal interaction channel. The present study investigated whether direct eye gaze of a speaker affects the likelihood of listeners believing truth-ambiguous statements. Participants were presented with videos in which a speaker produced such statements with either direct or averted gaze. The statements were selected through a rating study to ensure that participants were unlikely to know a-priori whether they were true or not (e.g., "sniffer dogs cannot smell the difference between identical twins". Participants indicated in a forced-choice task whether or not they believed each statement. We found that participants were more likely to believe statements by a speaker looking at them directly, compared to a speaker with averted gaze. Moreover, when participants disagreed with a statement, they were slower to do so when the statement was uttered with direct (compared to averted gaze, suggesting that the process of rejecting a statement as untrue may be inhibited when that statement is accompanied by direct gaze.

  14. Visual speaker gender affects vowel identification in Danish

    DEFF Research Database (Denmark)

    Larsen, Charlotte; Tøndering, John

    2013-01-01

    The experiment examined the effect of visual speaker gender on the vowel perception of 20 native Danish-speaking subjects. Auditory stimuli consisting of a continuum between /muːlə/ ‘muzzle’ and /moːlə/ ‘pier’ generated using TANDEM-STRAIGHT matched with video clips of a female and a male speaker...

  15. Bilingual and Monolingual Children Prefer Native-Accented Speakers

    Directory of Open Access Journals (Sweden)

    Andre L. eSouza

    2013-12-01

    Full Text Available Adults and young children prefer to affiliate with some individuals rather than others. Studies have shown that monolingual children show in-group biases for individuals who speak their native language without a foreign accent (Kinzler, Dupoux, & Spelke, 2007. Some studies have suggested that bilingual children are less influenced than monolinguals by language variety when attributing personality traits to different speakers (Anisfeld & Lambert, 1964, which could indicate that bilinguals have fewer in-group biases and perhaps greater social flexibility. However, no previous studies have compared monolingual and bilingual children’s reactions to speakers with unfamiliar foreign accents. In the present study, we investigated the social preferences of 5-year-old English and French monolinguals and English-French bilinguals. Contrary to our predictions, both monolingual and bilingual preschoolers preferred to be friends with native-accented speakers over speakers who spoke their dominant language with an unfamiliar foreign accent. This result suggests that both monolingual and bilingual children have strong preferences for in-group members who use a familiar language variety, and that bilingualism does not lead to generalized social flexibility.

  16. Bilingual and monolingual children prefer native-accented speakers.

    Science.gov (United States)

    Souza, André L; Byers-Heinlein, Krista; Poulin-Dubois, Diane

    2013-01-01

    Adults and young children prefer to affiliate with some individuals rather than others. Studies have shown that monolingual children show in-group biases for individuals who speak their native language without a foreign accent (Kinzler et al., 2007). Some studies have suggested that bilingual children are less influenced than monolinguals by language variety when attributing personality traits to different speakers (Anisfeld and Lambert, 1964), which could indicate that bilinguals have fewer in-group biases and perhaps greater social flexibility. However, no previous studies have compared monolingual and bilingual children's reactions to speakers with unfamiliar foreign accents. In the present study, we investigated the social preferences of 5-year-old English and French monolinguals and English-French bilinguals. Contrary to our predictions, both monolingual and bilingual preschoolers preferred to be friends with native-accented speakers over speakers who spoke their dominant language with an unfamiliar foreign accent. This result suggests that both monolingual and bilingual children have strong preferences for in-group members who use a familiar language variety, and that bilingualism does not lead to generalized social flexibility.

  17. Differences in Sickness Allowance Receipt between Swedish Speakers and Finnish Speakers in Finland

    Directory of Open Access Journals (Sweden)

    Kaarina S. Reini

    2017-12-01

    Full Text Available Previous research has documented lower disability retirement and mortality rates of Swedish speakers as compared with Finnish speakers in Finland. This paper is the first to compare the two language groups with regard to the receipt of sickness allowance, which is an objective health measure that reflects a less severe poor health condition. Register-based data covering the years 1988-2011 are used. We estimate logistic regression models with generalized estimating equations to account for repeated observations at the individual level. We find that Swedish-speaking men have approximately 30 percent lower odds of receiving sickness allowance than Finnish-speaking men, whereas the difference in women is about 15 percent. In correspondence with previous research on all-cause mortality at working ages, we find no language-group difference in sickness allowance receipt in the socially most successful subgroup of the population.

  18. On the improvement of speaker diarization by detecting overlapped speech

    OpenAIRE

    Hernando Pericás, Francisco Javier; Hernando Pericás, Francisco Javier

    2010-01-01

    Simultaneous speech in meeting environment is responsible for a certain amount of errors caused by standard speaker diarization systems. We are presenting an overlap detection system for far-field data based on spectral and spatial features, where the spatial features obtained on different microphone pairs are fused by means of principal component analysis. Detected overlap segments are applied for speaker diarization in order to increase the purity of speaker clusters an...

  19. Comprehending non-native speakers: theory and evidence for adjustment in manner of processing.

    Science.gov (United States)

    Lev-Ari, Shiri

    2014-01-01

    Non-native speakers have lower linguistic competence than native speakers, which renders their language less reliable in conveying their intentions. We suggest that expectations of lower competence lead listeners to adapt their manner of processing when they listen to non-native speakers. We propose that listeners use cognitive resources to adjust by increasing their reliance on top-down processes and extracting less information from the language of the non-native speaker. An eye-tracking study supports our proposal by showing that when following instructions by a non-native speaker, listeners make more contextually-induced interpretations. Those with relatively high working memory also increase their reliance on context to anticipate the speaker's upcoming reference, and are less likely to notice lexical errors in the non-native speech, indicating that they take less information from the speaker's language. These results contribute to our understanding of the flexibility in language processing and have implications for interactions between native and non-native speakers.

  20. Role of Speaker Cues in Attention Inference

    OpenAIRE

    Jin Joo Lee; Cynthia Breazeal; David DeSteno

    2017-01-01

    Current state-of-the-art approaches to emotion recognition primarily focus on modeling the nonverbal expressions of the sole individual without reference to contextual elements such as the co-presence of the partner. In this paper, we demonstrate that the accurate inference of listeners’ social-emotional state of attention depends on accounting for the nonverbal behaviors of their storytelling partner, namely their speaker cues. To gain a deeper understanding of the role of speaker cues in at...

  1. From phonemes to images : levels of representation in a recurrent neural model of visually-grounded language learning

    NARCIS (Netherlands)

    Gelderloos, L.J.; Chrupala, Grzegorz

    2016-01-01

    We present a model of visually-grounded language learning based on stacked gated recurrent neural networks which learns to predict visual features given an image description in the form of a sequence of phonemes. The learning task resembles that faced by human language learners who need to discover

  2. Race in Conflict with Heritage: "Black" Heritage Language Speaker of Japanese

    Science.gov (United States)

    Doerr, Neriko Musha; Kumagai, Yuri

    2014-01-01

    "Heritage language speaker" is a relatively new term to denote minority language speakers who grew up in a household where the language was used or those who have a family, ancestral, or racial connection to the minority language. In research on heritage language speakers, overlap between these 2 definitions is often assumed--that is,…

  3. Are Cantonese-speakers really descriptivists? Revisiting cross-cultural semantics.

    Science.gov (United States)

    Lam, Barry

    2010-05-01

    In an article in Cognition [Machery, E., Mallon, R., Nichols, S., & Stich, S. (2004). Semantics cross-cultural style. Cognition, 92, B1-B12] present data which purports to show that East Asian Cantonese-speakers tend to have descriptivist intuitions about the referents of proper names, while Western English-speakers tend to have causal-historical intuitions about proper names. Machery et al. take this finding to support the view that some intuitions, the universality of which they claim is central to philosophical theories, vary according to cultural background. Machery et al. conclude from their findings that the philosophical methodology of consulting intuitions about hypothetical cases is flawed vis a vis the goal of determining truths about some philosophical domains like philosophical semantics. In the following study, three new vignettes in English were given to Western native English-speakers, and Cantonese translations were given to native Cantonese-speaking immigrants from a Cantonese community in Southern California. For all three vignettes, questions were given to elicit intuitions about the referent of a proper name and the truth-value of an uttered sentence containing a proper name. The results from this study reveal that East Asian Cantonese-speakers do not differ from Western English-speakers in ways that support Machery et al.'s conclusions. This new data concerning the intuitions of Cantonese-speakers raises questions about whether cross-cultural variation in answers to questions on certain vignettes reveal genuine differences in intuitions, or whether such differences stem from non-intuitional differences, such as differences in linguistic competence. Copyright 2009 Elsevier B.V. All rights reserved.

  4. Robust Digital Speech Watermarking For Online Speaker Recognition

    Directory of Open Access Journals (Sweden)

    Mohammad Ali Nematollahi

    2015-01-01

    Full Text Available A robust and blind digital speech watermarking technique has been proposed for online speaker recognition systems based on Discrete Wavelet Packet Transform (DWPT and multiplication to embed the watermark in the amplitudes of the wavelet’s subbands. In order to minimize the degradation effect of the watermark, these subbands are selected where less speaker-specific information was available (500 Hz–3500 Hz and 6000 Hz–7000 Hz. Experimental results on Texas Instruments Massachusetts Institute of Technology (TIMIT, Massachusetts Institute of Technology (MIT, and Mobile Biometry (MOBIO show that the degradation for speaker verification and identification is 1.16% and 2.52%, respectively. Furthermore, the proposed watermark technique can provide enough robustness against different signal processing attacks.

  5. The Impact of Teaching Phonemic Awareness by Means of Direct Instruction on Reading Achievement of Students with Reading Disorder

    Directory of Open Access Journals (Sweden)

    Ahmad Sharifi

    2012-03-01

    Full Text Available Background: Phonemic awareness is one of the most important predictors of reading skills that has been taught by different procedures. One of the procedures is implementation of direct instruction in instruction of phonemic awareness. Current study is one of the unique studies in Iran that investigate impact of direct instruction in phonemic awareness on reading achievement of students with reading disorder.Case: Three male second grade elementary students with reading disorder in a regular school in district six of the office of education in Tehran were selected. Multiple-baseline across subjects was selected as a research design. The following tests were used as diagnostic criteria: reading and dyslexia test and Wechsler intelligence scale for children-revised. Moreover, a reading inventory consisting of 100 words was developed by researchers to assess the reading ability of the subjects. Data were collected in three phases: baseline, intervention, and follow-up. During the intervention phase, the intervention strategies were used while during baseline and follow-up, data were collected without any intervention. Comparing three phases of the study, we may conclude that intervention package consisting of direct instruction of phonological awareness was an effective strategy in reading achievement of all three students. In addition, follow-up data indicated that the effects of the intervention procedures were stable across time.Conclusion: Direct instruction of phonological awareness was effective in reading achievement of students with reading disorder in elementary school and increasing their abilities in reading.

  6. Framewise phoneme classification with bidirectional LSTM and other neural network architectures.

    Science.gov (United States)

    Graves, Alex; Schmidhuber, Jürgen

    2005-01-01

    In this paper, we present bidirectional Long Short Term Memory (LSTM) networks, and a modified, full gradient version of the LSTM learning algorithm. We evaluate Bidirectional LSTM (BLSTM) and several other network architectures on the benchmark task of framewise phoneme classification, using the TIMIT database. Our main findings are that bidirectional networks outperform unidirectional ones, and Long Short Term Memory (LSTM) is much faster and also more accurate than both standard Recurrent Neural Nets (RNNs) and time-windowed Multilayer Perceptrons (MLPs). Our results support the view that contextual information is crucial to speech processing, and suggest that BLSTM is an effective architecture with which to exploit it.

  7. A Joint Approach for Single-Channel Speaker Identification and Speech Separation

    DEFF Research Database (Denmark)

    Mowlaee, Pejman; Saeidi, Rahim; Christensen, Mads Græsbøll

    2012-01-01

    ) accuracy, here, we report the objective and subjective results as well. The results show that the proposed system performs as well as the best of the state-of-the-art in terms of perceived quality while its performance in terms of speaker identification and automatic speech recognition results......In this paper, we present a novel system for joint speaker identification and speech separation. For speaker identification a single-channel speaker identification algorithm is proposed which provides an estimate of signal-to-signal ratio (SSR) as a by-product. For speech separation, we propose...... a sinusoidal model-based algorithm. The speech separation algorithm consists of a double-talk/single-talk detector followed by a minimum mean square error estimator of sinusoidal parameters for finding optimal codevectors from pre-trained speaker codebooks. In evaluating the proposed system, we start from...

  8. A Study on Metadiscoursive Interaction in the MA Theses of the Native Speakers of English and the Turkish Speakers of English

    Science.gov (United States)

    Köroglu, Zehra; Tüm, Gülden

    2017-01-01

    This study has been conducted to evaluate the TM usage in the MA theses written by the native speakers (NSs) of English and the Turkish speakers (TSs) of English. The purpose is to compare the TM usage in the introduction, results and discussion, and conclusion sections by both groups' randomly selected MA theses in the field of ELT between the…

  9. Native listeners

    NARCIS (Netherlands)

    Cutler, A.

    2002-01-01

    Becoming a native listener is the necessary precursor to becoming a native speaker. Babies in the first year of life undertake a remarkable amount of work; by the time they begin to speak, they have perceptually mastered the phonological repertoire and phoneme co-occurrence probabilities of the

  10. The predictability of name pronunciation errors in four South African languages

    CSIR Research Space (South Africa)

    Kgampe, M

    2011-11-01

    Full Text Available of the the typical errors made by speakers from four South African languages (Setswana, English, isiZulu) when producing names from the same four languages. We compare these results with the pronunciations generated by four language-specific grapheme-to-phoneme (G2P...

  11. Exploring Phonetic and Phonological Variation: RP and the Nigerian ...

    African Journals Online (AJOL)

    ... Nigerian university, a hierarchy of the intelligibility of RP vowel phonemes is established. This not only provides evidence that intelligibility is a phenomenon which may be examined from a non-native speaker perspective, it also identifies specific features of RP segmental phonology which presents problems to Nigerians.

  12. Speaker Recognition from Emotional Speech Using I-vector Approach

    Directory of Open Access Journals (Sweden)

    MACKOVÁ Lenka

    2014-05-01

    Full Text Available In recent years the concept of i-vectors become very popular and successful in the field of the speaker verification. The basic principle of i-vectors is that each utterance is represented by fixed-length feature vector of low-dimension. In the literature for purpose of speaker verification various recordings obtained from telephones or microphones were used. The aim of this experiment was to perform speaker verification using speaker model trained with emotional recordings on i-vector basis. The Mel Frequency Cepstral Coefficients (MFCC, log energy, their deltas and acceleration coefficients were used in process of features extraction. As the classification methods of the verification system Mahalanobis distance metric in combination with Eigen Factor Radial normalization was used and in the second approach Cosine Distance Scoring (CSS metric with Within-class Covariance Normalization as a channel compensation was employed. This verification system used emotional recordings of male subjects from freely available German emotional database (Emo-DB.

  13. Segmentation of the Speaker's Face Region with Audiovisual Correlation

    Science.gov (United States)

    Liu, Yuyu; Sato, Yoichi

    The ability to find the speaker's face region in a video is useful for various applications. In this work, we develop a novel technique to find this region within different time windows, which is robust against the changes of view, scale, and background. The main thrust of our technique is to integrate audiovisual correlation analysis into a video segmentation framework. We analyze the audiovisual correlation locally by computing quadratic mutual information between our audiovisual features. The computation of quadratic mutual information is based on the probability density functions estimated by kernel density estimation with adaptive kernel bandwidth. The results of this audiovisual correlation analysis are incorporated into graph cut-based video segmentation to resolve a globally optimum extraction of the speaker's face region. The setting of any heuristic threshold in this segmentation is avoided by learning the correlation distributions of speaker and background by expectation maximization. Experimental results demonstrate that our method can detect the speaker's face region accurately and robustly for different views, scales, and backgrounds.

  14. The TNO speaker diarization system for NIST RT05s meeting data

    NARCIS (Netherlands)

    Leeuwen, D.A. van

    2006-01-01

    The TNO speaker speaker diarization system is based on a standard BIC segmentation and clustering algorithm. Since for the NIST Rich Transcription speaker dizarization evaluation measure correct speech detection appears to be essential, we have developed a speech activity detector (SAD) as well.

  15. A fundamental residue pitch perception bias for tone language speakers

    Science.gov (United States)

    Petitti, Elizabeth

    A complex tone composed of only higher-order harmonics typically elicits a pitch percept equivalent to the tone's missing fundamental frequency (f0). When judging the direction of residue pitch change between two such tones, however, listeners may have completely opposite perceptual experiences depending on whether they are biased to perceive changes based on the overall spectrum or the missing f0 (harmonic spacing). Individual differences in residue pitch change judgments are reliable and have been associated with musical experience and functional neuroanatomy. Tone languages put greater pitch processing demands on their speakers than non-tone languages, and we investigated whether these lifelong differences in linguistic pitch processing affect listeners' bias for residue pitch. We asked native tone language speakers and native English speakers to perform a pitch judgment task for two tones with missing fundamental frequencies. Given tone pairs with ambiguous pitch changes, listeners were asked to judge the direction of pitch change, where the direction of their response indicated whether they attended to the overall spectrum (exhibiting a spectral bias) or the missing f0 (exhibiting a fundamental bias). We found that tone language speakers are significantly more likely to perceive pitch changes based on the missing f0 than English speakers. These results suggest that tone-language speakers' privileged experience with linguistic pitch fundamentally tunes their basic auditory processing.

  16. Automated Intelligibility Assessment of Pathological Speech Using Phonological Features

    Directory of Open Access Journals (Sweden)

    Catherine Middag

    2009-01-01

    Full Text Available It is commonly acknowledged that word or phoneme intelligibility is an important criterion in the assessment of the communication efficiency of a pathological speaker. People have therefore put a lot of effort in the design of perceptual intelligibility rating tests. These tests usually have the drawback that they employ unnatural speech material (e.g., nonsense words and that they cannot fully exclude errors due to listener bias. Therefore, there is a growing interest in the application of objective automatic speech recognition technology to automate the intelligibility assessment. Current research is headed towards the design of automated methods which can be shown to produce ratings that correspond well with those emerging from a well-designed and well-performed perceptual test. In this paper, a novel methodology that is built on previous work (Middag et al., 2008 is presented. It utilizes phonological features, automatic speech alignment based on acoustic models that were trained on normal speech, context-dependent speaker feature extraction, and intelligibility prediction based on a small model that can be trained on pathological speech samples. The experimental evaluation of the new system reveals that the root mean squared error of the discrepancies between perceived and computed intelligibilities can be as low as 8 on a scale of 0 to 100.

  17. Effects of metric hierarchy and rhyme predictability on word duration in The Cat in the Hat.

    Science.gov (United States)

    Breen, Mara

    2018-05-01

    Word durations convey many types of linguistic information, including intrinsic lexical features like length and frequency and contextual features like syntactic and semantic structure. The current study was designed to investigate whether hierarchical metric structure and rhyme predictability account for durational variation over and above other features in productions of a rhyming, metrically-regular children's book: The Cat in the Hat (Dr. Seuss, 1957). One-syllable word durations and inter-onset intervals were modeled as functions of segment number, lexical frequency, word class, syntactic structure, repetition, and font emphasis. Consistent with prior work, factors predicting longer word durations and inter-onset intervals included more phonemes, lower frequency, first mention, alignment with a syntactic boundary, and capitalization. A model parameter corresponding to metric grid height improved model fit of word durations and inter-onset intervals. Specifically, speakers realized five levels of metric hierarchy with inter-onset intervals such that interval duration increased linearly with increased height in the metric hierarchy. Conversely, speakers realized only three levels of metric hierarchy with word duration, demonstrating that they shortened the highly predictable rhyme resolutions. These results further understanding of the factors that affect spoken word duration, and demonstrate the myriad cues that children receive about linguistic structure from nursery rhymes. Copyright © 2018 Elsevier B.V. All rights reserved.

  18. High-precision correlative fluorescence and electron cryo microscopy using two independent alignment markers

    International Nuclear Information System (INIS)

    Schellenberger, Pascale; Kaufmann, Rainer; Siebert, C. Alistair; Hagen, Christoph; Wodrich, Harald; Grünewald, Kay

    2014-01-01

    Correlative light and electron microscopy (CLEM) is an emerging technique which combines functional information provided by fluorescence microscopy (FM) with the high-resolution structural information of electron microscopy (EM). So far, correlative cryo microscopy of frozen-hydrated samples has not reached better than micrometre range accuracy. Here, a method is presented that enables the correlation between fluorescently tagged proteins and electron cryo tomography (cryoET) data with nanometre range precision. Specifically, thin areas of vitrified whole cells are examined by correlative fluorescence cryo microscopy (cryoFM) and cryoET. Novel aspects of the presented cryoCLEM workflow not only include the implementation of two independent electron dense fluorescent markers to improve the precision of the alignment, but also the ability of obtaining an estimate of the correlation accuracy for each individual object of interest. The correlative workflow from plunge-freezing to cryoET is detailed step-by-step for the example of locating fluorescence-labelled adenovirus particles trafficking inside a cell. - Highlights: • Vitrified mammalian cell were imaged by fluorescence and electron cryo microscopy. • TetraSpeck fluorescence markers were added to correct shifts between cryo fluorescence channels. • FluoSpheres fiducials were used as reference points to assign new coordinates to cryoEM images. • Adenovirus particles were localised with an average correlation precision of 63 nm

  19. High-precision correlative fluorescence and electron cryo microscopy using two independent alignment markers

    Energy Technology Data Exchange (ETDEWEB)

    Schellenberger, Pascale [Oxford Particle Imaging Centre, Division of Structural Biology, Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN (United Kingdom); Kaufmann, Rainer [Oxford Particle Imaging Centre, Division of Structural Biology, Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN (United Kingdom); Department of Biochemistry, University of Oxford, South Parks Road, Oxford OX1 3QU (United Kingdom); Siebert, C. Alistair; Hagen, Christoph [Oxford Particle Imaging Centre, Division of Structural Biology, Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN (United Kingdom); Wodrich, Harald [Microbiologie Fondamentale et Pathogénicité, MFP CNRS UMR 5234, University of Bordeaux SEGALEN, 146 rue Leo Seignat, 33076 Bordeaux (France); Grünewald, Kay, E-mail: kay@strubi.ox.ac.uk [Oxford Particle Imaging Centre, Division of Structural Biology, Wellcome Trust Centre for Human Genetics, University of Oxford, Roosevelt Drive, Oxford OX3 7BN (United Kingdom)

    2014-08-01

    Correlative light and electron microscopy (CLEM) is an emerging technique which combines functional information provided by fluorescence microscopy (FM) with the high-resolution structural information of electron microscopy (EM). So far, correlative cryo microscopy of frozen-hydrated samples has not reached better than micrometre range accuracy. Here, a method is presented that enables the correlation between fluorescently tagged proteins and electron cryo tomography (cryoET) data with nanometre range precision. Specifically, thin areas of vitrified whole cells are examined by correlative fluorescence cryo microscopy (cryoFM) and cryoET. Novel aspects of the presented cryoCLEM workflow not only include the implementation of two independent electron dense fluorescent markers to improve the precision of the alignment, but also the ability of obtaining an estimate of the correlation accuracy for each individual object of interest. The correlative workflow from plunge-freezing to cryoET is detailed step-by-step for the example of locating fluorescence-labelled adenovirus particles trafficking inside a cell. - Highlights: • Vitrified mammalian cell were imaged by fluorescence and electron cryo microscopy. • TetraSpeck fluorescence markers were added to correct shifts between cryo fluorescence channels. • FluoSpheres fiducials were used as reference points to assign new coordinates to cryoEM images. • Adenovirus particles were localised with an average correlation precision of 63 nm.

  20. Internal request modification by first and second language speakers ...

    African Journals Online (AJOL)

    This study focuses on the question of whether Luganda English speakers would negatively transfer into their English speech the use of syntactic and lexical down graders resulting in pragmatic failure. Data were collected from Luganda and Luganda English speakers by means of a Discourse Completion Test (DCT) ...

  1. Nigeria's Policy of Non-Alignment and Voting in the United Nations ...

    African Journals Online (AJOL)

    alignment in Nigeria's foreign policy between 1960 and 1965. The tradition that dates from the early 1960s concludes that Nigeria's foreign policy towards the Cold War was independent and non-aligned, and the post-war tradition is that Nigeria ...

  2. Does verbatim sentence recall underestimate the language competence of near-native speakers?

    Directory of Open Access Journals (Sweden)

    Judith eSchweppe

    2015-02-01

    Full Text Available Verbatim sentence recall is widely used to test the language competence of native and non-native speakers since it involves comprehension and production of connected speech. However, we assume that, to maintain surface information, sentence recall relies particularly on attentional resources, which differentially affects native and non-native speakers. Since even in near-natives language processing is less automatized than in native speakers, processing a sentence in a foreign language plus retaining its surface may result in a cognitive overload. We contrasted sentence recall performance of German native speakers with that of highly proficient non-natives. Non-natives recalled the sentences significantly poorer than the natives, but performed equally well on a cloze test. This implies that sentence recall underestimates the language competence of good non-native speakers in mixed groups with native speakers. The findings also suggest that theories of sentence recall need to consider both its linguistic and its attentional aspects.

  3. DIALIGN P: Fast pair-wise and multiple sequence alignment using parallel processors

    Directory of Open Access Journals (Sweden)

    Kaufmann Michael

    2004-09-01

    Full Text Available Abstract Background Parallel computing is frequently used to speed up computationally expensive tasks in Bioinformatics. Results Herein, a parallel version of the multi-alignment program DIALIGN is introduced. We propose two ways of dividing the program into independent sub-routines that can be run on different processors: (a pair-wise sequence alignments that are used as a first step to multiple alignment account for most of the CPU time in DIALIGN. Since alignments of different sequence pairs are completely independent of each other, they can be distributed to multiple processors without any effect on the resulting output alignments. (b For alignments of large genomic sequences, we use a heuristics by splitting up sequences into sub-sequences based on a previously introduced anchored alignment procedure. For our test sequences, this combined approach reduces the program running time of DIALIGN by up to 97%. Conclusions By distributing sub-routines to multiple processors, the running time of DIALIGN can be crucially improved. With these improvements, it is possible to apply the program in large-scale genomics and proteomics projects that were previously beyond its scope.

  4. Are Cantonese-Speakers Really Descriptivists? Revisiting Cross-Cultural Semantics

    Science.gov (United States)

    Lam, Barry

    2010-01-01

    In an article in "Cognition" [Machery, E., Mallon, R., Nichols, S., & Stich, S. (2004). "Semantics cross-cultural style." "Cognition, 92", B1-B12] present data which purports to show that East Asian Cantonese-speakers tend to have descriptivist intuitions about the referents of proper names, while Western English-speakers tend to have…

  5. Effects of Dual Coded Multimedia Instruction Employing Image Morphing on Learning a Logographic Language

    Science.gov (United States)

    Wang, Ling; Blackwell, Aleka Akoyunoglou

    2015-01-01

    Native speakers of alphabetic languages, which use letters governed by grapheme-phoneme correspondence rules, often find it particularly challenging to learn a logographic language whose writing system employs symbols with no direct sound-to-spelling connection but links to the visual and semantic information. The visuospatial properties of…

  6. The abstract representations in speech processing.

    Science.gov (United States)

    Cutler, Anne

    2008-11-01

    Speech processing by human listeners derives meaning from acoustic input via intermediate steps involving abstract representations of what has been heard. Recent results from several lines of research are here brought together to shed light on the nature and role of these representations. In spoken-word recognition, representations of phonological form and of conceptual content are dissociable. This follows from the independence of patterns of priming for a word's form and its meaning. The nature of the phonological-form representations is determined not only by acoustic-phonetic input but also by other sources of information, including metalinguistic knowledge. This follows from evidence that listeners can store two forms as different without showing any evidence of being able to detect the difference in question when they listen to speech. The lexical representations are in turn separate from prelexical representations, which are also abstract in nature. This follows from evidence that perceptual learning about speaker-specific phoneme realization, induced on the basis of a few words, generalizes across the whole lexicon to inform the recognition of all words containing the same phoneme. The efficiency of human speech processing has its basis in the rapid execution of operations over abstract representations.

  7. Functional activity and white matter microstructure reveal the independent effects of age of acquisition and proficiency on second-language learning.

    Science.gov (United States)

    Nichols, Emily S; Joanisse, Marc F

    2016-12-01

    Two key factors govern how bilingual speakers neurally maintain two languages: the speakers' second language age of acquisition (AoA) and their subsequent proficiency. However, the relative roles of these two factors have been difficult to disentangle given that the two can be closely correlated, and most prior studies have examined the two factors in isolation. Here, we combine functional magnetic resonance imaging with diffusion tensor imaging to identify specific brain areas that are independently modulated by AoA and proficiency in second language speakers. First-language Mandarin Chinese speakers who are second language speakers of English were scanned as they performed a picture-word matching task in either language. In the same session we also acquired diffusion-weighted scans to assess white matter microstructure, along with behavioural measures of language proficiency prior to entering the scanner. Results reveal gray- and white-matter networks involving both the left and right hemisphere that independently vary as a function of a second-language speaker's AoA and proficiency, focused on the superior temporal gyrus, middle and inferior frontal gyrus, parahippocampal gyrus, and the basal ganglia. These results indicate that proficiency and AoA explain separate functional and structural networks in the bilingual brain, which we interpret as suggesting distinct types of plasticity for age-dependent effects (i.e., AoA) versus experience and/or predisposition (i.e., proficiency). Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  8. Utilising Tree-Based Ensemble Learning for Speaker Segmentation

    DEFF Research Database (Denmark)

    Abou-Zleikha, Mohamed; Tan, Zheng-Hua; Christensen, Mads Græsbøll

    2014-01-01

    In audio and speech processing, accurate detection of the changing points between multiple speakers in speech segments is an important stage for several applications such as speaker identification and tracking. Bayesian Information Criteria (BIC)-based approaches are the most traditionally used...... for a certain condition, the model becomes biased to the data used for training limiting the model’s generalisation ability. In this paper, we propose a BIC-based tuning-free approach for speaker segmentation through the use of ensemble-based learning. A forest of segmentation trees is constructed in which each...... tree is trained using a sampled version of the speech segment. During the tree construction process, a set of randomly selected points in the input sequence is examined as potential segmentation points. The point that yields the highest ΔBIC is chosen and the same process is repeated for the resultant...

  9. Interpreting Mini-Mental State Examination Performance in Highly Proficient Bilingual Spanish-English and Asian Indian-English Speakers: Demographic Adjustments, Item Analyses, and Supplemental Measures.

    Science.gov (United States)

    Milman, Lisa H; Faroqi-Shah, Yasmeen; Corcoran, Chris D; Damele, Deanna M

    2018-04-17

    Performance on the Mini-Mental State Examination (MMSE), among the most widely used global screens of adult cognitive status, is affected by demographic variables including age, education, and ethnicity. This study extends prior research by examining the specific effects of bilingualism on MMSE performance. Sixty independent community-dwelling monolingual and bilingual adults were recruited from eastern and western regions of the United States in this cross-sectional group study. Independent sample t tests were used to compare 2 bilingual groups (Spanish-English and Asian Indian-English) with matched monolingual speakers on the MMSE, demographically adjusted MMSE scores, MMSE item scores, and a nonverbal cognitive measure. Regression analyses were also performed to determine whether language proficiency predicted MMSE performance in both groups of bilingual speakers. Group differences were evident on the MMSE, on demographically adjusted MMSE scores, and on a small subset of individual MMSE items. Scores on a standardized screen of language proficiency predicted a significant proportion of the variance in the MMSE scores of both bilingual groups. Bilingual speakers demonstrated distinct performance profiles on the MMSE. Results suggest that supplementing the MMSE with a language screen, administering a nonverbal measure, and/or evaluating item-based patterns of performance may assist with test interpretation for this population.

  10. Multiple sequence alignment accuracy and phylogenetic inference.

    Science.gov (United States)

    Ogden, T Heath; Rosenberg, Michael S

    2006-04-01

    Phylogenies are often thought to be more dependent upon the specifics of the sequence alignment rather than on the method of reconstruction. Simulation of sequences containing insertion and deletion events was performed in order to determine the role that alignment accuracy plays during phylogenetic inference. Data sets were simulated for pectinate, balanced, and random tree shapes under different conditions (ultrametric equal branch length, ultrametric random branch length, nonultrametric random branch length). Comparisons between hypothesized alignments and true alignments enabled determination of two measures of alignment accuracy, that of the total data set and that of individual branches. In general, our results indicate that as alignment error increases, topological accuracy decreases. This trend was much more pronounced for data sets derived from more pectinate topologies. In contrast, for balanced, ultrametric, equal branch length tree shapes, alignment inaccuracy had little average effect on tree reconstruction. These conclusions are based on average trends of many analyses under different conditions, and any one specific analysis, independent of the alignment accuracy, may recover very accurate or inaccurate topologies. Maximum likelihood and Bayesian, in general, outperformed neighbor joining and maximum parsimony in terms of tree reconstruction accuracy. Results also indicated that as the length of the branch and of the neighboring branches increase, alignment accuracy decreases, and the length of the neighboring branches is the major factor in topological accuracy. Thus, multiple-sequence alignment can be an important factor in downstream effects on topological reconstruction.

  11. Teaching Portuguese to Spanish Speakers: A Case for Trilingualism

    Science.gov (United States)

    Carvalho, Ana M.; Freire, Juliana Luna; da Silva, Antonio J. B.

    2010-01-01

    Portuguese is the sixth-most-spoken native language in the world, with approximately 240,000,000 speakers. Within the United States, there is a growing demand for K-12 language programs to engage the community of Portuguese heritage speakers. According to the 2000 U.S. census, 85,000 school-age children speak Portuguese at home. As a result, more…

  12. Electrophysiological correlates of grapheme-phoneme conversion.

    Science.gov (United States)

    Huang, Koongliang; Itoh, Kosuke; Suwazono, Shugo; Nakada, Tsutomu

    2004-08-19

    The cortical processes underlying grapheme-phoneme conversion were investigated by event-related potentials (ERPs). The task consisted of silent reading or vowel-matching of three Japanese hiragana characters, each representing a consonant-vowel syllable. At earlier latencies, typical components of the visual ERP, namely, P1 (110 ms), N1 (170 ms) and P2 (300 ms), were elicited in the temporo-occipital area for both tasks as well as control task (observing the orthographic shapes of three Korean characters). Following these earlier components, two sustained negativities were identified. The earlier sustained negativity, referred here to as SN1, was found in both the silent-reading and vowel-matching task but not in the control task. The scalp distribution of SN1 was over the left occipito-temporal area, with maximum amplitude over O1. The amplitude of SN1 was larger in the vowel-matching task compared to the silent-reading task, consistent with previous reports that ERP amplitude correlates with task difficulty. SN2, the later sustained negativity, was only observed in the vowel-matching task. The scalp distribution of SN2 was over the midsagittal centro-parietal area with maximum amplitude over Cz. Elicitation of SN2 in the vowel-matching task suggested that the vowel-matching task requires a wider range of neural activities exceeding the established conventional area of language processing.

  13. English as a Foreign Language Spelling: Comparisons between Good and Poor Spellers

    Science.gov (United States)

    Russak, Susie; Kahn-Horwitz, Janina

    2015-01-01

    This study examined English as a foreign language (EFL) spelling development amongst 233 fifth-grade, eighth-grade and 10th-grade Hebrew first-language speakers to examine effects of English orthographic exposure on spelling. Good and poor speller differences were examined regarding the acquisition of novel phonemes (/ae/, /?/ and /?/) and…

  14. Speaker Introductions at Internal Medicine Grand Rounds: Forms of Address Reveal Gender Bias.

    Science.gov (United States)

    Files, Julia A; Mayer, Anita P; Ko, Marcia G; Friedrich, Patricia; Jenkins, Marjorie; Bryan, Michael J; Vegunta, Suneela; Wittich, Christopher M; Lyle, Melissa A; Melikian, Ryan; Duston, Trevor; Chang, Yu-Hui H; Hayes, Sharonne N

    2017-05-01

    Gender bias has been identified as one of the drivers of gender disparity in academic medicine. Bias may be reinforced by gender subordinating language or differential use of formality in forms of address. Professional titles may influence the perceived expertise and authority of the referenced individual. The objective of this study is to examine how professional titles were used in the same and mixed-gender speaker introductions at Internal Medicine Grand Rounds (IMGR). A retrospective observational study of video-archived speaker introductions at consecutive IMGR was conducted at two different locations (Arizona, Minnesota) of an academic medical center. Introducers and speakers at IMGR were physician and scientist peers holding MD, PhD, or MD/PhD degrees. The primary outcome was whether or not a speaker's professional title was used during the first form of address during speaker introductions at IMGR. As secondary outcomes, we evaluated whether or not the speakers professional title was used in any form of address during the introduction. Three hundred twenty-one forms of address were analyzed. Female introducers were more likely to use professional titles when introducing any speaker during the first form of address compared with male introducers (96.2% [102/106] vs. 65.6% [141/215]; p form of address 97.8% (45/46) compared with male dyads who utilized a formal title 72.4% (110/152) of the time (p = 0.007). In mixed-gender dyads, where the introducer was female and speaker male, formal titles were used 95.0% (57/60) of the time. Male introducers of female speakers utilized professional titles 49.2% (31/63) of the time (p addressed by professional title than were men introduced by men. Differential formality in speaker introductions may amplify isolation, marginalization, and professional discomfiture expressed by women faculty in academic medicine.

  15. Shhh… I Need Quiet! Children's Understanding of American, British, and Japanese-accented English Speakers.

    Science.gov (United States)

    Bent, Tessa; Holt, Rachael Frush

    2018-02-01

    Children's ability to understand speakers with a wide range of dialects and accents is essential for efficient language development and communication in a global society. Here, the impact of regional dialect and foreign-accent variability on children's speech understanding was evaluated in both quiet and noisy conditions. Five- to seven-year-old children ( n = 90) and adults ( n = 96) repeated sentences produced by three speakers with different accents-American English, British English, and Japanese-accented English-in quiet or noisy conditions. Adults had no difficulty understanding any speaker in quiet conditions. Their performance declined for the nonnative speaker with a moderate amount of noise; their performance only substantially declined for the British English speaker (i.e., below 93% correct) when their understanding of the American English speaker was also impeded. In contrast, although children showed accurate word recognition for the American and British English speakers in quiet conditions, they had difficulty understanding the nonnative speaker even under ideal listening conditions. With a moderate amount of noise, their perception of British English speech declined substantially and their ability to understand the nonnative speaker was particularly poor. These results suggest that although school-aged children can understand unfamiliar native dialects under ideal listening conditions, their ability to recognize words in these dialects may be highly susceptible to the influence of environmental degradation. Fully adult-like word identification for speakers with unfamiliar accents and dialects may exhibit a protracted developmental trajectory.

  16. Automated whole-genome multiple alignment of rat, mouse, and human

    Energy Technology Data Exchange (ETDEWEB)

    Brudno, Michael; Poliakov, Alexander; Salamov, Asaf; Cooper, Gregory M.; Sidow, Arend; Rubin, Edward M.; Solovyev, Victor; Batzoglou, Serafim; Dubchak, Inna

    2004-07-04

    We have built a whole genome multiple alignment of the three currently available mammalian genomes using a fully automated pipeline which combines the local/global approach of the Berkeley Genome Pipeline and the LAGAN program. The strategy is based on progressive alignment, and consists of two main steps: (1) alignment of the mouse and rat genomes; and (2) alignment of human to either the mouse-rat alignments from step 1, or the remaining unaligned mouse and rat sequences. The resulting alignments demonstrate high sensitivity, with 87% of all human gene-coding areas aligned in both mouse and rat. The specificity is also high: <7% of the rat contigs are aligned to multiple places in human and 97% of all alignments with human sequence > 100kb agree with a three-way synteny map built independently using predicted exons in the three genomes. At the nucleotide level <1% of the rat nucleotides are mapped to multiple places in the human sequence in the alignment; and 96.5% of human nucleotides within all alignments agree with the synteny map. The alignments are publicly available online, with visualization through the novel Multi-VISTA browser that we also present.

  17. A general auditory bias for handling speaker variability in speech? Evidence in humans and songbirds

    Directory of Open Access Journals (Sweden)

    Buddhamas eKriengwatana

    2015-08-01

    Full Text Available Different speakers produce the same speech sound differently, yet listeners are still able to reliably identify the speech sound. How listeners can adjust their perception to compensate for speaker differences in speech, and whether these compensatory processes are unique only to humans, is still not fully understood. In this study we compare the ability of humans and zebra finches to categorize vowels despite speaker variation in speech in order to test the hypothesis that accommodating speaker and gender differences in isolated vowels can be achieved without prior experience with speaker-related variability. Using a behavioural Go/No-go task and identical stimuli, we compared Australian English adults’ (naïve to Dutch and zebra finches’ (naïve to human speech ability to categorize /ɪ/ and /ɛ/ vowels of an novel Dutch speaker after learning to discriminate those vowels from only one other speaker. Experiment 1 and 2 presented vowels of two speakers interspersed or blocked, respectively. Results demonstrate that categorization of vowels is possible without prior exposure to speaker-related variability in speech for zebra finches, and in non-native vowel categories for humans. Therefore, this study is the first to provide evidence for what might be a species-shared auditory bias that may supersede speaker-related information during vowel categorization. It additionally provides behavioural evidence contradicting a prior hypothesis that accommodation of speaker differences is achieved via the use of formant ratios. Therefore, investigations of alternative accounts of vowel normalization that incorporate the possibility of an auditory bias for disregarding inter-speaker variability are warranted.

  18. The Main Concept Analysis: Validation and sensitivity in differentiating discourse produced by unimpaired English speakers from individuals with aphasia and dementia of Alzheimer type.

    Science.gov (United States)

    Kong, Anthony Pak-Hin; Whiteside, Janet; Bargmann, Peggy

    2016-10-01

    Discourse from speakers with dementia and aphasia is associated with comparable but not identical deficits, necessitating appropriate methods to differentiate them. The current study aims to validate the Main Concept Analysis (MCA) to be used for eliciting and quantifying discourse among native typical English speakers and to establish its norm, and investigate the validity and sensitivity of the MCA to compare discourse produced by individuals with fluent aphasia, non-fluent aphasia, or dementia of Alzheimer's type (DAT), and unimpaired elderly. Discourse elicited through a sequential picture description task was collected from 60 unimpaired participants to determine the MCA scoring criteria; 12 speakers with fluent aphasia, 12 with non-fluent aphasia, 13 with DAT, and 20 elderly participants from the healthy group were compared on the finalized MCA. Results of MANOVA revealed significant univariate omnibus effects of speaker group as an independent variable on each main concept index. MCA profiles differed significantly between all participant groups except dementia versus fluent aphasia. Correlations between the MCA performances and the Western Aphasia Battery and Cognitive Linguistic Quick Test were found to be statistically significant among the clinical groups. The MCA was appropriate to be used among native speakers of English. The results also provided further empirical evidence of discourse deficits in aphasia and dementia. Practitioners can use the MCA to evaluate discourse production systemically and objectively.

  19. Presenting and processing information in background noise: A combined speaker-listener perspective.

    Science.gov (United States)

    Bockstael, Annelies; Samyn, Laurie; Corthals, Paul; Botteldooren, Dick

    2018-01-01

    Transferring information orally in background noise is challenging, for both speaker and listener. Successful transfer depends on complex interaction between characteristics related to listener, speaker, task, background noise, and context. To fully assess the underlying real-life mechanisms, experimental design has to mimic this complex reality. In the current study, the effects of different types of background noise have been studied in an ecologically valid test design. Documentary-style information had to be presented by the speaker and simultaneously acquired by the listener in four conditions: quiet, unintelligible multitalker babble, fluctuating city street noise, and little varying highway noise. For both speaker and listener, the primary task was to focus on the content that had to be transferred. In addition, for the speakers, the occurrence of hesitation phenomena was assessed. The listener had to perform an additional secondary task to address listening effort. For the listener the condition with the most eventful background noise, i.e., fluctuating city street noise, appeared to be the most difficult with markedly longer duration of the secondary task. In the same fluctuating background noise, speech appeared to be less disfluent, suggesting a higher level of concentration from the speaker's side.

  20. Key-note speaker: Predictors of weight loss after preventive Health consultations

    DEFF Research Database (Denmark)

    Lous, Jørgen; Freund, Kirsten S

    2018-01-01

    Invited key-note speaker ved conferencen: Preventive Medicine and Public Health Conference 2018, July 16-17, London.......Invited key-note speaker ved conferencen: Preventive Medicine and Public Health Conference 2018, July 16-17, London....

  1. ON THE IMPORTANCE OF THE PHONEME THEORY FONEM TEORİSİNİN ÖNEMİ HAKKINDA

    Directory of Open Access Journals (Sweden)

    Kerim DEMİRCİ

    2011-06-01

    Full Text Available Analyzing the structure of language has primarily to do with understanding the sounds that make up language. Saussure’s idea of ‘language is a system of mutually defining entities’ [there is nothing but differences in language] has been the main incentive behind the phoneme theory. The theory in modern terms has been formed especially by Baudouin de Courtenay and linguitsts such as Nikolai Trubetzkoy, Paul Passy, Roman Jakobson, Sergey Karçevski. These scholars made a tremendous contribution to linguistics by setting a system that made possible to differentiate speech sounds and phonemes from all other unprocessed (raw ones. This system laid the foundation of modern phonetic and phonology. This study examines the main features of the phoneme theory that analyzes distinctive features of speech sounds from a systematic approach Dilin yapısını çözümlemek öncelikli olarak dili oluşturan seslerin anlaşılmasıyla ilişkilidir. Ferdinand de Saussure’ün ‘Dilde aykırılıklardan başka bir şey yoktur’ varsayımından yola çıkan dilciler fonem teorisinin ortaya çıkmasını sağlamışlardır. Modern anlamda fonem teorisi, özellikle Rus bilgin Baudouin de Courtenay başta olmak üzere Nikolai Trubetzkoy, Paul Passy, Roman Jakobson, Sergey Karçevski gibi birçok uzmanın çalışmalarıyla şekillenmiştir. Bu bilginler dillerin konuşma seslerini çözümlemeyi mümkün kılan bir sitemi dilbilime kazandırmıştır. Bu sistem modern fonetik ve fonoloji çalışmalarının altyapısını oluşturmuştur. Bu yazıda konuşma seslerine mahsus ayırt edici özelliklerin bilimsel metotlarla tespitine dayanan fonem teorisinin en temel özellikleri ele alınacaktır.

  2. Automaticity and stability of adaptation to a foreign-accented speaker

    NARCIS (Netherlands)

    Witteman, M.J.; Bardhan, N.P.; Weber, A.C.; McQueen, J.M.

    2015-01-01

    In three cross-modal priming experiments we asked whether adaptation to a foreign-accented speaker is automatic, and whether adaptation can be seen after a long delay between initial exposure and test. Dutch listeners were exposed to a Hebrew-accented Dutch speaker with two types of Dutch words:

  3. Dysprosody and Stimulus Effects in Cantonese Speakers with Parkinson's Disease

    Science.gov (United States)

    Ma, Joan K.-Y.; Whitehill, Tara; Cheung, Katherine S.-K.

    2010-01-01

    Background: Dysprosody is a common feature in speakers with hypokinetic dysarthria. However, speech prosody varies across different types of speech materials. This raises the question of what is the most appropriate speech material for the evaluation of dysprosody. Aims: To characterize the prosodic impairment in Cantonese speakers with…

  4. Profiles of an Acquisition Generation: Nontraditional Heritage Speakers of Spanish

    Science.gov (United States)

    DeFeo, Dayna Jean

    2018-01-01

    Though definitions vary, the literature on heritage speakers of Spanish identifies two primary attributes: a linguistic and cultural connection to the language. This article profiles four Anglo college students who grew up in bilingual or Spanish-dominant communities in the Southwest who self-identified as Spanish heritage speakers, citing…

  5. THE HUMOROUS SPEAKER: THE CONSTRUCTION OF ETHOS IN COMEDY

    Directory of Open Access Journals (Sweden)

    Maria Flávia Figueiredo

    2016-07-01

    Full Text Available The rhetoric is guided by three dimensions: logos, pathos and ethos. Logos is the speech itself, pathos are the passions that the speaker, through logos, awakens in his audience, and ethos is the image that the speaker creates of himself, also through logos, in front of an audience. The rhetorical genres are three: deliberative (which drives the audience or the judge to think about future events, characterizing them as convenient or harmful, judiciary (the audience thinks about past events in order to classify them as fair or unfair and epidictic (the audience will judge any fact occurred, or even the character of a person as beautiful or not. According to Figueiredo (2014 and based on Eggs (2005, we advocate that ethos is not a mark left by the speaker only in rhetorical genres, but in any textual genre, once the result of human production, the simplest choices in textual construction, are able to reproduce something that is closely linked to speaker, thus, demarcating hir/her ethos. To verify this assumption, we selected a display of a video of the comedian Danilo Gentili, which will be examined in the light of Rhetoric and Textual Linguistics. So, our objective is to find, in the stand-up comedy genre, marks left by the speaker in the speech that characterizes his/her ethos. The analysis results show that ethos, discursive genre and communicational purpose amalgamate in an indissoluble complex in which the success of one of them interdepends on how the other was built.

  6. Proceedings of the 7. Independent Power Producers' Society of Alberta annual conference

    International Nuclear Information System (INIS)

    2001-01-01

    This conference provided to the delegates from across North America a forum where a wide array of perspectives with regard to the new electric market place of Alberta could be discussed. Speakers covered a lot of ground in their examination of the deregulation of the electricity market in Alberta and the impacts felt by consumers and producers alike. The recent events that led to the deregulation were reviewed and an emphasis was also placed on the successful development of power generation projects, wholesale pricing options and independent retail strategies. Open energy markets were discussed in a series of speaker panels where representatives from private organizations added their views on the topic. The conference was divided into seven sessions entitled: (1) the operation of Alberta's market, (2) panel discussion: defending the market, (3) competitive hurdles to successful development, (4) alternative energy solutions, (5) mechanics of retail choice, (6) wholesale pricing options, and (7) independent retailer strategies. refs., tabs., figs

  7. French Phonology for Teachers: A Programmed Introduction.

    Science.gov (United States)

    Green, Jerald R.; Poulin, Norman A.

    This programed, self-instructional course has the following terminal objectives: (1) to present some notions of the science of linguistics and the major branches of linguistics, (2) to teach the segmental and suprasegmental phonemes of French, (3) to identify the major articulatory problems of French for the native speaker of English, (4) to…

  8. Phonetic perspectives on modelling information in the speech signal

    Indian Academy of Sciences (India)

    Centre for Music and Science, Faculty of Music, University of Cambridge,. Cambridge .... However, to develop systems that can han- .... 1.2a Phonemes are not clearly identifiable in movement or in the acoustic speech signal: As ..... while the speaker role-played the part of a mother at a child's athletics meeting where the.

  9. Mechanical alignment of substrates to a mask

    Science.gov (United States)

    Webb, Aaron P.; Carlson, Charles T.; Honan, Michael; Amato, Luigi G.; Grant, Christopher Neil; Strassner, James D.

    2016-11-08

    A plurality of masks is attached to the underside of a mask frame. This attachment is made such that each mask can independently move relative to the mask frame in three directions. This relative movement allows each mask to adjust its position to align with respective alignment pins disposed on a working surface. In one embodiment, each mask is attached to the mask frame using fasteners, where the fasteners have a shaft with a diameter smaller than the diameter of the mounting hole disposed on the mask. A bias element may be used to allow relative movement between the mask and the mask frame in the vertical direction. Each mask may also have kinematic features to mate with the respective alignment pins on the working surface.

  10. Quantile Acoustic Vectors vs. MFCC Applied to Speaker Verification

    Directory of Open Access Journals (Sweden)

    Mayorga-Ortiz Pedro

    2014-02-01

    Full Text Available In this paper we describe speaker and command recognition related experiments, through quantile vectors and Gaussian Mixture Modelling (GMM. Over the past several years GMM and MFCC have become two of the dominant approaches for modelling speaker and speech recognition applications. However, memory and computational costs are important drawbacks, because autonomous systems suffer processing and power consumption constraints; thus, having a good trade-off between accuracy and computational requirements is mandatory. We decided to explore another approach (quantile vectors in several tasks and a comparison with MFCC was made. Quantile acoustic vectors are proposed for speaker verification and command recognition tasks and the results showed very good recognition efficiency. This method offered a good trade-off between computation times, characteristics vector complexity and overall achieved efficiency.

  11. A Dynamic Alignment System for the Final Focus Test Beam

    International Nuclear Information System (INIS)

    Ruland, R.E.; Bressler, V.E.; Fischer, G.; Plouffe, D.; SLAC

    2005-01-01

    The Final Focus Test Beam (FFTB) was conceived as a technological stepping stone on the way to the next linear collider. Nowhere is this more evident than with the alignment subsystems. Alignment tolerances for components prior to beam turn are almost an order of magnitude smaller than for previous projects at SLAC. Position monitoring systems which operate independent of the beam are employed to monitor motions of the components locally and globally with unprecedented precision. An overview of the FFTB alignment system is presented herein

  12. Defining "Native Speaker" in Multilingual Settings: English as a Native Language in Asia

    Science.gov (United States)

    Hansen Edwards, Jette G.

    2017-01-01

    The current study examines how and why speakers of English from multilingual contexts in Asia are identifying as native speakers of English. Eighteen participants from different contexts in Asia, including Singapore, Malaysia, India, Taiwan, and The Philippines, who self-identified as native speakers of English participated in hour-long interviews…

  13. Using Avatars for Improving Speaker Identification in Captioning

    Science.gov (United States)

    Vy, Quoc V.; Fels, Deborah I.

    Captioning is the main method for accessing television and film content by people who are deaf or hard-of-hearing. One major difficulty consistently identified by the community is that of knowing who is speaking particularly for an off screen narrator. A captioning system was created using a participatory design method to improve speaker identification. The final prototype contained avatars and a coloured border for identifying specific speakers. Evaluation results were very positive; however participants also wanted to customize various components such as caption and avatar location.

  14. Sensitivity to phonological context in L2 spelling: evidence from Russian ESL speakers

    DEFF Research Database (Denmark)

    Dich, Nadya

    2010-01-01

    The study attempts to investigate factors underlying the development of spellers’ sensitivity to phonological context in English. Native English speakers and Russian speakers of English as a second language (ESL) were tested on their ability to use information about the coda to predict the spelling...... on the information about the coda when spelling vowels in nonwords. In both native and non-native speakers, context sensitivity was predicted by English word spelling; in Russian ESL speakers this relationship was mediated by English proficiency. L1 spelling proficiency did not facilitate L2 context sensitivity...

  15. An analysis of topics and vocabulary in Chinese oral narratives by normal speakers and speakers with fluent aphasia.

    Science.gov (United States)

    Law, Sam-Po; Kong, Anthony Pak-Hin; Lai, Christy

    2018-01-01

    This study analysed the topic and vocabulary of Chinese speakers based on language samples of personal recounts in a large spoken Chinese database recently made available in the public domain, i.e. Cantonese AphasiaBank ( http://www.speech.hku.hk/caphbank/search/ ). The goal of the analysis is to offer clinicians a rich source for selecting ecologically valid training materials for rehabilitating Chinese-speaking people with aphasia (PWA) in the design and planning of culturally and linguistically appropriate treatments. Discourse production of 65 Chinese-speaking PWA of fluent types (henceforth, PWFA) and their non-aphasic controls narrating an important event in their life were extracted from Cantonese AphasiaBank. Analyses of topics and vocabularies in terms of part-of-speech, word frequency, lexical semantics, and diversity were conducted. There was significant overlap in topics between the two groups. While the vocabulary was larger for controls than that of PWFA as expected, they were similar in distribution across parts-of-speech, frequency of occurrence, and the ratio of concrete to abstract items in major open word classes. Moreover, proportionately more different verbs than nouns were employed at the individual level for both speaker groups. The findings provide important implications for guiding directions of aphasia rehabilitation not only of fluent but also non-fluent Chinese aphasic speakers.

  16. Revisiting vocal perception in non-human animals: a review of vowel discrimination, speaker voice recognition, and speaker normalization

    Directory of Open Access Journals (Sweden)

    Buddhamas eKriengwatana

    2015-01-01

    Full Text Available The extent to which human speech perception evolved by taking advantage of predispositions and pre-existing features of vertebrate auditory and cognitive systems remains a central question in the evolution of speech. This paper reviews asymmetries in vowel perception, speaker voice recognition, and speaker normalization in non-human animals – topics that have not been thoroughly discussed in relation to the abilities of non-human animals, but are nonetheless important aspects of vocal perception. Throughout this paper we demonstrate that addressing these issues in non-human animals is relevant and worthwhile because many non-human animals must deal with similar issues in their natural environment. That is, they must also discriminate between similar-sounding vocalizations, determine signaler identity from vocalizations, and resolve signaler-dependent variation in vocalizations from conspecifics. Overall, we find that, although plausible, the current evidence is insufficiently strong to conclude that directional asymmetries in vowel perception are specific to humans, or that non-human animals can use voice characteristics to recognize human individuals. However, we do find some indication that non-human animals can normalize speaker differences. Accordingly, we identify avenues for future research that would greatly improve and advance our understanding of these topics.

  17. The Status of Native Speaker Intuitions in a Polylectal Grammar.

    Science.gov (United States)

    Debose, Charles E.

    A study of one speaker's intuitions about and performance in Black English is presented with relation to Saussure's "langue-parole" dichotomy. Native speakers of a language have intuitions about the static synchronic entities although the data of their speaking is variable and panchronic. These entities are in a diglossic relationship to each…

  18. Progress in the AMIDA speaker diarization system for meeting data

    NARCIS (Netherlands)

    Leeuwen, D.A. van; Konečný, M.

    2008-01-01

    In this paper we describe the AMIDA speaker dizarization system as it was submitted to the NIST Rich Transcription evaluation 2007 for conference room data. This is done in the context of the history of this system and other speaker diarization systems. One of the goals of our system is to have as

  19. The Relationship between Articulatory Control and Improved Phonemic Accuracy in Childhood Apraxia of Speech: A Longitudinal Case Study

    Science.gov (United States)

    Grigos, Maria I.; Kolenda, Nicole

    2010-01-01

    Jaw movement patterns were examined longitudinally in a 3-year-old male with childhood apraxia of speech (CAS) and compared with a typically developing control group. The child with CAS was followed for 8 months, until he began accurately and consistently producing the bilabial phonemes /p/, /b/, and /m/. A movement tracking system was used to…

  20. Speaker and Accent Variation Are Handled Differently: Evidence in Native and Non-Native Listeners

    Science.gov (United States)

    Kriengwatana, Buddhamas; Terry, Josephine; Chládková, Kateřina; Escudero, Paola

    2016-01-01

    Listeners are able to cope with between-speaker variability in speech that stems from anatomical sources (i.e. individual and sex differences in vocal tract size) and sociolinguistic sources (i.e. accents). We hypothesized that listeners adapt to these two types of variation differently because prior work indicates that adapting to speaker/sex variability may occur pre-lexically while adapting to accent variability may require learning from attention to explicit cues (i.e. feedback). In Experiment 1, we tested our hypothesis by training native Dutch listeners and Australian-English (AusE) listeners without any experience with Dutch or Flemish to discriminate between the Dutch vowels /I/ and /ε/ from a single speaker. We then tested their ability to classify /I/ and /ε/ vowels of a novel Dutch speaker (i.e. speaker or sex change only), or vowels of a novel Flemish speaker (i.e. speaker or sex change plus accent change). We found that both Dutch and AusE listeners could successfully categorize vowels if the change involved a speaker/sex change, but not if the change involved an accent change. When AusE listeners were given feedback on their categorization responses to the novel speaker in Experiment 2, they were able to successfully categorize vowels involving an accent change. These results suggest that adapting to accents may be a two-step process, whereby the first step involves adapting to speaker differences at a pre-lexical level, and the second step involves adapting to accent differences at a contextual level, where listeners have access to word meaning or are given feedback that allows them to appropriately adjust their perceptual category boundaries. PMID:27309889

  1. Face Alignment via Regressing Local Binary Features.

    Science.gov (United States)

    Ren, Shaoqing; Cao, Xudong; Wei, Yichen; Sun, Jian

    2016-03-01

    This paper presents a highly efficient and accurate regression approach for face alignment. Our approach has two novel components: 1) a set of local binary features and 2) a locality principle for learning those features. The locality principle guides us to learn a set of highly discriminative local binary features for each facial landmark independently. The obtained local binary features are used to jointly learn a linear regression for the final output. This approach achieves the state-of-the-art results when tested on the most challenging benchmarks to date. Furthermore, because extracting and regressing local binary features are computationally very cheap, our system is much faster than previous methods. It achieves over 3000 frames per second (FPS) on a desktop or 300 FPS on a mobile phone for locating a few dozens of landmarks. We also study a key issue that is important but has received little attention in the previous research, which is the face detector used to initialize alignment. We investigate several face detectors and perform quantitative evaluation on how they affect alignment accuracy. We find that an alignment friendly detector can further greatly boost the accuracy of our alignment method, reducing the error up to 16% relatively. To facilitate practical usage of face detection/alignment methods, we also propose a convenient metric to measure how good a detector is for alignment initialization.

  2. Children's Understanding That Utterances Emanate from Minds: Using Speaker Belief To Aid Interpretation.

    Science.gov (United States)

    Mitchell, Peter; Robinson, Elizabeth J.; Thompson, Doreen E.

    1999-01-01

    Three experiments examined 3- to 6-year olds' ability to use a speaker's utterance based on false belief to identify which of several referents was intended. Found that many 4- to 5-year olds performed correctly only when it was unnecessary to consider the speaker's belief. When the speaker gave an ambiguous utterance, many 3- to 6-year olds…

  3. Comparative Analysys of Speech Parameters for the Design of Speaker Verification Systems

    National Research Council Canada - National Science Library

    Souza, A

    2001-01-01

    Speaker verification systems are basically composed of three stages: feature extraction, feature processing and comparison of the modified features from speaker voice and from the voice that should be...

  4. The (TNO) Speaker Diarization System for NIST Rich Transcription Evaluation 2005 for meeting data

    NARCIS (Netherlands)

    Leeuwen, D.A. van

    2005-01-01

    Abstract. The TNO speaker speaker diarization system is based on a standard BIC segmentation and clustering algorithm. Since for the NIST Rich Transcription speaker dizarization evaluation measure correct speech detection appears to be essential, we have developed a speech activity detector (SAD) as

  5. Alignment-independent comparison of binding sites based on DrugScore potential fields encoded by 3D Zernike descriptors.

    Science.gov (United States)

    Nisius, Britta; Gohlke, Holger

    2012-09-24

    Analyzing protein binding sites provides detailed insights into the biological processes proteins are involved in, e.g., into drug-target interactions, and so is of crucial importance in drug discovery. Herein, we present novel alignment-independent binding site descriptors based on DrugScore potential fields. The potential fields are transformed to a set of information-rich descriptors using a series expansion in 3D Zernike polynomials. The resulting Zernike descriptors show a promising performance in detecting similarities among proteins with low pairwise sequence identities that bind identical ligands, as well as within subfamilies of one target class. Furthermore, the Zernike descriptors are robust against structural variations among protein binding sites. Finally, the Zernike descriptors show a high data compression power, and computing similarities between binding sites based on these descriptors is highly efficient. Consequently, the Zernike descriptors are a useful tool for computational binding site analysis, e.g., to predict the function of novel proteins, off-targets for drug candidates, or novel targets for known drugs.

  6. The Effect of Tier 2 Intervention for Phonemic Awareness in a Response-to-Intervention Model in Low-Income Preschool Classrooms

    Science.gov (United States)

    Koutsoftas, Anthony D.; Harmon, Mary Towle; Gray, Shelley

    2009-01-01

    Purpose: This study assessed the effectiveness of a Tier 2 intervention that was designed to increase the phonemic awareness skills of low-income preschoolers who were enrolled in Early Reading First classrooms. Method: Thirty-four preschoolers participated in a multiple baseline across participants treatment design. Tier 2 intervention for…

  7. Using timing information in speaker verification

    CSIR Research Space (South Africa)

    Van Heerden, CJ

    2005-11-01

    Full Text Available This paper presents an analysis of temporal information as a feature for use in speaker verification systems. The relevance of temporal information in a speaker’s utterances is investigated, both with regard to improving the robustness of modern...

  8. Speaker Linking and Applications using Non-Parametric Hashing Methods

    Science.gov (United States)

    2016-09-08

    nonparametric estimate of a multivariate density function,” The Annals of Math- ematical Statistics , vol. 36, no. 3, pp. 1049–1051, 1965. [9] E. A. Patrick...Speaker Linking and Applications using Non-Parametric Hashing Methods† Douglas Sturim and William M. Campbell MIT Lincoln Laboratory, Lexington, MA...with many approaches [1, 2]. For this paper, we focus on using i-vectors [2], but the methods apply to any embedding. For the task of speaker QBE and

  9. A simple optical method for measuring the vibration amplitude of a speaker

    OpenAIRE

    UEDA, Masahiro; YAMAGUCHI, Toshihiko; KAKIUCHI, Hiroki; SUGA, Hiroshi

    1999-01-01

    A simple optical method has been proposed for measuring the vibration amplitude of a speaker vibrating with a frequency of approximately 10 kHz. The method is based on a multiple reflection between a vibrating speaker plane and a mirror parallel to that speaker plane. The multiple reflection can magnify a dispersion of the laser beam caused by the vibration, and easily make a measurement of the amplitude. The measuring sensitivity ranges between sub-microns and 1 mm. A preliminary experim...

  10. Pedagogical and didactical rationale of phonemic stimulation process in pre-school age children

    Directory of Open Access Journals (Sweden)

    López, Yudenia

    2010-01-01

    Full Text Available The paper describes the main results of a regional research problem dealing with education in pre-school age. It examines the effectiveness of the didactic conception of the process of phonemic stimulation in children from 3 to 5 years old. The pedagogical and didactic rationale of the process, viewed from the evolutionary, ontogeny, systemic perspective is explained. Likewise, possible scaffolding is illustrated. The suggested procedures focus the provision of support on a systematic and purposely practice which involve first the discrimination of non-verbal sounds and the discrimi-nation of verbal sound later, aiming to the creation of a phonological consciousness.

  11. Coronal View Ultrasound Imaging of Movement in Different Segments of the Tongue during Paced Recital: Findings from Four Normal Speakers and a Speaker with Partial Glossectomy

    Science.gov (United States)

    Bressmann, Tim; Flowers, Heather; Wong, Willy; Irish, Jonathan C.

    2010-01-01

    The goal of this study was to quantitatively describe aspects of coronal tongue movement in different anatomical regions of the tongue. Four normal speakers and a speaker with partial glossectomy read four repetitions of a metronome-paced poem. Their tongue movement was recorded in four coronal planes using two-dimensional B-mode ultrasound…

  12. Independent Component Analysis and Time-Frequency Masking for Speech Recognition in Multitalker Conditions

    Directory of Open Access Journals (Sweden)

    Reinhold Orglmeister

    2010-01-01

    Full Text Available When a number of speakers are simultaneously active, for example in meetings or noisy public places, the sources of interest need to be separated from interfering speakers and from each other in order to be robustly recognized. Independent component analysis (ICA has proven a valuable tool for this purpose. However, ICA outputs can still contain strong residual components of the interfering speakers whenever noise or reverberation is high. In such cases, nonlinear postprocessing can be applied to the ICA outputs, for the purpose of reducing remaining interferences. In order to improve robustness to the artefacts and loss of information caused by this process, recognition can be greatly enhanced by considering the processed speech feature vector as a random variable with time-varying uncertainty, rather than as deterministic. The aim of this paper is to show the potential to improve recognition of multiple overlapping speech signals through nonlinear postprocessing together with uncertainty-based decoding techniques.

  13. Multidimensional Approach to the Development of a Mandarin Chinese-Oriented Sound Test

    Science.gov (United States)

    Hung, Yu-Chen; Lin, Chun-Yi; Tsai, Li-Chiun; Lee, Ya-Jung

    2016-01-01

    Purpose: Because the Ling six-sound test is based on American English phonemes, it can yield unreliable results when administered to non-English speakers. In this study, we aimed to improve specifically the diagnostic palette for Mandarin Chinese users by developing an adapted version of the Ling six-sound test. Method: To determine the set of…

  14. Intelligibility of Standard German and Low German to Speakers of Dutch

    NARCIS (Netherlands)

    Gooskens, C.S.; Kürschner, Sebastian; van Bezooijen, R.

    2011-01-01

    This paper reports on the intelligibility of spoken Low German and Standard German for speakers of Dutch. Two aspects are considered. First, the relative potential for intelligibility of the Low German variety of Bremen and the High German variety of Modern Standard German for speakers of Dutch is

  15. Speaker detection for conversational robots using synchrony between audio and video

    NARCIS (Netherlands)

    Noulas, A.; Englebienne, G.; Terwijn, B.; Kröse, B.; Hanheide, M.; Zender, H.

    2010-01-01

    This paper compares different methods for detecting the speaking person when multiple persons are interacting with a robot. We evaluate the state-of-the-art speaker detection methods on the iCat robot. These methods use the synchrony between audio and video to locate the most probable speaker. We

  16. Perception of English palatal codas by Korean speakers of English

    Science.gov (United States)

    Yeon, Sang-Hee

    2003-04-01

    This study aimed at looking at perception of English palatal codas by Korean speakers of English to determine if perception problems are the source of production problems. In particular, first, this study looked at the possible first language effect on the perception of English palatal codas. Second, a possible perceptual source of vowel epenthesis after English palatal codas was investigated. In addition, individual factors, such as length of residence, TOEFL score, gender and academic status, were compared to determine if those affected the varying degree of the perception accuracy. Eleven adult Korean speakers of English as well as three native speakers of English participated in the study. Three sets of a perception test including identification of minimally different English pseudo- or real words were carried out. The results showed that, first, the Korean speakers perceived the English codas significantly worse than the Americans. Second, the study supported the idea that Koreans perceived an extra /i/ after the final affricates due to final release. Finally, none of the individual factors explained the varying degree of the perceptional accuracy. In particular, TOEFL scores and the perception test scores did not have any statistically significant association.

  17. Evaluating acoustic speaker normalization algorithms: evidence from longitudinal child data.

    Science.gov (United States)

    Kohn, Mary Elizabeth; Farrington, Charlie

    2012-03-01

    Speaker vowel formant normalization, a technique that controls for variation introduced by physical differences between speakers, is necessary in variationist studies to compare speakers of different ages, genders, and physiological makeup in order to understand non-physiological variation patterns within populations. Many algorithms have been established to reduce variation introduced into vocalic data from physiological sources. The lack of real-time studies tracking the effectiveness of these normalization algorithms from childhood through adolescence inhibits exploration of child participation in vowel shifts. This analysis compares normalization techniques applied to data collected from ten African American children across five time points. Linear regressions compare the reduction in variation attributable to age and gender for each speaker for the vowels BEET, BAT, BOT, BUT, and BOAR. A normalization technique is successful if it maintains variation attributable to a reference sociolinguistic variable, while reducing variation attributable to age. Results indicate that normalization techniques which rely on both a measure of central tendency and range of the vowel space perform best at reducing variation attributable to age, although some variation attributable to age persists after normalization for some sections of the vowel space. © 2012 Acoustical Society of America

  18. Do children go for the nice guys? The influence of speaker benevolence and certainty on selective word learning.

    Science.gov (United States)

    Bergstra, Myrthe; DE Mulder, Hannah N M; Coopmans, Peter

    2018-04-06

    This study investigated how speaker certainty (a rational cue) and speaker benevolence (an emotional cue) influence children's willingness to learn words in a selective learning paradigm. In two experiments four- to six-year-olds learnt novel labels from two speakers and, after a week, their memory for these labels was reassessed. Results demonstrated that children retained the label-object pairings for at least a week. Furthermore, children preferred to learn from certain over uncertain speakers, but they had no significant preference for nice over nasty speakers. When the cues were combined, children followed certain speakers, even if they were nasty. However, children did prefer to learn from nice and certain speakers over nasty and certain speakers. These results suggest that rational cues regarding a speaker's linguistic competence trump emotional cues regarding a speaker's affective status in word learning. However, emotional cues were found to have a subtle influence on this process.

  19. Effects of Language Background on Gaze Behavior: A Crosslinguistic Comparison Between Korean and German Speakers

    Science.gov (United States)

    Goller, Florian; Lee, Donghoon; Ansorge, Ulrich; Choi, Soonja

    2017-01-01

    Languages differ in how they categorize spatial relations: While German differentiates between containment (in) and support (auf) with distinct spatial words—(a) den Kuli IN die Kappe stecken (”put pen in cap”); (b) die Kappe AUF den Kuli stecken (”put cap on pen”)—Korean uses a single spatial word (kkita) collapsing (a) and (b) into one semantic category, particularly when the spatial enclosure is tight-fit. Korean uses a different word (i.e., netha) for loose-fits (e.g., apple in bowl). We tested whether these differences influence the attention of the speaker. In a crosslinguistic study, we compared native German speakers with native Korean speakers. Participants rated the similarity of two successive video clips of several scenes where two objects were joined or nested (either in a tight or loose manner). The rating data show that Korean speakers base their rating of similarity more on tight- versus loose-fit, whereas German speakers base their rating more on containment versus support (in vs. auf). Throughout the experiment, we also measured the participants’ eye movements. Korean speakers looked equally long at the moving Figure object and at the stationary Ground object, whereas German speakers were more biased to look at the Ground object. Additionally, Korean speakers also looked more at the region where the two objects touched than did German speakers. We discuss our data in the light of crosslinguistic semantics and the extent of their influence on spatial cognition and perception. PMID:29362644

  20. The Sound of Voice: Voice-Based Categorization of Speakers' Sexual Orientation within and across Languages.

    Directory of Open Access Journals (Sweden)

    Simone Sulpizio

    Full Text Available Empirical research had initially shown that English listeners are able to identify the speakers' sexual orientation based on voice cues alone. However, the accuracy of this voice-based categorization, as well as its generalizability to other languages (language-dependency and to non-native speakers (language-specificity, has been questioned recently. Consequently, we address these open issues in 5 experiments: First, we tested whether Italian and German listeners are able to correctly identify sexual orientation of same-language male speakers. Then, participants of both nationalities listened to voice samples and rated the sexual orientation of both Italian and German male speakers. We found that listeners were unable to identify the speakers' sexual orientation correctly. However, speakers were consistently categorized as either heterosexual or gay on the basis of how they sounded. Moreover, a similar pattern of results emerged when listeners judged the sexual orientation of speakers of their own and of the foreign language. Overall, this research suggests that voice-based categorization of sexual orientation reflects the listeners' expectations of how gay voices sound rather than being an accurate detector of the speakers' actual sexual identity. Results are discussed with regard to accuracy, acoustic features of voices, language dependency and language specificity.

  1. Popular Public Discourse at Speakers' Corner: Negotiating Cultural Identities in Interaction

    DEFF Research Database (Denmark)

    McIlvenny, Paul

    1996-01-01

    In this paper I examine how cultural identities are actively negotiated in popular debate at a multicultural public setting in London. Speakers at Speakers' Corner manage the local construction of group affiliation, audience response and argument in and through talk, within the context of ethnic...... in which participant 'citizens' in the public sphere can actively struggle over cultural representation and identities. Using transcribed examples of video data recorded at Speakers' Corner my paper will examine how cultural identity is invoked in the management of active participation. Audiences...... and their affiliations are regulated and made accountable through the routines of membership categorisation and the policing of cultural identities and their imaginary borders....

  2. Proficiency in English sentence stress production by Cantonese speakers who speak English as a second language (ESL).

    Science.gov (United States)

    Ng, Manwa L; Chen, Yang

    2011-12-01

    The present study examined English sentence stress produced by native Cantonese speakers who were speaking English as a second language (ESL). Cantonese ESL speakers' proficiency in English stress production as perceived by English-speaking listeners was also studied. Acoustical parameters associated with sentence stress including fundamental frequency (F0), vowel duration, and intensity were measured from the English sentences produced by 40 Cantonese ESL speakers. Data were compared with those obtained from 40 native speakers of American English. The speech samples were also judged by eight native listeners who were native speakers of American English for placement, degree, and naturalness of stress. Results showed that Cantonese ESL speakers were able to use F0, vowel duration, and intensity to differentiate sentence stress patterns. Yet, both female and male Cantonese ESL speakers exhibited consistently higher F0 in stressed words than English speakers. Overall, Cantonese ESL speakers were found to be proficient in using duration and intensity to signal sentence stress, in a way comparable with English speakers. In addition, F0 and intensity were found to correlate closely with perceptual judgement and the degree of stress with the naturalness of stress.

  3. Articulatory Movements during Vowels in Speakers with Dysarthria and Healthy Controls

    Science.gov (United States)

    Yunusova, Yana; Weismer, Gary; Westbury, John R.; Lindstrom, Mary J.

    2008-01-01

    Purpose: This study compared movement characteristics of markers attached to the jaw, lower lip, tongue blade, and dorsum during production of selected English vowels by normal speakers and speakers with dysarthria due to amyotrophic lateral sclerosis (ALS) or Parkinson disease (PD). The study asked the following questions: (a) Are movement…

  4. Speaker gender identification based on majority vote classifiers

    Science.gov (United States)

    Mezghani, Eya; Charfeddine, Maha; Nicolas, Henri; Ben Amar, Chokri

    2017-03-01

    Speaker gender identification is considered among the most important tools in several multimedia applications namely in automatic speech recognition, interactive voice response systems and audio browsing systems. Gender identification systems performance is closely linked to the selected feature set and the employed classification model. Typical techniques are based on selecting the best performing classification method or searching optimum tuning of one classifier parameters through experimentation. In this paper, we consider a relevant and rich set of features involving pitch, MFCCs as well as other temporal and frequency-domain descriptors. Five classification models including decision tree, discriminant analysis, nave Bayes, support vector machine and k-nearest neighbor was experimented. The three best perming classifiers among the five ones will contribute by majority voting between their scores. Experimentations were performed on three different datasets spoken in three languages: English, German and Arabic in order to validate language independency of the proposed scheme. Results confirm that the presented system has reached a satisfying accuracy rate and promising classification performance thanks to the discriminating abilities and diversity of the used features combined with mid-level statistics.

  5. Use of the BAT with a Cantonese-Putonghua Speaker with Aphasia

    Science.gov (United States)

    Kong, Anthony Pak-Hin; Weekes, Brendan Stuart

    2011-01-01

    The aim of this article is to illustrate the use of the Bilingual Aphasia Test (BAT) with a Cantonese-Putonghua speaker. We describe G, who is a relatively young Chinese bilingual speaker with aphasia. G's communication abilities in his L2, Putonghua, were impaired following brain damage. This impairment caused specific difficulties in…

  6. Processing advantage for emotional words in bilingual speakers.

    Science.gov (United States)

    Ponari, Marta; Rodríguez-Cuadrado, Sara; Vinson, David; Fox, Neil; Costa, Albert; Vigliocco, Gabriella

    2015-10-01

    Effects of emotion on word processing are well established in monolingual speakers. However, studies that have assessed whether affective features of words undergo the same processing in a native and nonnative language have provided mixed results: Studies that have found differences between native language (L1) and second language (L2) processing attributed the difference to the fact that L2 learned late in life would not be processed affectively, because affective associations are established during childhood. Other studies suggest that adult learners show similar effects of emotional features in L1 and L2. Differences in affective processing of L2 words can be linked to age and context of learning, proficiency, language dominance, and degree of similarity between L2 and L1. Here, in a lexical decision task on tightly matched negative, positive, and neutral words, highly proficient English speakers from typologically different L1s showed the same facilitation in processing emotionally valenced words as native English speakers, regardless of their L1, the age of English acquisition, or the frequency and context of English use. (c) 2015 APA, all rights reserved).

  7. Aligning the unalignable: bacteriophage whole genome alignments.

    Science.gov (United States)

    Bérard, Sèverine; Chateau, Annie; Pompidor, Nicolas; Guertin, Paul; Bergeron, Anne; Swenson, Krister M

    2016-01-13

    In recent years, many studies focused on the description and comparison of large sets of related bacteriophage genomes. Due to the peculiar mosaic structure of these genomes, few informative approaches for comparing whole genomes exist: dot plots diagrams give a mostly qualitative assessment of the similarity/dissimilarity between two or more genomes, and clustering techniques are used to classify genomes. Multiple alignments are conspicuously absent from this scene. Indeed, whole genome aligners interpret lack of similarity between sequences as an indication of rearrangements, insertions, or losses. This behavior makes them ill-prepared to align bacteriophage genomes, where even closely related strains can accomplish the same biological function with highly dissimilar sequences. In this paper, we propose a multiple alignment strategy that exploits functional collinearity shared by related strains of bacteriophages, and uses partial orders to capture mosaicism of sets of genomes. As classical alignments do, the computed alignments can be used to predict that genes have the same biological function, even in the absence of detectable similarity. The Alpha aligner implements these ideas in visual interactive displays, and is used to compute several examples of alignments of Staphylococcus aureus and Mycobacterium bacteriophages, involving up to 29 genomes. Using these datasets, we prove that Alpha alignments are at least as good as those computed by standard aligners. Comparison with the progressive Mauve aligner - which implements a partial order strategy, but whose alignments are linearized - shows a greatly improved interactive graphic display, while avoiding misalignments. Multiple alignments of whole bacteriophage genomes work, and will become an important conceptual and visual tool in comparative genomics of sets of related strains. A python implementation of Alpha, along with installation instructions for Ubuntu and OSX, is available on bitbucket (https://bitbucket.org/thekswenson/alpha).

  8. Halo Intrinsic Alignment: Dependence on Mass, Formation Time, and Environment

    Energy Technology Data Exchange (ETDEWEB)

    Xia, Qianli; Kang, Xi; Wang, Peng; Luo, Yu [Purple Mountain Observatory, the Partner Group of MPI für Astronomie, 2 West Beijing Road, Nanjing 210008 (China); Yang, Xiaohu; Jing, Yipeng [Center for Astronomy and Astrophysics, Shanghai Jiao Tong University, Shanghai 200240 (China); Wang, Huiyuan [Key Laboratory for Research in Galaxies and Cosmology, Department of Astronomy, University of Science and Technology of China, Hefei, Anhui 230026 (China); Mo, Houjun, E-mail: kangxi@pmo.ac.cn [Astronomy Department and Center for Astrophysics, Tsinghua University, Beijing 10084 (China)

    2017-10-10

    In this paper we use high-resolution cosmological simulations to study halo intrinsic alignment and its dependence on mass, formation time, and large-scale environment. In agreement with previous studies using N -body simulations, it is found that massive halos have stronger alignment. For the first time, we find that for a given halo mass older halos have stronger alignment and halos in cluster regions also have stronger alignment than those in filaments. To model these dependencies, we extend the linear alignment model with inclusion of halo bias and find that the halo alignment with its mass and formation time dependence can be explained by halo bias. However, the model cannot account for the environment dependence, as it is found that halo bias is lower in clusters and higher in filaments. Our results suggest that halo bias and environment are independent factors in determining halo alignment. We also study the halo alignment correlation function and find that halos are strongly clustered along their major axes and less clustered along the minor axes. The correlated halo alignment can extend to scales as large as 100 h {sup −1} Mpc, where its feature is mainly driven by the baryon acoustic oscillation effect.

  9. Multi-spectrometer calibration transfer based on independent component analysis.

    Science.gov (United States)

    Liu, Yan; Xu, Hao; Xia, Zhenzhen; Gong, Zhiyong

    2018-02-26

    Calibration transfer is indispensable for practical applications of near infrared (NIR) spectroscopy due to the need for precise and consistent measurements across different spectrometers. In this work, a method for multi-spectrometer calibration transfer is described based on independent component analysis (ICA). A spectral matrix is first obtained by aligning the spectra measured on different spectrometers. Then, by using independent component analysis, the aligned spectral matrix is decomposed into the mixing matrix and the independent components of different spectrometers. These differing measurements between spectrometers can then be standardized by correcting the coefficients within the independent components. Two NIR datasets of corn and edible oil samples measured with three and four spectrometers, respectively, were used to test the reliability of this method. The results of both datasets reveal that spectra measurements across different spectrometers can be transferred simultaneously and that the partial least squares (PLS) models built with the measurements on one spectrometer can predict that the spectra can be transferred correctly on another.

  10. Methods of Speakers\\' Effects on the Audience

    Directory of Open Access Journals (Sweden)

    فریبا حسینی

    2010-09-01

    Full Text Available Methods of Speakers' Effects on the Audience    Nasrollah Shameli *   Fariba Hosayni **     Abstract   This article is focused on four issues. The first issue is related to the speaker's external appearance including the beauty of face, the power of his voice, moves and signals by hand, the stick and eyebrow as well as the height. Such characteristics could have an important effect on the audience. The second issue is related to internal features of the speaker. These include the ethics of the preacher , his/her piety and intention on the speakers based on their personalities, habits and emotions, knowledge and culture, and speed of learning. The third issue is concerned with the appearance of the lecture. Words should be clear enough as well as being mixed with Quranic verses, poetry and proverbs. The final issue is related to the content. It is argued that the subject of the talk should be in accordance with the level of understanding of listeners as well as being new and interesting for them.   3 - A phenomenon rhetoric: It was noted in this section How to give words and phrases so that these words and phrases are clear, correct, mixed in parables, governance and Quranic verses, and appropriate their meaning.   4 - the content of Oratory : It was noted in this section to the topic of Oratory and say that the Oratory should be the theme commensurate with the minds of audiences and also should mean that agree with the case may be, then I say: that the rhetoric if the theme was innovative and new is affecting more and more on the audience.     Key words : Oratory , Preacher , Audience, Influence of speech     * Associate Professor, Department of Arabic Language and Literature, University of Isfahan E-mail: Dr-Nasrolla Shameli@Yahoo.com   * * M.A. in Arabic Language and Literature from Isfahan University E-mail: faribahosayni@yahoo.com

  11. Gender parity trends for invited speakers at four prominent virology conference series.

    Science.gov (United States)

    Kalejta, Robert F; Palmenberg, Ann C

    2017-06-07

    Scientific conferences are most beneficial to participants when they showcase significant new experimental developments, accurately summarize the current state of the field, and provide strong opportunities for collaborative networking. A top-notch slate of invited speakers, assembled by conference organizers or committees, is key to achieving these goals. The perceived underrepresentation of female speakers at prominent scientific meetings is currently a popular topic for discussion, but one that often lacks supportive data. We compiled the full rosters of invited speakers over the last 35 years for four prominent international virology conferences, the American Society for Virology Annual Meeting (ASV), the International Herpesvirus Workshop (IHW), the Positive-Strand RNA Virus Symposium (PSR), and the Gordon Research Conference on Viruses & Cells (GRC). The rosters were cross-indexed by unique names, gender, year, and repeat invitations. When plotted as gender-dependent trends over time, all four conferences showed a clear proclivity for male-dominated invited speaker lists. Encouragingly, shifts toward parity are emerging within all units, but at different rates. Not surprisingly, both selection of a larger percentage of first time participants and the presence of a woman on the speaker selection committee correlated with improved parity. Session chair information was also collected for the IHW and GRC. These visible positions also displayed a strong male dominance over time that is eroding slowly. We offer our personal interpretation of these data to aid future organizers achieve improved equity among the limited number of available positions for session moderators and invited speakers. IMPORTANCE Politicians and media members have a tendency to cite anecdotes as conclusions without any supporting data. This happens so frequently now, that a name for it has emerged: fake news. Good science proceeds otherwise. The under representation of women as invited

  12. Towards an Intelligent Acoustic Front End for Automatic Speech Recognition: Built-in Speaker Normalization

    Directory of Open Access Journals (Sweden)

    Umit H. Yapanel

    2008-08-01

    Full Text Available A proven method for achieving effective automatic speech recognition (ASR due to speaker differences is to perform acoustic feature speaker normalization. More effective speaker normalization methods are needed which require limited computing resources for real-time performance. The most popular speaker normalization technique is vocal-tract length normalization (VTLN, despite the fact that it is computationally expensive. In this study, we propose a novel online VTLN algorithm entitled built-in speaker normalization (BISN, where normalization is performed on-the-fly within a newly proposed PMVDR acoustic front end. The novel algorithm aspect is that in conventional frontend processing with PMVDR and VTLN, two separating warping phases are needed; while in the proposed BISN method only one single speaker dependent warp is used to achieve both the PMVDR perceptual warp and VTLN warp simultaneously. This improved integration unifies the nonlinear warping performed in the front end and reduces simultaneously. This improved integration unifies the nonlinear warping performed in the front end and reduces computational requirements, thereby offering advantages for real-time ASR systems. Evaluations are performed for (i an in-car extended digit recognition task, where an on-the-fly BISN implementation reduces the relative word error rate (WER by 24%, and (ii for a diverse noisy speech task (SPINE 2, where the relative WER improvement was 9%, both relative to the baseline speaker normalization method.

  13. Towards an Intelligent Acoustic Front End for Automatic Speech Recognition: Built-in Speaker Normalization

    Directory of Open Access Journals (Sweden)

    Yapanel UmitH

    2008-01-01

    Full Text Available A proven method for achieving effective automatic speech recognition (ASR due to speaker differences is to perform acoustic feature speaker normalization. More effective speaker normalization methods are needed which require limited computing resources for real-time performance. The most popular speaker normalization technique is vocal-tract length normalization (VTLN, despite the fact that it is computationally expensive. In this study, we propose a novel online VTLN algorithm entitled built-in speaker normalization (BISN, where normalization is performed on-the-fly within a newly proposed PMVDR acoustic front end. The novel algorithm aspect is that in conventional frontend processing with PMVDR and VTLN, two separating warping phases are needed; while in the proposed BISN method only one single speaker dependent warp is used to achieve both the PMVDR perceptual warp and VTLN warp simultaneously. This improved integration unifies the nonlinear warping performed in the front end and reduces simultaneously. This improved integration unifies the nonlinear warping performed in the front end and reduces computational requirements, thereby offering advantages for real-time ASR systems. Evaluations are performed for (i an in-car extended digit recognition task, where an on-the-fly BISN implementation reduces the relative word error rate (WER by 24%, and (ii for a diverse noisy speech task (SPINE 2, where the relative WER improvement was 9%, both relative to the baseline speaker normalization method.

  14. Language control in different contexts: the behavioural ecology of bilingual speakers

    Directory of Open Access Journals (Sweden)

    David William Green

    2011-05-01

    Full Text Available This paper proposes that different experimental contexts (single or dual language contexts permit different neural loci at which words in the target language can be selected. However, in order to develop a fuller understanding of the neural circuit mediating language control we need to consider the community context in which bilingual speakers typically use their two languages (the behavioural ecology of bilingual speakers. The contrast between speakers from code-switching and non-code switching communities offers a way to increase our understanding of the cortical, subcortical and, in particular, cerebellar structures involved in language control. It will also help us identify the non-verbal behavioural correlates associated with these control processes.

  15. Artificially intelligent recognition of Arabic speaker using voice print-based local features

    Science.gov (United States)

    Mahmood, Awais; Alsulaiman, Mansour; Muhammad, Ghulam; Akram, Sheeraz

    2016-11-01

    Local features for any pattern recognition system are based on the information extracted locally. In this paper, a local feature extraction technique was developed. This feature was extracted in the time-frequency plain by taking the moving average on the diagonal directions of the time-frequency plane. This feature captured the time-frequency events producing a unique pattern for each speaker that can be viewed as a voice print of the speaker. Hence, we referred to this technique as voice print-based local feature. The proposed feature was compared to other features including mel-frequency cepstral coefficient (MFCC) for speaker recognition using two different databases. One of the databases used in the comparison is a subset of an LDC database that consisted of two short sentences uttered by 182 speakers. The proposed feature attained 98.35% recognition rate compared to 96.7% for MFCC using the LDC subset.

  16. Objective eye-gaze behaviour during face-to-face communication with proficient alaryngeal speakers: a preliminary study.

    Science.gov (United States)

    Evitts, Paul; Gallop, Robert

    2011-01-01

    There is a large body of research demonstrating the impact of visual information on speaker intelligibility in both normal and disordered speaker populations. However, there is minimal information on which specific visual features listeners find salient during conversational discourse. To investigate listeners' eye-gaze behaviour during face-to-face conversation with normal, laryngeal and proficient alaryngeal speakers. Sixty participants individually participated in a 10-min conversation with one of four speakers (typical laryngeal, tracheoesophageal, oesophageal, electrolaryngeal; 15 participants randomly assigned to one mode of speech). All speakers were > 85% intelligible and were judged to be 'proficient' by two certified speech-language pathologists. Participants were fitted with a head-mounted eye-gaze tracking device (Mobile Eye, ASL) that calculated the region of interest and mean duration of eye-gaze. Self-reported gaze behaviour was also obtained following the conversation using a 10 cm visual analogue scale. While listening, participants viewed the lower facial region of the oesophageal speaker more than the normal or tracheoesophageal speaker. Results of non-hierarchical cluster analyses showed that while listening, the pattern of eye-gaze was predominantly directed at the lower face of the oesophageal and electrolaryngeal speaker and more evenly dispersed among the background, lower face, and eyes of the normal and tracheoesophageal speakers. Finally, results show a low correlation between self-reported eye-gaze behaviour and objective regions of interest data. Overall, results suggest similar eye-gaze behaviour when healthy controls converse with normal and tracheoesophageal speakers and that participants had significantly different eye-gaze patterns when conversing with an oesophageal speaker. Results are discussed in terms of existing eye-gaze data and its potential implications on auditory-visual speech perception. © 2011 Royal College of Speech

  17. Speaker Prediction based on Head Orientations

    NARCIS (Netherlands)

    Rienks, R.J.; Poppe, Ronald Walter; van Otterlo, M.; Poel, Mannes; Poel, M.; Nijholt, A.; Nijholt, Antinus

    2005-01-01

    To gain insight into gaze behavior in meetings, this paper compares the results from a Naive Bayes classifier, Neural Networks and humans on speaker prediction in four-person meetings given solely the azimuth head angles. The Naive Bayes classifier scored 69.4% correctly, Neural Networks 62.3% and

  18. The AMI speaker diarization system for NIST RT06s meeting data

    NARCIS (Netherlands)

    Leeuwen, D.A. van; Huijbregts, Marijn

    2006-01-01

    We describe the systems submitted to the NIST RT06s evaluation for the Speech Activity Detection (SAD) and Speaker Diarization (SPKR) tasks. For speech activity detection, a new analysis methodology is presented that generalizes the Detection Erorr Tradeoff analysis commonly used in speaker

  19. The AMI speaker diarization system for NIST RT06s meeting data

    NARCIS (Netherlands)

    van Leeuwen, David A.; Huijbregts, M.A.H.

    2007-01-01

    We describe the systems submitted to the NIST RT06s evaluation for the Speech Activity Detection (SAD) and Speaker Diarization (SPKR) tasks. For speech activity detection, a new analysis methodology is presented that generalizes the Detection Erorr Tradeoff analysis commonly used in speaker detection

  20. An acoustic analysis of English vowels produced by speakers of seven different native-language backgrounds

    NARCIS (Netherlands)

    Heuven, van V.J.J.P.; Gooskens, C.

    2017-01-01

    We measured F1, F2 and duration of ten English monophthongs produced by American native speakers and by Danish, Norwegian, Swedish, Dutch, Hungarian and Chinese L2 speakers. We hypothesized that (i) L2 speakers would approximate the English vowels more closely as the phonological distance between

  1. Three-dimensional hindfoot alignment measurements based on biplanar radiographs: comparison with standard radiographic measurements

    International Nuclear Information System (INIS)

    Sutter, Reto; Pfirrmann, Christian W.A.; Buck, Florian M.; Espinosa, Norman

    2013-01-01

    To establish a hindfoot alignment measurement technique based on low-dose biplanar radiographs and compare with hindfoot alignment measurements on long axial view radiographs, which is the current reference standard. Long axial view radiographs and low-dose biplanar radiographs of a phantom consisting of a human foot skeleton embedded in acrylic glass (phantom A) and a plastic model of a human foot in three different hindfoot positions (phantoms B1-B3) were imaged in different foot positions (20 internal to 20 external rotation). Two independent readers measured hindfoot alignment on long axial view radiographs and performed 3D hindfoot alignment measurements based on biplanar radiographs on two different occasions. Time for three-dimensional (3D) measurements was determined. Intraclass correlation coefficients (ICC) were calculated. Hindfoot alignment measurements on long axial view radiographs were characterized by a large positional variation, with a range of 14 /13 valgus to 22 /27 varus (reader 1/2 for phantom A), whereas the range of 3D hindfoot alignment measurements was 7.3 /6.0 to 9.0 /10.5 varus (reader 1/2 for phantom A), with a mean and standard deviation of 8.1 ± 0.6/8.7 ± 1.4 respectively. Interobserver agreement was high (ICC = 0.926 for phantom A, and ICC = 0.886 for phantoms B1-B3), and agreement between different readouts was high (ICC = 0.895-0.995 for reader 1, and ICC = 0.987-0.994 for reader 2) for 3D measurements. Mean duration of 3D measurements was 84 ± 15/113 ± 15 s for reader 1/2. Three-dimensional hindfoot alignment measurements based on biplanar radiographs were independent of foot positioning during image acquisition and reader independent. In this phantom study, the 3D measurements were substantially more precise than the standard radiographic measurements. (orig.)

  2. "Non-Vocalization": A Phonological Error Process in the Speech of Severely and Profoundly Hearing Impaired Adults, from the Point of View of the Theory of Phonology as Human Behaviour

    Science.gov (United States)

    Halpern, Orly; Tobin, Yishai

    2008-01-01

    "Non-vocalization" (N-V) is a newly described phonological error process in hearing impaired speakers. In N-V the hearing impaired person actually articulates the phoneme but without producing a voice. The result is an error process looking as if it is produced but sounding as if it is omitted. N-V was discovered by video recording the speech of…

  3. The Acquisition of English Focus Marking by Non-Native Speakers

    Science.gov (United States)

    Baker, Rachel Elizabeth

    This dissertation examines Mandarin and Korean speakers' acquisition of English focus marking, which is realized by accenting particular words within a focused constituent. It is important for non-native speakers to learn how accent placement relates to focus in English because appropriate accent placement and realization makes a learner's English more native-like and easier to understand. Such knowledge may also improve their English comprehension skills. In this study, 20 native English speakers, 20 native Mandarin speakers, and 20 native Korean speakers participated in four experiments: (1) a production experiment, in which they were recorded reading the answers to questions, (2) a perception experiment, in which they were asked to determine which word in a recording was the last prominent word, (3) an understanding experiment, in which they were asked whether the answers in recorded question-answer pairs had context-appropriate prosody, and (4) an accent placement experiment, in which they were asked which word they would make prominent in a particular context. Finally, a new group of native English speakers listened to utterances produced in the production experiment, and determined whether the prosody of each utterance was appropriate for its context. The results of the five experiments support a novel predictive model for second language prosodic focus marking acquisition. This model holds that both transfer of linguistic features from a learner's native language (L1) and features of their second language (L2) affect learners' acquisition of prosodic focus marking. As a result, the model includes two complementary components: the Transfer Component and the L2 Challenge Component. The Transfer Component predicts that prosodic structures in the L2 will be more easily acquired by language learners that have similar structures in their L1 than those who do not, even if there are differences between the L1 and L2 in how the structures are realized. The L2

  4. Speaker transfer in children's peer conversation: completing communication-aid-mediated contributions.

    Science.gov (United States)

    Clarke, Michael; Bloch, Steven; Wilkinson, Ray

    2013-03-01

    Managing the exchange of speakers from one person to another effectively is a key issue for participants in everyday conversational interaction. Speakers use a range of resources to indicate, in advance, when their turn will come to an end, and listeners attend to such signals in order to know when they might legitimately speak. Using the principles and findings from conversation analysis, this paper examines features of speaker transfer in a conversation between a boy with cerebral palsy who has been provided with a voice-output communication aid (VOCA), and a peer without physical or communication difficulties. Specifically, the analysis focuses on turn exchange, where a VOCA-mediated contribution approach completion, and the child without communication needs is due to speak next.

  5. Comparing headphone and speaker effects on simulated driving.

    Science.gov (United States)

    Nelson, T M; Nilsson, T H

    1990-12-01

    Twelve persons drove for three hours in an automobile simulator while listening to music at sound level 63dB over stereo headphones during one session and from a dashboard speaker during another session. They were required to steer a mountain highway, maintain a certain indicated speed, shift gears, and respond to occasional hazards. Steering and speed control were dependent on visual cues. The need to shift and the hazards were indicated by sound and vibration effects. With the headphones, the driver's average reaction time for the most complex task presented--shifting gears--was about one-third second longer than with the speaker. The use of headphones did not delay the development of subjective fatigue.

  6. Finding optimal interaction interface alignments between biological complexes

    KAUST Repository

    Cui, Xuefeng

    2015-06-13

    Motivation: Biological molecules perform their functions through interactions with other molecules. Structure alignment of interaction interfaces between biological complexes is an indispensable step in detecting their structural similarities, which are keys to understanding their evolutionary histories and functions. Although various structure alignment methods have been developed to successfully access the similarities of protein structures or certain types of interaction interfaces, existing alignment tools cannot directly align arbitrary types of interfaces formed by protein, DNA or RNA molecules. Specifically, they require a \\'blackbox preprocessing\\' to standardize interface types and chain identifiers. Yet their performance is limited and sometimes unsatisfactory. Results: Here we introduce a novel method, PROSTA-inter, that automatically determines and aligns interaction interfaces between two arbitrary types of complex structures. Our method uses sequentially remote fragments to search for the optimal superimposition. The optimal residue matching problem is then formulated as a maximum weighted bipartite matching problem to detect the optimal sequence order-independent alignment. Benchmark evaluation on all non-redundant protein-DNA complexes in PDB shows significant performance improvement of our method over TM-align and iAlign (with the \\'blackbox preprocessing\\'). Two case studies where our method discovers, for the first time, structural similarities between two pairs of functionally related protein-DNA complexes are presented. We further demonstrate the power of our method on detecting structural similarities between a protein-protein complex and a protein-RNA complex, which is biologically known as a protein-RNA mimicry case. © The Author 2015. Published by Oxford University Press.

  7. Speaker information affects false recognition of unstudied lexical-semantic associates.

    Science.gov (United States)

    Luthra, Sahil; Fox, Neal P; Blumstein, Sheila E

    2018-05-01

    Recognition of and memory for a spoken word can be facilitated by a prior presentation of that word spoken by the same talker. However, it is less clear whether this speaker congruency advantage generalizes to facilitate recognition of unheard related words. The present investigation employed a false memory paradigm to examine whether information about a speaker's identity in items heard by listeners could influence the recognition of novel items (critical intruders) phonologically or semantically related to the studied items. In Experiment 1, false recognition of semantically associated critical intruders was sensitive to speaker information, though only when subjects attended to talker identity during encoding. Results from Experiment 2 also provide some evidence that talker information affects the false recognition of critical intruders. Taken together, the present findings indicate that indexical information is able to contact the lexical-semantic network to affect the processing of unheard words.

  8. Study of audio speakers containing ferrofluid

    Energy Technology Data Exchange (ETDEWEB)

    Rosensweig, R E [34 Gloucester Road, Summit, NJ 07901 (United States); Hirota, Y; Tsuda, S [Ferrotec, 1-4-14 Kyobashi, chuo-Ku, Tokyo 104-0031 (Japan); Raj, K [Ferrotec, 33 Constitution Drive, Bedford, NH 03110 (United States)

    2008-05-21

    This work validates a method for increasing the radial restoring force on the voice coil in audio speakers containing ferrofluid. In addition, a study is made of factors influencing splash loss of the ferrofluid due to shock. Ferrohydrodynamic analysis is employed throughout to model behavior, and predictions are compared to experimental data.

  9. Designing, Modeling, Constructing, and Testing a Flat Panel Speaker and Sound Diffuser for a Simulator

    Science.gov (United States)

    Dillon, Christina

    2013-01-01

    The goal of this project was to design, model, build, and test a flat panel speaker and frame for a spherical dome structure being made into a simulator. The simulator will be a test bed for evaluating an immersive environment for human interfaces. This project focused on the loud speakers and a sound diffuser for the dome. The rest of the team worked on an Ambisonics 3D sound system, video projection system, and multi-direction treadmill to create the most realistic scene possible. The main programs utilized in this project, were Pro-E and COMSOL. Pro-E was used for creating detailed figures for the fabrication of a frame that held a flat panel loud speaker. The loud speaker was made from a thin sheet of Plexiglas and 4 acoustic exciters. COMSOL, a multiphysics finite analysis simulator, was used to model and evaluate all stages of the loud speaker, frame, and sound diffuser. Acoustical testing measurements were utilized to create polar plots from the working prototype which were then compared to the COMSOL simulations to select the optimal design for the dome. The final goal of the project was to install the flat panel loud speaker design in addition to a sound diffuser on to the wall of the dome. After running tests in COMSOL on various speaker configurations, including a warped Plexiglas version, the optimal speaker design included a flat piece of Plexiglas with a rounded frame to match the curvature of the dome. Eight of these loud speakers will be mounted into an inch and a half of high performance acoustic insulation, or Thinsulate, that will cover the inside of the dome. The following technical paper discusses these projects and explains the engineering processes used, knowledge gained, and the projected future goals of this project

  10. Culture independent PCR: an alternative enzyme discovery strategy

    DEFF Research Database (Denmark)

    Jacobsen, Jonas; Lydolph, Magnus; Lange, Lene

    2005-01-01

    Degenerate primers were designed for use in a culture-independent PCR screening of DNA from composite fungal communities, inhabiting residues of corn stovers and leaves. According to similarity searches and alignments amplified clone sequences affiliated with glycosyl hydrolase family 7 and glyco...... the value of culture-independent PCR in microbial diversity studies and could add to development of a new enzyme screening technology....

  11. During Threaded Discussions Are Non-Native English Speakers Always at a Disadvantage?

    Science.gov (United States)

    Shafer Willner, Lynn

    2014-01-01

    When participating in threaded discussions, under what conditions might non¬native speakers of English (NNSE) be at a comparative disadvantage to their classmates who are native speakers of English (NSE)? This study compares the threaded discussion perspectives of closely-matched NNSE and NSE adult students having different levels of threaded…

  12. Evaluation of Speakers at a National Continuing Medical Education (CME Course

    Directory of Open Access Journals (Sweden)

    Jannette Collins, MD, MEd, FCCP

    2002-12-01

    Full Text Available Purpose: Evaluations of a national radiology continuing medical education (CME course in thoracic imaging were analyzed to determine what constitutes effective and ineffective lecturing. Methods and Materials: Evaluations of sessions and individual speakers participating in a five-day course jointly sponsored by the Society of Thoracic Radiology (STR and the Radiological Society of North America (RSNA were tallied by the RSNA Department of Data Management and three members of the STR Training Committee. Comments were collated and analyzed to determine the number of positive and negative comments and common themes related to ineffective lecturing. Results: Twenty-two sessions were evaluated by 234 (75.7% of 309 professional registrants. Eighty-one speakers were evaluated by an average of 153 registrants (range, 2 – 313. Mean ratings for 10 items evaluating sessions ranged from 1.28 – 2.05 (1=most positive, 4=least positive; SD .451 - .902. The average speaker rating was 5.7 (1=very poor, 7=outstanding; SD 0.94; range 4.3 – 6.4. Total number of comments analyzed was 862, with 505 (58.6% considered positive and 404 (46.9% considered negative (the total number exceeds 862 as a “comment” could consist of both positive and negative statements. Poor content was mentioned most frequently, making up 107 (26.5% of 404 negative comments, and applied to 51 (63% of 81 speakers. Other negative comments, in order of decreasing frequency, were related to delivery, image slides, command of the English language, text slides, and handouts. Conclusions: Individual evaluations of speakers at a national CME course provided information regarding the quality of lectures that was not provided by evaluations of grouped presentations. Systematic review of speaker evaluations provided specific information related to the types and frequency of features related to ineffective lecturing. This information can be used to design CME course evaluations, design future CME

  13. AlignMe—a membrane protein sequence alignment web server

    Science.gov (United States)

    Stamm, Marcus; Staritzbichler, René; Khafizov, Kamil; Forrest, Lucy R.

    2014-01-01

    We present a web server for pair-wise alignment of membrane protein sequences, using the program AlignMe. The server makes available two operational modes of AlignMe: (i) sequence to sequence alignment, taking two sequences in fasta format as input, combining information about each sequence from multiple sources and producing a pair-wise alignment (PW mode); and (ii) alignment of two multiple sequence alignments to create family-averaged hydropathy profile alignments (HP mode). For the PW sequence alignment mode, four different optimized parameter sets are provided, each suited to pairs of sequences with a specific similarity level. These settings utilize different types of inputs: (position-specific) substitution matrices, secondary structure predictions and transmembrane propensities from transmembrane predictions or hydrophobicity scales. In the second (HP) mode, each input multiple sequence alignment is converted into a hydrophobicity profile averaged over the provided set of sequence homologs; the two profiles are then aligned. The HP mode enables qualitative comparison of transmembrane topologies (and therefore potentially of 3D folds) of two membrane proteins, which can be useful if the proteins have low sequence similarity. In summary, the AlignMe web server provides user-friendly access to a set of tools for analysis and comparison of membrane protein sequences. Access is available at http://www.bioinfo.mpg.de/AlignMe PMID:24753425

  14. Pre-Service Teachers' Knowledge of Phonemic Awareness: Relationship to Perceived Knowledge, Self-Efficacy Beliefs, and Exposure to a Multimedia-Enhanced Lecture

    Science.gov (United States)

    Martinussen, Rhonda; Ferrari, Julia; Aitken, Madison; Willows, Dale

    2015-01-01

    This study examined the relations among perceived and actual knowledge of phonemic awareness (PA), exposure to PA instruction during practicum, and self-efficacy for teaching PA in a sample of 54 teacher candidates (TCs) enrolled in a 1-year Bachelor of Education program in a Canadian university. It also assessed the effects of a brief…

  15. Direct and Indirect Effects of Stimulating Phoneme Awareness vs. Other Linguistic Skills in Preschoolers with Co-Occurring Speech and Language Impairments

    Science.gov (United States)

    Tyler, Ann A.; Gillon, Gail; Macrae, Toby; Johnson, Roberta L.

    2011-01-01

    Aim: The purpose of this study was to examine the effects of an integrated phoneme awareness/speech intervention in comparison to an alternating speech/morphosyntax intervention for specific areas targeted by the different interventions, as well as the extent of indirect gains in nontargeted areas. Method: A total of 30 children with co-occurring…

  16. Teste de figuras para discriminação fonêmica: uma proposta Phoneme Discrimination Picture Test: a proposal

    Directory of Open Access Journals (Sweden)

    Beatriz dos Santos-Carvalho

    2008-01-01

    Full Text Available OBJETIVOS: Propor um teste que avaliasse a discriminação fonêmica por meio de pares mínimos, abrangendo todos os fonemas do Português Brasileiro e utilizando-os em palavras que possam ser facilmente representadas por figuras. Buscou-se que este teste contribua para o diagnóstico de alterações fonoaudiológicas e para a pesquisa científica. Procurou-se fazer um teste que fosse de fácil aplicação, podendo ser utilizado em qualquer local de trabalho dos fonoaudiólogos. MÉTODOS: Selecionou-se pares mínimos que opusessem fonemas em relação ao valor binário de cada traço distintivo e às combinações possíveis entre os traços de lugar ([labial], [coronal], [dorsal] bem como pelas oposições de estruturas silábicas. Criou-se figuras que representam as palavras dos pares. RESULTADOS: Elaborou-se o Teste de Figuras para Discriminação Fonêmica, que avalia a habilidade de discriminação fonêmica em crianças de quatro a oito anos. Este teste contém 40 apresentações, das quais 30 são pares mínimos e dez pares de palavras iguais. Nestas 30 apresentações, opôs-se os traços distintivos [+/- soante], [+/- aproximante], [+/- contínuo], [+/- voz], [coronal+/-ant], [labial] x [coronal], [dorsal] x [coronal] e [labial] x [dorsal]. Igualmente foram opostas as seguintes estruturas silábicas: V x CV, CV x CCV, CV x CVC. CONCLUSÃO: Conclui-se que os objetivos deste trabalho foram alcançados com êxito, pois o Teste de Figura para Discriminação Fonêmica contempla tudo a que se propôs. Acredita-se que o teste deva ser aplicado em um estudo piloto para averiguar se as palavras e as figuras estão adequadas para a faixa etária. Posteriormente, deve ser aplicado em diversas regiões do país para ser devidamente padronizado.PURPOSE: To propose a test to evaluate phonemic discrimination using minimum pairs, comprising all Brazilian Portuguese phonemes and using them in words which can be easily represented by pictures. The

  17. 7 CFR 247.13 - Provisions for non-English or limited-English speakers.

    Science.gov (United States)

    2010-01-01

    ... 7 Agriculture 4 2010-01-01 2010-01-01 false Provisions for non-English or limited-English speakers... § 247.13 Provisions for non-English or limited-English speakers. (a) What must State and local agencies do to ensure that non-English or limited-English speaking persons are aware of their rights and...

  18. FACT. Normalized and asynchronous mirror alignment for Cherenkov telescopes

    Energy Technology Data Exchange (ETDEWEB)

    Mueller, Sebastian Achim [ETH Zurich (Switzerland); Buss, Jens [TU Dortmund (Germany)

    2016-07-01

    Imaging Atmospheric Cherenkov Telescopes (IACTs) need fast and large imaging optics to map the faint Cherenkov light emitted in cosmic ray air showers onto their image sensors. Segmented reflectors are inexpensive, lightweight and offer good image quality. However, alignment of the mirror facets remains a challenge. A good alignment is crucial in IACT observations to separate gamma rays from hadronic cosmic rays. We present a star tracking alignment method which is not restricted to clear nights. It normalizes the mirror facet reflections to be independent of the reference star or the cloud coverage. It records asynchronously of the telescope drive which makes the method easy to integrate in existing telescopes. It can be combined with remote facet actuation, but it does not need one to work. Furthermore, it can reconstruct all individual mirror facet point spread functions. We present the method and alignment results on the First Geiger-mode Photo Diode Avalanche Cherenkov Telescope (FACT) on the Canary Island of La Palma, Spain.

  19. Analysis of Feature Extraction Methods for Speaker Dependent Speech Recognition

    Directory of Open Access Journals (Sweden)

    Gurpreet Kaur

    2017-02-01

    Full Text Available Speech recognition is about what is being said, irrespective of who is saying. Speech recognition is a growing field. Major progress is taking place on the technology of automatic speech recognition (ASR. Still, there are lots of barriers in this field in terms of recognition rate, background noise, speaker variability, speaking rate, accent etc. Speech recognition rate mainly depends on the selection of features and feature extraction methods. This paper outlines the feature extraction techniques for speaker dependent speech recognition for isolated words. A brief survey of different feature extraction techniques like Mel-Frequency Cepstral Coefficients (MFCC, Linear Predictive Coding Coefficients (LPCC, Perceptual Linear Prediction (PLP, Relative Spectra Perceptual linear Predictive (RASTA-PLP analysis are presented and evaluation is done. Speech recognition has various applications from daily use to commercial use. We have made a speaker dependent system and this system can be useful in many areas like controlling a patient vehicle using simple commands.

  20. Communication Interface for Mexican Spanish Dysarthric Speakers

    Directory of Open Access Journals (Sweden)

    Gladys Bonilla-Enriquez

    2012-03-01

    Full Text Available La disartria es una discapacidad motora del habla caracterizada por debilidad o poca coordinación de los músculos del habla. Esta condición puede ser causada por un infarto, parálisis cerebral, o por una lesión severa en el cerebro. Para mexicanos con esta condición hay muy pocas, si es que hay alguna, tecnologías de asistencia para mejorar sus habilidades sociales de interacción. En este artículo presentamos nuestros avances hacia el desarrollo de una interfazde comunicación para hablantes con disartria cuya lengua materna sea el español mexicano. La metodología propuesta depende de (1 diseño especial de un corpus de entrenamiento con voz normal y recursos limitados, (2 adaptación de usuario estándar, y (3 control de la perplejidad del modelo de lenguaje para lograr alta precisión en el Reconocimiento Automático del Habla (RAH. La interfaz permite al usuario y terapéuta el realizar actividades como adaptación dinámica de usuario, adaptación de vocabulario, y síntesis de texto a voz. Pruebas en vivo fueron realizadas con un usuario con disartria leve, logrando precisiones de 93%-95% para habla espontánea.Dysarthria is a motor speech disorder due to weakness or poor coordination of the speechmuscles. This condition can be caused by a stroke, cerebral palsy, or by a traumatic braininjury. For Mexican people with this condition there are few, if any, assistive technologies to improve their social interaction skills. In this paper we present our advances towards the development of a communication interface for dysarthric speakers whose native language is Mexican Spanish. We propose a methodology that relies on (1 special design of a training normal-speech corpus with limited resources, (2 standard speaker adaptation, and (3 control of language model perplexity, to achieve high Automatic Speech Recognition (ASR accuracy. The interface allows the user and therapist to perform tasks such as dynamic speaker adaptation, vocabulary

  1. How African American English-Speaking First Graders Segment and Rhyme Words and Nonwords With Final Consonant Clusters.

    Science.gov (United States)

    Shollenbarger, Amy J; Robinson, Gregory C; Taran, Valentina; Choi, Seo-Eun

    2017-10-05

    This study explored how typically developing 1st grade African American English (AAE) speakers differ from mainstream American English (MAE) speakers in the completion of 2 common phonological awareness tasks (rhyming and phoneme segmentation) when the stimulus items were consonant-vowel-consonant-consonant (CVCC) words and nonwords. Forty-nine 1st graders met criteria for 2 dialect groups: AAE and MAE. Three conditions were tested in each rhyme and segmentation task: Real Words No Model, Real Words With a Model, and Nonwords With a Model. The AAE group had significantly more responses that rhymed CVCC words with consonant-vowel-consonant words and segmented CVCC words as consonant-vowel-consonant than the MAE group across all experimental conditions. In the rhyming task, the presence of a model in the real word condition elicited more reduced final cluster responses for both groups. In the segmentation task, the MAE group was at ceiling, so only the AAE group changed across the different stimulus presentations and reduced the final cluster less often when given a model. Rhyming and phoneme segmentation performance can be influenced by a child's dialect when CVCC words are used.

  2. The Relative Predictive Contribution and Causal Role of Phoneme Awareness, Rhyme Awareness, and Verbal Short-Term Memory in Reading Skills: A Review

    Science.gov (United States)

    Melby-Lervag, Monica

    2012-01-01

    The acknowledgement that educational achievement is highly dependent on successful reading development has led to extensive research on its underlying factors. A strong argument has been made for a causal relationship between reading and phoneme awareness; similarly, causal relations have been suggested for reading with short-term memory and rhyme…

  3. Vocal caricatures reveal signatures of speaker identity

    Science.gov (United States)

    López, Sabrina; Riera, Pablo; Assaneo, María Florencia; Eguía, Manuel; Sigman, Mariano; Trevisan, Marcos A.

    2013-12-01

    What are the features that impersonators select to elicit a speaker's identity? We built a voice database of public figures (targets) and imitations produced by professional impersonators. They produced one imitation based on their memory of the target (caricature) and another one after listening to the target audio (replica). A set of naive participants then judged identity and similarity of pairs of voices. Identity was better evoked by the caricatures and replicas were perceived to be closer to the targets in terms of voice similarity. We used this data to map relevant acoustic dimensions for each task. Our results indicate that speaker identity is mainly associated with vocal tract features, while perception of voice similarity is related to vocal folds parameters. We therefore show the way in which acoustic caricatures emphasize identity features at the cost of loosing similarity, which allows drawing an analogy with caricatures in the visual space.

  4. Within-category variance and lexical tone discrimination in native and non-native speakers

    NARCIS (Netherlands)

    Hoffmann, C.W.G.; Sadakata, M.; Chen, A.; Desain, P.W.M.; McQueen, J.M.; Gussenhove, C.; Chen, Y.; Dediu, D.

    2014-01-01

    In this paper, we show how acoustic variance within lexical tones in disyllabic Mandarin Chinese pseudowords affects discrimination abilities in both native and non-native speakers of Mandarin Chinese. Within-category acoustic variance did not hinder native speakers in discriminating between lexical

  5. The Acquisition of Clitic Pronouns in the Spanish Interlanguage of Peruvian Quechua Speakers.

    Science.gov (United States)

    Klee, Carol A.

    1989-01-01

    Analysis of four adult Quechua speakers' acquisition of clitic pronouns in Spanish revealed that educational attainment and amount of contact with monolingual Spanish speakers were positively related to native-like norms of competence in the use of object pronouns in Spanish. (CB)

  6. GraphAlignment: Bayesian pairwise alignment of biological networks

    Directory of Open Access Journals (Sweden)

    Kolář Michal

    2012-11-01

    Full Text Available Abstract Background With increased experimental availability and accuracy of bio-molecular networks, tools for their comparative and evolutionary analysis are needed. A key component for such studies is the alignment of networks. Results We introduce the Bioconductor package GraphAlignment for pairwise alignment of bio-molecular networks. The alignment incorporates information both from network vertices and network edges and is based on an explicit evolutionary model, allowing inference of all scoring parameters directly from empirical data. We compare the performance of our algorithm to an alternative algorithm, Græmlin 2.0. On simulated data, GraphAlignment outperforms Græmlin 2.0 in several benchmarks except for computational complexity. When there is little or no noise in the data, GraphAlignment is slower than Græmlin 2.0. It is faster than Græmlin 2.0 when processing noisy data containing spurious vertex associations. Its typical case complexity grows approximately as O(N2.6. On empirical bacterial protein-protein interaction networks (PIN and gene co-expression networks, GraphAlignment outperforms Græmlin 2.0 with respect to coverage and specificity, albeit by a small margin. On large eukaryotic PIN, Græmlin 2.0 outperforms GraphAlignment. Conclusions The GraphAlignment algorithm is robust to spurious vertex associations, correctly resolves paralogs, and shows very good performance in identification of homologous vertices defined by high vertex and/or interaction similarity. The simplicity and generality of GraphAlignment edge scoring makes the algorithm an appropriate choice for global alignment of networks.

  7. "I May Be a Native Speaker but I'm Not Monolingual": Reimagining "All" Teachers' Linguistic Identities in TESOL

    Science.gov (United States)

    Ellis, Elizabeth M.

    2016-01-01

    Teacher linguistic identity has so far mainly been researched in terms of whether a teacher identifies (or is identified by others) as a native speaker (NEST) or nonnative speaker (NNEST) (Moussu & Llurda, 2008; Reis, 2011). Native speakers are presumed to be monolingual, and nonnative speakers, although by definition bilingual, tend to be…

  8. Classroom acoustics design guidelines based on the optimization of speaker conditions

    DEFF Research Database (Denmark)

    Pelegrin Garcia, David; Brunskog, Jonas

    2012-01-01

    School teachers suffer frequently from voice problems due to the high vocal load that they experience and the not-always-ideal conditions under which they have to teach. Traditionally, the purpose of the acoustic design of classrooms has been to optimize speech intelligibility. New guidelines...... and noise level measurements in classrooms. Requirements of optimum vocal comfort, average A-weighted speech levels across the audience higher than 50 dB, and a physical volume higher than 6 m3/student are combined to extract optimum acoustic conditions, which depend on the number of students....... These conditions, which are independent on the position of the speaker, cannot be optimum for more than 50 students. For classrooms with 10 students, the reverberation time in occupied conditions shall be between 0.5 and 0.65 s, and the volume between 60 and 170 m3. For classrooms with 40 students...

  9. Bridging Gaps in Common Ground: Speakers Design Their Gestures for Their Listeners

    Science.gov (United States)

    Hilliard, Caitlin; Cook, Susan Wagner

    2016-01-01

    Communication is shaped both by what we are trying to say and by whom we are saying it to. We examined whether and how shared information influences the gestures speakers produce along with their speech. Unlike prior work examining effects of common ground on speech and gesture, we examined a situation in which some speakers have the same amount…

  10. Pitch perception and production in congenital amusia: Evidence from Cantonese speakers.

    Science.gov (United States)

    Liu, Fang; Chan, Alice H D; Ciocca, Valter; Roquet, Catherine; Peretz, Isabelle; Wong, Patrick C M

    2016-07-01

    This study investigated pitch perception and production in speech and music in individuals with congenital amusia (a disorder of musical pitch processing) who are native speakers of Cantonese, a tone language with a highly complex tonal system. Sixteen Cantonese-speaking congenital amusics and 16 controls performed a set of lexical tone perception, production, singing, and psychophysical pitch threshold tasks. Their tone production accuracy and singing proficiency were subsequently judged by independent listeners, and subjected to acoustic analyses. Relative to controls, amusics showed impaired discrimination of lexical tones in both speech and non-speech conditions. They also received lower ratings for singing proficiency, producing larger pitch interval deviations and making more pitch interval errors compared to controls. Demonstrating higher pitch direction identification thresholds than controls for both speech syllables and piano tones, amusics nevertheless produced native lexical tones with comparable pitch trajectories and intelligibility as controls. Significant correlations were found between pitch threshold and lexical tone perception, music perception and production, but not between lexical tone perception and production for amusics. These findings provide further evidence that congenital amusia is a domain-general language-independent pitch-processing deficit that is associated with severely impaired music perception and production, mildly impaired speech perception, and largely intact speech production.

  11. It doesn't matter what you say: FMRI correlates of voice learning and recognition independent of speech content.

    Science.gov (United States)

    Zäske, Romi; Awwad Shiekh Hasan, Bashar; Belin, Pascal

    2017-09-01

    Listeners can recognize newly learned voices from previously unheard utterances, suggesting the acquisition of high-level speech-invariant voice representations during learning. Using functional magnetic resonance imaging (fMRI) we investigated the anatomical basis underlying the acquisition of voice representations for unfamiliar speakers independent of speech, and their subsequent recognition among novel voices. Specifically, listeners studied voices of unfamiliar speakers uttering short sentences and subsequently classified studied and novel voices as "old" or "new" in a recognition test. To investigate "pure" voice learning, i.e., independent of sentence meaning, we presented German sentence stimuli to non-German speaking listeners. To disentangle stimulus-invariant and stimulus-dependent learning, during the test phase we contrasted a "same sentence" condition in which listeners heard speakers repeating the sentences from the preceding study phase, with a "different sentence" condition. Voice recognition performance was above chance in both conditions although, as expected, performance was higher for same than for different sentences. During study phases activity in the left inferior frontal gyrus (IFG) was related to subsequent voice recognition performance and same versus different sentence condition, suggesting an involvement of the left IFG in the interactive processing of speaker and speech information during learning. Importantly, at test reduced activation for voices correctly classified as "old" compared to "new" emerged in a network of brain areas including temporal voice areas (TVAs) of the right posterior superior temporal gyrus (pSTG), as well as the right inferior/middle frontal gyrus (IFG/MFG), the right medial frontal gyrus, and the left caudate. This effect of voice novelty did not interact with sentence condition, suggesting a role of temporal voice-selective areas and extra-temporal areas in the explicit recognition of learned voice identity

  12. Speaker-Sex Discrimination for Voiced and Whispered Vowels at Short Durations

    OpenAIRE

    Smith, David R. R.

    2016-01-01

    Whispered vowels, produced with no vocal fold vibration, lack the periodic temporal fine structure which in voiced vowels underlies the perceptual attribute of pitch (a salient auditory cue to speaker sex). Voiced vowels possess no temporal fine structure at very short durations (below two glottal cycles). The prediction was that speaker-sex discrimination performance for whispered and voiced vowels would be similar for very short durations but, as stimulus duration increases, voiced vowel pe...

  13. Identifying the nonlinear mechanical behaviour of micro-speakers from their quasi-linear electrical response

    Science.gov (United States)

    Zilletti, Michele; Marker, Arthur; Elliott, Stephen John; Holland, Keith

    2017-05-01

    In this study model identification of the nonlinear dynamics of a micro-speaker is carried out by purely electrical measurements, avoiding any explicit vibration measurements. It is shown that a dynamic model of the micro-speaker, which takes into account the nonlinear damping characteristic of the device, can be identified by measuring the response between the voltage input and the current flowing into the coil. An analytical formulation of the quasi-linear model of the micro-speaker is first derived and an optimisation method is then used to identify a polynomial function which describes the mechanical damping behaviour of the micro-speaker. The analytical results of the quasi-linear model are compared with numerical results. This study potentially opens up the possibility of efficiently implementing nonlinear echo cancellers.

  14. Promoting Communities of Practice among Non-Native Speakers of English in Online Discussions

    Science.gov (United States)

    Kim, Hoe Kyeung

    2011-01-01

    An online discussion involving text-based computer-mediated communication has great potential for promoting equal participation among non-native speakers of English. Several studies claimed that online discussions could enhance the academic participation of non-native speakers of English. However, there is little research around participation…

  15. Dissociations between word and picture naming in Persian speakers with aphasia

    Directory of Open Access Journals (Sweden)

    Mehdi Bakhtiar

    2014-04-01

    Full Text Available Studies of patients with aphasia have found dissociations in their ability to read words and name pictures (Hillis & Caramazza, 1995; Hillis & Caramazza, 1991. Persian orthography is characterised by nearly regular orthography-phonology (OP mappings however, the omission of some vowels in the script makes the OP mapping of many words less predictable. The aim of this study was to compare the predictive lexico-semantic variables across reading and picture naming tasks in Persian aphasia while considering the variability across participants and items using mixed modeling. Methods and Results A total of 21 brain-injured Persian-speaking patients suffering from aphasia were asked to name 200 normalized Snodgrass object pictures and words taken from Bakhtiar, Nilipour and Weekes (2013 in different sessions. The results showed that word naming performance was significantly better than object naming in Persian speakers with aphasia (p<0.0001. Applying McNemar’s test to examine individual differences found that 18 patients showed significantly better performance in word reading compared to picture naming, 2 patients showed no difference between naming and reading (i.e. case 1 and 10, and one patient (i.e. case 5 showed significantly better naming compared to reading χ (1=10.23, p< 0.01 (see also Figure 1. A mixed-effect logistic regression analysis revealed that the degree of spelling transparency (i.e. the number of letters in a word divided by the number of its phonemes had an effect on word naming (along with frequency, age of acquisition (AoA, and imageability and picture naming (along with image agreement, AoA, word length, frequency and name agreement with a much stronger effect on the word naming task (b= 1.67, SE= 0.41, z= 4.05, p< 0.0001 compared to the picture naming task (b= -0.64, SE= 0.32, z= 2, p< 0.05. Conclusion The dissociation between word naming and picture naming shown by many patients suggests at least two routes are available

  16. MUON DETECTORS: ALIGNMENT

    CERN Multimedia

    G.Gomez.

    Since June of 2009, the muon alignment group has focused on providing new alignment constants and on finalizing the hardware alignment reconstruction. Alignment constants for DTs and CSCs were provided for CRAFT09 data reprocessing. For DT chambers, the track-based alignment was repeated using CRAFT09 cosmic ray muons and validated using segment extrapolation and split cosmic tools. One difference with respect to the previous alignment is that only five degrees of freedom were aligned, leaving the rotation around the local x-axis to be better determined by the hardware system. Similarly, DT chambers poorly aligned by tracks (due to limited statistics) were aligned by a combination of photogrammetry and hardware-based alignment. For the CSC chambers, the hardware system provided alignment in global z and rotations about local x. Entire muon endcap rings were further corrected in the transverse plane (global x and y) by the track-based alignment. Single chamber track-based alignment suffers from poor statistic...

  17. Learning foreign labels from a foreign speaker: the role of (limited) exposure to a second language.

    Science.gov (United States)

    Akhtar, Nameera; Menjivar, Jennifer; Hoicka, Elena; Sabbagh, Mark A

    2012-11-01

    Three- and four-year-olds (N = 144) were introduced to novel labels by an English speaker and a foreign speaker (of Nordish, a made-up language), and were asked to endorse one of the speaker's labels. Monolingual English-speaking children were compared to bilingual children and English-speaking children who were regularly exposed to a language other than English. All children tended to endorse the English speaker's labels when asked 'What do you call this?', but when asked 'What do you call this in Nordish?', children with exposure to a second language were more likely to endorse the foreign label than monolingual and bilingual children. The findings suggest that, at this age, exposure to, but not necessarily immersion in, more than one language may promote the ability to learn foreign words from a foreign speaker.

  18. Is the superior verbal memory span of Mandarin speakers due to faster rehearsal?

    Science.gov (United States)

    Mattys, Sven L; Baddeley, Alan; Trenkic, Danijela

    2018-04-01

    It is well established that digit span in native Chinese speakers is atypically high. This is commonly attributed to a capacity for more rapid subvocal rehearsal for that group. We explored this hypothesis by testing a group of English-speaking native Mandarin speakers on digit span and word span in both Mandarin and English, together with a measure of speed of articulation for each. When compared to the performance of native English speakers, the Mandarin group proved to be superior on both digit and word spans while predictably having lower spans in English. This suggests that the Mandarin advantage is not limited to digits. Speed of rehearsal correlated with span performance across materials. However, this correlation was more pronounced for English speakers than for any of the Chinese measures. Further analysis suggested that speed of rehearsal did not provide an adequate account of differences between Mandarin and English spans or for the advantage of digits over words. Possible alternative explanations are discussed.

  19. DNA motif alignment by evolving a population of Markov chains.

    Science.gov (United States)

    Bi, Chengpeng

    2009-01-30

    Deciphering cis-regulatory elements or de novo motif-finding in genomes still remains elusive although much algorithmic effort has been expended. The Markov chain Monte Carlo (MCMC) method such as Gibbs motif samplers has been widely employed to solve the de novo motif-finding problem through sequence local alignment. Nonetheless, the MCMC-based motif samplers still suffer from local maxima like EM. Therefore, as a prerequisite for finding good local alignments, these motif algorithms are often independently run a multitude of times, but without information exchange between different chains. Hence it would be worth a new algorithm design enabling such information exchange. This paper presents a novel motif-finding algorithm by evolving a population of Markov chains with information exchange (PMC), each of which is initialized as a random alignment and run by the Metropolis-Hastings sampler (MHS). It is progressively updated through a series of local alignments stochastically sampled. Explicitly, the PMC motif algorithm performs stochastic sampling as specified by a population-based proposal distribution rather than individual ones, and adaptively evolves the population as a whole towards a global maximum. The alignment information exchange is accomplished by taking advantage of the pooled motif site distributions. A distinct method for running multiple independent Markov chains (IMC) without information exchange, or dubbed as the IMC motif algorithm, is also devised to compare with its PMC counterpart. Experimental studies demonstrate that the performance could be improved if pooled information were used to run a population of motif samplers. The new PMC algorithm was able to improve the convergence and outperformed other popular algorithms tested using simulated and biological motif sequences.

  20. Variation among heritage speakers: Sequential vs. simultaneous bilinguals

    Directory of Open Access Journals (Sweden)

    Teresa Lee

    2013-08-01

    Full Text Available This study examines the differences in the grammatical knowledge of two types of heritage speakers of Korean. Early simultaneous bilinguals are exposed to both English and the heritage language from birth, whereas early sequential bilinguals are exposed to the heritage language first and then to English upon schooling. A listening comprehension task involving relative clauses was conducted with 51 beginning-level Korean heritage speakers. The results showed that the early sequential bilinguals exhibited much more accurate knowledge than the early simultaneous bilinguals, who lacked rudimentary knowledge of Korean relative clauses. Drawing on the findings of adult and child Korean L1 data on the acquisition of relative clauses, the performance of each group is discussed with respect to attrition and incomplete acquisition of the heritage language.

  1. Modeling Linguistic Variables With Regression Models: Addressing Non-Gaussian Distributions, Non-independent Observations, and Non-linear Predictors With Random Effects and Generalized Additive Models for Location, Scale, and Shape

    Directory of Open Access Journals (Sweden)

    Christophe Coupé

    2018-04-01

    Full Text Available As statistical approaches are getting increasingly used in linguistics, attention must be paid to the choice of methods and algorithms used. This is especially true since they require assumptions to be satisfied to provide valid results, and because scientific articles still often fall short of reporting whether such assumptions are met. Progress is being, however, made in various directions, one of them being the introduction of techniques able to model data that cannot be properly analyzed with simpler linear regression models. We report recent advances in statistical modeling in linguistics. We first describe linear mixed-effects regression models (LMM, which address grouping of observations, and generalized linear mixed-effects models (GLMM, which offer a family of distributions for the dependent variable. Generalized additive models (GAM are then introduced, which allow modeling non-linear parametric or non-parametric relationships between the dependent variable and the predictors. We then highlight the possibilities offered by generalized additive models for location, scale, and shape (GAMLSS. We explain how they make it possible to go beyond common distributions, such as Gaussian or Poisson, and offer the appropriate inferential framework to account for ‘difficult’ variables such as count data with strong overdispersion. We also demonstrate how they offer interesting perspectives on data when not only the mean of the dependent variable is modeled, but also its variance, skewness, and kurtosis. As an illustration, the case of phonemic inventory size is analyzed throughout the article. For over 1,500 languages, we consider as predictors the number of speakers, the distance from Africa, an estimation of the intensity of language contact, and linguistic relationships. We discuss the use of random effects to account for genealogical relationships, the choice of appropriate distributions to model count data, and non-linear relationships

  2. Modeling Linguistic Variables With Regression Models: Addressing Non-Gaussian Distributions, Non-independent Observations, and Non-linear Predictors With Random Effects and Generalized Additive Models for Location, Scale, and Shape.

    Science.gov (United States)

    Coupé, Christophe

    2018-01-01

    As statistical approaches are getting increasingly used in linguistics, attention must be paid to the choice of methods and algorithms used. This is especially true since they require assumptions to be satisfied to provide valid results, and because scientific articles still often fall short of reporting whether such assumptions are met. Progress is being, however, made in various directions, one of them being the introduction of techniques able to model data that cannot be properly analyzed with simpler linear regression models. We report recent advances in statistical modeling in linguistics. We first describe linear mixed-effects regression models (LMM), which address grouping of observations, and generalized linear mixed-effects models (GLMM), which offer a family of distributions for the dependent variable. Generalized additive models (GAM) are then introduced, which allow modeling non-linear parametric or non-parametric relationships between the dependent variable and the predictors. We then highlight the possibilities offered by generalized additive models for location, scale, and shape (GAMLSS). We explain how they make it possible to go beyond common distributions, such as Gaussian or Poisson, and offer the appropriate inferential framework to account for 'difficult' variables such as count data with strong overdispersion. We also demonstrate how they offer interesting perspectives on data when not only the mean of the dependent variable is modeled, but also its variance, skewness, and kurtosis. As an illustration, the case of phonemic inventory size is analyzed throughout the article. For over 1,500 languages, we consider as predictors the number of speakers, the distance from Africa, an estimation of the intensity of language contact, and linguistic relationships. We discuss the use of random effects to account for genealogical relationships, the choice of appropriate distributions to model count data, and non-linear relationships. Relying on GAMLSS, we

  3. The native-speaker fever in English language teaching (ELT: Pitting pedagogical competence against historical origin

    Directory of Open Access Journals (Sweden)

    Anchimbe, Eric A.

    2006-01-01

    Full Text Available This paper discusses English language teaching (ELT around the world, and argues that as a profession, it should emphasise pedagogical competence rather than native-speaker requirement in the recruitment of teachers in English as a foreign language (EFL and English as a second language (ESL contexts. It establishes that being a native speaker does not make one automatically a competent speaker or, of that matter, a competent teacher of the language. It observes that on many grounds, including physical, sociocultural, technological and economic changes in the world as well as the status of English as official and national language in many post-colonial regions, the distinction between native and non-native speakers is no longer valid.

  4. Psychophysical Boundary for Categorization of Voiced-Voiceless Stop Consonants in Native Japanese Speakers

    Science.gov (United States)

    Tamura, Shunsuke; Ito, Kazuhito; Hirose, Nobuyuki; Mori, Shuji

    2018-01-01

    Purpose: The purpose of this study was to investigate the psychophysical boundary used for categorization of voiced-voiceless stop consonants in native Japanese speakers. Method: Twelve native Japanese speakers participated in the experiment. The stimuli were synthetic stop consonant-vowel stimuli varying in voice onset time (VOT) with…

  5. Does training make French speakers more able to identify lexical stress?

    OpenAIRE

    Schwab, Sandra; Llisterri, Joaquim

    2013-01-01

    This research takes the stress deafness hypothesis as a starting point (e.g. Dupoux et al., 2008), and, more specifically, the fact that French speakers present difficulties in perceiving lexical stress in a free-stress language. In this framework, we aim at determining whether a prosodic training could improve the ability of French speakers to identify the stressed syllable in Spanish words. Three groups of participants took part in this experiment. The Native group was composed of 16 speake...

  6. a sociophonetic study of young nigerian english speakers

    African Journals Online (AJOL)

    Oladipupo

    between male and female speakers in boundary consonant deletion, (F(1, .... speech perception (Foulkes 2006, Clopper & Pisoni, 2005, Thomas 2002). ... in Nigeria, and had had the privilege of travelling to Europe and the Americas for the.

  7. Applicability of Alignment and Combination Rules to Burst Pressure Prediction of Multiple-flawed Steam Generator Tube

    Energy Technology Data Exchange (ETDEWEB)

    Lee, Myeong Woo; Kim, Ji Seok; Kim, Yun Jae [Korea University, Seoul (Korea, Republic of); Jeon, Jun Young [Doosan Heavy Industries and Consruction, Seoul (Korea, Republic of); Lee, Dong Min [Korea Plant Service and Engineering, Technical Research and Development Institute, Naju (Korea, Republic of)

    2016-05-15

    Alignment and combination rules are provided by various codes and standards. These rules are used to determine whether multiple flaws should be treated as non-aligned or as coplanar, and independent or combined flaws. Experimental results on steam generator (SG) tube specimens containing multiple axial part-through-wall (PTW) flaws at room temperature (RT) are compared with assessment results based on the alignment and combination rules of the codes and standards. In case of axial collinear flaws, ASME, JSME, and BS7910 treated multiple flaws as independent flaws and API 579, A16, and FKM treated multiple flaws as combined single flaw. Assessment results of combined flaws were conservative. In case of axial non-aligned flaws, almost flaws were aligned and assessment results well correlate with experimental data. In case of axial parallel flaws, both effective flaw lengths of aligned flaws and separated flaws was are same because of each flaw length were same. This study investigates the applicability of alignment and combination rules for multiple flaws on the failure behavior of Alloy 690TT steam generator (SG) tubes that widely used in the nuclear power plan. Experimental data of burst tests on Alloy 690TT tubes with single and multiple flaws that conducted at room temperature (RT) by Kim el al. compared with the alignment rules of these codes and standards. Burst pressure of SG tubes with flaws are predicted using limit load solutions that provide by EPRI Handbook.

  8. Classifications of Vocalic Segments from Articulatory Kinematics: Healthy Controls and Speakers with Dysarthria

    Science.gov (United States)

    Yunusova, Yana; Weismer, Gary G.; Lindstrom, Mary J.

    2011-01-01

    Purpose: In this study, the authors classified vocalic segments produced by control speakers (C) and speakers with dysarthria due to amyotrophic lateral sclerosis (ALS) or Parkinson's disease (PD); classification was based on movement measures. The researchers asked the following questions: (a) Can vowels be classified on the basis of selected…

  9. THE PHONOLOGICAL SYSTEM OF SPANISH

    Directory of Open Access Journals (Sweden)

    Claudia S. Salcedo

    2010-10-01

    Full Text Available Spanish articulatory phonetics, the classification of sounds and the physiological mechanism used in the production of phonemes are discussed in this article. The process of learning a language consists of classifying sounds within the target language. Since the learner may be hearing the utterance in a different way than the native speaker some objective criteria are needed to classify sounds. If these distinctions are not mastered, he may be perceived as sounding awkward. Other phonological processes are applied in informal situations due to socio-linguistic factors such as age, social class, and education. Sound deletion in particular phonological environments are not done randomly by the speaker, but by necessity to retain semantic comprehension. Allophonic choices within phonemes make up the dialect for a particular area.

  1. The effect on recognition memory of noise cancelling headphones in a noisy environment with native and nonnative speakers

    Directory of Open Access Journals (Sweden)

    Brett R C Molesworth

    2014-01-01

    Full Text Available Noise has the potential to impair cognitive performance. For nonnative speakers, the effect of noise on performance is more severe than their native counterparts. What remains unknown is the effectiveness of countermeasures such as noise attenuating devices in such circumstances. Therefore, the main aim of the present research was to examine the effectiveness of active noise attenuating countermeasures in the presence of simulated aircraft noise for both native and nonnative English speakers. Thirty-two participants, half native English speakers and half native German speakers completed four recognition (cued recall tasks presented in English under four different audio conditions, all in the presence of simulated aircraft noise. The results of the research indicated that in simulated aircraft noise at 65 dB(A, performance of nonnative English speakers was poorer than for native English speakers. The beneficial effects of noise cancelling headphones in improving the signal to noise ratio led to an improved performance for nonnative speakers. These results have particular importance for organizations operating in a safety-critical environment such as aviation.

  2. Speaker-Sex Discrimination for Voiced and Whispered Vowels at Short Durations.

    Science.gov (United States)

    Smith, David R R

    2016-01-01

    Whispered vowels, produced with no vocal fold vibration, lack the periodic temporal fine structure which in voiced vowels underlies the perceptual attribute of pitch (a salient auditory cue to speaker sex). Voiced vowels possess no temporal fine structure at very short durations (below two glottal cycles). The prediction was that speaker-sex discrimination performance for whispered and voiced vowels would be similar for very short durations but, as stimulus duration increases, voiced vowel performance would improve relative to whispered vowel performance as pitch information becomes available. This pattern of results was shown for women's but not for men's voices. A whispered vowel needs to have a duration three times longer than a voiced vowel before listeners can reliably tell whether it's spoken by a man or woman (∼30 ms vs. ∼10 ms). Listeners were half as sensitive to information about speaker-sex when it is carried by whispered compared with voiced vowels.

  3. Infant sensitivity to speaker and language in learning a second label.

    Science.gov (United States)

    Bhagwat, Jui; Casasola, Marianella

    2014-02-01

    Two experiments examined when monolingual, English-learning 19-month-old infants learn a second object label. Two experimenters sat together. One labeled a novel object with one novel label, whereas the other labeled the same object with a different label in either the same or a different language. Infants were tested on their comprehension of each label immediately following its presentation. Infants mapped the first label at above chance levels, but they did so with the second label only when requested by the speaker who provided it (Experiment 1) or when the second experimenter labeled the object in a different language (Experiment 2). These results show that 19-month-olds learn second object labels but do not readily generalize them across speakers of the same language. The results highlight how speaker and language spoken guide infants' acceptance of second labels, supporting sociopragmatic views of word learning. Copyright © 2013 Elsevier Inc. All rights reserved.

  4. B Anand | Speakers | Indian Academy of Sciences

    Indian Academy of Sciences (India)

    However, the mechanism by which this protospacer fragment gets integrated in a directional fashion into the leader proximal end is elusive. The speakers group identified that the leader region abutting the first CRISPR repeat localizes Integration Host Factor (IHF) and Cas1-2 complex in Escherichia coli. IHF binding to the ...

  5. L2 speakers decompose morphologically complex verbs: fMRI evidence from priming of transparent derived verbs

    Directory of Open Access Journals (Sweden)

    Sophie eDe Grauwe

    2014-10-01

    Full Text Available In this fMRI long-lag priming study, we investigated the processing of Dutch semantically transparent, derived prefix verbs. In such words, the meaning of the word as a whole can be deduced from the meanings of its parts, e.g. wegleggen ‘put aside’. Many behavioral and some fMRI studies suggest that native (L1 speakers decompose transparent derived words. The brain region usually implicated in morphological decomposition is the left inferior frontal gyrus (LIFG. In non-native (L2 speakers, the processing of transparent derived words has hardly been investigated, especially in fMRI studies, and results are contradictory: Some studies find more reliance on holistic (i.e. non-decompositional processing by L2 speakers; some find no difference between L1 and L2 speakers. In this study, we wanted to find out whether Dutch transparent derived prefix verbs are decomposed or processed holistically by German L2 speakers of Dutch. Half of the derived verbs (e.g. omvallen ‘fall down’ were preceded by their stem (e.g. vallen ‘fall’ with a lag of 4 to 6 words (‘primed’; the other half (e.g. inslapen ‘fall asleep’ were not (‘unprimed’. L1 and L2 speakers of Dutch made lexical decisions on these visually presented verbs. Both ROI analyses and whole-brain analyses showed that there was a significant repetition suppression effect for primed compared to unprimed derived verbs in the LIFG. This was true both for the analyses over L2 speakers only and for the analyses over the two language groups together. The latter did not reveal any interaction with language group (L1 vs. L2 in the LIFG. Thus, L2 speakers show a clear priming effect in the LIFG, an area that has been associated with morphological decomposition. Our findings are consistent with the idea that L2 speakers engage in decomposition of transparent derived verbs rather than processing them holistically.

  6. 5 Tips for Creating Independent Activities Aligned with the Common Core State Standards

    Science.gov (United States)

    Fraser, Dawn W.

    2013-01-01

    Promoting independence in all students is one important part of education. It can be difficult for educators to identify meaningful tasks that students with severe disabilities can complete with full independence. By incorporating visual supports into a student's independent work, the teacher is providing the student with an opportunity to…

  7. Congenital Amusia in Speakers of a Tone Language: Association with Lexical Tone Agnosia

    Science.gov (United States)

    Nan, Yun; Sun, Yanan; Peretz, Isabelle

    2010-01-01

    Congenital amusia is a neurogenetic disorder that affects the processing of musical pitch in speakers of non-tonal languages like English and French. We assessed whether this musical disorder exists among speakers of Mandarin Chinese who use pitch to alter the meaning of words. Using the Montreal Battery of Evaluation of Amusia, we tested 117…

  8. Combining Behavioral and ERP Methodologies to Investigate the Differences Between McGurk Effects Demonstrated by Cantonese and Mandarin Speakers

    Directory of Open Access Journals (Sweden)

    Juan Zhang

    2018-05-01

    Full Text Available The present study investigated the impact of Chinese dialects on McGurk effect using behavioral and event-related potential (ERP methodologies. Specifically, intra-language comparison of McGurk effect was conducted between Mandarin and Cantonese speakers. The behavioral results showed that Cantonese speakers exhibited a stronger McGurk effect in audiovisual speech perception compared to Mandarin speakers, although both groups performed equally in the auditory and visual conditions. ERP results revealed that Cantonese speakers were more sensitive to visual cues than Mandarin speakers, though this was not the case for the auditory cues. Taken together, the current findings suggest that the McGurk effect generated by Chinese speakers is mainly influenced by segmental phonology during audiovisual speech integration.

  9. Combining Behavioral and ERP Methodologies to Investigate the Differences Between McGurk Effects Demonstrated by Cantonese and Mandarin Speakers

    Science.gov (United States)

    Zhang, Juan; Meng, Yaxuan; McBride, Catherine; Fan, Xitao; Yuan, Zhen

    2018-01-01

    The present study investigated the impact of Chinese dialects on McGurk effect using behavioral and event-related potential (ERP) methodologies. Specifically, intra-language comparison of McGurk effect was conducted between Mandarin and Cantonese speakers. The behavioral results showed that Cantonese speakers exhibited a stronger McGurk effect in audiovisual speech perception compared to Mandarin speakers, although both groups performed equally in the auditory and visual conditions. ERP results revealed that Cantonese speakers were more sensitive to visual cues than Mandarin speakers, though this was not the case for the auditory cues. Taken together, the current findings suggest that the McGurk effect generated by Chinese speakers is mainly influenced by segmental phonology during audiovisual speech integration. PMID:29780312

  10. Communication‐related affective, behavioral, and cognitive reactions in speakers with spasmodic dysphonia

    Science.gov (United States)

    Vanryckeghem, Martine

    2017-01-01

    Objectives To investigate the self‐perceived affective, behavioral, and cognitive reactions associated with communication of speakers with spasmodic dysphonia as a function of employment status. Study Design Prospective cross‐sectional investigation Methods 148 Participants with spasmodic dysphonia (SD) completed an adapted version of the Behavior Assessment Battery (BAB‐Voice), a multidimensional assessment of self‐perceived reactions to communication. The BAB‐Voice consisted of four subtests: the Speech Situation Checklist for A) Emotional Reaction (SSC‐ER) and B) Speech Disruption (SSC‐SD), C) the Behavior Checklist (BCL), and D) the Communication Attitude Test for Adults (BigCAT). Participants were assigned to groups based on employment status (working versus retired). Results Descriptive comparison of the BAB‐Voice in speakers with SD to previously published non‐dysphonic speaker data revealed substantially higher scores associated with SD across all four subtests. Multivariate Analysis of Variance (MANOVA) revealed no significantly different BAB‐Voice subtest scores as a function of SD group status (working vs. retired). Conclusions BAB‐Voice scores revealed that speakers with SD experienced substantial impact of their voice disorder on communication attitude, coping behaviors, and affective reactions in speaking situations as reflected in their high BAB scores. These impacts do not appear to be influenced by work status, as speakers with SD who were employed or retired experienced similar levels of affective and behavioral reactions in various speaking situations and cognitive responses. These findings are consistent with previously published pilot data. The specificity of items assessed by means of the BAB‐Voice may inform the clinician of valid patient‐centered treatment goals which target the impairment extended beyond the physiological dimension. Level of Evidence 2b PMID:29299525

  11. Communication-related affective, behavioral, and cognitive reactions in speakers with spasmodic dysphonia.

    Science.gov (United States)

    Watts, Christopher R; Vanryckeghem, Martine

    2017-12-01

    To investigate the self-perceived affective, behavioral, and cognitive reactions associated with communication of speakers with spasmodic dysphonia as a function of employment status. Prospective cross-sectional investigation. 148 Participants with spasmodic dysphonia (SD) completed an adapted version of the Behavior Assessment Battery (BAB-Voice), a multidimensional assessment of self-perceived reactions to communication. The BAB-Voice consisted of four subtests: the Speech Situation Checklist for A) Emotional Reaction (SSC-ER) and B) Speech Disruption (SSC-SD), C) the Behavior Checklist (BCL), and D) the Communication Attitude Test for Adults (BigCAT). Participants were assigned to groups based on employment status (working versus retired). Descriptive comparison of the BAB-Voice in speakers with SD to previously published non-dysphonic speaker data revealed substantially higher scores associated with SD across all four subtests. Multivariate Analysis of Variance (MANOVA) revealed no significantly different BAB-Voice subtest scores as a function of SD group status (working vs. retired). BAB-Voice scores revealed that speakers with SD experienced substantial impact of their voice disorder on communication attitude, coping behaviors, and affective reactions in speaking situations as reflected in their high BAB scores. These impacts do not appear to be influenced by work status, as speakers with SD who were employed or retired experienced similar levels of affective and behavioral reactions in various speaking situations and cognitive responses. These findings are consistent with previously published pilot data. The specificity of items assessed by means of the BAB-Voice may inform the clinician of valid patient-centered treatment goals which target the impairment extended beyond the physiological dimension. 2b.

  12. Validation of Kalman Filter alignment algorithm with cosmic-ray data using a CMS silicon strip tracker endcap

    CERN Document Server

    Sprenger, D; Adolphi, R; Brauer, R; Feld, L; Klein, K; Ostaptchuk, A; Schael, S; Wittmer, B

    2010-01-01

    A Kalman Filter alignment algorithm has been applied to cosmic-ray data. We discuss the alignment algorithm and an experiment-independent implementation including outlier rejection and treatment of weakly determined parameters. Using this implementation, the algorithm has been applied to data recorded with one CMS silicon tracker endcap. Results are compared to both photogrammetry measurements and data obtained from a dedicated hardware alignment system, and good agreement is observed.

  13. Schizophrenia among Sesotho speakers in South Africa | Mosotho ...

    African Journals Online (AJOL)

    Results: Core symptoms of schizophrenia among Sesotho speakers do not differ significantly from other cultures. However, the content of psychological symptoms such as delusions and hallucinations is strongly affected by cultural variables. Somatic symptoms such as headaches, palpitations, dizziness and excessive ...

  14. Sentence comprehension in Swahili-English bilingual agrammatic speakers

    NARCIS (Netherlands)

    Abuom, Tom O.; Shah, Emmah; Bastiaanse, Roelien

    For this study, sentence comprehension was tested in Swahili-English bilingual agrammatic speakers. The sentences were controlled for four factors: (1) order of the arguments (base vs. derived); (2) embedding (declarative vs. relative sentences); (3) overt use of the relative pronoun "who"; (4)

  15. An evidence-based rehabilitation program for tracheoesophageal speakers

    NARCIS (Netherlands)

    Jongmans, P.; Rossum, M.; As-Brooks, C.; Hilgers, F.; Pols, L.; Hilgers, F.J.M.; Pols, L.C.W.; van Rossum, M.; van den Brekel, M.W.M.

    2008-01-01

    Objectives: to develop an evidence-based therapy program aimed at improving tracheoesophageal speech intelligibility. The therapy program is based on particular problems found for TE speakers in a previous study as performed by the authors. Patients/Materials and Methods: 9 male laryngectomized

  16. Aligning Participation with Authorship: Independent Transmedia Documentary Production in Norway

    NARCIS (Netherlands)

    Karlsen, Joakim

    2016-01-01

    textabstractThe main contribution of this article is to describe how the concept of non-fiction transmedia has challenged the independent documentary film community in Norway. How the new possibilities afforded by web- and mobile media, with the potential of reconfiguring the current relation

  17. On the same wavelength: predictable language enhances speaker-listener brain-to-brain synchrony in posterior superior temporal gyrus.

    Science.gov (United States)

    Dikker, Suzanne; Silbert, Lauren J; Hasson, Uri; Zevin, Jason D

    2014-04-30

    Recent research has shown that the degree to which speakers and listeners exhibit similar brain activity patterns during human linguistic interaction is correlated with communicative success. Here, we used an intersubject correlation approach in fMRI to test the hypothesis that a listener's ability to predict a speaker's utterance increases such neural coupling between speakers and listeners. Nine subjects listened to recordings of a speaker describing visual scenes that varied in the degree to which they permitted specific linguistic predictions. In line with our hypothesis, the temporal profile of listeners' brain activity was significantly more synchronous with the speaker's brain activity for highly predictive contexts in left posterior superior temporal gyrus (pSTG), an area previously associated with predictive auditory language processing. In this region, predictability differentially affected the temporal profiles of brain responses in the speaker and listeners respectively, in turn affecting correlated activity between the two: whereas pSTG activation increased with predictability in the speaker, listeners' pSTG activity instead decreased for more predictable sentences. Listeners additionally showed stronger BOLD responses for predictive images before sentence onset, suggesting that highly predictable contexts lead comprehenders to preactivate predicted words.

  18. Student perceptions of native and non-native speaker language instructors: A comparison of ESL and Spanish

    Directory of Open Access Journals (Sweden)

    Laura Callahan

    2006-12-01

    Full Text Available The question of the native vs. non-native speaker status of second and foreign language instructors has been investigated chiefly from the perspective of the teacher. Anecdotal evidence suggests that students have strong opinions on the relative qualities of instruction by native and non-native speakers. Most research focuses on students of English as a foreign or second language. This paper reports on data gathered through a questionnaire administered to 55 university students: 31 students of Spanish as FL and 24 students of English as SL. Qualitative results show what strengths students believe each type of instructor has, and quantitative results confirm that any gap students may perceive between the abilities of native and non-native instructors is not so wide as one might expect based on popular notions of the issue. ESL students showed a stronger preference for native-speaker instructors overall, and were at variance with the SFL students' ratings of native-speaker instructors' performance on a number of aspects. There was a significant correlation in both groups between having a family member who is a native speaker of the target language and student preference for and self-identification with a native speaker as instructor. (English text

  19. Thermal Stresses Analysis and Optimized TTP Processes to Achieved CNT-Based Diaphragm for Thin Panel Speakers

    Directory of Open Access Journals (Sweden)

    Feng-Min Lai

    2016-01-01

    Full Text Available Industrial companies popularly used the powder coating, classing, and thermal transfer printing (TTP technique to avoid oxidation on the metallic surface and stiffened speaker diaphragm. This study developed a TTP technique to fabricate a carbon nanotubes (CNTs stiffened speaker diaphragm for thin panel speaker. The self-developed TTP stiffening technique did not require a high curing temperature that decreased the mechanical property of CNTs. In addition to increasing the stiffness of diaphragm substrate, this technique alleviated the middle and high frequency attenuation associated with the smoothing sound pressure curve of thin panel speaker. The advantage of TTP technique is less harmful to the ecology, but it causes thermal residual stresses and some unstable connections between printed plates. Thus, this study used the numerical analysis software (ANSYS to analyze the stress and thermal of work piece which have not delaminated problems in transfer interface. The Taguchi quality engineering method was applied to identify the optimal manufacturing parameters. Finally, the optimal manufacturing parameters were employed to fabricate a CNT-based diaphragm, which was then assembled onto a speaker. The result indicated that the CNT-based diaphragm improved the sound pressure curve smoothness of the speaker, which produced a minimum high frequency dip difference (ΔdB value.

  20. The Space-Time Topography of English Speakers

    Science.gov (United States)

    Duman, Steve

    2016-01-01

    English speakers talk and think about Time in terms of physical space. The past is behind us, and the future is in front of us. In this way, we "map" space onto Time. This dissertation addresses the specificity of this physical space, or its topography. Inspired by languages like Yupno (Nunez, et al., 2012) and Bamileke-Dschang (Hyman,…

  1. Does dynamic information about the speaker's face contribute to semantic speech processing? ERP evidence.

    Science.gov (United States)

    Hernández-Gutiérrez, David; Abdel Rahman, Rasha; Martín-Loeches, Manuel; Muñoz, Francisco; Schacht, Annekathrin; Sommer, Werner

    2018-07-01

    Face-to-face interactions characterize communication in social contexts. These situations are typically multimodal, requiring the integration of linguistic auditory input with facial information from the speaker. In particular, eye gaze and visual speech provide the listener with social and linguistic information, respectively. Despite the importance of this context for an ecological study of language, research on audiovisual integration has mainly focused on the phonological level, leaving aside effects on semantic comprehension. Here we used event-related potentials (ERPs) to investigate the influence of facial dynamic information on semantic processing of connected speech. Participants were presented with either a video or a still picture of the speaker, concomitant to auditory sentences. Along three experiments, we manipulated the presence or absence of the speaker's dynamic facial features (mouth and eyes) and compared the amplitudes of the semantic N400 elicited by unexpected words. Contrary to our predictions, the N400 was not modulated by dynamic facial information; therefore, semantic processing seems to be unaffected by the speaker's gaze and visual speech. Even though, during the processing of expected words, dynamic faces elicited a long-lasting late posterior positivity compared to the static condition. This effect was significantly reduced when the mouth of the speaker was covered. Our findings may indicate an increase of attentional processing to richer communicative contexts. The present findings also demonstrate that in natural communicative face-to-face encounters, perceiving the face of a speaker in motion provides supplementary information that is taken into account by the listener, especially when auditory comprehension is non-demanding. Copyright © 2018 Elsevier Ltd. All rights reserved.

  2. Infants' Selectively Pay Attention to the Information They Receive from a Native Speaker of Their Language.

    Science.gov (United States)

    Marno, Hanna; Guellai, Bahia; Vidal, Yamil; Franzoi, Julia; Nespor, Marina; Mehler, Jacques

    2016-01-01

    From the first moments of their life, infants show a preference for their native language, as well as toward speakers with whom they share the same language. This preference appears to have broad consequences in various domains later on, supporting group affiliations and collaborative actions in children. Here, we propose that infants' preference for native speakers of their language also serves a further purpose, specifically allowing them to efficiently acquire culture specific knowledge via social learning. By selectively attending to informants who are native speakers of their language and who probably also share the same cultural background with the infant, young learners can maximize the possibility to acquire cultural knowledge. To test whether infants would preferably attend the information they receive from a speaker of their native language, we familiarized 12-month-old infants with a native and a foreign speaker, and then presented them with movies where each of the speakers silently gazed toward unfamiliar objects. At test, infants' looking behavior to the two objects alone was measured. Results revealed that infants preferred to look longer at the object presented by the native speaker. Strikingly, the effect was replicated also with 5-month-old infants, indicating an early development of such preference. These findings provide evidence that young infants pay more attention to the information presented by a person with whom they share the same language. This selectivity can serve as a basis for efficient social learning by influencing how infants' allocate attention between potential sources of information in their environment.

  3. MUON DETECTORS: ALIGNMENT

    CERN Multimedia

    G.Gomez

    2010-01-01

    Most of the work in muon alignment since December 2009 has focused on the geometry reconstruction from the optical systems and improvements in the internal alignment of the DT chambers. The barrel optical alignment system has progressively evolved from reconstruction of single active planes to super-planes (December 09) to a new, full barrel reconstruction. Initial validation studies comparing this full barrel alignment at 0T with photogrammetry provide promising results. In addition, the method has been applied to CRAFT09 data, and the resulting alignment at 3.8T yields residuals from tracks (extrapolated from the tracker) which look smooth, suggesting a good internal barrel alignment with a small overall offset with respect to the tracker. This is a significant improvement, which should allow the optical system to provide a start-up alignment for 2010. The end-cap optical alignment has made considerable progress in the analysis of transfer line data. The next set of alignment constants for CSCs will there...

  4. Towards PLDA-RBM based speaker recognition in mobile environment: Designing stacked/deep PLDA-RBM systems

    DEFF Research Database (Denmark)

    Nautsch, Andreas; Hao, Hong; Stafylakis, Themos

    2016-01-01

    recognition: two deep architectures are presented and examined, which aim at suppressing channel effects and recovering speaker-discriminative information on back-ends trained on a small dataset. Experiments are carried out on the MOBIO SRE'13 database, which is a challenging and publicly available dataset...... for mobile speaker recognition with limited amounts of training data. The experiments show that the proposed system outperforms the baseline i-vector/PLDA approach by relative gains of 31% on female and 9% on male speakers in terms of half total error rate....

  5. Neural bases of congenital amusia in tonal language speakers.

    Science.gov (United States)

    Zhang, Caicai; Peng, Gang; Shao, Jing; Wang, William S-Y

    2017-03-01

    Congenital amusia is a lifelong neurodevelopmental disorder of fine-grained pitch processing. In this fMRI study, we examined the neural bases of congenial amusia in speakers of a tonal language - Cantonese. Previous studies on non-tonal language speakers suggest that the neural deficits of congenital amusia lie in the music-selective neural circuitry in the right inferior frontal gyrus (IFG). However, it is unclear whether this finding can generalize to congenital amusics in tonal languages. Tonal language experience has been reported to shape the neural processing of pitch, which raises the question of how tonal language experience affects the neural bases of congenital amusia. To investigate this question, we examined the neural circuitries sub-serving the processing of relative pitch interval in pitch-matched Cantonese level tone and musical stimuli in 11 Cantonese-speaking amusics and 11 musically intact controls. Cantonese-speaking amusics exhibited abnormal brain activities in a widely distributed neural network during the processing of lexical tone and musical stimuli. Whereas the controls exhibited significant activation in the right superior temporal gyrus (STG) in the lexical tone condition and in the cerebellum regardless of the lexical tone and music conditions, no activation was found in the amusics in those regions, which likely reflects a dysfunctional neural mechanism of relative pitch processing in the amusics. Furthermore, the amusics showed abnormally strong activation of the right middle frontal gyrus and precuneus when the pitch stimuli were repeated, which presumably reflect deficits of attending to repeated pitch stimuli or encoding them into working memory. No significant group difference was found in the right IFG in either the whole-brain analysis or region-of-interest analysis. These findings imply that the neural deficits in tonal language speakers might differ from those in non-tonal language speakers, and overlap partly with the

  6. BFAST: an alignment tool for large scale genome resequencing.

    Directory of Open Access Journals (Sweden)

    Nils Homer

    2009-11-01

    Full Text Available The new generation of massively parallel DNA sequencers, combined with the challenge of whole human genome resequencing, result in the need for rapid and accurate alignment of billions of short DNA sequence reads to a large reference genome. Speed is obviously of great importance, but equally important is maintaining alignment accuracy of short reads, in the 25-100 base range, in the presence of errors and true biological variation.We introduce a new algorithm specifically optimized for this task, as well as a freely available implementation, BFAST, which can align data produced by any of current sequencing platforms, allows for user-customizable levels of speed and accuracy, supports paired end data, and provides for efficient parallel and multi-threaded computation on a computer cluster. The new method is based on creating flexible, efficient whole genome indexes to rapidly map reads to candidate alignment locations, with arbitrary multiple independent indexes allowed to achieve robustness against read errors and sequence variants. The final local alignment uses a Smith-Waterman method, with gaps to support the detection of small indels.We compare BFAST to a selection of large-scale alignment tools -- BLAT, MAQ, SHRiMP, and SOAP -- in terms of both speed and accuracy, using simulated and real-world datasets. We show BFAST can achieve substantially greater sensitivity of alignment in the context of errors and true variants, especially insertions and deletions, and minimize false mappings, while maintaining adequate speed compared to other current methods. We show BFAST can align the amount of data needed to fully resequence a human genome, one billion reads, with high sensitivity and accuracy, on a modest computer cluster in less than 24 hours. BFAST is available at (http://bfast.sourceforge.net.

  7. Microwave conductance properties of aligned multiwall carbon nanotube textile sheets

    Energy Technology Data Exchange (ETDEWEB)

    Brown, Brian L. [Univ. of Texas, Dallas, TX (United States); Martinez, Patricia [Univ. of Texas, Dallas, TX (United States); Zakhidov, Anvar A. [Univ. of Texas, Dallas, TX (United States); Shaner, Eric A. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Lee, Mark [Univ. of Texas, Dallas, TX (United States)

    2015-07-06

    Understanding the conductance properties of multi-walled carbon nanotube (MWNT) textile sheets in the microwave regime is essential for their potential use in high-speed and high-frequency applications. To expand current knowledge, complex high-frequency conductance measurements from 0.01 to 50 GHz and across temperatures from 4.2 K to 300 K and magnetic fields up to 2 T were made on textile sheets of highly aligned MWNTs with strand alignment oriented both parallel and perpendicular to the microwave electric field polarization. Sheets were drawn from 329 and 520 μm high MWNT forests that resulted in different DC resistance anisotropy. For all samples, the microwave conductance can be modeled approximately by a shunt capacitance in parallel with a frequency-independent conductance, but with no inductive contribution. Finally, this is consistent with diffusive Drude conduction as the primary transport mechanism up to 50 GHz. Further, it is found that the microwave conductance is essentially independent of both temperature and magnetic field.

  8. Time-Contrastive Learning Based DNN Bottleneck Features for Text-Dependent Speaker Verification

    DEFF Research Database (Denmark)

    Sarkar, Achintya Kumar; Tan, Zheng-Hua

    2017-01-01

    In this paper, we present a time-contrastive learning (TCL) based bottleneck (BN) feature extraction method for speech signals with an application to text-dependent (TD) speaker verification (SV). It is well-known that speech signals exhibit quasi-stationary behavior in and only in a short interval......, and the TCL method aims to exploit this temporal structure. More specifically, it trains deep neural networks (DNNs) to discriminate temporal events obtained by uniformly segmenting speech signals, in contrast to existing DNN based BN feature extraction methods that train DNNs using labeled data...... to discriminate speakers or pass-phrases or phones or a combination of them. In the context of speaker verification, speech data of fixed pass-phrases are used for TCL-BN training, while the pass-phrases used for TCL-BN training are excluded from being used for SV, so that the learned features can be considered...

  9. Perceptual and acoustic analysis of lexical stress in Greek speakers with dysarthria.

    Science.gov (United States)

    Papakyritsis, Ioannis; Müller, Nicole

    2014-01-01

    The study reported in this paper investigated the abilities of Greek speakers with dysarthria to signal lexical stress at the single word level. Three speakers with dysarthria and two unimpaired control participants were recorded completing a repetition task of a list of words consisting of minimal pairs of Greek disyllabic words contrasted by lexical stress location only. Fourteen listeners were asked to determine the attempted stress location for each word pair. Acoustic analyses of duration and intensity ratios, both within and across words, were undertaken to identify possible acoustic correlates of the listeners' judgments concerning stress location. Acoustic and perceptual data indicate that while each participant with dysarthria in this study had some difficulty in signaling stress unambiguously, the pattern of difficulty was different for each speaker. Further, it was found that the relationship between the listeners' judgments of stress location and the acoustic data was not conclusive.

  10. Switches to English during French Service Encounters: Relationships with L2 French Speakers' Willingness to Communicate and Motivation

    Science.gov (United States)

    McNaughton, Stephanie; McDonough, Kim

    2015-01-01

    This exploratory study investigated second language (L2) French speakers' service encounters in the multilingual setting of Montreal, specifically whether switches to English during French service encounters were related to L2 speakers' willingness to communicate or motivation. Over a two-week period, 17 French L2 speakers in Montreal submitted…

  11. Musical Sophistication and the Effect of Complexity on Auditory Discrimination in Finnish Speakers

    Science.gov (United States)

    Dawson, Caitlin; Aalto, Daniel; Šimko, Juraj; Vainio, Martti; Tervaniemi, Mari

    2017-01-01

    Musical experiences and native language are both known to affect auditory processing. The present work aims to disentangle the influences of native language phonology and musicality on behavioral and subcortical sound feature processing in a population of musically diverse Finnish speakers as well as to investigate the specificity of enhancement from musical training. Finnish speakers are highly sensitive to duration cues since in Finnish, vowel and consonant duration determine word meaning. Using a correlational approach with a set of behavioral sound feature discrimination tasks, brainstem recordings, and a musical sophistication questionnaire, we find no evidence for an association between musical sophistication and more precise duration processing in Finnish speakers either in the auditory brainstem response or in behavioral tasks, but they do show an enhanced pitch discrimination compared to Finnish speakers with less musical experience and show greater duration modulation in a complex task. These results are consistent with a ceiling effect set for certain sound features which corresponds to the phonology of the native language, leaving an opportunity for music experience-based enhancement of sound features not explicitly encoded in the language (such as pitch, which is not explicitly encoded in Finnish). Finally, the pattern of duration modulation in more musically sophisticated Finnish speakers suggests integrated feature processing for greater efficiency in a real world musical situation. These results have implications for research into the specificity of plasticity in the auditory system as well as to the effects of interaction of specific language features with musical experiences. PMID:28450829

  12. Musical Sophistication and the Effect of Complexity on Auditory Discrimination in Finnish Speakers.

    Science.gov (United States)

    Dawson, Caitlin; Aalto, Daniel; Šimko, Juraj; Vainio, Martti; Tervaniemi, Mari

    2017-01-01

    Musical experiences and native language are both known to affect auditory processing. The present work aims to disentangle the influences of native language phonology and musicality on behavioral and subcortical sound feature processing in a population of musically diverse Finnish speakers as well as to investigate the specificity of enhancement from musical training. Finnish speakers are highly sensitive to duration cues since in Finnish, vowel and consonant duration determine word meaning. Using a correlational approach with a set of behavioral sound feature discrimination tasks, brainstem recordings, and a musical sophistication questionnaire, we find no evidence for an association between musical sophistication and more precise duration processing in Finnish speakers either in the auditory brainstem response or in behavioral tasks, but they do show an enhanced pitch discrimination compared to Finnish speakers with less musical experience and show greater duration modulation in a complex task. These results are consistent with a ceiling effect set for certain sound features which corresponds to the phonology of the native language, leaving an opportunity for music experience-based enhancement of sound features not explicitly encoded in the language (such as pitch, which is not explicitly encoded in Finnish). Finally, the pattern of duration modulation in more musically sophisticated Finnish speakers suggests integrated feature processing for greater efficiency in a real world musical situation. These results have implications for research into the specificity of plasticity in the auditory system as well as to the effects of interaction of specific language features with musical experiences.

  13. Working memory affects older adults' use of context in spoken-word recognition.

    Science.gov (United States)

    Janse, Esther; Jesse, Alexandra

    2014-01-01

    Many older listeners report difficulties in understanding speech in noisy situations. Working memory and other cognitive skills may modulate older listeners' ability to use context information to alleviate the effects of noise on spoken-word recognition. In the present study, we investigated whether verbal working memory predicts older adults' ability to immediately use context information in the recognition of words embedded in sentences, presented in different listening conditions. In a phoneme-monitoring task, older adults were asked to detect as fast and as accurately as possible target phonemes in sentences spoken by a target speaker. Target speech was presented without noise, with fluctuating speech-shaped noise, or with competing speech from a single distractor speaker. The gradient measure of contextual probability (derived from a separate offline rating study) affected the speed of recognition. Contextual facilitation was modulated by older listeners' verbal working memory (measured with a backward digit span task) and age across listening conditions. Working memory and age, as well as hearing loss, were also the most consistent predictors of overall listening performance. Older listeners' immediate benefit from context in spoken-word recognition thus relates to their ability to keep and update a semantic representation of the sentence content in working memory.

  14. Willing Learners yet Unwilling Speakers in ESL Classrooms

    Directory of Open Access Journals (Sweden)

    Zuraidah Ali

    2007-12-01

    Full Text Available To some of us, speech production in ESL has become so natural and integral that we seem to take it for granted. We often do not even remember how we struggled through the initial process of mastering English. Unfortunately, to students who are still learning English, they seem to face myriad problems that make them appear unwilling or reluctant ESL speakers. This study will investigate this phenomenon which is very common in the ESL classroom. Setting its background on related research findings on this matter, a qualitative study was conducted among foreign students enrolled in the Intensive English Programme (IEP at Institute of Liberal Studies (IKAL, University Tenaga Nasional (UNITEN. The results will show and discuss an extent of truth behind this perplexing phenomenon: willing learners, yet unwilling speakers of ESL, in our effort to provide supportive learning cultures in second language acquisition (SLA to this group of students.

  15. English exposed common mistakes made by Chinese speakers

    CERN Document Server

    Hart, Steve

    2017-01-01

    Having analysed the most common English errors made in over 600 academic papers written by Chinese undergraduates, postgraduates, and researchers, Steve Hart has written an essential, practical guide specifically for the native Chinese speaker on how to write good academic English. English Exposed: Common Mistakes Made by Chinese Speakers is divided into three main sections. The first section examines errors made with verbs, nouns, prepositions, and other grammatical classes of words. The second section focuses on problems of word choice. In addition to helping the reader find the right word, it provides instruction for selecting the right style too. The third section covers a variety of other areas essential for the academic writer, such as using punctuation, adding appropriate references, referring to tables and figures, and selecting among various English date and time phrases. Using English Exposed will allow a writer to produce material where content and ideas-not language mistakes-speak the loudest.

  16. Sensing Characteristics of A Precision Aligner Using Moire Gratings for Precision Alignment System

    Institute of Scientific and Technical Information of China (English)

    ZHOU Lizhong; Hideo Furuhashi; Yoshiyuki Uchida

    2001-01-01

    Sensing characteristics of a precision aligner using moire gratings for precision alignment sysem has been investigated. A differential moire alignment system and a modified alignment system were used. The influence of the setting accuracy of the gap length and inclination of gratings on the alignment accuracy has been studied experimentally and theoretically. Setting accuracy of the gap length less than 2.5μm is required in modified moire alignment. There is no influence of the gap length on the alignment accuracy in the differential alignment system. The inclination affects alignment accuracies in both differential and modified moire alignment systems.

  17. MUON DETECTORS: ALIGNMENT

    CERN Multimedia

    Z. Szillasi and G. Gomez.

    2013-01-01

    When CMS is opened up, major components of the Link and Barrel Alignment systems will be removed. This operation, besides allowing for maintenance of the detector underneath, is needed for making interventions that will reinforce the alignment measurements and make the operation of the alignment system more reliable. For that purpose and also for their general maintenance and recalibration, the alignment components will be transferred to the Alignment Lab situated in the ISR area. For the track-based alignment, attention is focused on the determination of systematic uncertainties, which have become dominant, since now there is a large statistics of muon tracks. This will allow for an improved Monte Carlo misalignment scenario and updated alignment position errors, crucial for high-momentum muon analysis such as Z′ searches.

  18. Neural decoding of attentional selection in multi-speaker environments without access to clean sources

    Science.gov (United States)

    O'Sullivan, James; Chen, Zhuo; Herrero, Jose; McKhann, Guy M.; Sheth, Sameer A.; Mehta, Ashesh D.; Mesgarani, Nima

    2017-10-01

    Objective. People who suffer from hearing impairments can find it difficult to follow a conversation in a multi-speaker environment. Current hearing aids can suppress background noise; however, there is little that can be done to help a user attend to a single conversation amongst many without knowing which speaker the user is attending to. Cognitively controlled hearing aids that use auditory attention decoding (AAD) methods are the next step in offering help. Translating the successes in AAD research to real-world applications poses a number of challenges, including the lack of access to the clean sound sources in the environment with which to compare with the neural signals. We propose a novel framework that combines single-channel speech separation algorithms with AAD. Approach. We present an end-to-end system that (1) receives a single audio channel containing a mixture of speakers that is heard by a listener along with the listener’s neural signals, (2) automatically separates the individual speakers in the mixture, (3) determines the attended speaker, and (4) amplifies the attended speaker’s voice to assist the listener. Main results. Using invasive electrophysiology recordings, we identified the regions of the auditory cortex that contribute to AAD. Given appropriate electrode locations, our system is able to decode the attention of subjects and amplify the attended speaker using only the mixed audio. Our quality assessment of the modified audio demonstrates a significant improvement in both subjective and objective speech quality measures. Significance. Our novel framework for AAD bridges the gap between the most recent advancements in speech processing technologies and speech prosthesis research and moves us closer to the development of cognitively controlled hearable devices for the hearing impaired.

  19. Accuracy of MFCC-Based Speaker Recognition in Series 60 Device

    Directory of Open Access Journals (Sweden)

    Pasi Fränti

    2005-10-01

    Full Text Available A fixed point implementation of speaker recognition based on MFCC signal processing is considered. We analyze the numerical error of the MFCC and its effect on the recognition accuracy. Techniques to reduce the information loss in a converted fixed point implementation are introduced. We increase the signal processing accuracy by adjusting the ratio of presentation accuracy of the operators and the signal. The signal processing error is found out to be more important to the speaker recognition accuracy than the error in the classification algorithm. The results are verified by applying the alternative technique to speech data. We also discuss the specific programming requirements set up by the Symbian and Series 60.

  20. Lip-Synching Using Speaker-Specific Articulation, Shape and Appearance Models

    Directory of Open Access Journals (Sweden)

    Gaspard Breton

    2009-01-01

    Full Text Available We describe here the control, shape and appearance models that are built using an original photogrammetric method to capture characteristics of speaker-specific facial articulation, anatomy, and texture. Two original contributions are put forward here: the trainable trajectory formation model that predicts articulatory trajectories of a talking face from phonetic input and the texture model that computes a texture for each 3D facial shape according to articulation. Using motion capture data from different speakers and module-specific evaluation procedures, we show here that this cloning system restores detailed idiosyncrasies and the global coherence of visible articulation. Results of a subjective evaluation of the global system with competing trajectory formation models are further presented and commented.

  1. Deformation effect in the fast neutron total cross section of aligned 59Co

    International Nuclear Information System (INIS)

    Fasoli, U.; Pavan, P.; Toniolo, D.; Zago, G.; Zannoni, R.; Galeazzi, G.

    1983-01-01

    The variation of the total neutron cross section, Δsigma/sub align/, on 59 Co due to nuclear alignment of the target has been measured over the energy range from 0.8 to 20 MeV employing a cobalt single crystal with a 34% nuclear alignment. The results show that Δsigma/sub align/ oscillates from a minimum of -5% at about 2.5 MeV to a maximum of +1% at about 10 MeV. The data were successfully fitted by optical model coupled-channel calculations. The coupling terms were deduced from a model representing the 59 Co nucleus as a vibrational 60 Ni core coupled to a proton hole in a (1f/sub 7/2/) shell, without free parameters. The optical model parameters were determined by fitting the total cross section, which was independently measured. The theoretical calculations show that, at lower energies, Δsigma/sub align/ depends appreciably on the coupling with the low-lying levels

  2. Triangular Alignment (TAME). A Tensor-based Approach for Higher-order Network Alignment

    Energy Technology Data Exchange (ETDEWEB)

    Mohammadi, Shahin [Purdue Univ., West Lafayette, IN (United States); Gleich, David F. [Purdue Univ., West Lafayette, IN (United States); Kolda, Tamara G. [Sandia National Laboratories (SNL-CA), Livermore, CA (United States); Grama, Ananth [Purdue Univ., West Lafayette, IN (United States)

    2015-11-01

    Network alignment is an important tool with extensive applications in comparative interactomics. Traditional approaches aim to simultaneously maximize the number of conserved edges and the underlying similarity of aligned entities. We propose a novel formulation of the network alignment problem that extends topological similarity to higher-order structures and provide a new objective function that maximizes the number of aligned substructures. This objective function corresponds to an integer programming problem, which is NP-hard. Consequently, we approximate this objective function as a surrogate function whose maximization results in a tensor eigenvalue problem. Based on this formulation, we present an algorithm called Triangular AlignMEnt (TAME), which attempts to maximize the number of aligned triangles across networks. We focus on alignment of triangles because of their enrichment in complex networks; however, our formulation and resulting algorithms can be applied to general motifs. Using a case study on the NAPABench dataset, we show that TAME is capable of producing alignments with up to 99% accuracy in terms of aligned nodes. We further evaluate our method by aligning yeast and human interactomes. Our results indicate that TAME outperforms the state-of-art alignment methods both in terms of biological and topological quality of the alignments.

  3. Musical practice and cognitive aging: two cross-sectional studies point phonemic fluency as a potential candidate for a use-dependent adaptation

    Directory of Open Access Journals (Sweden)

    Baptiste eFAUVEL

    2014-10-01

    Full Text Available Because of permanent use-dependent cerebral plasticity, all lifelong individuals’ experiences are believed to influence the cognitive aging quality. In old individuals, both former and current musical practices have been associated with better verbal skills, visual memory, processing speed, and planning function. This work sought for an interaction between musical practice and cognitive aging by comparing musician and nonmusician individuals for two periods of life (late adulthood and old age. Long-term memory, auditory verbal short-term memory, processing speed, nonverbal reasoning, and verbal fluencies were assessed. In study 1, measures of processing speed and auditory verbal short-term memory showed significant better performances for musicians compared with controls, but both groups displayed the same age-related difference. For verbal fluencies, musician individuals scored higher and displayed different age effects compared with controls. In study 2, we revealed that the life period at training onset (childhood versus adulthood was associated with phonemic, but not semantic fluency performances (musicians who had started practice in adulthood did not perform better on phonemic fluency compared with nonmusicians. For these two measures, current frequency of training did not account for musicians’ scores. These patterns of results are discussed by confronting the hypothesis of a transformative effect of musical practice with non-causal explanation.

  4. Umesh V Waghmare | Speakers | Indian Academy of Sciences

    Indian Academy of Sciences (India)

    Umesh V Waghmare. Theoretical Sciences Unit, Jawaharlal Nehru Centre for Advanced Scientific Research, Jakkur P.O., Bangalore 560 064, ... These ideas apply quite well to dynamical structure of a crystal, as described by the dispersion of its phonons or vibrational waves. The speakers group has shown an interesting ...

  5. A general auditory bias for handling speaker variability in speech? Evidence in humans and songbirds

    NARCIS (Netherlands)

    Kriengwatana, B.; Escudero, P.; Kerkhoven, A.H.; ten Cate, C.

    2015-01-01

    Different speakers produce the same speech sound differently, yet listeners are still able to reliably identify the speech sound. How listeners can adjust their perception to compensate for speaker differences in speech, and whether these compensatory processes are unique only to humans, is still

  6. MUON DETECTORS: ALIGNMENT

    CERN Multimedia

    G. Gomez and J. Pivarski

    2011-01-01

    Alignment efforts in the first few months of 2011 have shifted away from providing alignment constants (now a well established procedure) and focussed on some critical remaining issues. The single most important task left was to understand the systematic differences observed between the track-based (TB) and hardware-based (HW) barrel alignments: a systematic difference in r-φ and in z, which grew as a function of z, and which amounted to ~4-5 mm differences going from one end of the barrel to the other. This difference is now understood to be caused by the tracker alignment. The systematic differences disappear when the track-based barrel alignment is performed using the new “twist-free” tracker alignment. This removes the largest remaining source of systematic uncertainty. Since the barrel alignment is based on hardware, it does not suffer from the tracker twist. However, untwisting the tracker causes endcap disks (which are aligned ...

  7. MUON DETECTORS: ALIGNMENT

    CERN Multimedia

    Gervasio Gomez

    The main progress of the muon alignment group since March has been in the refinement of both the track-based alignment for the DTs and the hardware-based alignment for the CSCs. For DT track-based alignment, there has been significant improvement in the internal alignment of the superlayers inside the DTs. In particular, the distance between superlayers is now corrected, eliminating the residual dependence on track impact angles, and good agreement is found between survey and track-based corrections. The new internal geometry has been approved to be included in the forthcoming reprocessing of CRAFT samples. The alignment of DTs with respect to the tracker using global tracks has also improved significantly, since the algorithms use the latest B-field mapping, better run selection criteria, optimized momentum cuts, and an alignment is now obtained for all six degrees of freedom (three spatial coordinates and three rotations) of the aligned DTs. This work is ongoing and at a stage where we are trying to unders...

  8. Age differences in vocal emotion perception: on the role of speaker age and listener sex.

    Science.gov (United States)

    Sen, Antarika; Isaacowitz, Derek; Schirmer, Annett

    2017-10-24

    Older adults have greater difficulty than younger adults perceiving vocal emotions. To better characterise this effect, we explored its relation to age differences in sensory, cognitive and emotional functioning. Additionally, we examined the role of speaker age and listener sex. Participants (N = 163) aged 19-34 years and 60-85 years categorised neutral sentences spoken by ten younger and ten older speakers with a happy, neutral, sad, or angry voice. Acoustic analyses indicated that expressions from younger and older speakers denoted the intended emotion with similar accuracy. As expected, younger participants outperformed older participants and this effect was statistically mediated by an age-related decline in both optimism and working-memory. Additionally, age differences in emotion perception were larger for younger as compared to older speakers and a better perception of younger as compared to older speakers was greater in younger as compared to older participants. Last, a female perception benefit was less pervasive in the older than the younger group. Together, these findings suggest that the role of age for emotion perception is multi-faceted. It is linked to emotional and cognitive change, to processing biases that benefit young and own-age expressions, and to the different aptitudes of women and men.

  9. Acute alcohol intoxication impairs segmental body alignment in upright standing.

    Science.gov (United States)

    Hafstrom, A; Patel, M; Modig, F; Magnusson, M; Fransson, P A

    2014-01-01

    Balance control when standing upright is a complex process requiring input from several partly independent mechanisms such as coordination, feedback and feedforward control, and adaptation. Acute alcohol intoxication from ethanol is recognized as a major contributor to accidental falls requiring medical care. This study aimed to investigate if intoxication at 0.06 and 0.10% blood alcohol concentration affected body alignment. Mean angular positions of the head, shoulder, hip, and knee were measured with 3D-motion analysis and compared with the ankle position in 25 healthy adults during standing with or without perturbations, and with eyes open or closed. Alcohol intoxication had significant effects on body alignment during perturbed and unperturbed stance, and on adaptation to perturbations. It induced a significantly more posterior alignment of the knees and shoulders, and a tendency for a more posterior and left deviated head alignment in perturbed stance than when sober. The impact of alcohol intoxication was most apparent on the knee alignment, where availability of visual information deteriorated the adaptation to perturbations. Thus, acute alcohol intoxication resulted in inadequate balance control strategies with increased postural rigidity and impaired adaptation to perturbations. These factors probably contribute to the increased risk of falling when intoxicated with alcohol.

  10. What makes a charismatic speaker?

    DEFF Research Database (Denmark)

    Niebuhr, Oliver; Voße, Jana; Brem, Alexander

    2016-01-01

    The former Apple CEO Steve Jobs was one of the most charismatic speakers of the past decades. However, there is, as yet, no detailed quantitative profile of his way of speaking. We used state-of-the-art computer techniques to acoustically analyze his speech behavior and relate it to reference...... samples. Our paper provides the first-ever acoustic profile of Steve Jobs, based on about 4000 syllables and 12,000 individual speech sounds from his two most outstanding and well-known product presentations: the introductions of the iPhone 4 and the iPad 2. Our results show that Steve Jobs stands out...

  11. Conformation-independent structural comparison of macromolecules with ProSMART

    International Nuclear Information System (INIS)

    Nicholls, Robert A.; Fischer, Marcus; McNicholas, Stuart; Murshudov, Garib N.

    2014-01-01

    The Procrustes Structural Matching Alignment and Restraints Tool (ProSMART) has been developed to allow local comparative structural analyses independent of the global conformations and sequence homology of the compared macromolecules. This allows quick and intuitive visualization of the conservation of backbone and side-chain conformations, providing complementary information to existing methods. The identification and exploration of (dis)similarities between macromolecular structures can help to gain biological insight, for instance when visualizing or quantifying the response of a protein to ligand binding. Obtaining a residue alignment between compared structures is often a prerequisite for such comparative analysis. If the conformational change of the protein is dramatic, conventional alignment methods may struggle to provide an intuitive solution for straightforward analysis. To make such analyses more accessible, the Procrustes Structural Matching Alignment and Restraints Tool (ProSMART) has been developed, which achieves a conformation-independent structural alignment, as well as providing such additional functionalities as the generation of restraints for use in the refinement of macromolecular models. Sensible comparison of protein (or DNA/RNA) structures in the presence of conformational changes is achieved by enforcing neither chain nor domain rigidity. The visualization of results is facilitated by popular molecular-graphics software such as CCP4mg and PyMOL, providing intuitive feedback regarding structural conservation and subtle dissimilarities between close homologues that can otherwise be hard to identify. Automatically generated colour schemes corresponding to various residue-based scores are provided, which allow the assessment of the conservation of backbone and side-chain conformations relative to the local coordinate frame. Structural comparison tools such as ProSMART can help to break the complexity that accompanies the constantly growing

  12. Conformation-independent structural comparison of macromolecules with ProSMART

    Energy Technology Data Exchange (ETDEWEB)

    Nicholls, Robert A., E-mail: nicholls@mrc-lmb.cam.ac.uk [MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH (United Kingdom); Fischer, Marcus [University of California San Francisco, San Francisco, CA 94158 (United States); McNicholas, Stuart [University of York, Heslington, York YO10 5DD (United Kingdom); Murshudov, Garib N. [MRC Laboratory of Molecular Biology, Francis Crick Avenue, Cambridge Biomedical Campus, Cambridge CB2 0QH (United Kingdom)

    2014-09-01

    The Procrustes Structural Matching Alignment and Restraints Tool (ProSMART) has been developed to allow local comparative structural analyses independent of the global conformations and sequence homology of the compared macromolecules. This allows quick and intuitive visualization of the conservation of backbone and side-chain conformations, providing complementary information to existing methods. The identification and exploration of (dis)similarities between macromolecular structures can help to gain biological insight, for instance when visualizing or quantifying the response of a protein to ligand binding. Obtaining a residue alignment between compared structures is often a prerequisite for such comparative analysis. If the conformational change of the protein is dramatic, conventional alignment methods may struggle to provide an intuitive solution for straightforward analysis. To make such analyses more accessible, the Procrustes Structural Matching Alignment and Restraints Tool (ProSMART) has been developed, which achieves a conformation-independent structural alignment, as well as providing such additional functionalities as the generation of restraints for use in the refinement of macromolecular models. Sensible comparison of protein (or DNA/RNA) structures in the presence of conformational changes is achieved by enforcing neither chain nor domain rigidity. The visualization of results is facilitated by popular molecular-graphics software such as CCP4mg and PyMOL, providing intuitive feedback regarding structural conservation and subtle dissimilarities between close homologues that can otherwise be hard to identify. Automatically generated colour schemes corresponding to various residue-based scores are provided, which allow the assessment of the conservation of backbone and side-chain conformations relative to the local coordinate frame. Structural comparison tools such as ProSMART can help to break the complexity that accompanies the constantly growing

  13. Generic accelerated sequence alignment in SeqAn using vectorization and multi-threading.

    Science.gov (United States)

    Rahn, René; Budach, Stefan; Costanza, Pascal; Ehrhardt, Marcel; Hancox, Jonny; Reinert, Knut

    2018-05-03

    Pairwise sequence alignment is undoubtedly a central tool in many bioinformatics analyses. In this paper, we present a generically accelerated module for pairwise sequence alignments applicable for a broad range of applications. In our module, we unified the standard dynamic programming kernel used for pairwise sequence alignments and extended it with a generalized inter-sequence vectorization layout, such that many alignments can be computed simultaneously by exploiting SIMD (Single Instruction Multiple Data) instructions of modern processors. We then extended the module by adding two layers of thread-level parallelization, where we a) distribute many independent alignments on multiple threads and b) inherently parallelize a single alignment computation using a work stealing approach producing a dynamic wavefront progressing along the minor diagonal. We evaluated our alignment vectorization and parallelization on different processors, including the newest Intel® Xeon® (Skylake) and Intel® Xeon Phi™ (KNL) processors, and use cases. The instruction set AVX512-BW (Byte and Word), available on Skylake processors, can genuinely improve the performance of vectorized alignments. We could run single alignments 1600 times faster on the Xeon Phi™ and 1400 times faster on the Xeon® than executing them with our previous sequential alignment module. The module is programmed in C++ using the SeqAn (Reinert et al., 2017) library and distributed with version 2.4. under the BSD license. We support SSE4, AVX2, AVX512 instructions and included UME::SIMD, a SIMD-instruction wrapper library, to extend our module for further instruction sets. We thoroughly test all alignment components with all major C++ compilers on various platforms. rene.rahn@fu-berlin.de.

  14. Oxygen-promoted catalyst sintering influences number density, alignment, and wall number of vertically aligned carbon nanotubes.

    Science.gov (United States)

    Shi, Wenbo; Li, Jinjing; Polsen, Erik S; Oliver, C Ryan; Zhao, Yikun; Meshot, Eric R; Barclay, Michael; Fairbrother, D Howard; Hart, A John; Plata, Desiree L

    2017-04-20

    A lack of synthetic control and reproducibility during vertically aligned carbon nanotube (CNT) synthesis has stifled many promising applications of organic nanomaterials. Oxygen-containing species are particularly precarious in that they have both beneficial and deleterious effects and are notoriously difficult to control. Here, we demonstrated diatomic oxygen's ability, independent of water, to tune oxide-supported catalyst thin film dewetting and influence nanoscale (diameter and wall number) and macro-scale (alignment and density) properties for as-grown vertically aligned CNTs. In particular, single- or few-walled CNT forests were achieved at very low oxygen loading, with single-to-multi-walled CNT diameters ranging from 4.8 ± 1.3 nm to 6.4 ± 1.1 nm over 0-800 ppm O 2 , and an expected variation in alignment, where both were related to the annealed catalyst morphology. Morphological differences were not the result of subsurface diffusion, but instead occurred via Ostwald ripening under several hundred ppm O 2 , and this effect was mitigated by high H 2 concentrations and not due to water vapor (as confirmed in O 2 -free water addition experiments), supporting the importance of O 2 specifically. Further characterization of the interface between the Fe catalyst and Al 2 O 3 support revealed that either oxygen-deficit metal oxide or oxygen-adsorption on metals could be functional mechanisms for the observed catalyst nanoparticle evolution. Taken as a whole, our results suggest that the impacts of O 2 and H 2 on the catalyst evolution have been underappreciated and underleveraged in CNT synthesis, and these could present a route toward facile manipulation of CNT forest morphology through control of the reactive gaseous atmosphere alone.

  15. Processing ser and estar to locate objects and events: An ERP study with L2 speakers of Spanish.

    Science.gov (United States)

    Dussias, Paola E; Contemori, Carla; Román, Patricia

    2014-01-01

    In Spanish locative constructions, a different form of the copula is selected in relation to the semantic properties of the grammatical subject: sentences that locate objects require estar while those that locate events require ser (both translated in English as 'to be'). In an ERP study, we examined whether second language (L2) speakers of Spanish are sensitive to the selectional restrictions that the different types of subjects impose on the choice of the two copulas. Twenty-four native speakers of Spanish and two groups of L2 Spanish speakers (24 beginners and 18 advanced speakers) were recruited to investigate the processing of 'object/event + estar/ser ' permutations. Participants provided grammaticality judgments on correct (object + estar ; event + ser ) and incorrect (object + ser ; event + estar ) sentences while their brain activity was recorded. In line with previous studies (Leone-Fernández, Molinaro, Carreiras, & Barber, 2012; Sera, Gathje, & Pintado, 1999), the results of the grammaticality judgment for the native speakers showed that participants correctly accepted object + estar and event + ser constructions. In addition, while 'object + ser ' constructions were considered grossly ungrammatical, 'event + estar ' combinations were perceived as unacceptable to a lesser degree. For these same participants, ERP recording time-locked to the onset of the critical word ' en ' showed a larger P600 for the ser predicates when the subject was an object than when it was an event (*La silla es en la cocina vs. La fiesta es en la cocina). This P600 effect is consistent with syntactic repair of the defining predicate when it does not fit with the adequate semantic properties of the subject. For estar predicates (La silla está en la cocina vs. *La fiesta está en la cocina), the findings showed a central-frontal negativity between 500-700 ms. Grammaticality judgment data for the L2 speakers of Spanish showed that beginners were significantly less accurate than

  16. The Effect of Noise on Relationships Between Speech Intelligibility and Self-Reported Communication Measures in Tracheoesophageal Speakers.

    Science.gov (United States)

    Eadie, Tanya L; Otero, Devon Sawin; Bolt, Susan; Kapsner-Smith, Mara; Sullivan, Jessica R

    2016-08-01

    The purpose of this study was to examine how sentence intelligibility relates to self-reported communication in tracheoesophageal speakers when speech intelligibility is measured in quiet and noise. Twenty-four tracheoesophageal speakers who were at least 1 year postlaryngectomy provided audio recordings of 5 sentences from the Sentence Intelligibility Test. Speakers also completed self-reported measures of communication-the Voice Handicap Index-10 and the Communicative Participation Item Bank short form. Speech recordings were presented to 2 groups of inexperienced listeners who heard sentences in quiet or noise. Listeners transcribed the sentences to yield speech intelligibility scores. Very weak relationships were found between intelligibility in quiet and measures of voice handicap and communicative participation. Slightly stronger, but still weak and nonsignificant, relationships were observed between measures of intelligibility in noise and both self-reported measures. However, 12 speakers who were more than 65% intelligible in noise showed strong and statistically significant relationships with both self-reported measures (R2 = .76-.79). Speech intelligibility in quiet is a weak predictor of self-reported communication measures in tracheoesophageal speakers. Speech intelligibility in noise may be a better metric of self-reported communicative function for speakers who demonstrate higher speech intelligibility in noise.

  17. Native Speakers' Perception of Non-Native English Speech

    Science.gov (United States)

    Jaber, Maysa; Hussein, Riyad F.

    2011-01-01

    This study is aimed at investigating the rating and intelligibility of different non-native varieties of English, namely French English, Japanese English and Jordanian English by native English speakers and their attitudes towards these foreign accents. To achieve the goals of this study, the researchers used a web-based questionnaire which…

  18. The attentional blink is related to phonemic decoding, but not sight-word recognition, in typically reading adults.

    Science.gov (United States)

    Tyson-Parry, Maree M; Sailah, Jessica; Boyes, Mark E; Badcock, Nicholas A

    2015-10-01

    This research investigated the relationship between the attentional blink (AB) and reading in typical adults. The AB is a deficit in the processing of the second of two rapidly presented targets when it occurs in close temporal proximity to the first target. Specifically, this experiment examined whether the AB was related to both phonological and sight-word reading abilities, and whether the relationship was mediated by accuracy on a single-target rapid serial visual processing task (single-target accuracy). Undergraduate university students completed a battery of tests measuring reading ability, non-verbal intelligence, and rapid automatised naming, in addition to rapid serial visual presentation tasks in which they were required to identify either two (AB task) or one (single target task) target/s (outlined shapes: circle, square, diamond, cross, and triangle) in a stream of random-dot distractors. The duration of the AB was related to phonological reading (n=41, β=-0.43): participants who exhibited longer ABs had poorer phonemic decoding skills. The AB was not related to sight-word reading. Single-target accuracy did not mediate the relationship between the AB and reading, but was significantly related to AB depth (non-linear fit, R(2)=.50): depth reflects the maximal cost in T2 reporting accuracy in the AB. The differential relationship between the AB and phonological versus sight-word reading implicates common resources used for phonemic decoding and target consolidation, which may be involved in cognitive control. The relationship between single-target accuracy and the AB is discussed in terms of cognitive preparation. Copyright © 2015 Elsevier Ltd. All rights reserved.

  19. Factors influencing the alignment of accounting information systems of accepted manufacturing firms in Tehran Stock Exchange

    Directory of Open Access Journals (Sweden)

    Fazel Tamoradi

    2014-03-01

    Full Text Available The primary objective of this paper is to detect factors influencing the alignment of accounting information systems for firms in manufacturing sector listed on Tehran Stock Exchange. The concept of alignment has been investigated for many years, and strategic alignment plays essential role in increasing company performance. This paper investigates different levels of alignment and studies the factors, which influence alignment. More specifically, the work concentrates on the alignment between the requirements for accounting information (AIS requirements and the capacity of accounting systems (AIS capacity to build the information, in the specific context of manufacturing in Iran. The research sample consists of 216 companies over the period 2011-2007. The fit between these two sets was explored based on the moderation method and evidences indicate that AIS alignment in some firms was high. In addition, the relationship between the dependent variable and independent variables through multiple regressions yields a positive relationship between these variables.

  20. MUON DETECTORS: ALIGNMENT

    CERN Multimedia

    G.Gomez

    2011-01-01

    The Muon Alignment work now focuses on producing a new track-based alignment with higher track statistics, making systematic studies between the results of the hardware and track-based alignment methods and aligning the barrel using standalone muon tracks. Currently, the muon track reconstruction software uses a hardware-based alignment in the barrel (DT) and a track-based alignment in the endcaps (CSC). An important task is to assess the muon momentum resolution that can be achieved using the current muon alignment, especially for highly energetic muons. For this purpose, cosmic ray muons are used, since the rate of high-energy muons from collisions is very low and the event statistics are still limited. Cosmics have the advantage of higher statistics in the pT region above 100 GeV/c, but they have the disadvantage of having a mostly vertical topology, resulting in a very few global endcap muons. Only the barrel alignment has therefore been tested so far. Cosmic muons traversing CMS from top to bottom are s...